Initial progress on category-typed attributes for C-sets

1. Motivation

Here are two equivalent categories. The first category, \mathsf{Fib} := \mathsf{Set}^{\mathsf{Arr}} (for fibered), is the category of copresheaves on the walking arrow category \mathsf{Arr} = \bullet \to \bullet. An object of this category consists of two sets and a map between them, and a morphism is a commuting square.

The second category, which I will call \mathsf{Ind} (for indexed), is the category of elements of the functor \mathsf{Set}^{-} \colon \mathsf{Set}^{\mathrm{op}} \to \mathsf{Cat}. An element of this category consists of a set X and a function F \colon X \to \mathsf{Set}, and a morphism from (X,F) to (Y,G) consists of a function h \colon X \to Y along with for every x \in X, a function \alpha_x \colon F(x) \to G(h(x)).

The equivalence between \mathsf{Fib} and \mathsf{Ind} is an example of the Grothendieck construction. In brief, we can send an object (X, F \colon X \to \mathsf{Set}) of \mathsf{Fib} to an object \pi_1 \colon \sum_{x \in X} F(x) \to X of \mathsf{Ind}, and in the reverse direction send an object f \colon Y \to X to (X, f^{-1} \colon X \to \mathsf{Set}).

However, although these are equivalent mathematically, they are not equivalent computationally. Often the indexed point of view is more natural. For instance, it is more natural to have a list of ports for each box in a wiring diagram (the indexed perspective), rather than having a global list of ports and a function from ports to boxes (the fibered perspective).

Additionally, in the indexed perspective, it is easy to change \mathsf{Ind} to the Grothendieck construction for \mathsf{Set}_{\cong}^{-} and get morphisms which have be “locally isomorphic”, i.e. \alpha_x \colon F(x) \to G(h(x)) is always an isomorphism. Analogously, we could do “locally injective”, “locally surjective”, or even more exotic things like “locally partial”, “locally relational”, or even “locally dual”, where we have the map going in the opposite direction \alpha_x \colon G(h(x)) \to F(x) (in which case we get \mathsf{Poly})!

We could even switch to a totally different category, like (\mathbb{R}, \leq). The Grothendieck construction on \mathbb{R}^{-} gives us the category of pairs (X, F \colon X \to \mathbb{R}), and morphisms (X,F) \to (Y,G) given by functions h \colon X \to Y such that F(x) \leq G(h(x)) for all x \in X.

In short, the indexed perspective is both more natural, and gives us a lot of degrees of freedom to play with.

The one current advantage that the fibered perspective has going for it is that it’s a normal C-set category, and we have an implementation in AlgebraicJulia for C-sets, but we don’t have an implementation for all of this fancy indexed stuff. So when we implement wiring diagrams, we use the fibered perspective; we have a set of all ports and then a map from ports to boxes.

In order to remedy this situation, I want to come up with a theory that lets us talk about the indexed perspective. I have a lot of other things that I need to be doing, so I’m actually hoping that someone else works out the details of this based on the sketch I give in this post, possibly @KevinArlin.

2. The story so far

I think that I can get what I want from an extension of the theory of ACSets to have category-typed attributes.

To recap briefly, the theory of ACSets is essentially the following. Going back to Functorial Data Migration it is known that you can represent databases with only foreign keys columns using C-sets, and databases with columns that have actual data in them (like strings or integers) can be represented by slice categories \mathsf{Set}^C / A, where the C-set A tells you what types are in your data columns. The difficulty with this picture is that actually writing down the C-set A can be somewhat challenging.

To see why, an example is useful. Let C be the schema for graphs, i.e. \{E,V, (\mathrm{src}, \mathrm{tgt}) \colon E \to V\}. If we want to decorate each vertex by a string, one might suppose that we would let A(V) = \mathtt{String}, A(E) = 1, because we aren’t decorating edges by anything. But the problem is that we can’t define A(\mathrm{src}) \colon 1 \to \mathtt{String}. In order to fix this, we actually need A(E) = \mathtt{String} \times \mathtt{String}, with A(\mathrm{src}) = \pi_1 and A(\mathrm{tgt}) = \pi_2.

The contribution of the ACSet paper was to come up with a simpler way of describing the data attributes of C-sets, to make what we call “attributed C-sets”.

Currently I have a way of doing “category-typed attributes” that is at the level of slice categories; it is yet an open problem to come up with convenient presentations for category-typed attributes.

The slice story for category-typed attributes is a fairly straightforward extension of the slice story for set-typed attributes. We start with a schema category C, and then we consider a functor A \colon C \to \mathsf{Cat}. We can then take the lax comma category \mathsf{Set}^C // A along the inclusion \mathsf{Set}^C \to \mathsf{Cat}^C. An object of \mathsf{Set}^C // A is a functor F \colon C \to \mathsf{Set} along with a natural transformation \alpha \colon F \to A. A morphism from (F, \alpha) to (G, \beta) is a diagram

where \aleph is a modification.

This allows morphisms to only “laxly” preserve attributes. I.e., for each c \in C, x \in F(c), there is a morphism \aleph_{c,x} \colon \alpha_c(x) \to \beta_c(\gamma_c(x)) in A(c).

To help understand this, let’s do an example.

Let C be the schema for graphs, and define A \colon C \to \mathsf{Cat} by

A(V) = \mathsf{Set}^{\{I,O\}}
A(E) = \{ (B_1 \in \mathsf{Set}^{\{I, O\}}, B_2 \in \mathsf{Set}^{\{I, O\}}, (o,i) \in B_1(O) \times B_2(I)) \}

Just like before, A(\mathrm{src}) and A(\mathrm{tgt}) are the two obvious projections. The category structure on A(E) can be given by an appropriate Grothendieck construction.

An object of \mathsf{Set}^C // A consists of a graph where each vertex has a set of input ports and a set of output ports, and where each wire is decorated with an output port of its source, and an input port of its target. In other words, this is a wiring diagram (without boundary).

A morphism of these maps vertices to vertices and edges to edges, but does not have to preserve the sets of input and output ports. Rather, if v \mapsto v', then we get a function from the input ports of v to the input ports of v', and so on.

We then can require that this function is a bijection by making A(V) = \mathsf{Set}_{\cong}^{\{I,O\}}, and so on.

What remains to be done is to come up with a way of writing down the schema for one of these things in a more convenient way than specifying A manually, just as we have a profunctor schema for ACSets. As the profunctor schema for ACSets involved double categories, this likely involves some kind of 3-category, which scares me a little bit…

I’m going to conclude by mentioning something that I noticed which might help the process of defining finite presentations for these things. The category A(E) that we defined earlier is the Grothendieck construction of a representable functor out of \Set^{\{I,O\} + \{I,O\}}. So categories that are given by Grothendieck constructions of representable functors out of C-set categories might be relevant.


I want to suggest a definition for acsets with category-valued attributes that’s as close to the set-value attribute case as possible without losing anything we need. I suspect that the following is about what we want.

Let S_0 and S_1 be small categories, S:S_0\nrightarrow S_1 a profunctor, and K:S_1\to \mathbf{Cat} a functor.[1] Then an acset relative to K is an oplax functor from the collage A:|S|\to \mathbf{Cat} such that A restricts to K over S_1 and to a (necessarily strict) functor F:S_0\to \mathbf{Set}. Thus the oplaxness of A only bites at attributes: if we have a commutative triangle a=a'\circ f for a:S(s,x),f:S_0(s,s'), and a':S(s',x) then A provides morphisms A_{f,a'}:Aa\to Aa'\circ Af in the category Ax, as shown below.

As a running example, we should generalize the case of \mathbf{Fib} vs \mathbf{Ind} by considering an arbitrary S_0 with S_1 and S both terminal and K(\star)=\mathbf{Set}. Thus every object of S_0 has a unique set-valued attribute and all these attributes have to (oplaxly) respect each other. In this case, giving the acset A is equivalent to giving an oplax natural transformation \varphi from F to the functor \Delta\mathbf{Set}:S_0\to \mathbf{Cat} constant at \mathbf{Set}.

In this case, what the oplaxators give you is, for every piece x:As of combinatorial data, a function \varphi_{f,x}:\varphi_s(x)\to\varphi_{s'}(Ff(x)). So all in all, what we have is: for every s,x:F(s), a set \varphi(x) (suppressing the implicit subscript) and for every morphism f:s\to s' and x:F(s), a function \varphi(x)\to \varphi(Ff(x)), plus coherence. Noting that (s,x) is the data of an object in \int F and that (f,x) is the data of a morphism x\to Ff(x), we see that in this case, our acset A is just the same as a functor \varphi:\int F\to \mathbf{Set}. We could then integrate again to see that our acset is equivalent to the tower \int\varphi\to\int F\to S_0 of discrete opfibrations. Spoiler: we’ll see below that the category of acsets relative to this K is actually equivalent to the category of discrete opfibrations over discrete opfibrations over S_0.

In particular, if S_0 is terminal, then \int F and thus \int \varphi must be discrete and we’re recovering the category of indexed sets. And again we can go lax or pseudo to have some fun.

OK, how about morphisms? If A,A':|S|\to \mathbf{Cat} are two acsets relative to K, I suggest defining a morphism \alpha:A\to A' to be an oplax natural transformation of oplax functors, restricting (necessarily) to an ordinary natural transformation over S_0 and constant (not, I suppose, necessarily) over S_1. Thus \alpha supplies components \alpha_s:A(s)\to A'(s) and \alpha_a:A(a)\to A'(a)\circ \alpha_s for each attribute a, satisfying some coherence conditions.

Let’s turn again to our example of a unique, coherent set-valued attribute for every object of S_0, so that we can identify A and A' with the pairs (F,\varphi) and (F',\varphi') of an S_0-set and an oplax natural transformation to \Delta\mathbf{Set}. We have functions \alpha_s:F(s)\to F'(s), which are strictly natural for morphisms in S_0. Since attributes are indexed by objects of S_0, we can now write that we have indexed-set morphisms \bar\alpha_s:\varphi_s\to \varphi'_s\circ\alpha_s. Evaluating once again at some x:F(s) gives plain functions \bar\alpha_{s,x}:\varphi_s(x)\to \varphi'_s(\alpha_s(x)). Now recall that we saw the \varphi_s could be integrated into the object part of a functor \varphi:\int F\to \mathbf{Set}, while we get another copresheaf but with a different domain, \varphi':\int F'\to\mathbf{Set}, corresponding to \alpha's codomain. Now, since the \alpha_s constitute a natural transformation F\to F', they give rise to a functor \alpha:\int F\to \int F' over S_0, that is, in the slice category \mathbf{Cat}/S_0. The \bar\alpha_{s,x} then assemble into a natural transformation as indicated below:

Integrating once again shows that \alpha is equivalent to this figure:

An inverted house like this is the correct notion of morphism of discrete opfibration in the full sub-2-category of \mathbf{Cat}/S_0 spanned by the discrete opfibrations. In particular, you never put an arrow in the square for morphisms of (discrete) (op)fibrations. There also isn’t any nontrivial 2-categorical structure to add this category, because a 2-morphism would have to have vertical components, but these opfibrations are discrete. That’s great since it suggests we can have this upgrade to attributes without actually having to increase the categorical level. I think that this case has worked out nicely enough to suggest that we’ve got a usable definition in general. Let me know any thoughts or questions!

  1. I’m not sure whether I need K to be pseudo, yet; let’s imagine that S_1 is discrete as in the original acsets paper so we don’t have to worry about any of that. ↩︎

1 Like

I think that oplaxity is to constraining; we should really just put 2-cells in the schema. Then the schema would be two categories and a 2-profunctor between them, so that the collage was a 2-category, and an instance would be a 2-functor from the collage into Cat.

This would allow us to do wiring diagrams via attributes on the graph schema, by having two set-valued attribute on vertices and a unit-valued attribute on edges, and then a 2-cell that looked like:

I’m not fully following the target example so let me just say what I gather so far. I figure you’re thinking V is the boxes and E is all the kinds of wires. The set-valued attributes on V give you the set of input and output ports on each box. But then I don’t understand how E can have src and tgt defined on it, unless the wiring diagram is closed, or V somehow includes the outer box as well? I also don’t know what the unit-valued attribute is, is \mathrm{Unit}=\{\mathrm{Wire},\mathrm{InWire},\mathrm{OutWire},\mathrm{PassingWire}\}?