Compositional Wizards

Wizards as State Machines

In this post, I’m going to lay out a lot of conjectures for a mathematical formalism of wizards. Unfortunately, I don’t mean wizard like this:

I mean wizard like this:

Why do I care about wizards? I think that there is a general pattern in wizards that is applicable to many domains beyond software installation, and moreover this general pattern has compositional structure.

What is a wizard? Well, first of all, a wizard is a state machine. A wizard has a current state, and that state can be advanced by interaction with the wizard.

In order to describe this, we use polynomial coalgebras. To understand how this works, let’s give an example. We want to model a wizard which at each step, displays some text to the user and gives the user a choice of two buttons.

We can model this with the polynomial functor.

p(y) = \mathtt{String}\; y^2

where \mathtt{String} is the set of unicode strings, for instance.

A coalgebra of this functor consists of a set S and a function S \to p(S), i.e. a function S \to \mathtt{String} \times S^2. For each state, we get a string, which is the message to display the user, and then a pair of states, one for each button displayed. When the user pushes a button, we change to the corresponding state, and the process is repeated.

But this misses two important parts of a wizard: they begin and they end. We can model this by having a function 1 \to S, which marks the initial state, and changing the function S \to p(S) to a function S \to p(S) + 1, which allows some subset of states to be marked as final.

At this point, we can make an observation: why not allow wizards to begin and end in different ways? That is, instead of a map 1 \to S, we could have a map A \to S, where A is the set of “entry points” into the wizard, and instead of a map S \to p(S) + 1, we could have a map S \to p(S) + B, where B is the set of “ways the wizard can end”. For instance, we could have A = \mathtt{String}, which would allow us to start the wizard pointing at a specific file, and B = 2, which would allow the wizard to end signaling failure or success.

This now makes wizards look suspiciously like morphisms from A to B. Let’s see if this can work for real.

A Category of Wizards

For any polynomial functor p, let \mathsf{Wiz}_p to be the category where

  • Objects are sets A, B
  • A morphism from A to B consists of a tuple (S, \mathtt{init} \colon A \to S, \mathtt{update} \colon S \to p(S) + B)

To define composition, suppose that we have sets A,B,C and morphisms (S_1, \mathtt{init}_1 \colon A \to S_1, \mathtt{update}_1 \colon S_1 \to p(S_1) + B) and (S_2, \mathtt{init}_2 \colon B \to S_2, \mathtt{update}_2 \colon S_2 \to p(S_2) + C).

Then their composite is the tuple (S, \mathtt{init} \colon A \to S, \mathtt{update} \colon S \to p(S) + C) where

S = S_1 + S_2

And \mathtt{init} and \mathtt{update} are defined by

where \mathtt{inj}_1 \colon S_1 \to S_1 + S_2 and \mathtt{inj}_2 \colon S_2 \to S_1 + S_2 are the canonical injections.

The intuition behind composition of wizards is that you go through the first wizard and then when it has completed, go on to the second wizard.

The identity on A consists of (A, 1_A \colon A \to A, \mathtt{inj}_2 \colon A \to p(A) + A). This is the “do nothing” wizard, which returns immediately with the value it was passed.

Now, notice that \mathsf{Wiz}_p is not a category, because A + S \neq S so composing with the identity doesn’t give the same morphism. In order to fix this, we need to quotient wizards by their behavior. That is, we consider (S, \mathtt{init}, \mathtt{update}) to be behaviorally equivalent to (S', \mathtt{init}', \mathtt{update}') if the user can’t tell the difference between them; i.e. for any a, the same choices of buttons to push give you the same sequence of texts, and you finish at the same time with the same end value. This can be formalized by defining a wizard to be a function A \to \mathbb{C}_{p+B}(1), where \mathbb{C}_{p + B} is the cofree comonad on p + B. Essentially, an element of \mathbb{C}_{p + B}(1) is a “behavior tree” of p + B.

So really, we want morphisms in \mathsf{Wiz}_p to be maps A \to \mathbb{C}_{p+B}(1). Because these are the same as equivalence classes of state machines, our earlier definition of composition still works; we leave it to the reader to formulate composition in a more “polynomial-ish” manner.

There is now a symmetric monoidal structure on \mathsf{Wiz}_p given on objects by coproduct, and on morphisms A_1 \to \mathbb{C}_{p + B_1}, A_2 \to \mathbb{C}_{p + B_2} by the composite

A_1 + A_2 \to \mathbb{C}_{p + B_1}(1) + \mathbb{C}_{p + B_2}(1) \to \mathbb{C}_{p + B_1 + B_2}(1) + \mathbb{C}_{p + B_1 + B_2}(1) \to \mathbb{C}_{p + B_1 + B_2}(1)

Therefore, we can consider string diagrams of wizards. We can go further than this and also try to put a traced structure on \mathbf{Wiz}_p; this allows wizards to be run in loops. However, in order for this to work, we probably need to switch to partial wizards, i.e. A \to \mathbb{C}_{p + B}(1) + 1, to allow for infinite loops.

Migrating Wizards

One might wonder if this construction is functorial in p. And I think it is, if we have a map of polynomials p \to q we get a functor \mathsf{Wiz}_p \to \mathsf{Wiz}_q given on objects by the identity, and on morphisms by functoriality of \mathbb{C}_{p + B} in p. So \mathsf{Wiz} is a functor from \mathsf{Poly} to \mathsf{Cat}.

We can think of this functor like a “stateless” effect handler; an effect in (position of) p is mapped to an effect in q, and the response to the effect is carried back.

However, as David Spivak helped me see, one can go even further than this and consider stateful effect handlers. An element m \in \mathbb{C}_{[p, q]}(1) represents a way of translating p effects into q effects, and then updating some internal state, so that the next p effect might be handled in a different way. There is a natural map

\mathbb{C}_{p + B} \otimes \mathbb{C}_{[p, q]} = \mathbb{C}_{p \triangleleft (y + B)} \otimes \mathbb{C}_{[p, q]} \to \mathbb{C}_{(p \triangleleft (y + B)) \otimes [p, q]} \to \mathbb{C}_{(p \otimes [p, q]) \triangleleft (y + B)} \to \mathbb{C}_{q \triangleleft (y + B)} = \mathbb{C}_{q + B}

So specifically, there is a function

\mathbb{C}_{p + B}(1) \times \mathbb{C}_{[p, q]}(1) \to \mathbb{C}_{q + B}(1)

and thus for every m \in \mathbb{C}_{[p,q]}, a function

\mathbb{C}_{p + B}(1) \to \mathbb{C}_{q + B}(1)

which is precisely what we need to turn a function A \to \mathbb{C}_{p + B}(1) to a function A \to \mathbb{C}_{q + B}(1).

There is a category \mathsf{Org} where the objects are polynomials, and a morphism from p to q is an element of \mathbb{C}_{[p, q]}(1). Then \mathsf{Wiz} is a functor from this category into \mathsf{Cat} as well, with the action on morphisms given by the above.

All of this needs to be proven in much more detail, and there are probably higher-categorical dimensions to this as well that we are ignoring. But this is enough to start guiding an implementation of wizards that hopefully I’ll be able to show off in Semagrams! If anyone wants to work on formalizing this, it would be very welcome; there’s probably at least a paper’s worth of material here.

2 Likes

I really like wizards, Owen!

I’m now seeing the polynomial core of this idea as coming down to a very interesting map. I haven’t fully “naturalized” it yet, meaning I don’t know what laws it satisfies, but here it is.

Polynomial core

For any p,q,r:\mathbf{Poly}, there is a map

\Phi_{p,q,r}\colon\mathfrak{c}_p\times (q\lhd \mathfrak{c}_r) \to \mathfrak{c}_{pq+r}

where I’m stubbornly writing \mathfrak{c}_p where Owen was writing \mathbb{C}_p, for the cofree comonoid on p.

First I’ll say why the above maps \Phi are all we need for wizards, then I’ll say where such maps come from.

Why \Phi is enough

Instantaneity or not?

Owen defines a p-wizard from A to B as a map

A\to S\to (p+B)(S)=B+p(S).

This means that, given an a:A, it can either instantaneously output a b:B or do some p-ish behavior for a while and then possibly output a b:B. But we could ask wizards instead to always do at least one unit of p-processing first. Then we’d replace (\mathcal{y}+B)\lhd p with p\lhd(\mathcal{y}+B) in the above. In other words, we’d consider wizards as maps

A\to S\to p\lhd(B+S).

In his writeup, Owen (following me skipping ahead yesterday) accidentally made this replacement, though his story works just as well either way.

From my point of view, p\lhd(\mathcal{y}+B) is much more interesting because it’s an exponential:

p\lhd(\mathcal{y}+B)\cong p^{\mathcal{y}^B}

This tells you to consider replacing \mathcal{y}^B with an arbitrary polynomial q. It tells you where to look for the desired map.

If p is a monad, then there’s automatically a map (\mathcal{y}+B)\lhd p\to p\lhd(\mathcal{y}+B), and it’s even a distributive law regardless of the set B and the monad p. So in these cases, I think Owen will be happy for to use p\lhd(\mathcal{y}+B) in the definition of wizard; hopefully he’ll find the idea of exponentiating sufficiently compelling to just allow me to make the replacement for arbitrary (not necessarily monad) p's, in the definition of wizard.

In what follows, I’ll assume that a p-wizard from A to B is a map

\mathcal{y}\to\mathcal{y}^A\lhd\mathfrak{c}_{p\lhd(\mathcal{y}+B)}

or as I’d prefer to write it,

\mathcal{y}\to\mathcal{y}^A\lhd\mathfrak{c}_{p^{\mathcal{y}^B}}.

How to compose p-wizards using \Phi

Suppose given maps

\mathcal{y}\to\mathcal{y}^A\lhd\mathfrak{c}_{p\lhd(\mathcal{y}+B)} \qquad\text{and}\qquad \mathcal{y}\to\mathcal{y}^B\lhd\mathfrak{c}_{p\lhd(\mathcal{y}+C)}

Then by the universal property of products, we have

\mathcal{y}\to(\mathcal{y}^A\lhd\mathfrak{c}_{p\lhd(\mathcal{y}+B)})\times(\mathcal{y}^B\lhd\mathfrak{c}_{p\lhd(\mathcal{y}+C)})

There’s always a “strength” map (p\lhd q)\times r\to p\lhd(q\times r), so we get

\mathcal{y}\to\mathcal{y}^A\lhd\big(\mathfrak{c}_{p\lhd(\mathcal{y}+B)}\times(\mathcal{y}^B\lhd\mathfrak{c}_{p\lhd(\mathcal{y}+C)})\big)

Now we’re ready to bring in \Phi, or more specifically

\Phi_{p\lhd(\mathcal{y}+B),\mathcal{y}^B,p\lhd(\mathcal{y}+C)}\colon \mathfrak{c}_{p\lhd(\mathcal{y}+B)}\times(\mathcal{y}^B\lhd\mathfrak{c}_{p\lhd(\mathcal{y}+C)}) \to \mathfrak{c}_{(p\lhd(\mathcal{y}+B))\times\mathcal{y}^B+p\lhd(\mathcal{y}+C)}

Since p\lhd(\mathcal{y}+B)\cong p^{\mathcal{y}^B}, there is an evaluation map

(p\lhd(\mathcal{y}+B))\times\mathcal{y}^B\to p.

Putting it all together, we have the desired map

\mathcal{y}\to \mathcal{y}^A\lhd\mathfrak{c}_{p+p\lhd(\mathcal{y}+C)}\to \mathcal{y}^A\lhd\mathfrak{c}_{p\lhd(\mathcal{y}+C)}

to serve as the composite.

As an upshot, Owen, you can replace sets A,B:\mathbf{Set} with arbitrary polynomials a,b:\mathbf{Poly}, defining a p-wizard from a to b as a map

\mathcal{y}\to a\lhd\mathfrak{c}_{p^b}\;.

Then the composite of this with a wizard from b to c is given by

\mathcal{y}\to (a\lhd\mathfrak{c}_{p^b})(b\lhd\mathfrak{c}_{p^c})\to a\lhd(\mathfrak{c}_{p^b}(b\lhd\mathfrak{c}_{p^c}))\to a\lhd\mathfrak{c}_{p+p^c}\to a\lhd\mathfrak{c}_{p^c}.

where the first map is strength, the second is \Phi, and the third is the “constant over c” map p\to p^c.

Where \Phi comes from

We have seen that if we have our maps \Phi, then we can perform Owen’s composition operation, basically by evaluating exponentials. So where does \Phi come from?

For any p,q,r:\mathbf{Poly}, we need a map

\Phi_{p,q,r}\colon\mathfrak{c}_p\times (q\lhd \mathfrak{c}_r) \to \mathfrak{c}_{pq+r}

The cofree comonad is constructed as a limit

\mathfrak{c}_p\cong\lim_{i:\mathbb{N}}p^{(i)}

where p^{(0)}\coloneqq\mathcal{y} and p^{(i+1)}\coloneqq\mathcal{y}\times(p\lhd p^{(i)}). The point is, we may assume we have a map

\mathfrak{c}_p\times (q\lhd \mathfrak{c}_r) \to (pq+r)^{(i)}

and show that we can get a map

\mathfrak{c}_p\times (q\lhd \mathfrak{c}_r) \to^? (pq+r)^{(i+1)}

We have a map \mathfrak{c}_p\to\mathcal{y}, so it suffices to find a map

\mathfrak{c}_p\times (q\lhd \mathfrak{c}_r) \to^? (pq+r)\lhd(pq+r)^{(i)}.

Ready?

We start with the map

\mathfrak{c}_p\times (q\lhd \mathfrak{c}_r)\to (p\lhd\mathfrak{c}_p)\times (q\lhd \mathfrak{c}_r)\times (q\lhd \mathfrak{c}_r)

given by unfolding \mathfrak{c}_p once and copying the rest. Then by strength we have

(p\lhd\mathfrak{c}_p)\times (q\lhd \mathfrak{c}_r)\times (q\lhd \mathfrak{c}_r)\to \Big(p\lhd\big(\mathfrak{c}_p\times (q\lhd\mathfrak{c}_r)\big)\Big)\times(q\lhd\mathfrak{c}_r)

By induction, we have

\Big(p\lhd\big(\mathfrak{c}_p\times (q\lhd\mathfrak{c}_r)\big)\Big)\times(q\lhd\mathfrak{c}_r)\to \big(p\lhd(pq+r)^{(i)}\big)\times(q\lhd r^{(i)})

Since there is a map r^{(i)}\to(pq+r)^{(i)}, we have

\big(p\lhd(pq+r)^{(i)}\big)\times(q\lhd r^{(i)})\to pq\lhd(pq+r)^{(i)}\to (pq+r)\lhd(pq+r)^{(i)}

This is the codomain we wanted! So composing all of the above, we get our desired map.

What does it mean? The map

\Phi_{p,q,r}\colon\mathfrak{c}_p\times (q\lhd \mathfrak{c}_r) \to \mathfrak{c}_{pq+r}

on positions takes as input a p-tree T, a position J:q(1) and for every direction j : q[J], a choice of r-tree U_j. It needs to provide a (pq+r)-tree. The root of that tree will be the pair (T.\text{root},J):(pq)(1). Given a direction there, either it is a direction i:p[T.\text{root}] or it is a direction j:q[J]. In the first case, we can take the new tree T.i and start again, keeping the same J, U. In the second case, we say that we have ``escaped out of p and into r‘’ because we never see p again. Instead, we get an r-tree U_j, and we consider it as a pq+r-tree.

For directions, given (T,J,U) and given a direction of the corresponding (pq+r)-tree, either it’s before the escape or after. If it’s before the escape, we get a direction in the p-tree T. If it’s after the escape, we get to know which q-direction j was taken, and we get to know a direction in that r-tree.

It’s interesting to see how, on directions, \Phi forgets all the history that occurred before the escape.

3 Likes

But if you require wizards to do one unit of p-processing, how do you have identities?

Good point.

Then I’d say you’re looking for a map

\Psi_{p,B,q}\colon \mathfrak{c}_{p+B}\times (\mathcal{y}^B\lhd\mathfrak{c}_{q})\to \mathfrak{c}_{p+q}

and there is one. From there you need

(\mathcal{y}^A\lhd\mathfrak{c}_{p+B})\times (\mathcal{y}^B\lhd\mathfrak{c}_{p+C})\to \mathcal{y}^A\lhd\mathfrak{c}_{p+C}

which is just a matter of strength and \Psi:

(\mathcal{y}^A\lhd\mathfrak{c}_{p+B})\times (\mathcal{y}^B\lhd\mathfrak{c}_{p+C}) \\\downarrow\\ \mathcal{y}^A\lhd\big(\mathfrak{c}_{p+B}\times(\mathcal{y}^B\lhd\mathfrak{c}_{p+C})\big) \\\downarrow\\ \mathcal{y}^A\lhd\mathfrak{c}_{p+p+C} \\\downarrow\\ \mathcal{y}^A\lhd\mathfrak{c}_{p+C}

I wish I knew how to better “naturalize” this map \Psi. I can use \Phi from the post to get almost there: for any p,b,q:\mathbf{Poly} a map

\mathfrak{c}_{p+\mathcal{y}+b}\times(\mathcal{y}^b\lhd\mathfrak{c}_q)\to \mathfrak{c}_{p+\mathcal{y}+q}

Do you see what this does? It puts together a tree T that sometimes does p, sometimes “thinks”, and sometimes does b, with a function that takes any position of b and outputs either a direction there or a q-tree. So the tree T is running p's and thinking, and every so often it pops out a b-position. The function triages it: if it knows the answer, it returns it as one round of “thinking” and the p-tree keeps going. If it doesn’t know the answer, it shunts the computation off to the q tree.

This feels similarly to what might happen in the trace also.

I think you misunderstood the question. In this new category of wizards, what is the map y \to y^A \triangleleft \mathfrak{c}_{p \triangleleft (y + A)} that serves as the identity?

But I think that I have an idea now; if we assume p is a monad, then the program can “output the unit effect” and then return A. Does that make sense? How would you write this intuition down?

I don’t think I misunderstood; I followed you and changed from \mathfrak{c}_{p\lhd(\mathcal{y}+A)} to \mathfrak{c}_{A+p}.

As for the case when p\coloneqq t carries a monad, I think my original map works because there’s a distributive law

(\mathcal{y}+A)\lhd t\to t\lhd(\mathcal{y}+A)\cong t^{\mathcal{y}^A}.

Ohhh, I understand now; I was asking for an identity in your framework, you were admitting that your framework doesn’t have identities, and switching to my framework, rather than exhibiting an identity in your framework.

And that makes sense for the monad! OK, now that I understand what you are trying to do, I have to read your previous comment more carefully.

1 Like

I started wondering: why did you choose cofree comonads instead of free monads? Suppose I define a p-wyzard from A to B to be a map

A\to\mathfrak{m}_{p}(B)

where \mathfrak{m}_p is the free monad on p. These programs always terminate, as opposed to wizards, which may not. The p-wyzard’s are again closed under composition, and I believe they again extend to a functor \mathsf{Org}\to\mathsf{Cat}.

Do you prefer p-wizards over p-wyzards? If so, why?

Because you can’t have the trace structure with free monads! I want to be able to loop in a wizard, and it’s hard to loop without the possibility of infinite looping.

I don’t quite see it yet. I think that the “+1” you need, both when working with free monads and with cofree comonads, handles the infinite loop case, and that otherwise, you want to land in \mathfrak{m}_p(B). So I’m not following your explanation of why you prefer cofree over free.

However, I do get the sense that I could define the “exception-handling” trace map

\mathcal{y}^A\lhd\mathfrak{c}_{t\lhd(A+\mathcal{y})}\to\mathcal{y}^A\lhd\mathfrak{c}_{t}

for monads (or more generally, pointed endofunctors) t, much more easily using just \mathbf{Poly} structures than I could if I tried to use the free monad \mathfrak{m}_t in place of \mathfrak{c}_t. So maybe that aligns with your thinking.

Note that the above “exception-handling” map extends to a trace. Indeed, for any B:\mathbf{Set}, if t is a monad (or pointed endofunctor) then so is t\lhd(B+\mathcal{y}). So substituting t\lhd(B+\mathcal{y}) in place of t, we have a map

\mathcal{y}^A\lhd\mathfrak{c}_{t\lhd(B+\mathcal{y})\lhd(A+\mathcal{y})}\to\mathcal{y}^A\lhd\mathfrak{c}_{t\lhd(B+\mathcal{y})}

serving as the trace. In other words, if we think of \mathcal{y}^A\lhd\mathfrak{c}_{t\lhd(B+\mathcal{y})} as the polynomial of t-wizards A\to B, then these should compose and (+) should be a traced monoidal structure.

1 Like

The +1 handles the infinite loop that doesn’t produce any effects, but I want an infinite loop continually producing effects!

1 Like

I didn’t follow 100% the discussion but I get a lot of indexed containers vibes from wizards. Indexed containers are the syntactic counterpart to ‘multivariable polynomials’, ie polynomial functors {\bf Set}^I \to {\bf Set}^J. They’ve been used by Hancock to model the kind of interaction it seems a wizard also is an example of.

1 Like

Thanks for writing this up Owen, and for the discussion, David! I feel like I’ve binged enough Poly videos to get the intuition behind this thread but I could definitely be misunderstanding something.

Owen, you mentioned separately that this is potentially similar to the AlgebraicRewriting work. In that effort and in this thread there are a number of claims that rely on some polynomial p or t being a polynomial monad.

In general though, if we’re thinking about p or t as encoding the interface of a wizard, it probably won’t be a monad right? Does that cause any significant problems, such as in defining a trace map?

Well, we can always take the free monad on p if that’s an issue! But also, from an implementation standpoint, I don’t think it will be a problem. That is, I have an idea about how I will implement this, and it doesn’t depend on p being a monad. The mathematics might require thinking about the free monad on p, but in code we won’t worry about that.

Cool, obviously I’d love to make more concrete implementation plans with you, Micah, and whomever else at some point soon!

Again thinking of installation wizards, am I right in thinking that there are two notions of wizard wiring diagrams that might be useful? The first is a wiring diagram where sequential composition of wizards from A->B and B->C is done by composing polynomials \mathcal{y}^A\lhd\mathfrak{c}_{t\lhd(B+\mathcal{y})} and \mathcal{y}^B\lhd\mathfrak{c}_{t\lhd(C+\mathcal{y})} (can’t say I can write out what that composition is but I assume it’s doable!). And you can use (+) to get traced monoidal diagrams. But then you could also compose the interfaces of wizards, like t_1 and t_2 or whatever, by drawing wiring diagrams like t_1 \otimes t_2 \to t_3 that specify what a composite interface would look like.

Right; you can compose wizards that have the same interface to run in sequence, but you can also “handle” the effects coming out of wizards to change the interface. David and I have explored effects handlers in poly before: Imperative Programming with Poly // The Topos Lab; hopefully something like this will be able to be used with wizards.