*The following has been informed by numerous conversations, among which some with David Spivak, Nima Mothamed, David Jaz Myers, and Nathaniel Virgo.*

Argmax is my nemesis. It keeps popping up everywhere in my work and yet escapes a structuralist treatment. It has very bad formal properties and thus it’s hard to justify (which means saying argmax is ‘just’ a blah in a blah). That makes me mad!

For now, let’s work in \bf Cat. Consider a functor f:X \to V therein (typicially, f:X \to \R where X is discrete). An object x^* \in X is in the argmax iff fx^* is “terminal in the image of f”, i.e. iff for all x \in X, there exists a (perhaps unique) fx \to fx^*.

If one tries to read this as a universal property, they are going to get all dizzy. The universal quantification ranges over objects of X but the invoked comparison maps are in V. This is weird.

One can frame \argmax as follows: consider the map f:X \to \R as a map of sets, and use it to pullback the order structure on \R onto X. This makes X look like the ‘level sets’ of f since x \leq x' \iff fx \leq fx', and thus in particular x \cong x' \iff fx = fx'. Then \argmax f is the isomorphism class of terminal objects in X.

‘Pulling back’ the order structure can be seen in various way. One is as a literal cartesian lift for the forgetful functor U:\bf Ord \to Set from orders to sets, which turns out to be a fibration. Indeed, the level sets construction is literally the functor \mathsf{lvl} : {\bf Set}/ U \to \bf Ord defining a cleavage for this fibration. Moreover {\bf Set}/U has a neat interpretation as ‘sets with utility’, and thus \mathsf{lvl} is a case of very natural and concrete construction (sending a utility function to its induced preference relation) having a decent universal property (being a right adjoint, being a cartesian lift functor, etc.).

That’s already quite cool! Let’s make this cooler by summoning equipments.

The operation of taking level sets can also be seen as follows: given a functor f:X \to V, one factors it as X \to \mathsf{lvl}f \to V where the first is a bijective-on-objects functor and the second is a fully-faithful functor (think: bijective-on-morphisms functor). Formally, this factorization comes from the a vertical-cartesian factorization system on \bf Cat induced by its fibration of objects \rm Ob: \bf Cat \to Set, which in turn is just an extension to categories of the fibration U above.

One can see this factorization in yet another way, by approaching categories as monads in spans. Recall that functors are tight maps in that equipment, thus in particular squares

where X_0,V_0 are the set of objects of X and V respectively, X_1 and V_1 the spans of morphisms, and f_0 and f_1 are similarly the action of f on objects and maps.

Now by being an equipment, we know this square has to factor along a restriction, thus yielding:

And this is basically the bo/ff factorization of f we had above. In fact it’s exactly the same since the 2-cell I labeled \text{cart} is the cartesian lift of the ‘category structure’ on V along the map of objects f_0.

But we can do something a bit different now. Let’s switch to think about this in the equipment \bf \mathbb Cat = \mathbb Mod(\mathbb Set) of categories, functors, profunctors and transformations. There, we can restrict the identity profunctor on V = (V_0, V_1) along the entire functor f, yielding:

Now V(f,f) is *extra structure on the category X*: it’s a promonad on it, thus the data of new morphisms on X, namely the ones it would have ‘in V’. This seems useful to talk about properties of elements of X with respect to morphisms between their f-images!

Besides, here’s a crucial insight: **\argmax f should really be thought as a predicate on X**. For instance, consider things like \mathrm{softmax} which assign a ‘maximality score’ to all things in a set: this smells a lot like *literally* argmax but seen as a ‘generalized predicate’ X \to [0,1]. And even without resorting to generalization, in a constructive setting one ends up encoding \argmax f as the X-indexed type which assign to each x the type of proofs of its argmaximality, in a classic proposition-as-types move.

So here’s a conjecture (I’m sure duality is gonna bite me): take the following lift in \bf Prof, the bicategory (you could probably swap if for the equipment above but I just don’t know how would that work):

By duality, one can compute this as a Kan extension, specifically (I’m stealing this idea from the nLab):

where we are considering profunctors 1 \nrightarrow X directly as copresheaves over X.

It’s not hard to convince oneself that the above formula corresponds to defining:

Clearly, had we setup the above in the 2-\bf \mathbb Prof (aka \bf \mathbb Mod(\mathbb Rel)), this is exactly what we would have found!

Moreover, in this fashion, we can also describe constrained optimization problems. Say you have P:X \to \bf 2 \hookrightarrow Set describing the predicate which constrains the optimization. This is again a copresheaf 1 \nrightarrow X (in fact a subsingleton), and lifting V(f,f) against it you get:

Notice we can also talk about whether an object x^* \in X ‘is in \argmax f’: this categorifies to asking *how much* it is in \argmax f, and is answered by checking membership categorically, i.e. by mapping from X(x^*,-) in {\bf Psh}(X), which by Yoneda simply returns the value of \argmax_P f at x^*.

When working in \bf \mathbb Mod(\mathbb Rel), the only possible answers are given by the only possible degrees of ‘being’ in \argmax f: either *not at all* (\bot) or *completely* (\top), whereas in other settings one will get different kinds of answers (e.g. in \bf \mathbb Prof, one gets either \varnothing or a set of witnesses that “fx \leq fx^*”, actually the hom-set fx\to fx^*).

Assuming this stuff is right, I can finally say: **argmax is defeated**! …or is it?

Let’s check what happens when we compute \argmax f in the equipment of [0,1]-enriched categories, where ([0, 1], \leq, {\cdot}, \multimap) is the base of enrichment, with a \multimap b =e ^{b\ \dot- a} = \min(e^b/e^a, 1).

Given f:X \to [0,1], we have

Despite the suggestive notation, \int^x fx is actually the maximum of f. If f=e^{-u}, where e^{-(-)} is base change [0,\infty] \to [0,1], one would expect \argmax f to be \mathrm{softmax}\ u, but the normalization is somehow wrong.

So it appears there is a room for another chapter in the battle against \argmax!