In type theory, we typically model operations on types like product and sum “structurally”. That is, we define types A \times B and A + B, and operations on those types.
However, in actual programming languages, that is not quite how it works. We can have two types
data T1 = MkT1 Int Float
data T2 = MkT2 Int Float
and these two types are different!!
I’ve been thinking about how to model this formally for a while, because it seems to be an essential part of the day to day of programming that’s neglected by the theory.
The way that I think about this now is that these data declarations are like let
bindings. Except, instead of binding a value, they bind a type and constructors and destructors for that type (i.e. a universal property). In later code, i.e. the body of the let binding, we have no way of knowing that T1
and T2
are the same type; we just know that they both satisfy the same universal property. Crucially, we could use different implementations/memory layouts for T1
and T2
, and the later code has to be oblivious to this choice, because it only accesses T1
and T2
through the constructors/destructors.
I’m still not fully sure how to think about this categorically, but I’m now aiming at this through the same lens as how I’m looking at let bindings, and I have some idea about how to tackle those through some method like abstract binding trees.