Nominal typing vs. structural typing

In type theory, we typically model operations on types like product and sum “structurally”. That is, we define types A \times B and A + B, and operations on those types.

However, in actual programming languages, that is not quite how it works. We can have two types

data T1 = MkT1 Int Float

data T2 = MkT2 Int Float

and these two types are different!!

I’ve been thinking about how to model this formally for a while, because it seems to be an essential part of the day to day of programming that’s neglected by the theory.

The way that I think about this now is that these data declarations are like let bindings. Except, instead of binding a value, they bind a type and constructors and destructors for that type (i.e. a universal property). In later code, i.e. the body of the let binding, we have no way of knowing that T1 and T2 are the same type; we just know that they both satisfy the same universal property. Crucially, we could use different implementations/memory layouts for T1 and T2, and the later code has to be oblivious to this choice, because it only accesses T1 and T2 through the constructors/destructors.

I’m still not fully sure how to think about this categorically, but I’m now aiming at this through the same lens as how I’m looking at let bindings, and I have some idea about how to tackle those through some method like abstract binding trees.