Idea
InterTypes plans to have three basic types: sum types, product types, and acsets/array systems. Of these three, sum and product seem more fundamental; acsets/array systems seem comparitively adhoc.
This morning, I had an idea for how to break array systems into a composition of simpler features. Essentially, it came from the remembering that an array of type A could be expressed by the existential type
So why not just separate the two components of this? This is done in programming languages like Scala and OCaml for normal types, e.g. you can have a trait in Scala that looks like:
trait Monoid {
type T
val identity: T
def mul(x1: T, x2: T): T
}
Similarly, you can have a module in OCaml which looks like
module type MONOID = sig
type t
val mempty : t
val mappend: t -> t -> t
end
Of course, in a language with dependent types this is just a dependent record, but as OCaml and Scala show, it’s possible to do this without full dependent types.
In F-ing modules, Rossberg shows how a ML-style module system can be encoded via existential types, and Scala’s DOT calculus gives a similar picture.
In fact, this sort of “existential type” is how the type system of many object-oriented languages like Java work under the hood. An instance of a class is a specific, but unknown type (because it could be a subclass), along with a vtable which provides all of the methods that are defined in the class. This encoding is shown explicitly in the Haskell wiki page on existential types.
The point is that this is quite well-understood and implementable from a type-theoretic standpoint.
So what if we take this theory, and make a restriction. Specifically, to the underlying type theory we add an additional judgment form \Gamma \vdash T \: \mathbf{ftype} (where \mathbf{ftype} stands for “finite type”), along with axioms
and for all n \in \mathbb{N}
Having “finite types” in a language has been explored in the context of high-performance array processing in dex-lang and moreover the array support in SciLean is based off of a similar idea: see this localcharts comment for an example of how this works.
We could also allow finite types to be created from namespaces, which are prefix-free finite sets of lists of strings, which we represent as separated via dots. E.g. stuff like \{ \mathtt{a.b}, \mathtt{c}, \mathtt{p.f} \}. This representation of finite types is useful for finite sets of variables. We could also have finite sets of UUIDs, or of generational indices.
The neat thing about this approach is that different representations of finite sets can live together, because ftype
is a judgment, not a type.
Then we restrict function type creation to the case when T is an ftype, i.e.
We similarly restrict existential types. The end result would then be something like (in pseudo-Scala syntax)
type WGraph[T] {
ftype V
ftype E
def src(e: E): V
def tgt(e: E): V
def weight(e: E): T
}
Making this into a type theory would allow us to flexibly compose types. For instance, we could declare
type WGraphHom[T] {
val dom: WGraph[T]
val codom: WGraph[T]
def apply(v: dom.V): codom.V
def apply(e: dom.E): codom.E
}
Something similar to Scala’s support for refinement types would then allow things like
type WGraphCospan[T] {
val leftfoot: WGraph[T]
val apex: WGraph[T]
val rightfoot: WGraph[T]
val leftinj: WGraphHom[T] {
val dom = leftfoot
val codom = apex
}
val rightinj: WGraphHom[T] {
val dom = rightfoot
val codom = apex
}
}
We could then have additional rules which specified that certain types were ftypes if all of their fields were. For instance, if we had
type Pair[S,T] {
val _1: S
val _2: T
}
then if E
were an ftype
, we could also allow Pair[E,E]
to be an ftype
.
Assuming all of the type theory works out (which I think it does, because this is just a restriction of an existing type theory), it should be straightforward to serialize/deserialize any instance of these types. Namely, the ftype
fields are stored as whatever representation of finite set we want (integers, namespaces, finite sets of UUIDs, etc.), and the def
s are stored as arrays/tries/hashmaps/etc. val
fields are stored as normal.
Extensions
It could be possible to give a category of instances for each type, where the morphisms were “diffs”. The category of diffs for a primitive type like Int
would simply be a codiscrete category, but the category of diffs for a ftype
would record what elements were added and deleted.
Each of the type formation operations could then act on these categories. For instance, a product type would simply be a product of the categories of diffs. A sum type might be the “connected sum” of two categories, which is the coproduct, but also adds a unique morphism between each pair of objects in the opposite sides. So there would be a unique diff from Left(x)
to Right(y)
, but possibly many diffs from Left(x)
to Left(x')
, depending on the diffs from x
to x'
. I’m not totally sure how to interpret the existential types yet; the success of this depends whether or not there is a clean way of doing this.
Along with this, there should also be a notion of migrations between different versions of the same type. For instance, it should be possible to add a new optional field to a record type.
So essentially, I’m hoping that this could form the core type theory of chit.
Another extension is to add laws to the types. The laws could be first-order logical statements in which the only quantification was over ftypes
. Thus, the laws would be decidable.
And of course, it becomes very tempting to add function definitions as well. At a certain point, this just becomes its own programming language if we do this too much, and my hope is for this to not be its own programming language, rather this should be compiled into types in existing programming languages. Which is not incompatible with having function definitions, but ideally restricts the scope of function definitions somewhat.
We could also consider ftype
to stand for finitely presented types, or add a new judgment fptype
. This would add all sorts of complications, but might be possible. The serialization of a finitely presented type would be the list of generators, along with an e-graph expressing the equalities between those generators.
Where to go from here?
I would be very happy if someone picked up on this and fleshed it out, either on the theory side or the implementation side. I can’t focus on this right now, because it is too greenfield and I need to deliver on other projects, but if you are interested in working on this, please contact me.