Combinatorial data structures via finite existential types

Idea

InterTypes plans to have three basic types: sum types, product types, and acsets/array systems. Of these three, sum and product seem more fundamental; acsets/array systems seem comparitively adhoc.

This morning, I had an idea for how to break array systems into a composition of simpler features. Essentially, it came from the remembering that an array of type A could be expressed by the existential type

\sum_{X \in \mathsf{Fin}} A^X

So why not just separate the two components of this? This is done in programming languages like Scala and OCaml for normal types, e.g. you can have a trait in Scala that looks like:

trait Monoid {
type T

val identity: T

def mul(x1: T, x2: T): T
}


Similarly, you can have a module in OCaml which looks like

module type MONOID = sig
type t

val mempty : t

val mappend: t -> t -> t
end


Of course, in a language with dependent types this is just a dependent record, but as OCaml and Scala show, itâ€™s possible to do this without full dependent types.
In F-ing modules, Rossberg shows how a ML-style module system can be encoded via existential types, and Scalaâ€™s DOT calculus gives a similar picture.

In fact, this sort of â€śexistential typeâ€ť is how the type system of many object-oriented languages like Java work under the hood. An instance of a class is a specific, but unknown type (because it could be a subclass), along with a vtable which provides all of the methods that are defined in the class. This encoding is shown explicitly in the Haskell wiki page on existential types.

The point is that this is quite well-understood and implementable from a type-theoretic standpoint.

So what if we take this theory, and make a restriction. Specifically, to the underlying type theory we add an additional judgment form \Gamma \vdash T \: \mathbf{ftype} (where \mathbf{ftype} stands for â€śfinite typeâ€ť), along with axioms

and for all n \in \mathbb{N}

Having â€śfinite typesâ€ť in a language has been explored in the context of high-performance array processing in dex-lang and moreover the array support in SciLean is based off of a similar idea: see this localcharts comment for an example of how this works.

We could also allow finite types to be created from namespaces, which are prefix-free finite sets of lists of strings, which we represent as separated via dots. E.g. stuff like \{ \mathtt{a.b}, \mathtt{c}, \mathtt{p.f} \}. This representation of finite types is useful for finite sets of variables. We could also have finite sets of UUIDs, or of generational indices.

The neat thing about this approach is that different representations of finite sets can live together, because ftype is a judgment, not a type.

Then we restrict function type creation to the case when T is an ftype, i.e.

We similarly restrict existential types. The end result would then be something like (in pseudo-Scala syntax)

type WGraph[T] {
ftype V
ftype E

def src(e: E): V
def tgt(e: E): V
def weight(e: E): T
}


Making this into a type theory would allow us to flexibly compose types. For instance, we could declare

type WGraphHom[T] {
val dom: WGraph[T]
val codom: WGraph[T]

def apply(v: dom.V): codom.V
def apply(e: dom.E): codom.E
}


Something similar to Scalaâ€™s support for refinement types would then allow things like

type WGraphCospan[T] {
val leftfoot: WGraph[T]
val apex: WGraph[T]
val rightfoot: WGraph[T]

val leftinj: WGraphHom[T] {
val dom = leftfoot
val codom = apex
}

val rightinj: WGraphHom[T] {
val dom = rightfoot
val codom = apex
}
}


We could then have additional rules which specified that certain types were ftypes if all of their fields were. For instance, if we had

type Pair[S,T] {
val _1: S
val _2: T
}


then if E were an ftype, we could also allow Pair[E,E] to be an ftype.

Assuming all of the type theory works out (which I think it does, because this is just a restriction of an existing type theory), it should be straightforward to serialize/deserialize any instance of these types. Namely, the ftype fields are stored as whatever representation of finite set we want (integers, namespaces, finite sets of UUIDs, etc.), and the defs are stored as arrays/tries/hashmaps/etc. val fields are stored as normal.

Extensions

It could be possible to give a category of instances for each type, where the morphisms were â€śdiffsâ€ť. The category of diffs for a primitive type like Int would simply be a codiscrete category, but the category of diffs for a ftype would record what elements were added and deleted.

Each of the type formation operations could then act on these categories. For instance, a product type would simply be a product of the categories of diffs. A sum type might be the â€śconnected sumâ€ť of two categories, which is the coproduct, but also adds a unique morphism between each pair of objects in the opposite sides. So there would be a unique diff from Left(x) to Right(y), but possibly many diffs from Left(x) to Left(x'), depending on the diffs from x to x'. Iâ€™m not totally sure how to interpret the existential types yet; the success of this depends whether or not there is a clean way of doing this.

Along with this, there should also be a notion of migrations between different versions of the same type. For instance, it should be possible to add a new optional field to a record type.

So essentially, Iâ€™m hoping that this could form the core type theory of chit.

Another extension is to add laws to the types. The laws could be first-order logical statements in which the only quantification was over ftypes. Thus, the laws would be decidable.

And of course, it becomes very tempting to add function definitions as well. At a certain point, this just becomes its own programming language if we do this too much, and my hope is for this to not be its own programming language, rather this should be compiled into types in existing programming languages. Which is not incompatible with having function definitions, but ideally restricts the scope of function definitions somewhat.

We could also consider ftype to stand for finitely presented types, or add a new judgment fptype. This would add all sorts of complications, but might be possible. The serialization of a finitely presented type would be the list of generators, along with an e-graph expressing the equalities between those generators.

Where to go from here?

I would be very happy if someone picked up on this and fleshed it out, either on the theory side or the implementation side. I canâ€™t focus on this right now, because it is too greenfield and I need to deliver on other projects, but if you are interested in working on this, please contact me.

1 Like

With some judicious use of the subtyping and path types used in the DOT calculus, it might also be possible to recover some of the features of nested acsets. Consider:

type Ports {
ftype In
ftype Out
}

type HasSrcTgt[T <: Ports] {
val src: T
val tgt: T

val srcPort: src.Out
val tgtPort: tgt.In
}

type DWD {
ftype Box <: Ports
ftype Wire <: SrcTgt[Box]
}


Also, from the F-ing modules paper:

My idea is essentially to have â€ślarge, small, and very small typesâ€ť!

Another extension of this: sorted finite types, with affordances for comparing two elements, getting the elements in sorted order, making a new element that is â€ślarger than everything elseâ€ť, â€śsmaller than everything elseâ€ť, â€ścomes right before element xâ€ť, â€ścomes right after element xâ€ť. This would be implemented by assigning each element an integer key, but those integer keys could be reallocated opaquely.

For instance if the keys were â€ś2,4â€ť, and I wanted to put an element before 4, then I could just put it at 3.

But if the keys were â€ś3,4â€ť, and I wanted to put an element before 4, then I could â€śgarbage collectâ€ť, remake the keys to be â€ś0,10â€ť and put the new element at 5. With sufficient spacing, this garbage collection should happen rarely, so it should be efficient.

Simpler alternative to previous wiring diagram schema:

type HasPorts {
ftype In
ftype Out
}

type DWD {
ftype Box <: HasPorts
ftype Wire

type OutPort {
val box: Box
val port: box.Out
}

type InPort {
val box: Box
val port: box.In
}

def src(w: Wire): OutPort
def tgt(w: Wire): InPort
}


Variation with outer ports:

type HasPorts {
ftype In
ftype Out
}

type DWD <: HasPorts {
ftype Box <: HasPorts
ftype Wire

enum SrcPort {
case Inner(box: Box, port: box.Out)
case Outer(port: In)
}

enum TgtPort {
case Inner(box: Box, port: box.In)
case Outer(port: Out)
}

def src(w: Wire): SrcPort
def tgt(w: Wire): TgtPort
}


Putting things in the boxes via refinement types.

type FilledDWD[T <: HasPorts] <: HasPorts {
val dwd: DWD
type In = dwd.In
type Out = dwd.Out

def content(b: dwd.Box): T { ftype In = b.In, ftype Out = b.Out }
}

enum NestedDWD[T <: HasPorts] <: HasPorts {
case Primitive(value: T) {
ftype In = value.In
ftype Out = value.Out
}
case Nested(filled: FilledDWD[NestedDWD[T]]) {
ftype In = filled.In
ftype Out = filled.Out
}
}


A putative â€śJulia-likeâ€ť syntax for this:

  @record FancyGraph{T} begin
name::Symbol

V1::TABLE
V2::TABLE
E::TABLE

label(v::V1)::Point{T}
label(v::V2)::I64

src(e::E)::V1
tgt(e::E)::Either{V1, V2}
weight(e::E)::Map{I64, I64}
end

@record FancyGraphHom{T} begin
dom::FancyGraph{T}
codom::FancyGraph{T}

apply(v::dom.V1)::codom.V1
apply(v::dom.V2)::codom.V2
apply(v::dom.E)::codom.E
end

@record FancyGraphCospan{T} begin
leftfoot::FancyGraph{T}
apex::FancyGraph{T}
rightfoot::FancyGraph{T}

leftinj::FancyGraphHom{T} where {
dom = leftfoot
codom = apex
}

rightinj::FancyGraphHom where {
dom = rightfoot
codom = apex
}
end

@record Ports begin
In::TABLE
Out::TABLE
end

@record InPort{T<:Ports} begin
val::T
port::val.In
end

@record OutPort{T<:Ports} begin
val::T
port::val.Out
end

@record DWD <: Ports begin
Box::TABLE{Ports}
Wire::TABLE

src(w::Wire)::OutPort{Box}
tgt(w::Wire)::InPort{Box}
end