The more I think about T-expressions the more I feel that the dotted (setlike) portion really needs to be recursively decomposable. This means that in, eg, a T-expression like

(a . 1 2 3)

the expression ‘really’ needs to be represented internally as

(a . 1 . 2 . 3 .)

and then the parsing and printing convention, is just to suppress all dots after the first.

This means that a setlike expression breaks down into a pair of an element and a set, and the final set is always the empty set (.)

We could call such element-and-set pairs *unions*, to distinguish them from the more conventional conslike *pairs*. (We could also call a union a *column* and a pair a *row*).

Types would break down as:

atom

– symbol- number

cons

– union (starts with .)

– term (starts with |)

– pair (starts with .)

Under the more strict (and simpler) T-expression semantics I’ve been thinking about recently, there would be no auto-expanding of setlike expressions inside other setlike expressions. This makes finding equivalents for car/cdr much simpler:

first -takes the first element of a union, or simply returns a non-union

rest – takes the second element of a union, or simply returns a non-union

head – takes the first element of a pair

tail – takes the second element of a pair

The tail of a T-expression pair is *always* a set, even in the case of a simple list such as (a b c), which makes parsing them quite different from S-expressions. To get the equivalent of (cdr X) you would have to take (first (tail X)).

It might be useful to define car as (head first) and cdr as (tail first).

A list probably isn’t the best way to store sets internally. An obvious alternative would be a hash or tree structure (say a red/black tree), and it might be useful to have varying kinds of internal representation. But we still need a way to construct and serialise sets, and that’s probably going to need some kind of canonical ordering.

It becomes especially important if we want to have a Prolog or Kanren like language produce a sensible datastructure for its output: it would be nice if all predicates could return a first-class set of results rather than an ‘answer list’ which is almost, but not quite, entirely unlike a set.

Canonical orderings of computed sets is probably a complicated subject. There’s probably no way to optimise for both searching and generating a computed set. In the Prolog/Kanren way, I think the answer list of a predicate would be in a defined order but not necessarily one well-suited to searching. The idea would be to avoid even generating more sets than necessary.