## Logical Relations

April 19, 2017

I find myself still turning back to logic as a guide to the meaning of set-relation structures. Starting with term-expressions:

() asserts empty truth or completeness; it usually means the end of a statement

(A . B) might represent a logical statement, with A the predicate and B the subject, but this interpretation is too context-specific. Instead we can say that (A . B) is a logical phrase, a unit much smaller than a statement: in some space of meaning, A asserts B.

(| . X) however is a logical statement: a semantic unit rooted in a definite namespace.

(|) is empty falsehood or the empty set; if a statement ends in this, the statement is discarded.

(| and A . B) asserts both phrases A and B in the current space

(| all . X) asserts all phrases in X, ie (| all A . B) = (| and A | all B)

(| all) = (| all . ()) = (|)  = (| and (|) | and (|)…)

which suggests that perhaps () = ((|)), ie empty truth is an infinite sequence of empty falsehood. Is this stable? It might be! It would be the ‘sequence-ness’ that acts as the structuring element…

## Unifying NIL and NULL

April 6, 2017

I think I can unify the two nulls in T-expressions: NIL () and the more dubious NULL (.), which is a big win. However this is only the case if I abandon the idea of T-expressions being unions, and move back one step to a more direct representation of sets and relations. Thaf’s perhaps a necessary and more rigorous set-theoretic foundation though.

As a simple set literal, composed of sets, pairs, atoms and NIL, we can get relations like this:

(a) = (a . ())

(a . () (b) c) = union of (a) (a b) (a . c)

So both sets and pairs (if they have the same CAR) can be unioned, but they remain different entities.

And nested sets don’t get deconstructed:

(a . (. b)) not = (a . b)

The CDR of a pair is always a set,

This seems natural and sensible, though it does not map to the semantics of either dictionaries or binary relations, in the sense that a set containing NIL or atoms cannot be expressed as a relation.

We can do a ‘lookup’ operation  (a parameterised CDR), but it can only be defined over pairs within a set, not atomic values. We would need an ‘exists’ to test atomic values.

There’s an annoying kind of ‘handedness’ to these pairs, where the left side must be unique within a set and the right side must be a set, but it’s been there in T-expressions all along, and comes from the S-expression list format where we have to pick left or right associstion. (And from my strong desire to avoid duplicating data). It seems worth it to get the simplicity of lists though.

Although we win by uniting the two nulls, we also lose by separating sets and pairs; a pair is no longer a set-of-one-pair or a one-row table. It feels closer to the ‘bare metal’ of the mathematics though.

## Mail As Publish-Subscribe

January 12, 2017

The general unifying vision I keep angling toward is a distributed data publish-subscribe-compute network roughly like Ted Nelson’s Xanadu. (With the strong caveat that I’m very concerned about the privacy danger of distributed computation: any practical distributed-computation fabric must have the ability to restrict computation to trusted locales in that network).

For the primitive data element, I want something like S-expressions extended to allow relations and data types: that’s why I’m interested in something like term-expressions as a way of extending S-expressions in an unambiguous, user-extendable way.

For computation, I want something pure-functional; preferably based on something like Lisp’s fexprs, so that any generalised expression can be computed and a generalised relational data-representation language, something like Prolog, built on top. This is why I’ve been chasing the idea of variable-free relational programming for many years, mostly unsuccessfully; it feels like a way of building relations out of function-like operations. One of the stumbling blocks to me – other than the issues around multivalued logic, the handling of change and negation in logic programming, remains the question of just what a logic variable and the unique ‘unknown’ value associated with one, might represent in a functional paradigm. They appear to be concepts which simply (mostly) don’t map to the functional world. (Except for Scheme’s concept of variable location; which exists only because of mutation, which is not a pure-functional concept.) This seems very odd, since functional programming derived from logic. Presumably this is because it’s a subset of logic.

(I feel strongly against ‘types’ as they exist in current functional languages; for a number of reasons, but largely that they restrict expressiveness and cannot possibly be a fundamental language element. How could you describe a type annotation, except as a smaller, more primitive assertion in a more fundamental data-representation language? Then give me that more fundamental language and we can begin to talk seriously about all the data I need to represent – not just the computations I wish to perform on that data, which is a much smaller set. What is the type of a file, for example? What’s the type of a JSON node? What if your entire filesystem and memory storage was made of something like JSON nodes, and that was all there was? What could the ‘type’ of a node possibly mean then, other than another node attached to it – as a Lua metatable is, perhaps? So give me the ‘attachment’ mechanism you would use to implement your ‘types’; remember that you will be me, and I may be you; the roles of ‘user’, ‘programmer’ and ‘language designer’ are fluid and will change over time).

For I/O, I want something like functional reactive programming, so that we preserve pure-functionality and the UI can be simply described as a function over a stream of input and output events. I’m still waiting for FRP concepts to stabilize though. I like Elm as a proof of concept, but I dislike its typefulness for all the reasons above.

For the network component, I want something like content-centric networking, where data packets or blocks are coded uniquely and cached persistently for as long as possible, moving through the network to wherever demand is. With connections flowing between them as changes occurred.

So I’m wondering how something like email – or ‘message passing’ generally – might appear in a publish-subscribe format. In pub-sub or FRP, we would not really think in terms of ‘message sending’: rather, we’d make small local changes or assertions to a local object or relation or data entity. Then, should others choose to observe us, they would pick up our changes, but only if they were observing.

For a lot of purposes – think Facebook or Twitter – what we want to do isn’t send point-to-point messages. We very often want to broadcast updates or posts. We’d like to keep then the ability to smoothly change between broadcasts and narrowcasts.

One way might be: we create outgoing mailboxes. Imagine we have a folder structure to our sea-of-nodes. So Alice’s public feed might have  \alice\public\out\bob\  .. and Bob would have an \bob\private\in\alice\  (Handwaving over just what public and private might mean in a distributed network: let’s say only public gets automatically exported out of a local LAN-like physical network, via something like a high-level firewall). And Bob’s definition of \in\alice is something like a function: ‘include \alice\public\out\bob’ – and anything then that Alice creates under \out\bob gets automatically transcluded (in the Ted Nelson sense) into that folder.

How does the information get from Alice to Bob? It seems like there must be an intermediary, a routing node where he could find \router\alice\ . And that’s maybe where we would need to reinvent 50+ years of TCP/IP and SMTP routing theory, since there would have to be some kind of overlapping hierarchy/heterarchy of such routing nodes. And we would need to deal with how to send connect/disconnect messages between them, how to keepalive and restore broken connections and garbage-collect dropped ones.

A bigger question though: how does Alice ‘push’ to Bob, assuming this is something we want? (We don’t always; that’s the spam problem. The cool thing about Facebook/Twitter is it’s mostly NOT push but pub-sub). How might Bob discover Alice for the first time? He hasn’t subscribed to her by creating a function. How does he know she wants to connect? Perhaps we need another channel for initial notifications: \bob\connect\in\… , \alice\connect\out…  and they both subscribe to their local router, which subscribes to its upstream routers… and…

This is where I start to get lost. There must be a simple way of thinking about the bootstrap problem between routers, and also the problem of how to only request the information for which active subscriptions exist. Do we need at least the primitive concept of ‘send a subscription / notification / poll message’, or can we even reduce those to pure-functional observations?

It seems like we ought to be able to: after all, the lowest level of a network is a wire (or radio wave) between two devices. No device ever actually sends a message: they just put a formatted packet on the wire and pull off all packets with their address on it, ignoring the others. So what might the lowest ‘subscription message’ look like if it’s not a message but a sort of broadcast?

Ie at the low level you might /ethernet/packet which is the current packet addressed to this machine…. and then something like an ARP table as /machines/ips/192/0/0/1… but it’s what might go above that that gets tricky.

## Everything In Language Is A Noun

January 12, 2017

An odd little idea which may or may not have some merit:

What happens if we think of every kind of natural language utterance and its component parts of speech as, essentially, a noun?

This is basically taking the object-oriented view of information, and pushing it to see how far it will go. After all, we already describe language utterances with nouns: book, chapter, scene, paragraph, sentence, subject, verb, word, syllable, phoneme. These nouns describe chunks of speech (text, structured information). If we are thinking – as I am – in a Prolog sort of way, where our entire universe (with not much of a distinction between ‘real’ or ‘imaginary’ inside a simulated world) is literally made up of relations.. which are somewhat object-y… Does it help us at all if we think of the less traditionally-nouny parts of speech (verb, preposition, adjective, adverb…) as also nouns that simply happen to be related to other nouns, forming clusters?

I think perhaps it does!

At least if we look at English, we see that many of the words we consider verbs or adjectives also function as nouns. Also, we often communicate in a kind of telegraphic code: short, abbreviated chunks of speech which aren’t whole sentences yet still convey information. Very often the information we are conveying is noun-like. Eg:

“run” -> the act of running. A verb, but it’s also an abstract concept, and a noun. Someone or something is running? A continued series of events? The performance of a programmed action? A command to a human to run, or to a computer to start a program?

“will run” -> not a sentence; a fragment of speech; an action, perhaps; but it could be valid in response to a question: “Will she or won’t she run?”. “Will” by itself is, unsurprisingly, a noun: intention, violition. It’s only when associated with another noun that we see a combination forming. You could still say it’s  a nounlike cluster: “the will to run”. Just who is doing the willing is not yet clear; it may be generally “the universe”, or it may be a person who possesses the intention. It’s a function of human language to require context for full interpretation, and to have the back-and-forth of dialog to clarify missing information.

“she will run” -> attaching the pronoun “she” clarifies that this is a female personlike or animallike entity (not necessarily human; even the idea of “person” overlaps both the categories of “human” and “animal” in human thought).

Now we have a fully-functioning sentence: though it still requires a lot in the way of context to be fully grounded. You could say that “will” is the verb, or the primary idea of the sentence, but this is where nounlike thinking leads us in interesting directions: we have three candidates for nouns and they’re all valid. Is the main idea of the sentence that “she” has some attribute (the will to run)? That “will” is the main idea? That a “run” is the main idea? All of them, I think, are sensible and useful handles.

The more interesting part is that we’re looking at a thing, a kind of clustered object made of other, similar objects. And that human language is not actually as fixed in its categories of grammar as we often imagine. We infer a lot from context; there are always multiple interpretations of any utterance. What seems sensible to me is that there really are no fixed rules; just a lot of inferences. If there are rules, they must be very basic, something like sequence or closeness: passage of time, orientation in space, nested containers. The rest is learned.

We can extend our sentence:

“she will run tomorrow”; “tomorrow she will run” -> adding the idea of “tomorrow”, which is of course a noun; the day after today. But we can add it in various places. It still sits there, as an associated idea in a cluster of ideas.

Now imagine how we get from here to paragraphs (sequences of sentences); dialogues; poems; chapters; the whole idea of breaking arguments into steps, stories into scenes, genres, etc. Imagine that we didn’t know anything more about “the grammar of a sentence” or “parts of speech” and were looking at these higher-order clusters of… language-stuff. What “grammar rules” might we come up with? And what might those rules be composed from? When we think in a metaphor of “being composed from” anything, aren’t we invoking … nouns again, as a fairly basic idea?

For a structured grammar based on nested sequences (eg, S-expressions) it’s useful to think of a “head” of a clause, with a strict hierarchy of subcomponents. But human utterances seem like they can break down in several ways; it seems possible to me that we think of them as mental “things” comprised of loosely attached clusters of “stuff”; we recognise what sub-noun is the most important one, perhaps, by grouping it into a container of some kind. That’s if our mental architecture works on anything like our physical intuition of space and containment – but we do have that intuition, and a lot of our language metaphors seem to based on it.

Also think of the single word:

“run”

It might be a command. What makes it a command? Most likely, simply the context of the listener. If the listener is looking for something that resembles “an action”, they have the option to perform that action.

I think there’s some potential here. If designing a natural language interface, try thinking of the parts of speech as nouns. Think of combinations of them as nouns. After the parsing step, you are going to structure them into noun-like or object-like entities anyway. Why not save yourself some trouble and start by picturing them as nouns right away? Also, if you’re communicating any kind of interesting idea, you’re going to want to structure it into chunks. What are those chunks except nouns? Finally, you will certainly need to reify verbs so you can describe the act of verbing. Much better, and simpler, to not have to reify at all: just assume that a verb is a noun: an action, a transformation, a temporal relationship.

The most important idea then is that nouns can be clustered or contained; that nouns can be modified or inflected by others; that they can form structures or classifications based on some form of relative-linking.

This post could be thought of as a rejection and inversion of the classic Java rant “Execution in the Kingdom of Nouns“. I dislike Java, but one of the problems I have with it – and most formulations of OOP – isn’t that it’s noun-based! It’s that it embeds unexamined and unalterable primitive ideas other than nouns (methods, classes, access modifiers). There are so many very simple ideas – such as, eg, ‘I have a meeting at 3pm’ – you simply can’t express easily in a class-based OOP language. (You can probably express that in JSON; you’ll run into trouble though with anything much more complex because of the fact a JSON object is not a map, and it’s certainly not a relation.)

But also: a simple idea should be simple. You really don’t need the machinery of function evaluation and class hierarchies and a simulated operating system and virtual machine to express, eg, “here is a list of appointments I have today”. And so you shouldn’t need to break out of your ‘programming language’ into something like JSON, XML, CSV, flat text, to express that. The point of a language, after all, is to link concepts together. If you have to switch languages to express a different (let alone) simpler concept, you’ve kind of missed the point of literacy. You could certainly have multiple vocabularies and libraries (nouns again); but those are not separate languages.

One the ideas I keep working toward is this: everything inside a computer should be made of the same fundamental ‘stuff’. If you’re going to have stuff, an noun-ish thing seems like the thing to make stuff out of.

January 8, 2017

In short: using the standard deductive rules of logic – and the crucial ability to name ‘this sentence’ as a logical element – you can deduce any X from ‘if this sentence implies X, then X’. This is obviously wrong and unhelpful. But what part of standard logic does it imply is broken?

1. Any human will immediately reject such a silly self-referential sentence as a false and distracting toy problem. It’s intuitively obvious that the truth of something outside of a logical sentence (ie, X) does not depend on that sentence – there’s no causal connection, and logic is about causal connections. However we don’t have any obvious way to package this intuition in standard logic.

I think this idea of causal connection is what relevance logic tried to package up, but I still don’t have a way in to understanding relevance logic.

I’m more interested right now in the elimination of disjunction introduction, because it seems like a mostly unexplored area in mechanical logic that’s nevertheless lit up with big flashing intuitive red lights. Eg: in human logic, ‘if X then Y’ almost never means the standard logical ‘material implication‘ (either/both Y or not-X). Rather, I believe in human logic the truth of an implication (and by extension, a disjunction) is a piece of data entirely separate from a truth-function over the current state of the logical world.

A human will look a Curry Paradox statement, say ‘yes, X may or may not be true but that has no bearing on the truth of this sentence. And so I judge this sentence to be false because it does not correspond to outside reality.’ So we naively understand something that the material implication rule doesn’t.

One reason the Prolog prover (Robinson resolution) is so powerful and yet mostly paradox-free is exactly because it disallows such mechanical inference of disjunctions during proofs. It seems though, that it would be much better to raise the principle that ‘you can’t infer a disjunction without cause’ to a higher level than just ‘a proof procedure’. It really seems a very basic kind of requirement for any mechanical logic that wants to resemble human thought.

2. Despite this gap between material implication vs human-thought implication, the Curry Paradox still occurs if we reformulate it more clearly in English as a disjunction: ‘either X is true or (this sentence) is false’. But with the strong human-language version of ‘if’ out of the way it becomes clearer that it’s the old problem of non-terminating recursion, which seems to bottom out to the case, again, that Boolean logic is incomplete: we need at least a value for ‘null’ as well as ‘true’ and ‘false’.

So the problem then becomes: what model of three-or-more-valued-logic is useful? Boolean logic is simple and closed! But it’s just a bit too simple.The recursive form of the Curry Paradox, I think, comes down to the Boolean term:

‘X or not (X or not (X or not (…’

And if X is false then it comes down to

(not (not (not…

Which is very easy to see – just like the mathematical expression

-1 * -1- * -1…

is non-terminating, so should probably be short-circuited to ‘null’.

But does a logical null give us problems? It gives us all sorts of problems in SQL, for instance. Is SQL simply doing it wrong, or is there a deeper problem?

3. Is a Curry Paradox sentence any sillier, really, than either ‘for all X, true implies X’ or simply ‘for all X, X’?

Either of those, as a logical statement, would be true by definition. It would be ‘false’ by reference to the outside world, but it would still be ‘a true statement’ because we are asserting it, and all assertions are taken as truth. ‘for all X, X’ is a consistent statement, and logic at best can only prove consistency not truth. So… if you don’t want a vacuous infinite truth generator in your database, maybe don’t put it into the database in the first place?

Or is that applying something too much like common sense to the domain of mechanical logic?

Looking at infinite arithmetic for an inspiration for multi-valued logic gives us:

True = +infinity
False = -infinity
Neither = 0
Both = NaN (eg, a set spanning the entire number line)

The logical operators, perhaps, then become:

And = Min(A,B) – so ‘+ and – and 0 and NaN’ = –
Or = Max (A,B) – s0 ‘+ and – and 0 and NaN’ = +

+ becomes something like ‘repeated assertion’, so True + False = +infinity+-infinity = NaN
* becomes something like ‘negation as an operator’, so False * False = True = -infinity * +infinity

So it seems like we should be able to drop Both from logical statements using And/Or, and NaN from computations using Max and Min operators. A valid question seems: where does NaN – a set spanning the number line – sort on that line? Is it less than -infinity and greater than +infinity? I’d say not myself. But for the purposes of Max/Min (and/or) it is both less than and greater than 0. I’d guess not equal to any other value though. Is it equal to itself? A good question. SQL Null isn’t. How much do we lose if we allow NaN to equal NaN and Both to equal Both? And are either Neither or Both equal to SQL Null?

An interesting random note about the three-valued logic of Jan Łukasiewicz (p4) shows that it comes to different conclusions for the ‘implication’ operator than using infinite arithmetic as a base: he decided that ‘A imp B’ should be True, not None, if A and B were both None.

He isn’t exactly clear about why he makes this choice. He says in 1930 that \the desired equations I obtained on the basis of detailed considerations, which were more or less plausible to me”. But it isn’t easy to find explicit considerations (though I’ve only done limited homework). So what’s going on?

Well, note that in the alternative table with I’ in the central position in the matrix, there will be no tautologies in the logic at all (meaning now a proposition that always takes the value T, whether its atomic components take T, F or indeed I). For on that alternative table, complex propositions built up entirely from constituents taking the value’ I will always take the `value’ I. This, perhaps, Lukasiewicz took as step too far away from classical ideas

If we use infinite arithmetic as a base, we can define material implication in the usual way: B or (not A), which comes out as Max(B, -A). If A and B are both Indeterminate/None, which takes a numeric value of 0, then Max (0,0) will obviously be 0. So arithmetically speaking, (I imp I) = I = Max(0,0) = 0, however surprising that might be for classical logic. This points us clearly away from Łukasiewicz, and towards the semantics of Kleene/Priest logic.

## T-expressions: Pairs and Unions

October 27, 2016

The more I think about T-expressions the more I feel that the dotted (setlike) portion really needs to be recursively decomposable. This means that in, eg, a T-expression like

(a . 1 2 3)

the expression ‘really’ needs to be represented internally as

(a . 1 . 2 . 3 .)

and then the parsing and printing convention, is just to suppress all dots after the first.

This means that a setlike expression breaks down into a pair of an element and a set, and the final set is always the empty set (.)

We could call such element-and-set pairs unions, to distinguish them from the more conventional conslike pairs. (We could also call a union a column and a pair a row).

Types would break down as:

atom
– symbol- number
cons
– union (starts with .)
– term (starts with |)
– pair (starts with .)

Under the more strict (and simpler) T-expression semantics I’ve been thinking about recently, there would be no auto-expanding of setlike expressions inside other setlike expressions. This makes finding equivalents for car/cdr much simpler:

first -takes the first element of a union, or simply returns a non-union
rest – takes the second element of a union, or simply returns a non-union
head – takes the first element of a pair
tail – takes the second element of a pair

The tail of a T-expression pair is always a set, even in the case of a simple list such as (a b c),  which makes parsing them quite different from S-expressions. To get the equivalent of (cdr X) you would have to take (first (tail X)).

It might be useful to define car as (head first) and cdr as (tail first).

A list probably isn’t the best way to store sets internally. An obvious alternative would be a hash or tree structure (say a red/black tree), and it might be useful to have varying kinds of internal representation. But we still need a way to construct and serialise sets, and that’s probably going to need some kind of canonical ordering.

It becomes especially important if we want to have a Prolog or Kanren like language produce a sensible datastructure for its output: it would be nice if all predicates could return a first-class set of results rather than an ‘answer list’ which is almost, but not quite, entirely unlike a set.

Canonical orderings of computed sets is probably a complicated subject. There’s probably no way to optimise for both searching and generating a computed set. In the Prolog/Kanren way, I think the answer list of a predicate would be in a defined order but not necessarily one well-suited to searching. The idea would be to avoid even generating more sets than necessary.

## The Primitiveness of Primitives

September 26, 2016

It’s generally understood, I think, at least in theory if not in practice, that complex systems are best constructed out of a small set of universal primitives which have a balance between expressive power,  and enough restriction to make them able to be reasoned about. It’s usually a good thing if the primitives can then be recursively composed into high-level components or operations that allow higher-level reasoning. (Ie, stacked inside each other at arbitrary depth: as, for example, both ‘data types’ and ‘functions’ can within a language, even one as reasonably primitive as C).

On the planet-spanning Internet in 2016, you might think that we had evolved such a sufficiently powerful and flexible layer of software composition. Rather incredibly, after nearly 40 years, we find that we really only have two sets of universally agreed on primitives to connect all our software. And neither of them are recursively composable.

1. Within a computer:
2. Write a machine word to RAM
3. Execute raw machine code at a RAM address
4. Call an operating system function using an API based on (1,2,3) plus, sometimes, ‘machine interrupt’.
2. Between two computers:
1. Send a TCP/IP packet (of bytes, not machine words) to (IP address, port)
2. Receive and process a TCP/IP packet on a process listening on a port

There’s a lot more, of course. Vast stacks and layers of APIs and abstractions. But there’s not much in the way of universally agreed on layers above these two.

Eg, we have HTTP: GET a document at a DNS address, POST a reply, maybe (if you’re lucky) PUT a document (but you’re more likely to have to use some not-especially-well-adopted extension like WebDAV, CalDAV, etc).

You can’t, though, use HTTP to abstract all of TCP/IP. You’ll have to use a whole patchwork of different protocols to get different results. SSL/TLS for encryption may or may not be available for every protocol. You can encapsulate things, badly, in various VPN and tunnelling protocols that are extremely brittle and fragile. You can’t even abstract all of IP with TCP. IP packets remain the only universal lingua franca between computers. (And sometimes not even that – eg Bluetooth, NFC, and all the other Layer 2 protocols – but we’re being optimistic here).

Inside a computer it’s far, far worse. We have such a Babel of languages and binary formats that put data in a machine address and execute code at a machine address remain the only universal connections between our language stacks – and it’s pretty obvious, looking at these, why we have such a security disaster: because executing raw machine code truly is Turing-complete in all its horror. It’s like a society built on sending messages by injecting yourself with used syringes. Once you execute raw machine code – unless you’re 100% certain you hand-loaded it and triple-checked it with mathematical proofs – you’ve pretty much lost any security guarantees you ever had.

It would seem obvious that these two devices (send an IP packet, and execute machine code at an address) are incredibly poor as universal abstractions! Neither of them actually correlate to what’s taught as the ironclad foundation of computing in computer science classes: function evaluation. Sending an IP packet *might*, roughly, correlate to an ‘object call’, except that it’s incredibly coarse-grained (machine, port), there’s no notion of object identity, creation or destruction, and there’s no mechanism for marshalling or referencing other objects.

We do have, of course, protocols for ‘remote procedure call’. Many of them. RPC, SOAP, CORBA, DCOM, etc. They are not, however, universally used: there’s no agreement on which is appropriate when, and the answer very often is ‘none of these are appropriate’. Their semantics are complex and all subtly different, which seems kind of odd since ‘function application’ is something you’d think would be mathematically fundamental, simple, and unchanging.

What, I wonder, might be the simplest possible safe and composable primitive for communicating functions, objects or processes (preferably not making any kind of type distinction) both within a machine process and to arbitrary remote machines?

By ‘safe’ I mean: one which does not (like ‘execute machine code at this address’ does) open an entire machine state to manipulation by remote actors, but allows only strictly bounded amounts of 1) machine state change and 2) usage of transmit/compute/storage resources?

By ‘composable’ I mean: a single primitive type that can be user-defined and stacked inside itself, using controlled recursion, to create arbitrarily complex structures or operations.

You’d think this would have been the #1 priority to solve back when the Internet went live, and yet, here we are decades later, and it hasn’t been, and the Internet’s on fire.

A generous evaluation as to why this is: the design goals of the Internet (and, well, Unix and C, because that’s what drove most of it) never even remotely included ‘safety’ or even ‘composability’, and certainly not ‘privacy against hostile nation-states and organised crime’. It was simply assumed that all Internet users would be sufficiently smart and sufficiently careful. What the developers of the Internet, C and Unix wanted – and got – was simplicity of implementation, speed of execution and absolute freedom for the endpoints. Period. Send a message, read/write RAM, execute code. Everything else is the programmer’s problem.

The problem is that we now know, that’s not enough. We need a communication/computation fabric which can make some basic guarantees (some of which are potentially in conflict): that you can send a message that’s not been overheard or altered in transit, that you can identify the source of a message and refuse receipt of irrelevant or fraudulent messages, and that you can limit the resources a message uses or the changes it makes to your state. On the other hand, we don’t really want another OSI model. But can’t we at least get something like ‘function call’? With something just a little more structured than ‘raw array of bytes’ and safer than ‘C pointer’ as the universal message type?

## Negation as a Type

September 26, 2016

If the expression

(|not x)

were something like an instance of a datatype, what might ‘not x’  mean as a value? How would one use it?

I imagine there’s not much you could do directly with such a datatype other than 1) add it to a collection (such as |all or |only), 2) search that collection to decide if it asserts or does not assert a value (and obtain a collection of those values), and 3) extract the ‘positive’ or ‘absolute’ value from it (ie the statement which is being negated).

It seems odd, though, that the idea of such a ‘negation type’ doesn’t seem to exist as such in the usual literature. It seems like an obvious generalisation of basic arithmetic to logic. Perhaps it exists somewhere in the literature of modern (computing) type theory? But which type theory, I wonder?

## Delta terms and functors

September 22, 2016

We can construct a workable dataset delta – a set of logical propositions showing change in a set of logical propositions – by just adding a (|not) to our existing (|all) expression that indicates a conjunction (something similar to, but not quite like a set – depending on whether or not we want to give it a hard boundary like a set has). But for practical use it would be nice to add a second term: (|only).

‘All’, ‘not’ and ‘only’ act here like the operators +, – and = from my earlier idea of ‘delta expressions’, just mapped into term-expression syntax.

The usefulness of (|only) is that we have a lot of cases in real-world data where a given property or slot of an object-like structure has a many-to-one relationship (ie, a property that can only take one value) or where we want to express the idea of ‘updating’ it rather than just ‘adding a new piece of information to it’. For searching, it’s also really useful to encounter something like (|only), because it means we can stop the search at that point, making it more efficient.

I thought we could maybe get away without needing (|not) entirely, but I don’t think that will work. So (|all), (|not) and (|only) are I think a minimal useful set of such operators (or functors, in Prolog-talk, which expresses perhaps better that these aren’t necessarily functions or function-like things: they’re syntactic markers in data which might have standard interpretations, but don’t have to rewrite to a different value).

The hope is that we could use one set of standard functors to make a data format that can describe both changes to information, and static information. So we could have, eg, a person record, using a functor to be something like a datatype indicator:

(|person |all (name |only Fred) (age |only 42…) (hobbies stamp collecting) (|not smoker))

etc. Where we’re explicitly flagging in a record that certain information is authoritative, or overrides any other information.

Marked up like this, ‘only’ (and ‘not’) start to look a little like data types themselves. I’m not sure that they work quite the same as the more familiar data types we know in computing, but it’s interesting to think of them that way: that (|only . x) might be a ‘typed’ or more narrow version of x, and that wherever we see a (|only…) expression we know it obeys a particular law: that it replaces information rather than adding to it.

A secondly, slightly related, thought:

Thinking a little more about functors (ie, heads of term-expressions) and how we can assign meaning to them: it seems the meaning of a term s probably a function of 1) the functor definition, 2) the body of the term, and 3) the context (ie, all the expressions ‘to the left’ of the expression). The context (3) being roughly equivalent to, or a generalisation of, the ‘dynamic environment’ as opposed to lexical environment of fexprs (and vau expressions in vau calculus).

I’m wondering now if we could make that idea of context a little more restricted by forcing it to be the functors of all enclosing expressions, not just ‘all terms in all enclosing expressions’. Would that lose us a useful degree of freedom?

Eg: say we had a (|let) form that worked much like Lisp. If we weren’t restricting context to functors, we could say

(|let (a 1) |a)

such that |let binds the name a to the value 1, |a refers to that bound value, and we could then say that the meaning of the new functor |a we’ve created is defined by the enclosing (|let) context. But the information that defines the meaning of that functor isn’t in the enclosing functor, it’s in the term body. If we restricted meaningful context to functors, we might have to write that as

(|(|let a 1) |a)

Which is syntactically clumsier, but does put all the meaning into the head of the term: we’re creating a new combined functor (|let (a 1)), and using  that to give meaning to the otherwise undefined functor-call (|a). Among other things, this seems like it makes tracking the ‘functor environment’ a lot simpler. It’s just a straight list of all the heads of enclosing terms. Otherwise, I think it would have to be quite a large expression.

(We still probably wouldn’t want functors directly accessing that context term except in very restricted ways as it makes their meaning hard to predict. )

We could *maybe* relax even more and assume that a list as a functor is a term

(|(let a 1) |a)

but really not sure I like that at all – we might want a list-as-list to mean something.

This takes advantage of something that the HiLog people have pointed out for a while, which is that there might be an improvement in expressiveness by allowing functors to be complex terms. But do we win by enforcing rather than just allowing that?

If we were able to rely on something like currying, such that the meaning of every token in the body of a term was completely defined by the expression syntactically to the left of it, we could use the list of all such left-expressions. But I’m not sure we can rely on currying – it’s just too useful to distinguish functors by arity, so that (|frob bla) can mean something different from (|frob bla1 bla2).

Potentially messy to look at, but perhaps interesting.

In normal Lisp, we’d handle this by restricting the idea of ‘meaning’ to ‘environment’. It’s just that I’m trying to generalise from Lisp toward much more data-centric systems which may not necessarily be doing evaluation.

## A For Loop for Logic

September 21, 2016

Taking an opposite turn from the exploration of ‘generalised OR as computation’, which is about peeling apart the atoms, and putting them together: here’s a mostly Prolog-like logic-programming clause which works a bit like a function, mixing variable definition with an implication.

Using term-expressions (ie, writing | in a list to indicate that it’s the start of a Prolog-like logical term), and defining the functor ‘for’ to declare variables, we can have:

|for (varlist) (pattern) … list of conditions

eg:

(cat likes |for (x) (x) (x isa fish))

This is almost, but not quite, like a Prolog clause-and-implication structure. It does three things at once:

1. Declares a new variable ‘x’. (As a Prolog clause boundary does, and as Kanren’s ‘fresh’ operator does). The syntax here assuming that we just replace any occurrences of the declared variable tokens as variables. That’s maybe a bit ambiguous, but is simple.

2. Produces the pattern ‘x’ (substituting in any possible value for the variables – and potentially producing unique ‘unknowns’ as Prolog does, and which don’t quite have any counterpart in functional languages). This can be understood as a return value, except that as a relation, this might return multiple such values, lazily. (If it returned them eagerly we’d have a problem because it might be an infinitely large set of values).

3. Restricts the pattern to the list of conditions (binding variables by unifying as it evaluates). This would work just like the body of a Prolog rule.

The main difference between this and Prolog is that we can see that this is a functional form, that it’s returning a value (a list, or a set of lists), and that it’s putting that value in a very specific place. The sense of place, of recursive structure, is an important modern feature that Prolog, with its 1970s-era central database, lacks.

Using a ‘for’ declaration like this calls to mind ‘forall x’ in logic, but also a ‘for’ loop in programming – it makes it clear that introducing a logic variable is like creating a loop, or a setlike structure.

I like this a lot, but I still don’t like that there’s no apparent way to resolve the split between ‘unification environment’ (the logical variables and their bindings, including their ‘next value’ which must be something a bit like a closure) and the ‘logical database’ (the set of currently defined statements). In a functional language like Scheme these are both part of the same environment. Or at least, the functional equivalent of ‘unification environment’ is sort of the activation records (stack and/or heap).

micro-Kanren implementations seem to do various tricks to create ‘unknowns’ as unique objects – sometimes just by incrementing a system-global integer – which is, um. It works, but doesn’t quite seem right to me. Certainly doesn’t seem like something that you’d want to scale up to the Internet. Generate unknowns as GUIDs? And then the unification environment is a big old list, which is fine, but…

It seems like unknowns ought to be something like closures, but that probably means it would be very hard to unify them. You could make them OOP objects, I guess, with ‘unify’ methods, but again that seems like putting the cart before the horse. Why should logic programming be so much more complicated at its foundations than functional programming? Or rather why are there concepts – like unknowns – in logic resolution which don’t seem to have have any counterparts in functional theory? (Other than type theory, which basically is logic resolution with the serial numbers filed off).

What I find especially odd is that functional theory as used in modern computation (eg lambda calculus and combinatory logic) was invented as a method of doing logic – but we’ve never used either lambdas or combinators for logic programming! The logic programming people had to stumble onto a computation kernel (Robinson resolution with backtracking over Horn clauses) by a very different path.

(Both lambdas and combinators were in fact rejected for logic because they were inconsistent – produced paradoxes – which explains why the Prolog folks didn’t use them, but somehow this didn’t stop Lisp from using lambdas.)

I often feel with logic that I’ve stumbled into a vast, crumbling, abandoned library written in a dead language. Programming is a bustling city, but logic outside of the bright lights of simple Boolean algebra is… extremely hard to parse out exactly what problems they were trying to solve. Why was any of this infinity stuff considered important? Life went on perfectly fine without it over in Computation City. The problems we’re now trying to solve where logic seems like it might be applicable seem much simpler and more practical than ‘represent all of maths’: how to search databases, how to prevent buffer overflows. But even the simple things still seem unnecessarily complex, which makes me think we’ve not quite got the foundations right.

That is to say: now that the Internet is up and running and we’ve moved most of our civilisation online, the problem of computing is ‘how do we efficiently and safely represent and navigate all of human discourse’, as opposed to ‘how do we invent a mechanical oracle which, given carefully restricted input on any possible theoretical question, will give us absolute truth as output’.

I feel that many if not most of the formalisms of logic were created to solve Problem #2 – something we don’t actually care about today – as opposed to Problem #1. The good news is that restricting the universe of logic to discourse – words, images, ideas existing in symbolic form – rather than abstract theoretical concepts like towers of infinity, may mean the problem’s more tractable! But it also means that a lot of the constructs invented for Problem #2 just aren’t applicable, and also get in the way. ‘Restricting the input’ is not an option when the input is given for you.

If we want to use logic for something like Vannevar Bush’s Memex, Douglas Engelbart’s NLS, Ted Nelson’s Xanadu – decentralised personal databases of texts and concepts – we need a logic that’s very simple, durable, makes almost no assumptions about the data that’s fed in, and can let us search and index and update and make basic deductions reasonably safely and without infinite loops. If there’s data we don’t know how to handle, we should preserve it untouched and just leave it alone. That’s all we need, really.

Some of the things we need to do, but can’t do easily right now:

• Manage collections of messy data. Unstructured, ragged collections that aren’t perfectly rectangular tables.
• Deduce facts from collections of data and rules. Assume there are probably many, many contradictions. Don’t blow up if this happens. Be conservative and handle the error gracefully.
• Given two sets of data, produce a change delta – and given a delta, apply it to create another set. Apply deltas to deltas. Preferably every data set should itself be a delta.
• Assume these sets may be very large, infinite in multiple dimensions, so only compute as much data as you need at any one time.
• Have one data structure that can cope with everything. Something like objects, but with fewer assumptions. Something so simple anyone can implement it.
• Allow modelling of absolutely ANY system (including yourself – compilers compile themselves, after all). Make as few assumptions as possible.