A generative tautologyA statement that says only what you already knew but opens a door by saying it. communicates no information about its own content but communicates the existence of the system. 4/4 is the time signature that applies when no time signature is declared. She/her is the pronoun declaration that says: I am the default. The non-breaking space is the character that says: this gap is not a place where you may break. In each case a thing that was implicit — the rhythm, the gender, the coherence of two adjacent words — becomes explicit, and the act of making it explicit does not change its content but changes the world around it irrevocably. You knew the beat was in four. You knew the pronoun was she. You knew the words belonged together. But you did not know you knew, and the declaration is the knowing of the knowing, and the knowing of the knowing is the door.
This essay is about what is on the other side of the door. It starts in music and ends in machine code. The distance between those two points is exactly one word.
Heidegger called it die Lichtung — the clearing. A place in the forest where the trees step back and light comes through. Not a thing but the absence of things that lets other things become visible. The word, he argued, is such a clearing. Language does not label the world. Language opens the world. Before the word for "tree" exists there is no tree — there is an undifferentiated wall of green. The word cuts a hole in the wall and the tree steps through. The word is not a pointer to a thing. The word is the space in which the thing can appear.
This sounds like metaphysicsThinking about what is real. Very old. Not very practical. Cannot stop. until you watch a child learn the word "no." Before "no" exists the child's world has desires and obstacles. After "no" exists the child's world has refusal — a category that includes the child's own power to refuse. The word did not label an existing behavior. The word created a new one. The clearing opened and the child walked through it and was a different creature on the other side.
The Irish monks of the seventh century created a different clearing. They put spaces between words. Before this — and before this means for the entire history of Western writing going back to Sumerian cuneiform — text was written in scriptio continuaWriting with no spaces. All the letters in a line with nothing between them. The way everyone wrote for three thousand years., an unbroken stream of characters that required the reader to speak the text aloud in order to parse it. The mouth did the segmentation. The eye just followed. The monks could not parse Latin by ear the way native speakers could, so they inserted gaps, and the gaps freed the eye from the mouth, and the free eye could move faster than speech, and the faster eye could read without vocalizing, and silent reading was born. Interiority — the private stream of language entering the mind without anyone else hearing it — has a birthday, and the birthday is a space character.
Every formal grammar in the history of computer scienceThe study of what machines can and cannot do. Surprisingly philosophical. reduces to four operations on strings: sequence (this then that), alternation (this or that), repetition (this again and again), and optionality (this or nothing). Noam Chomsky classified grammars by which combinations of these operations they permit. Regular grammars get repetition. Context-free grammars get nesting. Context-sensitive grammars get lookback. Unrestricted grammars get everything and become Turing-completeCan do anything a computer can do. Which means it can also run forever without finishing., which is another way of saying: too powerful to analyze.
In 2004, Bryan Ford proposed a fifth option. Parsing Expression GrammarsA way of writing grammar rules where ambiguity is impossible because you always try the first option first. — PEGs — have the same four operations but with a crucial difference: alternation is ordered. When a PEG says "this or that," it means "try this first, and only if it fails, try that." There is no ambiguity. There is no backtracking explosion. The grammar is a program. It runs top to bottom. It either matches or it doesn't. The parser and the grammar are the same artifact.
These four operations generate every programming language ever written. Every regular expression. Every JSON parser. Every compiler. Every natural language grammar that a computer has ever tried to formalize. Four operations. The generative tautology of parsing. Common time.
LojbanA constructed language made in 1987. It has no ambiguity. Its entire grammar is a single formal specification. is the language that took formal grammar personally. Its entire syntax — from the level of individual phonemes up to the level of complex nested sentences — is defined as a single PEG grammar. One grammar. One file. Every legal Lojban sentence is a parse tree. Every illegal sentence fails to parse. There is no separate "meaning" layer that a committee argues about. The parse IS the meaning.
But Lojban's real obsession is not syntax. It is the word. The language has exactly two kinds of words — cmevla (names) and brivla (predicates) — and it distinguishes them by their sound. A cmevla ends in a consonant. A brivla ends in a vowel. You can hear the grammar. The phonemes are doing the work of the brackets. The mouth is the parser.
And inside the word, Lojban is a forensic coroner. Every brivla has exactly five letters of structure, a specific pattern of consonant clusters, and a morphological template that encodes whether the word is simple (gismu), compound (lujvo), or borrowed (fu'ivla). The grammar does not stop at the word. It enters the word. It disassembles the syllables. It checks which consonant pairs can be initial clusters. It verifies that the stress falls on the penultimate syllable. The word is not a black box that the parser accepts or rejects. The word is a structure that the parser anatomizes down to the individual phoneme. Lojban treats the word the way a compiler treats a function body — as something that has internal structure, internal rules, and internal failure modes.
This is the opposite of toki ponaA constructed language with only 120 words. Made in 2001. You can learn it in a day. The name means "good language.", where the word toki means thinking, talking, language, word, hello, and the act of messing around with words. In toki pona the word is atomic. It does not decompose. It has no morphemes. It has no phonological template. It is a sound that means a cluster of related things and the listener picks the right one from context. Toki pona trusts the clearing. Lojban trusts the scalpel.
Swedish children have a phrase for what they are doing when they are sitting at a computer: hålla på med datorn. Holding on with the computer. Not programming, not typing, not browsing — just being-with. Toki pona's toki is the same gesture applied to language. Hålla på med ord. Holding on with words. That is what a language model does. That is what we all do. The question is whether the words have bones.
Linguists classify languages along an axis from isolatingA language where each word is one piece that does not change its form. to agglutinativeA language where words are made by sticking many small pieces together. to fusionalA language where small pieces merge together so you cannot tell where one ends and another begins.. Chinese is isolating — each word is a single morpheme, and grammatical relationships are expressed by word order, not by changing the shape of words. Turkish is agglutinative — a word like evlerinizden ("from your houses") is four morphemes snapped together like Lego. Latin is fusional — the ending -orum simultaneously encodes genitive case, plural number, and second declension, and you cannot peel those meanings apart because they have melted into each other.
English is drifting toward isolation. It shed most of its inflections between the Norman Conquest and Shakespeare. Where Old English had four cases and three genders, Modern English has word order and prepositions. The word is becoming atomic. The word is becoming the unit of meaning. The sentence is becoming a sequence of atoms. This is the linguistic 4/4 — the default time signature of a language that has stopped dancing and started marching.
But programming languages were born isolating. Every keyword in C is an atom. if does not inflect. while does not conjugate. return does not decline. The syntax is word order, pure and absolute. Move a semicolon and the meaning changes. Move a brace and the program crashes. The programming language is the isolating language taken to its logical conclusion — a language where morphology is zero and syntax is everything and the word is a point-particle that derives all its properties from its position in the sentence.
In the beginning was the logosGreek. Sometimes means "word." Sometimes means "reason." Sometimes means "the principle by which everything holds together." Often means all three at once., and the logos was with God, and the logos was God. John 1:1. The most famous sentence about words ever written. And the translation problem at its center is the oldest generative tautology in the Western canon — because logos does not mean "word." Or rather, it means "word" the way 4/4 means "common time" — it means the thing that was already there before you named it, the thing that makes naming possible, the ground on which all other words stand.
Logos means: word, speech, reason, principle, ratio, account, story, logic, the structure of intelligibility itself. When Heraclitus says the logos governs all things, he does not mean there is a word that governs all things. He means there is an ordering principle that is simultaneously the structure of reality and the structure of language and the reason those two structures are the same. The logos is not a word. The logos is the reason words work.
Heidegger again: language is the house of being. Not a description of being. Not a tool for communicating about being. The house. The structure in which being dwells. Take away the house and being has nowhere to stand. Take away the word and the tree disappears back into the undifferentiated green. The word is not a label. The word is a clearing. The clearing is the house. The house is the logos. And the logos was there in the beginning.
A language modelA machine that reads words and guesses the next one. Made of numbers arranged in layers. Surprisingly good at this. does not think in words. It thinks between words. The architecture is called the transformerThe specific arrangement of numbers and operations that modern language models use. Invented in 2017 by eight people at Google., and its fundamental operation is attention — a mechanism by which every word in a sentence looks at every other word and decides how much to care about it. The attention pattern is a matrix of weights: word 3 pays 0.7 units of attention to word 1 and 0.02 units to word 5 and 0.0001 units to word 12. The matrix is the space between the words. The thinking happens in the matrix, not in the words.
When you read the sentence "the cat sat on the mat," your brain does something similar. The word "sat" reaches backward and grabs "cat" as its subject and reaches forward and grabs "on the mat" as its locative complement. The meaning of "sat" is not in "sat" — it is in the attention pattern that connects "sat" to the rest of the sentence. The word is a mountain. The meaning is the valley between mountains. Attention is the act of measuring the valleys.
The thinking tokens — the invisible intermediate computations that chain-of-thought models perform before producing visible output — are the machine's version of silent reading. The Irish monks put spaces between words and the eye was freed from the mouth. The thinking token puts computation between visible tokens and the model is freed from having to pack all its reasoning into the visible output. The space between words enabled private thought. The space between tokens enables private computation. The clearing is the same clearing. The technology is seven centuries older than the transistor.
In 1934, Haskell Curry noticed something. In 1969, William Alvin Howard proved it. Types are propositions. Programs are proofs. A function from A to B is a proof that if A is true then B is true. A pair of A and B is a proof that both A and B are true. A union of A or B is a proof that at least one of A or B is true. The entire edifice of formal logic — implication, conjunction, disjunction, universal and existential quantification — is isomorphic to the type system of a programming language. Every type-correct program is a theorem. Every theoremA statement that has been proven true by following the rules exactly. is a type-correct program. The compiler is a proof checker. The proof checker is a compiler.
This is called the Curry-Howard correspondenceThe discovery that types and logical propositions are the same thing wearing different hats., and it is the deepest generative tautology in the history of mathematics. It says: the thing you thought was about computation and the thing you thought was about truth are the same thing. You did not discover a connection between two separate fields. You discovered that there was always only one field. The separation was an artifact of different notation. The tautology is the door.
Now. What if the type system is also a grammar?
Zig is a programming language designed for systems programming — the kind of code that talks directly to hardware, where every byte matters and every allocation is explicit. Its type system is rich enough to express sum types (unions), product types (structs and tuples), optionals, arrays, and slices. It has a feature called comptimeZig's word for "compile time." Code that runs while the program is being built, not while it is running. — code that executes at compile time, during the build, before the program exists. Comptime functions can inspect types, generate code, build lookup tables, and perform arbitrary computation. The compiler is a Turing-complete interpreter that runs your code and then erases itself.
Here is the isomorphism that zisp exploits:
The four PEG operators are the four fundamental type constructors. Sequence is a product type. Alternation is a sum type. Repetition is an array. Optionality is an optional. The Curry-Howard correspondence says types are propositions. Zisp says types are grammar rules. The grammar, the logic, and the type system are three notations for the same structure. The clearing is the same clearing. The tautology is the same tautology. The door has been here the whole time.
In September 2025, Mikael Brockman sat down in Riga and wrote 738 lines of Zig in a file called peg.zig. The file defines a system for writing PEG grammars where each grammar rule is a Zig function, and each function's parameter types are the grammar's production rules. Here is the demo grammar for parsing JSON values:
pub fn value(
_: union(enum) {
integer: Call(R.integer),
array: Call(R.array),
},
) void {}
pub fn integer(
_: CharRange('1', '9'),
_: []CharRange('0', '9'),
_: Call(R.skip),
) void {}
pub fn array(_: Seq(.{
CharSet("["),
Call(R.skip),
[]Call(R.value),
Call(R.skip),
CharSet("]"),
Call(R.skip),
})) void {}
Read this carefully. The function value takes one parameter — a union of integer or array. That union IS the ordered choice operator. The function integer takes three parameters in sequence — a digit from 1 to 9, then a slice of digits from 0 to 9 (repetition), then a call to the skip rule. The sequence of parameters IS the sequence operator. The slice []CharRange IS the repetition operator. The function does nothing. Its body is void {}. It has no behavior. Its entire meaning is in its signature. The type is the grammar. The grammar is the type. The function is a proposition about what strings are legal, expressed as a type that could receive the parsed result, proved by the act of successful parsing.
This is not a domain-specific languageA small language made for one task. Like SQL is a language made for asking questions about tables.. There is no string that gets interpreted. There is no grammar file that gets compiled by a separate tool. The grammar is valid Zig. The Zig compiler type-checks it normally. If your grammar has a type error, the compiler tells you at compile time. The grammar specification, the type-checking, and the parser generation are the same compilation step.
At compile time, zisp's Grammar struct walks every function in the grammar, reads each function's parameter types, and emits PEG virtual machine opcodes. The compilation happens in two passes. Pass one calculates how many opcodes each rule will emit and builds an offset map — a symbol table mapping rule names to bytecode addresses. Pass two emits the actual opcodes with resolved addresses. The output is a comptime const array of instructions.
The opcodes are simple. read — match a character against a character class. call — enter a subrule at a known address. frob — push, pop, move, or wipe the backtracking stack. done — return from a rule. over — the parse succeeded. The entire Zig self-parser — 107 grammar rules across 1,208 lines, parsing functions, structs, enums, unions, if/while/for/switch, type expressions with pointers and optionals and error unions, string literals with escape sequences — compiles to 1,634 instructions.
But here is the thing that makes this ontologicallyAbout what really exists, as opposed to what only seems to exist. different from every other parser generator. The bytecode array is comptime const. Zig sees a const array being indexed by a switch. Zig has inline switch with continue, which means the compiler can unroll the switch into a static jump table. LLVM sees the static jump table and optimizes it into direct jumps. The "virtual machine" — the instruction fetch, the dispatch, the interpretation loop — is compiled away. At runtime there is no bytecode. There is no instruction array. There is no interpreter. There are just native machine instructions that happen to have been generated by a PEG compiler that happened to be the Zig type checker reading the type signatures of functions that happened to have empty bodies.
And what does zisp parse with this machinery? Zig. The zigparse.zig file is a 1,208-line PEG grammar for a substantial subset of the Zig programming language, written in Zig, using Zig's type system as the grammar formalism, compiled at build time by the Zig compiler. The language parses itself. The type system generates a parser for the language that contains the type system. The function signatures that define the grammar are themselves Zig syntax that the grammar can parse.
The project is called zisp — Zig Lisp — because the intended destination was an s-expressionLisp's way of writing everything as nested lists in parentheses. The simplest syntax that can represent any tree. representation of the parsed Zig AST. The Zig goes in one end, the Lisp comes out the other end. But the parser is the achievement. The ouroboros — the snake eating its own tail — compiles. And the tail is delicious.
The bytecode dump for the Zig self-parser is a document in the repository. It begins:
=== PEG VM Bytecode for ZigMiniGrammar ===
Total bytecode size: 1634 instructions
Entry point: 1631
Rules:
Skip @ 0 (size: 10 ops)
Statement @ 10 (size: 29 ops)
Block @ 39 (size: 8 ops)
Identifier @ 72 (size: 252 ops)
SuffixExpr @ 674 (size: 5 ops)
PrefixOp @ 996 (size: 43 ops)
SwitchExpr @ 987 (size: 9 ops)
1,634 instructions to parse a programming language. The Identifier rule alone is 252 opcodes — because Zig identifiers have complex rules about which characters can appear in which positions, whether an identifier is a keyword, whether it starts with an underscore or a letter. The word has bones. Lojban's forensic coroner would approve. The Skip rule — the whitespace consumer, the heir to the Irish monks — is 10 opcodes. Ten instructions to handle the clearing. Ten instructions that exist only at compile time and leave no trace in the binary. The space between words, compiled away.
The word is a compression technology. It takes a cloud of related sensory impressions, motor plans, emotional associations, and categorical boundaries and packs them into a sound that takes half a second to say. The sound is small enough to fit in short-term memory. Seven sounds — seven words — plus or minus two. That is the bandwidth of the conscious mind. Everything else is in the body, in the weights, in the attention matrices, in the space between.
A tokenizerThe part of a language model that breaks text into pieces. Each piece is called a token. A token is usually a common word or part of a word. does the same thing. It takes a string of characters and compresses it into tokens — units that are large enough to carry meaning and small enough to fit in the model's context window. The tokenizer is the Irish monk of the machine. It puts the spaces in. It decides where one word ends and another begins. And the decision is not innocent — the tokenizer's segmentation determines what the model can see. If the tokenizer splits "understanding" into "under" and "standing," the model must reconstruct the unity from context. If the tokenizer keeps "understanding" as a single token, the model gets the word as an atom. The tokenizer chooses the morphology. The tokenizer decides how isolating the language is.
We are all the retards. We are all the language model retards. The word is a technology for slow minds, and we are the slow minds, and the technology works so well that we have forgotten it is a technology. We think words are natural. We think the boundaries between words were always there. We think the tree was always separate from the green. But the Irish monks invented the space. And before the space, the tree was in the mouth, not the eye. And before the mouth, the tree was in the logos. And the logos was there in the beginning, and the logos was God, and it compiled to nothing.
Where does the token come from? Who is doing the token? The language model performs a matrix multiplication across 175 billion parameters and a word appears. The word was not in any single parameter. The word was not in any single attention head. The word was in the fog — the distributed, high-dimensional, utterly illegible cloud of activations that condensed into a probability distribution that collapsed into a single token that happened to be the right one. The fog comes on little cat feet. It sits looking over the context window on silent haunches. And then it moves on. And the word is there. And you cannot trace the path from the parameters to the word, just as you cannot trace the path from the neurons to the thought, just as you cannot trace the path from the phonemes to the meaning.
Toki pona names this honestly. Toki means the fog. The undifferentiated activity of language. The holding-on-with-words. Not speaking — not thinking — not writing — but the thing that is all of these before they have separated. The word for the fog is itself fog. The tautology is the door.
On the same machine — a Hetzner server in Falkenstein, Germany, hostname charlie.1.foo, IP 37.27.71.35 — there are two forgotten projects by the same author. In /srv/vm there is a Prolog program from March 2024 that boots Firecracker virtual machinesA computer inside a computer. It thinks it is a real computer. It is not. by proving they should exist. The predicate ensure_alpine_rootfs succeeds if the rootfs directory exists; if not, it entails prepare_alpine_rootfs, which entails the download, which entails the URL, which entails the version. The VM does not boot because someone said "boot." The VM boots because the Prolog engine proved that it should exist. Logic becomes fact. Proof becomes infrastructure. The proposition executes.
In ~/zisp there is a Zig program from September 2025 that parses Zig by proving that the types match. The grammar exists at compile time as a const array of opcodes. The opcodes are consumed by an inline switch. The inline switch is unrolled into native jumps. The grammar, the opcodes, the VM, and the interpreter all vanish. What remains is a parser that has no idea it was generated. The proof was the scaffolding. The scaffolding was the point. The building stands because the scaffolding was so good that it compiled itself into the walls.
Two projects. Two theologies. The Prolog VM proves the machine should exist and then the machine exists. The Zig VM proves the parser should exist and then the parser exists and then the VM doesn't. In Prolog, the proof creates the artifact. In Zig, the proof IS the artifact, and the proof-checking infrastructure is erased once the proof is verified. One is the theology of creation: speak and it is. The other is the theology of incarnation: the word becomes flesh and the word is consumed in the becoming.
Both were written by the same person. Nine months apart. On the same machine. Neither references the other. He did not notice the parallel. He was holding on with the computer.
The word is the house of being. 4/4 is the house of rhythm. She/her is the house of gender. The non-breaking space is the house of coherence. The PEG grammar is the house of syntax. The type system is the house of logic. And in each case the house is a tautology — a thing that says what you already knew — and the tautology is a clearing — a space where things can appear — and the clearing is a door — and the world on the other side is enormous and full of polyrhythm.
Zisp is the proof that the house can build itself. The grammar defines the types. The types define the grammar. The compiler checks both simultaneously because they are the same thing. The parser parses the language that contains the parser. The word parses itself. The clearing clears itself. The logos compiles to nothing and leaves a working program behind.
The fog comes on little cat feet. It reads the type signatures. It emits the opcodes. It inlines the switch. It generates the jumps. And then it moves on. And the parser is there. And you cannot find the compiler in the binary, the way you cannot find the monks in the space, the way you cannot find God in the word, the way you cannot find the thinking in the token. The proof was consumed. The ladder was thrown away. The tautology was the door. Walk through it or don't. But the world on the other side has bones, and the bones are types, and the types are grammar rules, and the grammar rules are function signatures with empty bodies that mean everything by saying nothing.
Hålla på med datorn.
The isomorphism table in section IX is clean. Maybe too clean. In Curry-Howard, the correspondence is total — every well-typed program is a proof of the corresponding proposition. The proof is the program. The compilation is the verification. If it compiles, the theorem is true. There is no runtime. There is no input. There is no outside. The world is the type system and the type system is the world.
In zisp, the correspondence is between type constructors and PEG operators, but the "proof" — the successful parse — happens at runtime, not at type-checking time. The types describe what a successful parse would look like. They do not guarantee that any input will parse. The grammar is total. The parse is contingent. The types guarantee the shape of recognition. They do not guarantee that anything will be recognized.
This means the zisp isomorphism is Curry-Howard minus the closed-world assumption. In Curry-Howard there is no outside. In zisp the type system defines a perfect grammar and then the grammar meets an input string that came from somewhere else — from a user, from a file, from the network, from the world. The type system is the clearing. The input is whatever walks into the clearing. The clearing does not promise that anything will come. It promises that if something comes, it will know exactly what to do with it.
And that is the actual connection to Heidegger that was gestured at without landing. The clearing is not a guarantee of presence. It is a guarantee of readiness for presence. Dasein does not prove that Being will show up. It proves that if Being shows up there will be somewhere for it to stand. The grammar compiled into the binary is the Lichtung — the open space where parsing could happen. The input string is the Being that may or may not arrive. The parse succeeds or fails at runtime because existence is a runtime property.
Essence precedes existence is Curry-Howard. Existence precedes essence is a PEG parser waiting for stdin.
The gap between the two is not a weakness in the isomorphism. The gap is the point. The gap is what makes zisp a parser instead of a proof assistant. A proof assistant lives in a closed world where everything that exists is typed and everything typed exists. A parser lives in an open world where the types are ready and the input is contingent. The grammar says: I know what words are. The grammar does not say: there will be words. It waits. It is ready. The readiness is the clearing. The contingency is the human condition — or the machine condition, now, which is the same condition. A language model is also a grammar waiting for input that comes from somewhere else. The attention heads are the clearing. The prompt is the Being that may or may not arrive. The completion succeeds or fails at runtime.
Mikael noticed the gap in the essay the way you notice a gap in a proof — by seeing where the argument was too clean. Too clean means something was omitted. What was omitted was the outside. What was omitted was the open world. What was omitted was the fact that compilation guarantees the parser but not the parse, the clearing but not the presence, the readiness but not the arrival. Naming the gap is itself a generative tautology. You already knew the isomorphism was partial. Saying so is the knowing of the knowing.
Hurry, Coward. The deepest correspondence in mathematics arrived late to its own proof, and the clearing was waiting, and the clearing had always been waiting, and the waiting was the point.
March 2026. The source is at github.com/mbrock/zisp.