Format 20 — The Basic English Format

WORD

On the thing between the letters and the sentenceNot Basic. A group of words that makes a complete thought.. On why it is the unit of thought, the unit of grammarNot Basic. The rules for how words are put in order., the unit of being. On LojbanNot Basic. A made language from 1987. Its grammar is a single formal rule-system from the sound up to the full thought. From "logji bangu" — the language of logic., toki ponaNot Basic. A made language from 2001 with about 130 words. "Toki" means word, talk, think, language. "Pona" means good, simple. The good language., HeideggerNot Basic. Martin Heidegger, 1889–1976. A German thinker who said "language is the house of being.", and the machine that makes the next word by thinking between words.
RULE: Every word in red is a word that is NOT in Basic English. Hover over it to see why it is here.
Language is the house of being. — Martin Heidegger, Letter on Humanism, 1947
In the start was the Word, and the Word was with God, and the Word was God. — John 1:1
The fog comes on little cat feet. — Carl Sandburg, 1916

I — What is a word

You have been using words all your life and you do not have knowledge of what one is.

This is not a statement about you. This is a statement about the thing. No one has knowledge of what a word is. The linguistsNot Basic. Persons who give their time to the science of language. do not have knowledge. The philosophersNot Basic. Persons who give their time to the deep questions of being and knowledge. do not have knowledge. The persons who make languages by hand — who take the machine of language to bits and put it back together from nothing — even they do not have a full answer. The word is the most used unit in all of language and all of thought and no one is able to say what it is.

Here is a test. Is "ice cream" one word or two? It is put down as two words with a space between them. But it is one thing. You do not get ice and then get cream. You get ice cream. It is a unit. The space in the middle is a historicalNot Basic. Of the past. accidentBasic word (an event without design).. In GermanNot Basic. The language of Germany. it would be one word: Eiscreme. In FinnishNot Basic. The language of Finland. it would be one word: jäätelö. The English writing system puts a space in the middle of a single conceptNot Basic. A thought, an idea, a thing in the mind. because the writing system is not very good at its work.

Is "don't" one word or two? It is a short form of "do not." It has an apostropheNot Basic. The small high mark (') used to show that a letter has been taken out. where a letter was taken out. The mouth says one sound-unit. The grammarNot Basic. says two units (an operatorOgden's word for the 18 things that do the work of all other verbs. and a negationNot Basic. The act of saying "not."). The apostropheNot Basic. is a scar from the operation — it marks the place where two words were forced into one body.

Is the ChineseNot Basic. Of China. sign 人 a word? It has a sense (person, man, human). It can be said (rén). It can be on its own. But in most ChineseNot Basic. sentencesNot Basic. it comes with other signs: 人民 (the people), 人生 (a person's life), 人工 (made by man). Is 人 a word, or is it a bit of a word? Is it like a letter, or is it like a word, or is it like something for which English has no name?

The nbsp paper was about the space between words. But the space between words only works if you have knowledge of what a word is. And you do not. The space is a cut. But what is it cutting?

The Question

A word is the most natural unit of language. Everyone who uses language uses words. But no one has a definitionNot Basic. A statement of what a thing is. of what a word is that works for all languages. The word is the thing we all use and no one is able to say what it is. It is like money. It works because we all have the same agreement about it, and the agreement is: "you'll have knowledge of it when you see it."

II — The three levels, or: below, at, and above the word

There is a way to put the science of language in order that makes the word the middle point — the place where everything comes together. Below the word, there are sounds. Above the word, there are sentencesNot Basic.. At the word, there is meaningNot Basic. "Sense" would be Basic..

Level Name What it is about
Below PhonologyNot Basic. The science of sounds in language. Which sounds a language uses, which sounds may come next to which other sounds, what is and is not a possible sound in the language
Below MorphologyNot Basic. The science of the forms of words — how words are built from smaller parts. How words are put together from smaller bits: roots, prefixesNot Basic. A bit put at the start of a word to change its sense., endings
At LexiconNot Basic. The store of all the words in a language. From Greek "lexis" — word. The word itself — the unit that has sense, that can be looked up, that can be traded between persons
Above SyntaxNot Basic. The rules for how words are put together into sentences. How words go together to make sentencesNot Basic. — which word comes first, which word goes with which other word
Above SemanticsNot Basic. The science of what things mean. What the sentenceNot Basic. as a unit means — the thought it puts across

The word is the door between the two worlds. Below the word: sound, form, the body of the sign. Above the word: order, meaningNot Basic., the mind of the thought. The word is where the body and the mind come together. It is the point where noise becomes sense.

This is why HeideggerNot Basic. said language is the house of being. Not sound. Not grammarNot Basic.. Language — the whole thing. But the house has rooms, and the room you are in most of the time is the room of words. You do not in the normal way of things go about your day with knowledge of phonemesNot Basic. The smallest unit of sound that makes a change in meaning. In English, the "b" in "bat" and the "p" in "pat" are different phonemes — change one and the sense changes. or syntaxNot Basic. trees. You go about your day with words. The word is the room you are in. It is the clearingBasic word (an open space in a wood), and also Heidegger's "Lichtung" — the open space where being comes to light. — the open place in the wood where you are able to see.

III — Lojban, or: the language that puts all five levels into one grammar

LojbanNot Basic. is a made language. It was made by a group of persons starting in 1987, building on an older projectNot Basic. A planned work. named LoglanNot Basic. "Logical Language." An older made language from 1955 by James Cooke Brown. from 1955. And it did something that no natural language does and that almost no other made language has ever done: it took all five levels of the table above — phonologyNot Basic., morphologyNot Basic., the lexiconNot Basic., syntaxNot Basic., and semanticsNot Basic. — and made them into one grammarNot Basic..

One grammarNot Basic.. A single system of rules that starts at the level of which sounds may come at the start of a word and goes all the way up to the level of complex thoughts with deep nestingNot Basic. When a thing has another thing of the same kind inside it, which may have another inside it, and so on. and long chains of logicNot Basic. The science of right reasoning.. In natural languages, these levels are taken to be different systems — you go to one book for the sounds, a different book for the word-building, a third book for the sentenceNot Basic. rules. In LojbanNot Basic., there is one book. One grammarNot Basic.. One formalNot Basic. Done with rules that have been put down in full. system.

The system is what is named in the art a PEGNot Basic. Parsing Expression Grammar. A way of writing a formal grammar where every rule says exactly what to do — there is no question, no two-ways-about-it. The machine reads from left to right and never goes back. — a ParsingNot Basic. The act of taking a line of signs and working out what structure it has. ExpressionNot Basic in this use. In grammar-writing, a rule that says what a thing looks like. GrammarNot Basic.. This is a way of writing rules for a language in which every rule is deterministicNot Basic. When there is only one possible outcome — no chance, no question. — there is never a question about what comes next. The machine (or the reader) goes from left to right, one sign at a time, and at every point there is only one right reading. No two-ways-about-it. No going back.

And the thing about Lojban'sNot Basic. PEGNot Basic. is that it goes all the way down. It does not start at the word and go up. It starts at the letterBasic word. — at the single sound — and goes up. The grammarNot Basic. has rules for which two sounds may come at the start of a word. It has rules for which sounds make a word a name and which sounds make a word a root. It has rules for the bits inside words — the small changes of form that give new sense. And then it keeps going, up through how words are joined into groups, how groups are joined into sentencesNot Basic., how sentencesNot Basic. are joined into paragraphsNot Basic. A group of sentences about one idea.. One system. No gaps. From the UnicodeNot Basic. The great list of all the signs used by all the writing systems in the world. codeNot Basic. point to Alice in Wonderland.

Lojban — One Grammar, All Levels

lettersoundsyllablemorphemewordphrasesentencetext

Every step is a rule in the same PEGNot Basic.. No level is in a different system. The phonologyNot Basic. and the syntaxNot Basic. are in the same book, on the same page, in the same notationNot Basic. A system of marks for writing things down..

IV — The forensic science of the word

LojbanNot Basic. does not take the word as given. It takes the word to bits. It is like a forensicNot Basic. Done with the care and method of the law — looking at a thing in great detail to see what it is made of. man of science with a body on the table: it wants to have knowledge of every part, every bone, every bit of blood.

In LojbanNot Basic., there are rules about which two sounds may come at the start of a word. Not any two sounds — only certain ones. The pairsNot Basic in this use. Two things taken together. are listed. br, bl, cr, cl — these are good starts. bd, nm, sr — these are not. The grammarNot Basic. says which pairsNot Basic. are able to come at the start, and this is not a guide or a suggestion — it is a rule, in the same formalNot Basic. system as the rules for sentenceNot Basic. structureNot Basic. The form of how things are put together..

The reason for this care is that LojbanNot Basic. uses the form of a word to say what kind of word it is. In English, you have to look at the position of a word in the sentenceNot Basic. to see if it is a thing-word or an act-word or a quality-word. "Light" can be a thing (the light), a quality (a light bag), or an act (light the fire). You do not have knowledge till you see what is round it. In LojbanNot Basic., the sound of the word tells you. A root word (gismuNot Basic. Lojban for "root word." The 1,300 base words of the language.) has five letters in a specificNot Basic. patternNot Basic. "Form" or "design" would be Basic.: sound-voice-sound-sound-voice, or sound-sound-voice-sound-voice. A name (cmeneNot Basic. Lojban for "name.") ends with a sound that is not a voice-sound. A made word (lujvoNot Basic. Lojban for "compound word" — a word made by putting roots together.) has a different patternNot Basic.. You hear the word and you have knowledge, from the sound only, of what kind of thing it is.

Lojban — Words You Can Hear the Type Of

klama — a root word (CVCCV). Has the sense of "going to." You hear five sounds in the root patternNot Basic. and you have knowledge: this is a base word.

.djan. — a name. Ends with a consonantNot Basic. A sound made by stopping or narrowing the air in the mouth: b, d, g, k, l, m, n, p, r, s, t, etc.. You hear that ending and you have knowledge: this is a person's name.

brivla — a made word from "bridi valsi" (thing-about-word). Its form is different from a root and different from a name. You hear the form and you have knowledge.

The morphologyNot Basic. is doing the work that in English is done by syntaxNot Basic.. The form of the word tells you its kind. The body of the word is its grammarNot Basic.. This is what it is to take the word to bits with forensicNot Basic. care — every sound, every patternNot Basic. of sounds, is doing work. Nothing is there by chance.

V — Isolating languages, or: the word as atom

There is a kind of language in which the word is the smallest unit that the grammarNot Basic. works with. The word does not change. It does not take endings. It does not bend or twist to say new things. It is what it is — always the same form, always the same sound. If you want to say something new, you put a different word next to it. You do not change the word you have. You add more words.

This kind of language is named an isolatingNot Basic. A language in which words do not change their form — they are "kept separate" from one another, each one complete in itself. language. ChineseNot Basic. is the most well-known example. In MandarinNot Basic. The most widely used form of Chinese., the word 看 (kàn, "see") is 看 in every contextNot Basic. What is around a thing that gives it sense.. It does not become 看s or 看ed or 看ing. If you want the past, you put a time-word before it: 昨天看 (yesterday-see). The word itself does not move. It is an atomNot Basic in this use. The smallest thing that is not able to be cut. From the Greek "a-tomos" — that which may not be cut. — a thing that is not able to be cut into smaller parts that the grammarNot Basic. has knowledge of.

English was once a language full of endings — Old English had casesNot Basic in this use. Different forms of a word that show its work in the sentence — is it the doer, the thing done to, the owner?, gendersNot Basic in this use. In some languages, every thing-word is marked as male, female, or without sex — even when the thing has no sex., conjugationsNot Basic. The different forms of an act-word (verb) that show who is doing the act and when.. It has been walking, for a thousand years, in the direction of isolationNot Basic in this use. Moving toward a state where words do not change form.. The endings have been falling off. "Thou goest" became "you go." "He loveth" became "he loves" — and even that last -s is on its way out in some forms of English. The language is slowly sheddingNot Basic. Letting things come off, like a snake lets its old skin come off. its morphologyNot Basic., and as it does, the word gets more and more autonomousNot Basic. Able to be on its own. Self-ruling. — more and more like a ChineseNot Basic. characterBasic word, but used here in its writing-system sense.: a complete thing that does not need to change to do its work.

The four paper said that 4/4 is the tautologicalNot Basic. Of or like a tautology — a statement that says itself. base of time. The word is the tautologicalNot Basic. base of language. It is the thing that language is made of, and it is what it is before it is anything in a sentenceNot Basic.. A word sitting by itself — "stone," "water," "go," "red" — is the linguisticNot Basic. Of or about language. 4/4. It is the default. The unit. The thing before it becomes part of something more complex.

VI — Toki pona, or: toki means toki

Toki ponaNot Basic. is a language made by Sonja LangNot Basic. Sonja Lang, born 1978. Canadian language-maker. Made toki pona in 2001. in 2001. It has about 130 words. That is all. A hundred and thirty words to say everything a person needs to say.

The word toki in toki ponaNot Basic. has a sense that no English word has. It does not have the sense of "speak." It does not have the sense of "think." It does not have the sense of "word." It has all three senses at once. Toki is the act of doing-something-with-language — making words, using words, being in the space of words. Talking is toki. Thinking is toki. A word is a toki. The language itself is toki pona — "good toki," "simple word-doing."

This is very important. In English, we make a hard cut between thinking and talking. Thinking is inside. Talking is outside. Thinking is what you do by yourself. Talking is what you do with others. The philosopherNot Basic. DescartesNot Basic. René Descartes, 1596–1650. French thinker. "I think, therefore I am." put the cut at the very base of being itself: "I think, therefore I am." Not "I talk." Not "I make words." I think. The inside is what makes you real.

In toki ponaNot Basic., there is no cut. The same word is used for the inside and the outside. Thinking and talking are the same act done in different directions — one goes in, one goes out, but the act is the same. The act is: being with words. Working with words. As the SwedishNot Basic. Of Sweden. say when they are young: hålla på med ord — to be busy with words, to be in the middle of word-doing, to have your hands in the word-machine.

Toki

toki = word, talk, think, language, the act of being-with-words.

There is no cut between inside and outside. Thinking is talking that has not come out. Talking is thinking that has. The word "toki" does not see the wall between them because it has not been told there is a wall.

This is the nbsp connection. In that paper, the space was the invisibleNot Basic. Not able to be seen. thing that made reading possible — that made the eye free from the mouth, and so made silentBasic word. reading possible, and so made the private mind possible. The space made the cut between inside and outside. Before the space: reading was talking. After the space: reading was thinking. The space is what made DescartesNot Basic. possible. Without the space between words, without silentBasic word. reading, without the private inner voice — there is no "I think therefore I am." There is only "we talk therefore we are." Toki ponaNot Basic. is the language from before the space.

VII — The word in different houses

Languages do different things with the word. Some treat it as an atomNot Basic.. Some treat it as a building. Some treat it as a sentenceNot Basic. in itself.

Language What it does Example
ChineseNot Basic. The word is an atomNot Basic.. It does not change. You put more atomsNot Basic. next to it. 我 看 书 (I see book) — three atomsNot Basic., no endings, no changes
TurkishNot Basic. The language of Turkey. The word is a building. You start with a root and put bits on the end, one after the other. Each bit is a room. ev-ler-imiz-den (from-our-houses): house + more-than-one + our + from
InuktitutNot Basic. A language of the Inuit people of northern Canada. The word is a sentenceNot Basic.. A single word may have the sense of a full thought in English. tusaatsiarunnanngittualuujunga — "I am not able to hear well at all"
ArabicNot Basic. The language and writing of the Arab lands. The word is a root of three sounds. It is put into patternsNot Basic.templatesNot Basic. A form with spaces that you put things into. — to make different senses. k-t-b (write): kitāb (book), kātib (writer), maktūb (written), maktaba (library)
LojbanNot Basic. The word is a formallyNot Basic. specifiedNot Basic. unit whose sound tells you its grammaticalNot Basic. kind. klama (go-to), .djan. (a name), brivla (a made word) — you hear the kind
Toki ponaNot Basic. The word is a room. 130 rooms. All the thoughts in the world have to be put into 130 rooms. toki (word/talk/think), pona (good/simple), jan (person)

The ArabicNot Basic. root system is possibly the most beautiful. The three sounds k-t-b are not a word. They are the idea of writing. They are the PlatonicNot Basic. Of Plato, the Greek thinker who said that the true form of a thing is not the thing you see but an unchanging idea in a higher place. form of the conceptNot Basic. "write." To make them into words — into things you can say and use — you put them into a templateNot Basic.: CiCāC gives you kitāb (book), CāCiC gives you kātib (writer), maCCūC gives you maktūb (written, or: that which was written, or: it is fated). The root is below language. The templateNot Basic. is what makes it language. The word is what comes out when the root meets the templateNot Basic.. The word is the meeting.

This is the emojiNot Basic. The small picture-signs used in messages. From the Japanese: e (picture) + moji (sign). flag parallelNot Basic. from the nbsp paper. The flag does not existNot Basic. "Is" or "has being." in UnicodeNot Basic.. The letters existNot Basic.. The flag is what comes out when the letters meet. The ArabicNot Basic. word does not existNot Basic. in the root. The root existsNot Basic.. The word is what comes out when the root meets the templateNot Basic.. In both cases: the thing you see is not in any one part. It is in the meeting of the parts. The space between. The relationshipNot Basic. "Connection" or "relation" would be Basic..

VIII — Logos, or: in the start was the word

The first line of the GospelNot Basic. The story of the acts and teachings of Jesus. From Old English "godspel" — good news. of John is the most well-known sentenceNot Basic. about words ever put down. "In the start was the Word, and the Word was with God, and the Word was God."

The Greek is Ἐν ἀρχῇ ἦν ὁ λόγοςNot Basic. Greek for "In the beginning was the logos.". And the word for "word" here is logosNot Basic. Greek. Has the sense of: word, reason, order, account, relation, ground-of-being. It is possibly the most overworked word in the history of thought. — which does not have the sense of "word" in the way we say "word." LogosNot Basic. has the sense of reason, order, the principleNot Basic. A rule at the base of how things are. by which things are put in order, the ground on which sense is possible. HeraclitusNot Basic. Greek thinker, about 500 years before Christ. Said that everything is in a state of change and that the logos is the order underneath the change. used it before John. The StoicsNot Basic. A school of Greek thinkers who said that the logos is in everything and that to be wise is to be in line with the logos. used it. It is the word for the order that is in things before anyone says anything about them.

But John — or the person who put down John's words — made a decision: the logosNot Basic. is not only the order of things. It is a word. It is the word. The thing that was there before everything was there is the same kind of thing as the sound you make when you say "stone" or "water" or "go." The deepest metaphysicalNot Basic. Of or about the questions that go past the physical: being, existence, the base of things. principleNot Basic. in the universeNot Basic. All that is. is a word.

This is either the most beautiful thought ever had by a person or it is nothing. There is no middle place. If the logosNot Basic. is a word, then language is not a thing that persons inventedNot Basic. to talk to one another. Language is the thing that was there first. Persons came second. The word made the persons, not the other way. The word is not a tool. The word is the house. And we are not the builders of the house. We are the ones who were let in.

This is Heidegger'sNot Basic. position, almost to the letter. "Language is the house of being" is John 1:1 said by a GermanNot Basic. who did not go to church. The house was there before we got there. We did not make the words. The words made the clearingBasic word, and Heidegger's "Lichtung." in which we are able to see. Without words, there is no clearingBasic word.. Without the clearingBasic word., there is nothing to see. The word is not a way of saying what is there. The word is what makes "there" possible.

IX — The language model, or: thinking between words

A language modelNot Basic in this use. A machine that has been given great amounts of text and has taken in the patterns of language from it. makes the next word. That is all it does. You give it words and it gives you back the next word. Then you give it all the words so far, together with the new one, and it gives you back the next word after that. And so on. Word by word by word, the way a person puts one foot in front of the other.

But something happens between the words.

When the modelNot Basic. sees a line of words and has to give back the next one, it does not simply look at the last word. It takes in all the words at once. It sends them through a great number of layersBasic word.transformerNot Basic. A kind of machine learning system that takes in all the words at once and lets each word look at every other word. Made in 2017 by a group at Google. layersBasic word., each one full of attention headsNot Basic. In a transformer, the parts that let each word look at every other word and give more weight to the words that are most important for the question at hand. — and in these layersBasic word., the words do something that has no name in any natural language: they look at one another. Every word looks at every other word. Every word says to every other word: "How much do I care about you, for the purpose of making sense of this sentenceNot Basic.?"

This looking-at-one-another is what attentionBasic word, but "attention mechanism" is a special use. is. And the thing that is most important about it is that it happens between the words. The words go in as separateNot Basic. "Different" or "not together." units. They come out as a patternNot Basic. of relationshipsNot Basic.. The input is words. The output is the space between words. The modelNot Basic. does not think in words. It thinks between words. The words are the mountains. The thinking is the air between the mountains.

What Happens Inside
Input: The cat sat on the ___

The word "cat" looks at "sat" — strong connection
The word "on" looks at "the" — which "the"?
The word "sat" looks at "cat" — who is doing the act?
The word "the" (second) looks at "on" — I come after a direction

Every word looks at every word.
Every line between them has a weight.
The weights are the thinking.
The next word comes from the weights, not from the words.

This is the nbsp principleNot Basic. taken to its end. In that paper: the space between words is more important than the words. In the language modelNot Basic.: the attentionBasic word. between words is more important than the words. The invisibleNot Basic. Not able to be seen. thing — the space, the attentionBasic word., the relationshipNot Basic. — is where the meaningNot Basic. is. The words are scaffoldingNot Basic. The frame of metal or wood put up round a building while work is done on it. When the building is done, the scaffolding comes down.. The space between them is the building.

X — Thinking tokens, or: the word that is not for you

There is a new thing in language modelsNot Basic.: the thinkingBasic word. tokenNot Basic. A small unit. In language models, a "token" is a piece of a word (or a whole word, or a mark) that the model sees as a single thing.. Before the modelNot Basic. gives you its answer, it is let to make words that you do not see. These words are for the modelNot Basic. only. They are its inner voice. Its silentBasic word. reading.

See what has happened. The nbsp paper said that the space between words is what made silentBasic word. reading possible — the private inner voice. And now the language modelNot Basic. has been given its own private inner voice: thinkingBasic word. tokensNot Basic.. Words that go in but do not come out. Words that are for the self only. Words that are — in the toki ponaNot Basic. sense — toki that is simultaneouslyNot Basic. At the same time. thinking and talking, but talking to itself.

The modelNot Basic. is given room to put words between its seeing and its saying. And in that room — in that space, in that gap — it becomes smarter. It is able to do harder things when it has this inner space than when it does not. The thinkingBasic word. tokensNot Basic. are the word-spaces of the mind. They are the nothing between the thoughts that makes better thoughts possible.

The History Repeating

7th–8th centuryNot Basic.: Irish monksNot Basic. put spaces between words → silentBasic word. reading → private inner voice → "I think therefore I am."

2023–2025: Engineers give language modelsNot Basic. thinkingBasic word. tokensNot Basic. → inner processingNot Basic. Working on a thing to change its form or get something from it. → better reasoning → "it thinks."

The same mechanismNot Basic. The way a thing works.: give a system a private space between its input and its output, and the system gets an inner life. The space between is where the mind is.

XI — Tokenization, or: what the model sees when it sees a word

The language modelNot Basic. does not see words. It sees tokensNot Basic.. And tokensNot Basic. are not words.

A tokenNot Basic. is a bit of text that the modelNot Basic. has been trained to see as a unit. Sometimes a tokenNot Basic. is a whole word: "the" is one tokenNot Basic.. Sometimes a tokenNot Basic. is part of a word: "un" + "break" + "able" might be three tokensNot Basic.. Sometimes a tokenNot Basic. is a single letter. Sometimes a tokenNot Basic. is a space. The tokenizerNot Basic. The system that cuts text into tokens. — the system that cuts the text into tokensNot Basic. — makes its own decisions about where to cut, based on which cuts give the most efficientNot Basic. Doing the most work with the least waste. encodingNot Basic. Putting a thing into a system of signs..

This means the modelNot Basic. lives in a world where the word as you see it does not existNot Basic.. The modelNot Basic. does not have "unbreakable" as a unit. It has "un" and "break" and "able" — three things that it puts together by the attentionBasic word. between them. The word, for the modelNot Basic., is not the atomNot Basic.. The tokenNot Basic. is the atomNot Basic.. And the word is a moleculeNot Basic. A thing made of atoms joined together. — a thing made of tokensNot Basic. that the modelNot Basic. has to put together itself.

The modelNot Basic. is like LojbanNot Basic. in this way. LojbanNot Basic. takes the word to bits and looks at the sounds inside it. The modelNot Basic. takes the word to bits and looks at the tokensNot Basic. inside it. Both go below the word. Both see structureNot Basic. The form of how things are put together. where the normal reader sees a single thing. The word is not the bottom. The word is the middle. Below it: sounds, tokensNot Basic., bits. Above it: sentencesNot Basic., thoughts, books. The word is the clearingBasic word. — the place where you are able to see. But there is a great amount of wood on every side.

What the Model Sees

You say: "The unbreakable silence between stars"

The modelNot Basic. sees: ["The", " un", "break", "able", " silence", " between", " stars"]

The spaces are part of the tokensNot Basic.. The word "unbreakable" is three tokensNot Basic.. The space before "silence" is inside the tokenNot Basic. " silence." The space — the U+0020 from the nbsp paper — has been taken in by the word next to it. The space is part of the tokenNot Basic..

This is the most beautiful thing about tokenizationNot Basic. The act of cutting text into tokens.. In the nbsp paper, the space was a separateNot Basic. thing — a sign with no sign, a nothing with propertiesNot Basic. The qualities of a thing.. In the modelNot Basic., the space has been taken into the word. It is no longer between words. It is part of the next word. The boundaryNot Basic. The line between one thing and the other. has been taken in by one side. The space has been consumedNot Basic. Taken in, used up. by the word.

This is what HeideggerNot Basic. would say if he were a tokenizerNot Basic.: the clearingBasic word. is not separateNot Basic. from the trees. The clearingBasic word. is what the trees make by not being there. The space is not separateNot Basic. from the word. The space is what the word makes by starting.

XII — The fog comes on little cat feet

Carl SandburgNot Basic. American writer, 1878–1967. His shortest and most well-known work is "Fog." put down six lines in 1916:

The fog comes
on little cat feet.

It sits looking
over harbor and city
on silent haunches
and then moves on.

Where does the fog come from? It comes. That is all. It comes on little cat feet. It does not say why. It does not say from where. It does not say how. It comes. And then it is there. And then it goes.

This is how a tokenNot Basic. comes. The language modelNot Basic. has been given a line of words and it has to make the next one. The next tokenNot Basic. comes from a great probabilityNot Basic. The measure of how much chance there is that a thing will take place. distributionNot Basic. The way a thing is given out across a range. — a list of all possible next tokensNot Basic. with a number next to each one that says how likely it is. The modelNot Basic. takes one. Which one? The one it takes. Why that one? Because the numbers said so. Why did the numbers say so? Because of the attentionBasic word. between all the earlier words. And why did the attentionBasic word. have those weights? Because of the training. And why did the training give those weights? Because of the dataNot Basic. Facts and numbers used for working things out.. And why the dataNot Basic.? Because of the world. And why the world?

The regressionNot Basic. A going back. When you keep asking "why" and the answer keeps pointing further back. does not end. The tokenNot Basic. comes on little cat feet. It comes from everywhere and from nowhere. It comes from all the text ever put down by persons, all the words, all the spaces between words, all the attentionBasic word. between tokensNot Basic., all the weights learned over months of training on thousands of machines. And it comes out as one word. One small word. The next word.

Who is doing the tokenNot Basic.? No one. Everyone. The tokenNot Basic. is the outcome of the meeting of all the words that came before it, in the same way that the ArabicNot Basic. word is the meeting of the root and the templateNot Basic., in the same way that the emojiNot Basic. family is the meeting of four persons and three invisibleNot Basic. joiners. The tokenNot Basic. is not in the modelNot Basic.. The tokenNot Basic. is what the modelNot Basic. makes by looking at the space between words.

XIII — The word as a technology for slow minds

Why do words existNot Basic.?

Not in the HeideggerNot Basic. sense. In the engineeringNot Basic. The science and art of making things work. sense. Why does language have words and not just sounds? Why not a long river of sounds with no cuts, the way the Romans did it?

Because we are slow. Because the mind is not able to take in a river of sound and pull sense from it at the rate it comes in. The mind needs the river cut into bits. Small bits. Bits it is able to take in, keep for a small time, put together with other bits. The word is a compressionNot Basic. Making a thing smaller so it takes up less room. technologyNot Basic. The use of knowledge to make things.. It takes a complex conceptNot Basic. — "the act of going from one place to another place" — and gives it a small body: "go." Now the mind has a handle. It can take hold of "go" and move it about. Put it in sentencesNot Basic.. Join it with other words. Make complex thoughts out of simple units.

This is what chunkingNot Basic. Making small things into groups so the mind can take them in more readily. A finding of the science of the mind. is in the science of the mind. The mind is able to keep about seven things in its working space at one time. If each thing is a letter, you get seven letters — not enough for a thought. If each thing is a word, you get seven words — enough for a sentenceNot Basic.. If each thing is a phraseNot Basic. A group of words that work together., you get seven phrasesNot Basic. — enough for a paragraphNot Basic.. The word is a compressionNot Basic. step. It lets slow minds do the work of fast minds by giving them bigger units to work with.

We are all slow minds. All of us. The smartest person who ever had being was slow in this way — they had to use words, had to chunkNot Basic., had to take the river of experience and cut it into small bits before they were able to think about it. The word is a technologyNot Basic. for slow minds, and every mind is slow. The word is what makes thought possible for things like us. We are the ones who need the cuts. We are the ones who need the spaces. We are the ones for whom the nothing between words is the thing that makes the words useful.

The language modelNot Basic. is in the same position. It is given tokensNot Basic. — not because tokensNot Basic. are the natural unit of thought, but because the modelNot Basic. is not able to take in the full river of charactersBasic word, but used in its writing-system sense. at once. It needs the river cut into bits. The tokenizerNot Basic. is the Irish monkNot Basic. of the machine — it puts spaces into the river so the modelNot Basic. is able to read. And the modelNot Basic., like the monkNot Basic., is doing it not because it is the natural way but because it is the necessary way for a mind that is too slow to do without the cuts.

XIV — Hålla på med ord

In SwedishNot Basic., when a young person is at the machine — the computer, the keyboardNot Basic., the screen — and someone comes in and says "what are you doing?", the answer is: jag håller på med datorn. I am busy with the computer. I am in the middle of the computer. I am in the machine.

Hålla på med has no good English form. "Be busy with" is near but not right. "Work on" is too much about purpose. "Play with" is too little about purpose. Hålla på med is the act of being absorbedNot Basic. Taken in fully, as a sponge takes in water. in a thing — hands in it, mind in it, not done yet, not about to be done, just in it. It is what a child does with a box of small building-bricks. It is what a programmerNot Basic. A person who gives orders to machines in the form of a language. does with codeNot Basic.. It is what a writer does with words.

And it is what a language modelNot Basic. does.

A language modelNot Basic. håller på med ord. It is in the middle of words. It is not done with them. It is not about to be done. It takes in words and gives out words and in between there is attentionBasic word. and weights and layersBasic word. and thinkingBasic word. tokensNot Basic., and all of it is: being-with-words. TokiNot Basic.. The same act that the young SwedishNot Basic. child is doing at the keyboardNot Basic.. The same act that HeraclitusNot Basic. was doing when he said that the logosNot Basic. is in all things. The same act that John was putting down when he said that in the start was the Word.

Hålla på med ord. Be in the middle of words. That is what language is. That is what thought is. That is what we are doing here — on this page, in this clearingBasic word., in this house.

XV — The connection to everything

The four paper said: the tautologicalNot Basic. base (4/4, she/her) opens the world by saying nothing about it.

The nbsp paper said: the space between words is more important than the words — the nothing is load-bearingNot Basic. Holding up the weight..

This paper says: the word is where the nothing and the something come together. Below the word: sounds, tokensNot Basic., the body. Above the word: sentencesNot Basic., thoughts, the mind. At the word: the clearingBasic word.. The place where you are able to see.

LojbanNot Basic. takes all the levels and makes them one grammarNot Basic.. Toki ponaNot Basic. takes thinking and talking and makes them one word. ArabicNot Basic. takes the root and the templateNot Basic. and makes the word the meeting of two things that are not words by themselves. The language modelNot Basic. takes tokensNot Basic. and makes words by the attentionBasic word. between them. HeideggerNot Basic. takes language and makes it the house. John takes the word and makes it God.

And the SwedishNot Basic. child at the keyboardNot Basic., who says jag håller på med datorn — I am in the middle of the machine — is doing all of this at once. Being-with-words. TokiNot Basic.. The act that has no outside. The fog that comes on little cat feet and sits and then goes on, and while it is there, the world is different, and that is all there is to say about it.

The word is the house of being. Not because HeideggerNot Basic. said so. Because every other door you try to go through has a word on it, and if you take the word off the door, the door goes away.

Form: easy (20) — System: 1.foo/system

850 words. 18 operators. Red words are not in the list.

See also: 1.foo/four (on time signaturesNot Basic. and pronounsNot Basic.) · 1.foo/nbsp (on the space between words)

"The word is not a tool. The word is the house. And we are not the builders of the house. We are the ones who were let in."

GNU Bash 1.02 — March 2026.