Compositional natural-language generation for Thai and English · driven entirely by symbolic frames + lexicon · every output ships with an audit trail.
No pre-trained weights. No LLM API call. No neural model in the runtime path. The composer reads a frame definition, picks a register-aware construction, fills slots, and applies Thai sentence-final particles + classifiers + English subject-verb agreement — all from rules.
All 12 examples below were computed offline by running the live composer module. No backend call from this page.
Run via compose(frame_id, bindings, lang, register, ...) — output, construction id, and confidence shown verbatim.
Pure data-flow · no learned parameters anywhere along this path.
For multi-sentence output (paragraphs), compose_paragraph() wraps the
above per-sentence pipeline, then layers anaphora resolution (adjacent-sentence pronoun replacement)
and discourse markers (ดังนั้น · แล้ว · So · Then · But) inferred from frame relations.
The composer returns a trace[] listing the frame loaded, the lexical
unit picked, the construction selected, every slot fill, and any post-processing rule applied.
You can replay the decision path. An LLM cannot.
Given the same frame + bindings + register, the composer returns the same surface form, byte-identical.
Missing CORE frame elements show up as __MISSING_X__ placeholders
with a warning — the composer never invents content.
ครับ / ค่ะ politeness, ไหม / หน่อย question and request softeners, and 38 lexical classifiers (เล่ม / คัน / คน / ตัว / ใบ ...) are first-class concepts — not bolted-on filters over English-trained weights.
Composer is pure Python + JSON files. No external API call, no telemetry, no token going off-host. The same module runs identically online and offline.
Compose() is a few hundred microseconds of dictionary lookup and regex substitution per sentence. There is no per-token billing because there are no tokens being inferred — only patterns being filled.
We trade open-domain fluency for trace + privacy + determinism. If a frame for a given event type has not been authored yet, the composer cannot say it. That is a feature, not a bug.
What the composer does not do well yet. We surface this so you can decide where to deploy and where not to.
Across the 512-frame library, 18 frames need a Thai-native review pass before being safe for production Thai output. They cover fragile pragmatics (REQUEST scaling, register-coupling around politeness, gender-marked particles in mixed-register paragraphs).
When the lexicon picker and the construction picker disagree on register (e.g. only a casual LU is available for a frame whose constructions are formal-leaning), the composer falls back gracefully but emits a warning. This is honest behaviour rather than a hallucinated smoothing — some sentence cards below show the warning verbatim.
This page demonstrates the generation half (frame → sentence) only. The reverse direction — free-form Thai/English sentence → frame + bindings — is currently a scaffold, not a benchmarked module. End-to-end conversational use therefore still leans on the wider Onion brain stack, not the composer alone.
The Thai classifier system covers ~150-200 nouns harvested from lex_quantity (38 classifier lexemes with example noun lists). Nouns outside this set fall through gracefully (no rewrite, no fabricated classifier) rather than guess.
The agreement layer rewrites the first verb after the subject NP for ~30 common verbs. Auxiliary chains, second clauses, and modals (will / can / must) are intentionally left invariant. Out-of-table verbs pass through unchanged rather than being guessed.
Inserting an ASCII digit ("3") into a Thai construction slot raises a "lang mismatch" warning that currently dampens the confidence score even though the surface output is correct. This is on the Phase 2.x list as W_NUM_LANG_PRIOR. The Thai classifier examples below show this honestly — correct output, conservative score.