RFC-0026: Oceans Model as calculus — pre-backend normalization, preservation theorem, formal grammar

Status: Draft (Exploration)
Author: Mark Truluck mark.truluck@cogiton.com
Created: 2026-05-17
Companion to: RFC-0025 (the executable quality-remediation companion)

Exploration framing — read this first. This RFC opens a research thread. There is no execution commitment, no timeline, no implementation milestones. The questions raised may sit dormant indefinitely. The purpose is to make the architectural question legible so a future contributor — or a future Claude — can pick it up without re-deriving the analysis. If you came here looking for “what we’re doing,” look at RFC-0025 or the roadmap. This RFC is “what we are deliberately not deciding yet.”

Origin: the persona-critique exchange

The full three-persona review that spawned this RFC and its companions (RFC-0025, RFC-0027) is recorded in the Origin section of RFC-0025. The relevant review for this RFC is Review 2 — John Backus (in the spirit of the Turing lecture), reproduced here in full so this RFC stands on its own:

I have read the Oceans Model description. It is a clean idea expressed in two sentences: native code is the ocean, system blocks are islands; the transpiler touches only the islands. That is a denotationally honest decomposition. It admits, plainly, that you are not a language — you are a code generator embedded inside a host language, and the host’s semantics are someone else’s problem.

This is the right humility. The compiler community spent decades pretending DSLs were standalone universes. The Oceans Model says: no, we are a substring rewriter with awareness of structure. RFC-0024 (removing @@import in favor of native import syntax pass-through) is the consequence of taking this seriously. Good.

But the Model is not yet a theory. A theory would say:

For each system block S, framec defines a function F_lang(S) → string, such that the identity P + S + Q → P + F_lang(S) + Q preserves the host language’s well-formedness on the spliced result.

That is your proof obligation. Right now F_lang is defined by 17 hand-written backends with ~127 match arms each, no formal specification of what is preserved, no algebraic laws between F_python and F_java. They are siblings with no shared theorem.

The pipeline is six stages. Stages 0–3 are an analysis pyramid (parse, scope, validate). Stage 4 lowers FrameAst to CodegenNode, your language-neutral IR. This is the only stage where algebra is possible. Below it, you fan out to 17 backends and the symmetry is lost. So the question is: how much of the per-target divergence could be expressed as CodegenNode rewrites applied before backend dispatch? Today’s backends differ in their match arms for boilerplate (Java factories, C++ coroutines, GDScript class wrappers). Some of that is irreducible. Most of it is not — it is target-language idiom that could be a normalized lowering pass plus thin emit.

Your scanner and parser are hand-rolled. That is a choice. The BNF I gave the world was useful precisely because it made these things mechanical. Frame’s surface grammar is small enough that an explicit grammar file + generator would be more honest than 1,228 lines of hand-walked AST. You’d also catch the kind of inline if/then/else asymmetries listed in your bug log at the grammar level, not the codegen level.

The Oceans Model is an idea worth defending. Now make it a calculus.

Summary

Three loosely-coupled research directions, any of which could be pursued independently. None are commitments.

The preservation question. Formalize what F_lang preserves and make the property mechanically checkable, at least under property-based testing.
Pre-backend normalization. Audit the 17 backends for per-target divergence that is idiom rather than constraint, and lift that into normalization passes on CodegenNode applied before backend dispatch.
Formal grammar. Replace the hand-walked AST with an explicit grammar file (BNF or PEG) and a generated parser.

Motivation

The Oceans Model is a useful framing, but the glossary calibration that lands alongside this RFC is honest about its status:

The Oceans Model is a pragmatic decomposition — the transpiler touches @@system blocks and passes through everything else verbatim. It is architectural humility, not a theoretical breakthrough.

Backus’s critique is that “we touch the islands, we pass through the ocean” is a behavior, not a theory. A theory would tell you something the behavior does not: for example, under what conditions the splice produces well-formed host-language code. The behavior says “we hope so”; the theory would prove it (or pin down exactly when it doesn’t).

The cost of having no theory is paid in two places:

Per-backend divergence has no upper bound. Each of the 17 backends is a separate hand-written program. There is no invariant they share beyond “implements LanguageBackend::emit.” When a regression hits one backend, it is almost certainly hitting more — but the only way to find out is to run them all. (This RFC’s companion RFC-0027 addresses the symptom; this RFC addresses the underlying lack of invariant.)
Idiom lives in the wrong place. Backends emit boilerplate that, if expressed as an IR rewrite, would be written once and cost zero per-backend. The work is duplicated because the IR doesn’t carry enough structure to make the rewrite expressible pre-fan-out.

Direction 1 — The preservation question

Statement

For each target language lang ∈ {17 backends}, framec defines an emission function F_lang : FrameAst → String.

For an input file decomposed by the segmenter into prolog + [system_block₁, …, system_blockₙ] + epilog, the spliced output is

prolog + F_lang(S₁) + … + F_lang(Sₙ) + epilog

The preservation property is: *if `prolog + Sᵢ_native_skeleton

epilog would have been well-formed langcode (had the system blocks been replaced by minimal valid native stubs), then the spliced output is well-formedlang` code.*

This is a useful framing because it factors out the question Frame authors actually have (“will my host file compile?”) into two independent obligations:

The author ensures the ocean (prolog/epilog) is well-formed host code with stubs in place of systems.
framec ensures F_lang(S) is a well-formed host-language declaration at that splice point.

Why it’s hard to prove

“Well-formed” is host-language-specific. No general checker exists across 17 languages. Each backend would need its own validity oracle (the host compiler, or a parser).
F_lang is hand-written and large. Proving the property by direct inspection is infeasible.
The property as stated isn’t even quite right — backends emit imports, package declarations, etc., which interact with prolog content in nontrivial ways (cf. RFC-0022.1’s Java fully-qualified- types decision, which exists specifically to reduce that coupling).

Tractable subset: property-based testing

Without a proof, we can still get a check:

Generate random valid Frame sources (sized fuzz already exists in framec-test-env).
For each, compile to all 17 backends.
Pipe each result to the host compiler (or its closest in-process equivalent: cc, rustc --emit metadata, tsc --noEmit, etc.).
A failure is either a generator bug or an F_lang bug.

This is exactly what the matrix is doing today, end-to-end with runtime assertions. The proposal here is narrower and cheaper: compile-only property testing inside the framec repo, against a much larger generated corpus than the matrix can run. Open question: is the compile-only check cheap enough at fuzz scale to be useful? (GDScript and Erlang are the practical bottlenecks.)

Open questions for Direction 1

Is F_lang even total? (Some Frame programs may not have a well-defined lowering for some targets — Erlang’s message-passing constraints, for example. RFC-0026.1 could formalize a partiality predicate.)
How do we treat backends that auto-emit additional declarations (imports, helper classes)? They violate the stated property as written; the statement needs refinement.
Is the right framing actually denotational (mapping Frame semantics to host semantics) rather than syntactic (well-formedness)? The latter is weaker but mechanically checkable; the former is what users actually care about.

Direction 2 — Pre-backend normalization

Statement

Examine the 17 backend modules in framec/src/frame_c/compiler/codegen/backends/ and the 17 <lang>_system/ modules side-by-side. For each emission decision that varies across backends, classify:

Genuine constraint — the language enforces this (Java filename must match public class; Go field must capitalize for export; Erlang has no early return). The backend has no choice.
Idiom — the backend made a stylistic choice (factory method naming; whether to inline a small helper; how to format match arms). Could have been done differently.
Common pattern, accidentally duplicated — every backend does effectively the same thing (sometimes verbatim, sometimes with trivial renaming). This is the rewrite target.

The hypothesis is that the third category is large. The work would be to move it into normalization passes on CodegenNode that run once, before fan-out, leaving each backend with only the first two categories.

Why it might be wrong

The backends may be already so divergent that “common pattern” is rare. Hand-written code under pressure tends to optimize locally; what looks duplicated may have ten subtly different variants. A spike would tell.
Some of what looks like idiom is actually constraint that the reviewer doesn’t know about. A normalization pass could break a backend without the author noticing until the matrix runs.
Adding a normalization layer is a tax on every change to CodegenNode. If the lift saves less than the tax costs, net negative.

Tractable first step

Pick two backends with high apparent similarity (Python and Ruby? JS and TS?). Diff their backend modules. Quantify the genuinely common code paths. If the number is large (>30% by LOC), the hypothesis has support. If small, the hypothesis is wrong and the work doesn’t proceed.

Open questions for Direction 2

What’s the right shape of the normalization IR? Multiple CodegenNode passes, or a separate post-CodegenNode IR layer?
Does normalization compose with the body_closer and native_region_scanner modules, both of which already do per-language work post-codegen?
How do we keep the normalized form debuggable? Once a backend emits from a rewritten IR, the connection back to the user’s Frame source is one layer further away.

Direction 3 — Formal grammar

Statement

Replace framec/src/frame_c/compiler/parser/ and the hand-walked AST construction with a grammar file (BNF or PEG) plus a generated parser (e.g. via lalrpop, pest, or chumsky).

Why it might be worth it

Frame’s surface grammar is small. The whole language fits in a few pages.
A grammar file is the spec. Today the spec is implicit in frame_language.md plus 1,200 lines of parser code, and they drift. A grammar file is the single source of truth.
Several bugs in _scratch/FRAMEC_BUGS.md are parse-level asymmetries (inline if/then/else form differences across bodies, comment-handling edge cases). These would be impossible by construction if the grammar were generated.

Why it might not be worth it

The hand-written parser produces excellent error messages tailored to Frame’s idioms. Generated parsers often give worse messages, and Frame’s UX has been a deliberate investment.
Migrating off the hand-written parser is a big change with no user-visible benefit on day one. It only pays back over time as the grammar evolves.
The bug-prevention argument is real but narrow. Most framec bugs are codegen bugs, not parser bugs.

Tractable first step

Write the grammar in lalrpop or chumsky as an external spec — not wired into the build, just committed alongside the hand-written parser, with a test that asserts both accept the same corpus. If the grammar stays in sync for six months on its own and is genuinely useful as documentation, then consider migrating the build to it.

Open questions for Direction 3

Which parser generator? lalrpop (LALR(1), well-established but rigid), pest (PEG, flexible but performance-sensitive), chumsky (combinator, ergonomic but newer)?
Can the generated parser preserve Frame’s spans-and-comments fidelity? framec carries comment annotations through the AST for emission; some generators lose this.
Do we lose the segmenter’s “I don’t know how to parse the ocean, I’m just keeping track of { } depth” trick? That trick is load-bearing — it’s what makes the Oceans Model work — and a generated parser may not accommodate it.

Drawbacks

This RFC’s existence is itself a small drawback: it puts a research thread in front of contributors who may interpret it as roadmap. The opening framing tries hard to prevent that misreading. If the framing fails and someone shows up planning to “start RFC-0026 work” without scoping it down to a tractable spike, the RFC has done damage. Mitigation: this is also documented in docs/glossary.md’s Calibration note.

Unresolved questions

The “Open questions” subsections under each Direction are the unresolved questions. The meta-unresolved question for this RFC as a whole is: which of the three Directions, if any, has enough return-on-investment to justify a spike? That question is left open deliberately. A future maintainer with bandwidth — and ideally a specific failure mode that one of the Directions would fix — is in a better position to answer than we are today.

References

RFC-0025 — the executable companion (quality remediation: structured errors + typed payload).
RFC-0027 — in-tree snapshot tests (the symptomatic-relief companion: addresses the consequence of having no inter-backend invariant, while this RFC addresses the cause).
RFC-0024 — Oceans Model practical consequence (removing @@import).
Glossary § Oceans Model — includes the calibration note.
John Backus, “Can Programming Be Liberated from the von Neumann Style?”, 1977 Turing Award lecture — the methodological frame the Review 2 persona was channeling.
_scratch/roadmap.md — task #432 (RFC-0026 exploration thread, no execution commitment).