RFC-0028: In-process framec API — utility, callers, and IDE integration paths

  • Status: Draft (Forward-looking)
  • Author: Mark Truluck mark.truluck@cogiton.com
  • Created: 2026-05-18
  • Closes (supersedes): Roadmap #171 — “In-process framec API (large refactor)”

Framing — read this first. Roadmap entry #171 framed framec as “CLI-subprocess today, needs in-process API.” That framing is stale: the in-process API was always there and is in active use. What’s missing is adoption — callers that still fork+exec the CLI binary out of historical habit, and the green-field integrations (IDE, REPL, playground) that would build on the existing library. This RFC documents what the API is, what it already does, what value remains untapped, and what each future adoption path is worth. It is not an execution plan for a single project — it’s the call-shop catalog so future work picks the right caller to convert at the right time.

Summary

The framec crate already exposes compile_module(&str, TargetLanguage) -> Result<String, RunError> from framec::frame_c::compiler. This in-process API is invoked directly by:

  • the RFC-0027 snapshot tests (tests/common/mod.rs, 204 invocations per cargo test run as of 2026-05-18)
  • internal validator / pipeline tests (~400 unit tests)
  • the CLI binary framec, which is a 3-line main.rs wrapping cli::run() which parses argv and ultimately calls the same function

The “framec is CLI-subprocess” framing applies only to callers outside the framec crate — the matrix test harness (shell scripts in framec-test-env), the doc-sample validator (scripts/validate_doc_samples.py), and any hypothetical IDE integration that doesn’t yet exist.

Each external caller has a different cost/benefit profile for conversion to in-process. This RFC enumerates them and frames when each conversion is worth doing.

What the API already is

pub fn compile_module(
    content_str: &str,
    lang: TargetLanguage,
) -> Result<String, RunError>
  • Takes Frame source text + a target language enum
  • Runs the full pipeline (Segmenter → Lexer → Parser → Arcanum → Validator → Codegen → Backend → Assembler)
  • Returns either generated target code as a String, or a typed RunError with code + message + optional source location

Plus compile_module_with_path(...) (variant carrying a path for diagnostics) and the lower-level pipeline::compile_module(...) (returns a richer CompileResult with stage-specific data).

Already callable from any Rust crate that adds framec as a dependency. No new design is needed for the basic compile path.

What’s NOT in the API (limits worth knowing)

The library exposes the compile path. It does not currently expose:

  1. Streaming / incremental compilation. Each call re-runs the whole pipeline. No re-use of arcanum across calls; no token-cache for editor incremental updates. Editor integrations that need sub-100ms response on every keystroke would need this.
  2. Diagnostic-only mode. You can ask “did it compile?” but the typed-error stream is RunError (a single error at a time — though the validator can accumulate multiple via the multi-error pipeline). A proper “give me all warnings + errors + their source spans, do NOT emit codegen” mode would be cleaner for LSP diagnostics.
  3. Symbol queries. No “what’s the type of self.x at this line/column” API. The arcanum has the data; no public accessor.
  4. Thread-safety audit. Naively compile_module looks thread-safe (takes &str, returns owned String). framec uses thread_local! in places (e.g. NEW_CONTRACT_SYSTEMS registry per the auto-memory entry on nested-system codegen). Multiple concurrent compilations from one process should work but have not been audited for Send + Sync, data races on shared thread-locals, or panic poisoning.
  5. Stable semver contract. Right now the public surface drifts with every framec release. A stable compile_module(...) and TargetLanguage::* enum would help downstream integrators but slows internal refactors.

These are the unbuilt pieces — work for the integrations that need them, not blockers for callers that just want “compile this source, give me the output.”

The callers — when does conversion pay?

Caller 1: framec-test-env matrix harness

Today: Per-test the shell runner calls framec compile -l <lang> -o /tmp ... via fork+exec. 5,455 fixtures × ~5ms fork overhead = ~27s of CPU spent on process spawn per matrix run.

If converted: Rust orchestrator calls compile_module() in process. Each compile is just a function call. Could reuse a single process for all 17 backends.

Realistic wall-clock win: ~5-10s on a 117s matrix wall (most of the 27s in fork overhead is parallelized across containers; the critical-path savings is smaller). Single-digit-percent improvement.

Cost to convert: Major. The runner is per-language shell that handles compile + run + classify per backend in ~3000 LOC total. Rewriting in Rust means re-implementing every backend’s “call gcc / call npx / call kotlinc / call swiftc / …” plus its TAP classifier. Multi-week project. The Docker per-container model itself constrains us — each container has its own toolchain.

Verdict: Not worth it today. The 117s matrix wall is bounded by container compile/run times (kotlin 110s, c 107s, erlang 117s), not by framec fork overhead. The matrix shell runners are working and tested.

Caller 2: doc-sample validator (scripts/validate_doc_samples.py)

Today: Python script extracts runnable Frame blocks from docs/*.md, calls target/debug/framec compile ... per block via subprocess.run. ~120 calls per validation pass (every pre-commit hook + every CI run).

If converted: Either rewrite in Rust (replaces Python entirely) or expose a PyO3 binding so Python calls compile_module directly.

Realistic wall-clock win: 120 calls × ~5ms fork = ~600ms saved per validation pass. The actual validator pass takes ~6s today (framec compile + python execution of generated code). Net win: ~10%.

Cost to convert: Moderate. The doc validator is 199 LOC of Python; rewriting in Rust would be ~300-500 LOC. PyO3 bindings are ~50 LOC plus toolchain setup (build script, wheels).

Verdict: Modest value, do when bored or when validator becomes a real bottleneck. The current 6s pre-commit cost is fine.

Caller 3: framec_cached.sh (per-test framec cache wrapper)

Today: Shell script (70 LOC) computes content-hash of the .frm source + framec binary hash, looks up in a per-language tar cache, calls framec if miss. Used by every matrix runner to avoid redundant framec re-runs on identical input.

If converted: Rust function that does the same caching in process. Or even — bypass caching entirely since in-process compile is fast.

Realistic wall-clock win: Negligible. The cache hit path is already a 1ms shell op (read a tar, untar to output dir). The miss path runs framec which would be the same speed in or out of process.

Verdict: Don’t bother. The cache wrapper is a working, correct, language-agnostic solution. Replacing shell with Rust buys nothing here.

Caller 4: IDE / language server integration (doesn’t exist yet)

The real motivation for #171. Build new code that consumes the in-process API for value the CLI can’t provide.

Three plausible IDE shapes:

4a. Diagnostics-only LSP

A minimal language server that exposes textDocument/publishDiagnostics by calling compile_module() on save (or with debounce on edit). Returns the typed error stream as LSP diagnostics with source spans. Users see Frame compile errors and validator warnings (E1xx, W7xx, E601-E815, E950s) inline in VS Code / Neovim / JetBrains.

  • Effort: ~1-2 weeks. Tower-lsp-style scaffolding + map RunError → LSP Diagnostic. Re-runs framec on every save; cold compile is fast enough (<100ms for typical fixtures).
  • Win: Probably the highest-impact framec UX improvement available. Frame is a small language and the validator is already strict — surfacing those errors in real time would make it dramatically easier to learn.
  • Blocker: None that I see. The API is already there.
  • Need from framec: A “validate but don’t codegen” mode would be nice (faster + no temp files needed), plus exposing the multi-error accumulator instead of returning at first error.

4b. Hover-types + go-to-definition

Adds textDocument/hover returning the inferred type of identifiers under the cursor, and textDocument/definition for $State / @@:self.method / domain-field references.

  • Effort: ~2-4 weeks on top of 4a. Needs framec to expose arcanum (symbol table) state to callers.
  • Win: Significant editor UX, but lower priority than diagnostics (most Frame programs are small enough you don’t need go-to-def to find a state).
  • Need from framec: Public arcanum accessors: lookup_symbol_at(source, line, col), state_definition(name), interface_method(name). Plus serializable spans on every AST node (the parser likely has them; need to surface them).

4c. Web playground (browser REPL)

Compile framec to WASM, host a web page where users paste Frame source and pick a target backend → see generated code. Like the TypeScript playground or rustc’s godbolt integration.

  • Effort: ~1-2 weeks once WASM build is wired up. lib.rs already has WASM entry points hinted at in the build script.
  • Win: Best “try Frame” onboarding tool possible. Lowers barrier to evaluation from “install Rust + clone repo + cargo build –release” to “open URL.”
  • Blocker: Test the WASM build works. Some framec deps may not compile to wasm32-unknown-unknown (filesystem access for fixtures, threading primitives, etc.). The compile_module function itself is pure (takes &str, returns String) — should work in WASM.

Caller 5: future fuzz runtime exec (roadmap #172)

The fuzz harness today generates Frame source + compiles via framec, but doesn’t run the generated programs. A runtime-exec wave would. Some fuzz phases might benefit from in-process compile (to avoid 35k fork-execs), but the bigger benefit is having framec embedded in a Rust orchestrator that can drive parallel compile + spawn-target-runtime loops.

Verdict: Same library, different orchestrator. Whoever builds #172 will use compile_module directly. Already supported.

What this RFC recommends

No single project. Instead:

  1. Close roadmap #171 as superseded by this RFC. The in-process API is shipped; the “multi-week refactor” framing was wrong. The library is ready.

  2. Document the API publicly in a new section of CONTRIBUTING.md (or a dedicated docs/contributing/library_api.md) so future integrators know what’s available without re-deriving it from tests/common/mod.rs.

  3. When a specific caller needs conversion, open a dedicated task — not “convert all subprocess callers.” Each conversion (4a LSP, 4b hover-types, 4c WASM playground) has its own value model.

  4. When a real integration project starts, audit thread-safety and the unbuilt pieces above (incremental compile, public arcanum accessors, validate-only mode). Right now they’re imagined needs; let the first real consumer drive the design.

  5. Don’t preemptively stabilize the semver surface. Wait until there’s an external consumer to negotiate stability with.

Drawbacks

  • Closing #171 without “completing” it may look like work was punted. It wasn’t — the work was already done, just unlabeled. This RFC’s job is to make that legible.
  • No single owner for the integration paths. Each is plausible but speculative. Without a concrete user, they may sit indefinitely. That’s fine — better than building speculative infrastructure.

Unresolved questions

  • Is the LSP (4a) worth building speculatively, or wait for a user to ask? A 1-2 week LSP would be a huge UX win. There is no obvious technical reason to defer today — the language surface (state machines, HSM, persist, lifecycle, push/pop, imports via Oceans Model) is stable enough as of 2026-05 that an LSP wouldn’t churn meaningfully on syntax changes. The honest question is whether the user-pull justifies the investment now versus higher-ROI work elsewhere — not a stability concern. (Earlier draft of this section claimed RFC-0015 / RFC-0019 / RFC-0024 needed to “ship and stick” before building an LSP. All three were already shipped/accepted when this RFC was written; the deferral was a hand-wave, not a real blocker.)

  • Should the WASM playground (4c) be in framec/ or its own repo? WASM consumer is small and could ship as a thin separate repo that depends on framec as a crate. Keeps build complexity out of the main crate.

  • PyO3 binding for the doc validator (Caller 2) — worth doing proactively? No urgent reason. The validator runs in seconds today. Note as a quick win if someone has half a day.

References

  • framec/src/lib.rs — current crate root
  • framec/src/frame_c/compiler/mod.rs:42compile_module public entry point
  • framec/tests/common/mod.rs — example of in-process usage from RFC-0027 snapshot tests
  • Roadmap #171 (closed by this RFC) — original framing as “subprocess-only, needs in-process refactor”
  • Roadmap #172 — Fuzz runtime exec pipeline (independent integration that would use this API)