A Brief History of State Machine Languages — And Why Frame Is Different

Prompt Engineer: Mark Truluck mark@frame-lang.org

Automata theory has been settled science since the 1950s. The concepts — states, transitions, inputs, outputs — are well-defined, mathematically rigorous, and practically useful. And yet, sixty years later, the mainstream programming languages used to build software still have no native concept of state. The theory won in the textbooks. It lost in the editors.

This isn’t because nobody tried. There is a long history of languages, notations, tools, and standards designed to bring state machines into the practice of software development. Each made a different set of tradeoffs. Each found an audience. None broke through to general adoption. Understanding why tells you something important about what it takes for a state machine tool to actually get used — and what Frame does differently.

The Early Formalisms (1950s–1970s)

The mathematical foundations were laid in the 1950s by people like Mealy, Moore, Kleene, and Rabin & Scott. Their work defined what state machines are, what they can compute, and how they relate to formal languages. This was theory, not tooling — the output was proofs and papers, not software.

The practical applications came first in hardware. Digital circuit designers used state machines directly, expressing them as state transition tables and implementing them in flip-flops and logic gates. The hardware community never lost touch with automata theory because the hardware itself is a state machine — there’s no abstraction gap to bridge.

Software took a different path. As programming languages evolved from assembly through FORTRAN, COBOL, C, and into the object-oriented era, they organized around data and operations rather than states and transitions. State machines became a “technique you use when you need one” rather than a structural foundation, and the gap between the theory and the practice began to widen.

Harel Statecharts (1987)

David Harel’s 1987 paper “Statecharts: A Visual Formalism for Complex Systems” was the single most important contribution to making state machines practical for software. The paper identified the limitations of flat state machines — the explosion of states when modeling complex systems — and introduced three extensions that addressed them:

Hierarchy (nested states). A state can contain substates. The parent state defines shared behavior; child states specialize it. This collapses the combinatorial explosion: instead of duplicating error-handling logic in every state, you put it in a parent and let child states inherit it.

Concurrency (orthogonal regions). A system can be in multiple states simultaneously, in independent regions that evolve in parallel. A phone can be simultaneously “ringing” (call state) and “on battery” (power state) without needing a cross-product of every combination.

History (memory of previous substates). When a state is re-entered, it can resume in the substate it was last in, rather than always starting from the initial substate. This models the common pattern of “go do something and come back to where you were.”

Statecharts were a visual formalism — defined as diagrams, not text. They were designed to be drawn, not typed. This was their strength (people could see the whole system at a glance) and, as it turned out, their limitation for adoption into the daily practice of writing code.

UML State Machines (1990s–2000s)

When the Unified Modeling Language absorbed Harel’s statecharts in the mid-1990s, it gave them an industry-standard home. UML state machine diagrams became the canonical way to model behavioral logic in enterprise software projects. CASE tools — Rational Rose, MagicDraw, Enterprise Architect, and others — provided graphical editors for drawing statecharts and, in some cases, generating code from them.

This was the closest state machines came to mainstream software practice, and it’s worth understanding why it didn’t stick.

The model-code gap. UML diagrams lived in modeling tools, separate from the codebase. Developers drew diagrams, then wrote code that was supposed to implement the diagrams. Over time, the code diverged from the diagrams. Keeping them in sync was a manual discipline that few teams sustained. The diagrams became documentation artifacts — useful for initial design, unreliable after the first few sprints.

Round-trip engineering never worked well. The promise was that you could modify the diagram and the code would update, or modify the code and the diagram would update. In practice, the impedance mismatch between a visual diagram and textual code made this fragile. Generated code was often unreadable or heavily annotated with tool-specific metadata. Hand-edited code couldn’t be reliably parsed back into a diagram.

Heavyweight tooling. UML tools were expensive, complex, and required their own learning curve. They were adopted in enterprise environments with formal processes but resisted by the agile and open-source communities that increasingly set the norms for software development.

Modeling vs. programming. UML positioned state machines as a modeling activity — something you do before or alongside coding. For most developers, especially as agile practices took hold, modeling-as-a-separate-phase lost credibility. Developers wanted to express behavior in code, not in a separate tool that generated code.

The UML era established that state machines were useful for design. It did not establish that they were practical for implementation. The gap between the whiteboard and the editor remained.

SCXML (2005–2015)

The W3C’s State Chart XML initiative spent a decade (2005 to 2015) standardizing an XML-based language for defining statecharts with formal execution semantics. SCXML was thorough: it specified compound states, parallel states, history, event processing order, data models, and executable content. It was designed as an interchange format — a way to define a state machine once and execute it on any conformant runtime.

SCXML succeeded as a specification. It is implemented in Java (Apache Commons SCXML), JavaScript (SCION), C++ (Qt’s SCXML module), and other languages. The Qt framework integrated it directly into its toolchain, with a visual editor in Qt Creator.

But SCXML’s adoption remained narrow, for reasons that illuminate what developers actually need:

XML is not a programming language. SCXML is verbose, hard to read, and painful to write by hand. The specification itself acknowledges this — it was designed as a machine-readable interchange format, not a human-authored language. Developers who tried to write SCXML directly found it hostile. Those who used graphical editors to generate it faced the same model-code gap as UML.

Runtime dependency. SCXML state machines require an SCXML engine to execute. The engine is a runtime dependency with its own semantics, quirks, and performance characteristics. Developers couldn’t just read the generated code and understand it — they had to understand the engine’s execution model.

Separate from the code. SCXML files are external artifacts, not part of the source code. They must be loaded, parsed, and executed by a separate runtime. The state machine and the application logic that surrounds it live in different files, different languages, and different mental models.

Stateflow (MathWorks)

In the control systems and embedded engineering world, MathWorks’ Stateflow has been the dominant state machine tool for decades. Integrated with MATLAB and Simulink, Stateflow provides graphical statechart editing with simulation, testing, and code generation (typically C/C++ for embedded targets).

Stateflow is genuinely effective within its domain. Control engineers design state machines graphically, simulate them against plant models, and generate production code that runs on embedded hardware. The model is the implementation — there’s no gap.

But Stateflow is tied to the MathWorks ecosystem: expensive licenses, a proprietary environment, and a workflow oriented toward signal processing and control engineering. It never crossed over to general-purpose software development. A web developer, an agent developer, or a backend engineer has no path to Stateflow.

Ragel

Ragel, created by Adrian Thurston, took a different approach entirely. It’s a state machine compiler that generates code in C, C++, Java, Ruby, and other languages. You write regular expressions and state machine definitions in Ragel’s syntax, and it generates efficient, readable code — specifically optimized for parsing and protocol handling.

Ragel is notable because it got two things right that many other tools missed. First, the generated code is plain, dependency-free source code in the target language — no runtime required. Second, the tool was practical enough to be used in production systems (the Mongrel and Puma web servers for Ruby used Ragel-generated HTTP parsers).

But Ragel was narrowly focused on lexing and parsing. It wasn’t designed for general application state management, UI workflows, or business processes. It demonstrated that state machine code generation could be practical, but for a specific domain.

XState (2018–present)

XState, created by David Khourshid, is the most successful modern attempt to bring statecharts into mainstream software development. It’s a JavaScript/TypeScript library that implements Harel statechart semantics — hierarchy, parallel states, history, guards, actions — with a developer-friendly API.

XState got several things right. It works in the JavaScript ecosystem where a huge number of developers live. It has a visual editor (Stately Studio) that generates code and stays in sync with it. It has strong TypeScript support. It was designed for UI development, which is where most JavaScript developers encounter complex state.

XState’s approach is a runtime library: you define a state machine as a JavaScript object, and the XState interpreter executes it at runtime. This means the state machine definition is in the code (solving the model-code gap) and the semantics are handled by the library (solving the edge-case problem).

The tradeoff is runtime dependency. XState is a library your code depends on. The state machine definition uses XState’s API and data structures. If you decide to stop using XState, you’re rewriting, not removing an import. The generated visualizations are excellent, but the underlying code is XState’s execution model, not plain language constructs.

XState proved there’s significant demand for state machine tooling in mainstream development. It also demonstrated that the JavaScript/TypeScript ecosystem — the largest developer community in the world — is receptive to the idea.

Erlang/OTP gen_statem

Erlang’s gen_statem behavior module deserves mention because it represents the closest any mainstream language ecosystem comes to treating state machines as a foundational abstraction rather than an add-on.

In OTP (Erlang’s standard framework), gen_statem is a behavior — a pattern that your module implements. You define callback functions for each state, and the OTP framework handles dispatch, transitions, timeouts, and supervision. The state machine structure is visible in the code: each state is a function clause, events are matched per-state, and transitions are explicit.

The Erlang approach works because the language’s concurrency model (lightweight processes, message passing) naturally produces systems that are state machines — each process receives messages and responds based on its current state. The language and the abstraction are aligned.

But Erlang is a niche language. gen_statem is excellent within the Erlang/Elixir ecosystem and irrelevant outside it. It demonstrates what’s possible when a platform takes state machines seriously, without providing a solution for the vast majority of developers who work in other languages.

The Recurring Pattern

Looking across this history, the same failure modes repeat:

The modeling trap. UML and CASE tools positioned state machines as a design activity separate from coding. This created a gap that widened over time. Developers don’t want to model in one tool and implement in another.

The runtime trap. SCXML and XState require a runtime engine or library to execute the state machine. This creates a dependency, an abstraction layer to learn, and code that can’t be understood without understanding the framework.

The domain trap. Stateflow and Ragel are excellent within their domains (control systems, parsing) and irrelevant outside them. General-purpose state machine support requires a general-purpose approach.

The syntax trap. SCXML proved that XML is a terrible authoring format for humans. UML proved that graphical-only notations don’t integrate with text-based development workflows. Any notation that developers are expected to write needs to be concise, readable, and comfortable in a text editor.

The ecosystem trap. Erlang’s gen_statem works because it’s integrated into a language whose concurrency model naturally produces state machines. Porting the concept to Python or TypeScript means rebuilding the integration from scratch.

What Frame Does Differently

Frame is a state machine language that was designed with this history in mind. Its design decisions are responses to the specific failure modes described above.

Text-first, not diagrams-first

Frame is a textual language, written in a text editor, stored in source control, diffed in pull requests. There is no mandatory graphical tool. You can generate diagrams from Frame (the framepiler outputs GraphViz DOT), but the source of truth is the text. This avoids the model-code gap entirely: the Frame specification is in the codebase, version-controlled alongside everything else.

Embedded, not separate

Frame lives inside your source files. A Frame system block is embedded in a Python, TypeScript, Rust, C, or Go file, surrounded by native code. The framepiler finds the @@system blocks, generates code from them, and passes everything else through unchanged. Frame’s design metaphor is “native code is the ocean, Frame systems are islands.” The state machine and the application code that uses it are in the same file, the same language, the same build.

This is fundamentally different from SCXML (a separate XML file), UML (a separate modeling tool), and XState (a runtime library with its own API). Frame doesn’t ask you to move your state logic somewhere else. It adds state structure to the code you’re already writing.

No runtime dependency

The framepiler generates a plain class in the target language. No runtime library, no framework, no engine. The generated code is readable — straightforward dispatch logic that any developer can step through in a debugger. If you decide to stop using Frame, the generated code stands alone. You can maintain it directly, modify it, or use it as a starting point for hand-rolled code.

This is the Ragel principle applied to general-purpose state machines: the tool disappears after compilation, leaving behind clean, dependency-free source code.

Multi-language code generation

Frame generates code in 17 languages: Python, TypeScript, JavaScript, Rust, C, C++, Java, C#, Go, PHP, Kotlin, Swift, Ruby, Erlang, Lua, Dart, and GDScript. The same Frame specification can target any of them. This means the state machine design is portable across platforms, but the output is native to each — not an interchange format that requires a universal runtime.

State is the organizing principle

In Frame, code is organized by state. Each state is a block containing the event handlers that apply in that state. The behavior for “Going Up” is in one place — all of its event responses grouped together. This is the transposition that the history of tools never achieved within the source code itself: code organized by row (state) rather than by column (event).

Events not handled in a state are silently ignored — there is no else clause to accidentally capture new states. Or, with hierarchical state machines, unhandled events can be forwarded to a parent state that defines the default behavior. This replaces defensive guard clauses with structural guarantees.

The full automata spectrum

Frame isn’t limited to finite state machines. It supports hierarchical state machines (Harel’s nested states, through parent-child state relationships), pushdown automata (through a built-in state stack with push and pop operations), and general-purpose computation through native code within handlers. The framepiler doesn’t restrict which automata features you use — a simple two-state toggle and a complex hierarchical workflow with state history use the same syntax and the same generated architecture.

Diagrams are derived, not primary

Frame can generate GraphViz diagrams from any specification with a single command. The diagram is always in sync with the code because it’s generated from the same source. This inverts the UML model: instead of drawing diagrams and generating code (with inevitable drift), you write the specification and derive diagrams (with guaranteed accuracy). The diagram becomes a review tool and a documentation artifact, not a primary authoring surface.

The Gap Frame Fills

The history of state machine tools is a history of almost-but-not-quite. Each tool solved part of the problem while introducing new friction that limited adoption:

UML gave us the visual formalism but separated it from the code. SCXML gave us standardized semantics but buried them in XML and a runtime engine. Stateflow gave us production-quality code generation but locked it in a proprietary ecosystem. Ragel gave us dependency-free code generation but limited it to parsing. XState gave us developer-friendly statecharts but tied them to a runtime library in one language. Erlang gave us first-class state machine support but only for Erlang.

Frame’s bet is that you can have all the properties that matter — textual, embedded, dependency-free, multi-language, state-organized, diagrammable — without the tradeoffs that held previous tools back. Whether that bet pays off depends on whether enough developers recognize the state gap in their own code and decide to close it. The theory has been ready for sixty years. The question has always been the tooling.