Notes · Dissecting Real Systems
growing
A Language That Can't Loop Forever
Bazel's Starlark forbids unbounded loops and recursion on purpose, and gets analyzability, caching, and parallelism in return. Buck2 quietly allows recursion back — and the split shows which restriction is load-bearing.
It seems that perfection is attained not when there is nothing more to add, but when there is nothing more to take away.
— Antoine de Saint-Exupéry, Terre des hommes (1939)
Cite this
Mangalapilly, Y. J. (2026, June). A Language That Can't Loop Forever. Saṃhitā Notes. https://yesudeep.com/blog/a-language-that-cant-loop-forever/ @online{mangalapilly2026a,
author = {Yesudeep Jose Mangalapilly},
title = {A Language That Can't Loop Forever},
journal = {Sa\d{m}hit\=a Notes},
year = {2026},
month = {June},
url = {https://yesudeep.com/blog/a-language-that-cant-loop-forever/},
urldate = {2026-07-02},
} Yesudeep Jose Mangalapilly. “A Language That Can't Loop Forever.” Saṃhitā Notes, 2026. https://yesudeep.com/blog/a-language-that-cant-loop-forever/. TY - ELEC
AU - Mangalapilly, Yesudeep Jose
TI - A Language That Can't Loop Forever
T2 - Saṃhitā Notes
PY - 2026
UR - https://yesudeep.com/blog/a-language-that-cant-loop-forever/
Y2 - 2026-07-02
ER - The fourth of five pieces on build systems — the language-design essay the others keep gesturing at. By the end you'll understand why Starlark forbids while loops and recursion on purpose, what that restriction buys (analyzability, safe caching, parallelism), where Buck2's implementation deliberately relaxes it, where the design sits on the ladder of machines below the Turing cliff — and how the same "bounded fixed point" pattern recurs across DAGs (dependency graphs that forbid cycles), CRDTs (replicated data types that forbid conflict), and regular expressions.
Both Bazel and Buck2 write their build files in the same little language, Starlark, and both lean on a property of it without quite saying so out loud. So let's say it out loud. Starlark — as specified, and as Bazel enforces it — is, by deliberate design, not Turing-complete. You cannot write an infinite loop in it. You cannot write unbounded recursion. Those missing features are not an oversight or a roadmap item — they are the point. And the most instructive fact about the restriction is that Buck2's implementation put half of it back: it allows recursion. Where the two engines agree tells you what the design cannot live without; where they split tells you what was a stance.
Turing-complete — able to express any computation a general-purpose computer can, including loops that may never halt. Deciding in advance whether an arbitrary such program finishes is the halting problem — provably impossible. A language that is not Turing-complete gives that up in exchange for a guarantee that every program terminates. Learn more.
This is the cleanest example I know of a principle worth internalizing:
Sometimes you make a system more capable by making its language less powerful.
Starlark gives up arbitrary computation and gets, in exchange, evaluation you can treat as analysis: it always finishes, touches nothing outside the file, and yields the same answer every time — the foundation everything else in a modern build stands on.
What it looks like
Before dissecting what the language refuses, look at the language — if you are deciding whether you'd want to live in it, this is the whole tour. Starlark wears two faces, and if you can read Python, you can already read both.
The everyday face — the one you'd touch daily — is the BUILD file: pure declaration. Name a target, list its sources and dependencies, and you're done. No commands, no control flow, and the order of the blocks doesn't matter. This one is from this site's own repository, and the build you're reading was gated on it:
cc_library(
name = "hello_c",
srcs = ["hello.c"],
hdrs = ["hello.h"],
)
cc_test(
name = "hello_c_test",
size = "small",
srcs = ["hello_test.c"],
deps = [":hello_c"],
)Declarations all the way down: a library, a test that depends on it by label, and nothing else — no shell commands, no ordering, no control flow. This is what most Starlark in the wild looks like.
The second face lives in .bzl library files that BUILD files load(): a small, friendly Python — functions, for loops over collections that already exist, comprehensions, if / else — used to stamp out repetitive declarations. Here is a complete macro:
def echo_targets(names):
"""One rule per name — a for loop over a list that already exists."""
for name in names:
native.genrule(
name = name + "_txt",
outs = [name + ".txt"],
cmd = "echo %s > $@" % name,
)A real function with a real loop — but the loop walks names, a list that exists before the loop starts, so its trip count is fixed at entry.
load(":defs.bzl", "echo_targets")
echo_targets(names = [
"alpha",
"beta",
"gamma",
])And its call site: the BUILD file loads the macro and hands it the bounded list. Evaluating this package produces three genrule targets — the same answer, every time, on every machine.
Here is what "produces" means. A target declaration is a node; every label in srcs, deps, or data is an edge; evaluating the file is the mapping from one to the other:
srcs and data labels become edges. This is the value evaluation returns: run bazel query 'deps(:echo_targets_test) intersect :*' on the package and it prints exactly this graph.And that is very nearly the entire language. What you will never see in any Starlark file — what the rest of this essay is about — is either of these:
def countdown(n):
while n > 0:
n -= 1while — rejected at parse time by both engines: 'while' not supported, use 'for' instead.
def size(t):
n = 1
for k in t.kids:
n += size(k)
return nRecursion — in Bazel's dialect, a dynamic error at the first re-entry: function 'size' called recursively.
So: Python's readable surface, minus the two constructs whose iteration count is discovered rather than declared. Why those two, exactly — and what refusing them buys — is where the dissection starts.
What Starlark refuses
Starlark looks like a small, friendly Python. It has functions, for-loops over collections, comprehensions, if / else, immutable data. What it pointedly lacks, you can read straight out of the parsers and evaluators of both implementations.
There is no while loop — in either implementation, and neither offers a flag to turn one on. Bazel's parser rejects the keyword with a message that is almost cheeky, in Parser.java:
case WHILE -> "'while' not supported, use 'for' instead";Buck2's independent Rust implementation, starlark-rust, refuses it one layer earlier: while sits in the reserved-words list in its lexer, and using it is a parse error ("cannot use reserved keyword"). A for loop iterates a collection that already exists, so it runs a known, finite number of times. A while loop runs until a condition you compute says stop — which is exactly the door to non-termination, and both teams keep it shut.
A babysitter who promises "five songs, then lights out" can tell the parents exactly when bedtime ends before singing a note. One who promises "songs until she falls asleep" cannot — nobody can, without living through it. for is the first promise; while is the second. Starlark only hires the first babysitter.
Recursion is where the two implementations part ways. The spec forbids it — but read the wording carefully: "It is a dynamic error for a function to call itself or another function value with the same declaration," followed by an explicit license: "an implementation may allow clients to disable this check, allowing unbounded recursion."
Bazel enforces the rule. Call a function that calls itself, directly or through a chain, and the evaluator stops you at the first re-entrant call, in StarlarkFunction.java:
// Check recursion
if (!thread.isRecursionAllowed() && thread.isRecursiveCall(owner)) {
throw Starlark.errorf("function '%s' called recursively", owner.getName());
}The check compares function declarations, not values — a source comment notes that comparing closures would let you defeat it by writing the Y combinator.
The Y combinator — a higher-order function that manufactures recursion out of nothing but function application: it lets an anonymous function call itself without ever being given a name to refer to. That is exactly why declaration-identity is the right thing to check. Guard on values instead and a self-caller need not textually name itself — Y hands it a fresh closure to recurse through, so no two frames share a value and the "is this me again?" test never fires. Comparing the static declaration closes that door: every closure Y spins up still comes from the one source lambda. Learn more.
And that isRecursionAllowed() guard is not decoration: the interpreter library carries an ALLOW_RECURSION option for other embedders, whose own javadoc reads "This option is not exposed to Bazel, which unconditionally prohibits recursion." (starlark-go exposes the same escape hatch as a -recursion flag.)
Buck2 takes the license. starlark-rust allows recursion outright — its conformance suite skips the upstream "called recursively" test with the comment // We allow recursion — and what stops a runaway program is not a language rule but fuel: a pre-allocated call stack, 50 frames deep by default, raisable per project via the starlark_max_callstack_size buckconfig. Recurse past it and evaluation fails with "Starlark call stack overflow." The design comment in the source says plainly what the limit protects: the native stack — each Starlark frame costs roughly a kilobyte of it — not a semantic boundary.
So the two teams did not draw the same line. Bazel made "every program terminates" a property of the language: the error arrives at recursion depth one, deterministically, on any machine. Buck2 made it a property of the runtime: its Starlark is Turing-complete on paper, and every evaluation halts because the tank is finite. There is even an inversion one layer down — Buck2 is stricter than Bazel at the build-file layer, disabling def entirely in BUCK files (enable_def: false in file_type.rs), so recursion is only expressible in a .bzl library to begin with.
f is a language error at depth 1, and the stack never grows; in Buck2's, the call is legal and what runs dry is a 50-frame fuel tank. Toggle the dialect and make the call.The ladder and the cliff
"Not Turing-complete" sounds like a defect — as if the designers ran out of time. To see why it is a design, zoom out from Starlark for a moment. Machines — and the languages they interpret — come on a ladder, and every rung up trades analysis away for expressiveness.
On the bottom rung sits the finite automaton: a fixed set of states, no memory beyond which state it is in. It is what a regular expression compiles to, and it is completely transparent — does it halt (always), can it ever match, are these two machines equivalent: all decidable. One rung up, the pushdown automaton adds a stack — enough to match nested brackets and parse programming languages — and already the fog rolls in: whether two grammars describe the same language is undecidable. At the top stands the Turing machine: unbounded memory, unbounded time. That rung is where the general-purpose languages live — Python, Groovy, and, on paper, Buck2's Starlark.
Finite and pushdown automata — the two lower rungs of the classical hierarchy: a machine with fixed states only, and one with fixed states plus a stack. Regular expressions compile to the first; grammars and parsers correspond to the second. Learn more.
The top rung is not a summit but a cliff. Rice's theorem says that every non-trivial question about what a Turing-complete program will do — not just "does it halt," but "does it ever touch the network," "is its output always the same" — is undecidable in general. Below the cliff, a tool can answer questions about a program by inspection. Above it, there is exactly one fully general way to learn what a program does: run it, and hope. That is the precise sense in which Gradle has to execute its build scripts, and the precise thing Starlark's designers refused.
Rice's theorem (1953) — for any behavioral property that some programs have and others lack, no algorithm can decide it for all programs in a Turing-complete language. The halting problem is the special case everyone meets first. Learn more.
Starlark stands on a rung just below the cliff, and the rung has a name. In 1967, Albert Meyer and Dennis Ritchie — yes, that Dennis Ritchie, five years before he built C — studied "loop programs": programs whose only loop construct repeats a body a number of times fixed before the loop starts. Their result: such programs compute exactly the primitive recursive functions — a class vast enough to cover essentially anything a build file could want — and every one of them terminates, for a reason you can check by hand: every loop's turn-count is settled before its first turn, so each pass only spends the budget down, and a stack of finite budgets is finite. Add a while loop, whose iteration count is discovered along the way rather than fixed at entry, and the guarantee is gone: you are on the Turing rung. Starlark's for-over-a-collection versus no-while is that 1967 boundary, enacted by a parser. And recursion is the same door by another name — a recursive call is a loop whose bound nobody declared — which is why the spec bars the two together, and why Buck2's choice to re-admit recursion is exactly what makes its dialect Turing-complete on paper.
Primitive recursive — the functions computable using only bounded ("do this n times") iteration. Total by construction: every primitive recursive program halts. Learn more.
Nor is Starlark alone on its rung. Dhall is a configuration language whose headline feature is the same refusal — no general recursion, every expression normalizes. Inside the Linux kernel, the eBPF verifier statically proves that every loaded program terminates before it is allowed to run — for years by admitting no loops at all, today by admitting only loops it can bound. Wherever a host must run a guest's code and stay alive, the guest's language gets pushed below the cliff. The school has a name: David Turner called it total functional programming (2004) — surrender Turing-completeness, and every program becomes a value you can reason about rather than a process you must supervise.
Why the loss is a gain
Take it as given that a Starlark evaluation always halts — a theorem in Bazel's dialect, an operational fact in Buck2's. What does that guarantee unlock?
Its evaluation is analysis. This is the big one, and it's the thread back to the whole build systems series. A build system needs to know the dependency graph a BUILD file describes. The file does get run — evaluation is execution — but it is execution of a special kind: guaranteed to finish, forbidden to touch the world outside the file. If the file were written in an unrestricted language, the only fully general way to find out what it does is to run it with no such guarantees. Because Starlark's evaluation is bounded, it doubles as a safe, predictable analysis step: the engine can evaluate a package definition to extract its targets — the declarations-to-graph mapping drawn in the specimen section — confident it will terminate, and cache the result.
It can be cached and reused safely. A terminating, side-effect-free evaluation is a pure function from inputs to a graph. Same inputs, same graph — so the result is cacheable, shareable, and content-addressable, exactly the property Skyframe and DICE depend on. A language that could loop forever could not offer that, because "the result of evaluating this file" might not exist.
It can be parallelized and sandboxed. Evaluations that are guaranteed to terminate and don't perform arbitrary I/O can be run concurrently across packages without fear of one wedging a core forever. Bounded computation is safe computation, in the very concrete sense that the build tool can run a lot of it at once and trust it to come back.
Be honest about the load-bearing walls, though. Two more properties share the weight, and they are precisely the ones both implementations kept: hermeticity — a Starlark file cannot read the filesystem, the clock, or the network; effects exist only through what the engine injects — and freezing — module globals become immutable once a file finishes loading, so parallel evaluation cannot race. Termination guarantees the answer exists; hermeticity and determinism guarantee it is the same answer everywhere. Buck2's dialect keeps safe caching and parallelism while permitting recursion, which is strong evidence that those shared properties, not the recursion ban, carry most of the load. What the recursion ban adds — and it is Bazel's deliberate stance — is that the termination half holds as a fact about the language rather than a fact about one runtime's fuel gauge.
Contrast this with the survey's cautionary example, Gradle, whose build scripts are full Groovy and Kotlin. Because that configuration can do anything, Gradle has to actually execute it to learn the graph — and then bolt on a "configuration cache" to avoid paying that cost twice. The configuration cache is the price of an unbounded language. Starlark declines to incur it by declining the power that creates it.
The general shape: the bounded fixed point
Starlark is one instance of a pattern that shows up wherever a system has to reason about programs rather than merely run them. The right primitive is rarely the most powerful one; it's the most powerful one that stays decidable.
- A DAG forbids cycles, so a topological order — and termination — always exists.
- A CRDT forbids the operations that wouldn't commute, so concurrent merges can't conflict.
- A regular expression (the real kind) forbids backreferences, so an automata-based engine (RE2, Rust's
regex, Go's) can match it in linear time — the guarantee lives in the engine, not the syntax: a backtracking engine can still blow up on nested quantifiers alone. - Starlark, in Bazel's dialect, forbids unbounded computation, so its programs can be analyzed, cached, and parallelized.
DAG — a directed acyclic graph: directed edges, no cycles. Because no path can return to where it started, the nodes admit a topological order, and any edge-following traversal is guaranteed to finish — the same termination guarantee Starlark buys by forbidding loops. Learn more.
CRDT — a conflict-free replicated data type. Its update operations are chosen so they all commute; replicas that see the same updates in any order converge to the same state, so concurrent merges can't conflict and need no central coordinator. Learn more.
Backreferences — the regex feature that matches whatever an earlier captured group did. Dropping them keeps a regular expression truly regular — equivalent to a finite automaton — so matching is linear and can't tip into catastrophic backtracking. Learn more.
In each case the restriction isn't a compromise grudgingly accepted — it's the enabling condition. You give up the unbounded version precisely because the bounded version has a property you need more than you need the power. The art of designing these systems is finding the boundary: general enough to express what users actually need, bounded enough to keep the property that makes the system tractable.
Is bounded enough?
The fair objection: doesn't real build logic sometimes need a loop of unknown length, or recursion? In practice, remarkably rarely. The work a BUILD file does — declare targets, glob files, map over a known list of sources, compose rule calls — is naturally bounded; it's iterating over things that exist, not searching an open-ended space. When a genuinely complex transformation is needed, it lives in a rule implementation or a provider, structured so the engine can still treat the whole thing as a bounded, analyzable unit. The constraint pushes complexity to where it can be managed rather than letting it leak into the configuration layer where it would poison analyzability for everyone.
Two independent teams built two implementations of this language, kept the same wall against while, and split on recursion. The split is the tell — it shows which bricks are load-bearing. The ones nobody touched: no unbounded loops, no I/O, frozen modules, deterministic evaluation. Take those out and the building falls down — no safe analysis, no cacheable graph, no parallel evaluation, no shared remote cache. The recursion ban is Bazel's additional stance, and it is a coherent one: it hoists "this terminates" from an engineering fact into a language guarantee that holds at any stack depth on any machine, and it deters the configuration layer from growing into a programming environment. Buck2 judged a fifty-frame fuel gauge enough. One honest caveat belongs next to both: termination is not tractability. A nest of for loops over range(2**20) — about a million turns each, and nested loops multiply — halts, eventually — which is why Bazel also ships --max_computation_steps as a hard budget on BUILD-file evaluation (it defaults to zero: unbounded). Either way, no Starlark evaluation loops forever — by theorem in one engine, by fuel in the other — and that is why everything built on it can be trusted to finish.
That split has a history, and the history is short. The language was built to be small on purpose, and stayed that way through a rename, three implementations, and a fork that put the recursion question back on the table.
- Blaze — Google starts an internal build tool; its build files are evaluated by a real Python interpreter.
- The restricted language is designed — Laurent Le Brun begins work on a bounded Python dialect, then called Skylark, to replace that interpreter.
- Bazel is open-sourced (March) — Skylark ships inside it as the build-file language.
- Google moves onto it — hundreds of thousands of internal Python build files are migrated to the restricted language.
- Renamed Starlark (August) — the language is spun into its own repository with a specification separate from any implementation; the Go implementation is published the same year.
- Bazel 1.0 (October).
- Meta takes over
starlark-rust— Google's Rust implementation, started by Damien Martin-Guillerez, passes to Meta's care. - Buck2 is open-sourced (April) — built on
starlark-rust, it takes the spec's explicit license and re-admits recursion, bounded by a fifty-frame fuel cap (DEFAULT_STACK_SIZE: usize = 50inevaluator.rs— check for yourself). The fork in this essay.
Credit where it is due, because languages do not refuse things — people do. Starlark was designed and implemented in Java by Jon Brandvein, Alan Donovan, Laurent Le Brun, Dmitry Lomov, Vladimir Moskva, François-René Rideau, Gergely Svigruha, and Florian Weikert, "standing on the shoulders of the Python community," as the starlark-go credits put it. That Go implementation was written by Alan Donovan and Jay Conrod, its scanner derived from one written by Russ Cox; and starlark-rust, the implementation Buck2 leans on, was started by Damien Martin-Guillerez and passed from Google to Meta's care. Saint-Exupéry's line about perfection opens this essay; these are the people who knew what to take away.
The last piece in this arc follows that trust outward — to the content-addressed store and Merkle trees that let a whole team share one build cache, turning "proportional to my change" into "proportional to anyone's."
Lessons
- Starlark, as specified and as Bazel enforces it, is deliberately not Turing-complete: no
whileloops (rejected at parse time by both implementations, with no flag to enable them) and no recursion (a dynamic error in Bazel, checked against declarations so the Y combinator can't sneak it back in). - Buck2's starlark-rust takes the spec's explicit license and allows recursion, bounding it with fuel instead — a 50-frame call stack by default. There, "every evaluation halts" is a runtime backstop, not a language guarantee.
- The refusal has a pedigree: loops whose bound is fixed at entry compute exactly the primitive recursive functions — total, analyzable, and enough (Meyer & Ritchie, 1967).
while— or recursion, the same door by another name — is precisely what tips a language over the cliff where Rice's theorem makes every question about behavior undecidable. - The payoff of guaranteed termination: evaluating a build file is the analysis — a pure, always-finishing function from inputs to a graph — so the result is cacheable, shareable, and parallelizable.
- The properties both engines kept — bounded loops, hermeticity, frozen modules, deterministic evaluation — are what safe caching and parallelism actually ride on; the recursion ban is Bazel's stance on top, making termination a property of the language itself.
- The cost of an unbounded language is concrete: Gradle must run its Groovy/Kotlin scripts to learn the graph, then cache the execution.
- This is the bounded fixed point pattern: the right primitive is the most general one that stays decidable. DAGs, CRDTs, and regular expressions all make the same trade — the restriction is the feature.
References
- “The Starlark language.” Bazel docs. — · the Starlark spec, whose recursion rule is a dynamic error "an implementation may allow clients to disable"
- “net.starlark.java.” Bazel source. —
Parser.java(whilerejection),StarlarkFunction.java(recursion check),StarlarkSemantics.java(ALLOW_RECURSION, "not exposed to Bazel") - “starlark-rust.” — Buck2's independent implementation; recursion allowed —
DEFAULT_STACK_SIZE = 50in eval/runtime/evaluator.rs (a depth cap is the proof it's legal); started by Damien Martin-Guillerez - “Starlark credits.” starlark-go README. — the designers and implementors, named — the Java team, and Alan Donovan & Jay Conrod's Go implementation with a scanner after Russ Cox
- “Buck2 per-file-type dialects.” Buck2 source. —
BUCKfiles getenable_def: false;.bzlgets the full language - “Turing completeness.” — and the halting problem for the theory
- “The complexity of loop programs.” Meyer & Ritchie, Proc. ACM 1967. — bounded loops compute exactly the primitive recursive functions — the rung Starlark stands on (PDF at MIT)
- Turner. “Total Functional Programming.” J.UCS, 2004. — the design school Starlark practices: every program terminates by construction (PDF)
- “Rice's theorem.” — why everything above the cliff must be run to be known
- “Dhall.” — a config language on the same rung; the eBPF verifier is the same trade inside the kernel
- Shapiro, Preguiça, Baquero & Zawirski. “Conflict-free Replicated Data Types.” SSS, 2011. — the bounded-by-design sibling, from the source (PDF) · DAGs as another
How to cite
Mangalapilly, Y. J. (2026, June). A Language That Can't Loop Forever. Saṃhitā Notes. https://yesudeep.com/blog/a-language-that-cant-loop-forever/ @online{mangalapilly2026a,
author = {Yesudeep Jose Mangalapilly},
title = {A Language That Can't Loop Forever},
journal = {Sa\d{m}hit\=a Notes},
year = {2026},
month = {June},
url = {https://yesudeep.com/blog/a-language-that-cant-loop-forever/},
urldate = {2026-07-02},
} Yesudeep Jose Mangalapilly. “A Language That Can't Loop Forever.” Saṃhitā Notes, 2026. https://yesudeep.com/blog/a-language-that-cant-loop-forever/. TY - ELEC
AU - Mangalapilly, Yesudeep Jose
TI - A Language That Can't Loop Forever
T2 - Saṃhitā Notes
PY - 2026
UR - https://yesudeep.com/blog/a-language-that-cant-loop-forever/
Y2 - 2026-07-02
ER - Webmentions
Annotations
Thank you — your note is held for review and will appear once approved.
Thank you — your note is published.
Please sign in below to leave a note.
