Notes · Dissecting Real Systems

growing

The Build That Restarts Itself

Inside Skyframe, Bazel's incremental engine — and the strange trick at its heart.

· · 9 min read

build-systems, bazel, skyframe, incremental-computation, dissecting-systems

Laziness: The quality that makes you go to great effort to reduce overall energy expenditure.

— Larry Wall, Programming Perl (1996)

Cite this
APA
Mangalapilly, Y. J. (2026, June). The Build That Restarts Itself. Saṃhitā Notes. https://yesudeep.com/blog/the-build-that-restarts-itself/
BibTeX
@online{mangalapilly2026the,
          author  = {Yesudeep Jose Mangalapilly},
          title   = {The Build That Restarts Itself},
          journal = {Sa\d{m}hit\=a Notes},
          year    = {2026},
          month   = {June},
          url     = {https://yesudeep.com/blog/the-build-that-restarts-itself/},
          urldate = {2026-07-01},
        }
Plain
Yesudeep Jose Mangalapilly. “The Build That Restarts Itself.” Saṃhitā Notes, 2026. https://yesudeep.com/blog/the-build-that-restarts-itself/.
RIS
TY  - ELEC
        AU  - Mangalapilly, Yesudeep Jose
        TI  - The Build That Restarts Itself
        T2  - Saṃhitā Notes
        PY  - 2026
        UR  - https://yesudeep.com/blog/the-build-that-restarts-itself/
        Y2  - 2026-07-01
        ER  - 

The second of five pieces on build systems; it opens up the engine the survey named. By the end you'll understand how Bazel's Skyframe discovers a build graph by letting computations demand their dependencies, why returning null and getting restarted is a feature rather than a bug, and how comparing recomputed values — "change pruning" — stops a rebuild the moment a value stops changing.

In the survey I claimed that Bazel decides what to rebuild by content digest and a memoized graph called Skyframe, and then moved on. This piece opens that engine up. The interesting part isn't that it caches — every build system caches. It's how it discovers what to cache, and the answer is a trick that looks, at first, like a bug:

A function runs, asks for a dependency it doesn't have, gives up, and gets run again from the top.

That sounds wasteful. It is the opposite, and understanding why is understanding how a modern incremental build actually thinks.

The problem restated

A build is a graph of computations. To build app you must first build lib, which needs a.o and b.o, which need their sources. The catch that makes this hard: you often don't know a computation's dependencies until you start computing it. Reading a BUILD file tells you what a target depends on — but you only learn that by reading the file, which is itself a computation with its own dependencies. The dependency graph is not handed to you; it is discovered as you evaluate.

Most systems handle this with an explicit two-phase split: first load and analyze everything to build the graph, then execute it. Skyframe does something subtler and more uniform. It treats every step — loading a package, resolving a configuration, executing an action — as the same kind of thing, a node in one graph, and discovers the edges by letting each node ask.

Three types, and that's nearly all

Read src/main/java/com/google/devtools/build/skyframe/ and the whole framework hangs from three interfaces.

A SkyValue is the immutable result of a computation — a marker interface. There is no hashing here: the engine compares values with plain equals(), which is all it needs to cache them and to notice when one stops changing.

A SkyKey names a computation: a function name plus its arguments. PACKAGE:foo, FILE:bar.c, ACTION:link-app. Keys are interned so identity comparison is cheap.

A SkyFunction is the computation itself. Here is its entire contract, from SkyFunction.java:

SkyValue compute(SkyKey skyKey, Environment env)
    throws SkyFunctionException, InterruptedException;

You are handed your key and an Environment. You produce a value. To reach a dependency, you ask the environment for it by key: env.getValue(someKey). That's the surface. The depth is in what happens when the thing you asked for isn't ready yet.

The trick: return null, get restarted

Here is the mechanism, quoted from the source, because no paraphrase does it justice:

The implementation can request arbitrary values using Environment#getValue. If the values are not ready, the call will return null; in that case the implementation should just return null, in which case the missing dependencies will be computed and the compute method will be started again.

That re-invocation has a name in the codebase: a Skyframe restart. Your function ran partway, discovered it needed FILE:b.c, found it wasn't computed yet, returned null — and Skyframe went off, computed b.c (and anything it needed, recursively), and then called your function again from the beginning, this time with b.c sitting ready in the environment.

The first time you meet this, it reads like a performance disaster: you're re-running compute functions repeatedly, throwing away partial work. But look at what it buys.

Imagine you're following a recipe and you reach a step that needs melted butter — but the butter is still in the fridge. You don't stand at the counter blocking the kitchen. You set the recipe down ("I'll come back to this"), melt the butter, and then start the recipe again from the top — except now the butter is ready, so you sail past that step. Annoying to start over? Barely — reading the recipe again is instant; melting the butter was the slow part, and you only did that once. Skyframe does exactly this with build steps.

Step through it yourself:

A SkyFunction asks for a dependency, returns null when it isn't ready, and is restarted once the dependency is computed.

The function never has to declare its dependencies up front — it discovers them by demanding them, and the act of demanding is the act of recording the edge. Skyframe learns the graph for free, as a side-effect of evaluation. There is no separate "figure out the dependencies" phase to keep in sync with the "do the work" phase; they are the same phase. The two can never drift apart, because they are one thing.

And the re-runs are cheaper than they look. A restart happens at most once per newly discovered batch of dependencies, not per dependency; well-written functions request what they can in parallel so the environment can compute a whole frontier at once before restarting. The partial work thrown away is usually the cheap part — the expensive part (building b.c) is done exactly once and cached. Usually is doing honest work in that sentence: for functions whose pre-restart work is genuinely expensive (loading a large package, resolving hundreds of dependencies), Bazel later added an escape hatch — SkyFunction.Environment#getState, which lets a function carry partial state across restarts. The restart is a beautiful default, not a free one.

Incrementality: dirty, or changed

The restart trick builds the graph. The second half of Skyframe is what makes a re-build fast: deciding which cached nodes are still good.

When something changes, Skyframe marks affected nodes — but it distinguishes two kinds of staleness, and the distinction is the whole game. In NodeEntry.java, a node can be made DIRTY or CHANGED. A changed node must be recomputed no matter what — its input changed directly. A dirty node only might need recomputing: one of its dependencies is dirty, so on re-evaluation Skyframe walks that node's recorded dependencies and checks whether any actually produced a different value.

This is the payoff of comparing values, not timestamps or change events. Bazel's glossary calls the mechanism change pruning, and the comparison is exactly what it sounds like — DirtyBuildingState.java documents it as:

Returns true if newValue.equals the value from the last time this node was built.

If a dependency was dirtied but recomputes to an equal value — you changed a comment, or touched a file's mtime without changing its bytes — the node that depends on it is marked clean without re-running. (Skyframe even keeps the old value object on a match, preferring =​== identity to a fresh-but-equal copy.) Staleness stops propagating the moment a value is unchanged. A one-character edit deep in a header does not blindly invalidate everything above it; it invalidates exactly as far up as the values actually differ.

Where the digest comes fromFileStateValue uses the filesystem's fast digest when one exists; otherwise it stores only size plus a ctime/mtime/inode FileContentsProxy, and its equality is proxy equality — a touch dirties the file-state node. The content hash is computed one layer up, when the action's inputs are digested (FileArtifactValue via DigestUtils' manual fallback, cached by that same proxy): the re-hash comes back equal, the action sees unchanged input digests, and the rebuild stops there — the timestamp only triggered the look.

Change vs. dirty. A changed input forces recompute; a dirty node recomputes only if a dependency's value actually differs — so unchanged values halt the propagation.

Watch the two cases run on the same graph. Edit a.c is a real code change: every node above it re-runs and produces a changed value, all the way to app. Reformat BUILD re-runs only parse — which returns an equal value, so everything above it is marked clean without re-running.

Change pruning, animated. Two edits, one graph. Edit a.c and every dependent re-runs with a changed value; reformat BUILD and only parse re-runs — its value comes back equal, so the nodes above are pruned clean without re-running. Dashed means "might"; solid means "did".

This is the "cost proportional to the change" promise from the survey, made mechanical. The cost isn't proportional to what might have been affected (the transitive closure); it's proportional to what was — the nodes whose recorded inputs genuinely changed value.

Hermeticity is a property of the function

One more detail worth pulling out, because it's where the elegance becomes a contract. Each SkyFunctionName carries a hermeticity flag — createHermetic, createNonHermetic, createSemiHermetic in the source.

Hermetic — a build step is hermetic when it depends only on its declared inputs and nothing else (no wall clock, no network, no stray files), so it produces the same output every time, everywhere. Learn more.

A hermetic function is a pure function of its declared dependencies: if they're unchanged, its value cannot change, so Skyframe is free to skip it entirely. A non-hermetic one might change even with identical dependencies (it reads the wall clock, or the filesystem outside the graph), so it can't be trusted to stay clean.

This is why hermeticity matters beyond reproducibility. It's not a virtue you bolt on for its own sake — it's the precondition that lets the engine not run your function. Every non-hermetic node is a node Skyframe must pessimistically reconsider. Purity is what makes the incremental graph trustworthy.

What Skyframe actually is

Step back and the shape is clean. Skyframe is a general demand-driven incremental computation engine that happens to be aimed at builds. Functions discover their dependencies by asking for them; the asking records the graph; change pruning lets staleness die the moment a value stops changing; purity lets whole subgraphs be skipped. The "restart" that looked like a bug is what unifies dependency discovery and execution into a single phase that cannot fall out of sync.

It generalizes well past Bazel — the same pattern (a computation that requests inputs and suspends when they're missing) is how reactive build systems, query engines like Rust's salsa, and incremental compilers all work. The PL literature had the idea before Bazel shipped it: Umut Acar's self-adjusting computation (2005), made demand-driven in Adapton (PLDI 2014) — Skyframe is that idea, engineered at monorepo scale. Once you've seen the restart trick, you start seeing it everywhere: it is what demand-driven incrementality looks like when you write it down honestly.

The next piece looks at the system that took this idea and collapsed Bazel's remaining phases into one graph — Buck2's DICE — and asks why Meta decided to rebuild rather than adopt.

Lessons

  • Skyframe is a general demand-driven incremental computation engine: a function discovers its dependencies by asking for them, and the asking records the graph — so dependency discovery and execution are one phase that can't fall out of sync.
  • Returning null and being restarted looks wasteful but isn't: the expensive work is done once and cached; only cheap partial evaluation is re-run.
  • Change pruning — comparing a recomputed value with equals() against the cached one — makes staleness stop the moment a value recomputes unchanged; a dirtied dependency that produces an equal value doesn't propagate.
  • Hermeticity isn't decoration: a pure function of its inputs is one the engine can safely skip. Purity is what makes the incremental graph trustworthy.

References

  1. Bazel glossary: Skyframe.” Bazel. — · the Skyframe source
  2. Hermeticity in Bazel.” Bazel.
  3. Hammer, Khoo, Hicks & Foster. “Adapton: Composable, Demand-Driven Incremental Computation.” PLDI, 2014. — the demand-driven lineage Skyframe belongs to; Umut Acar's Self-Adjusting Computation (CMU thesis, 2005) is its root
  4. salsa.” — the same demand-driven pattern as a general Rust framework
  5. Incremental computing.” — for the broader idea

How to cite

APA
Mangalapilly, Y. J. (2026, June). The Build That Restarts Itself. Saṃhitā Notes. https://yesudeep.com/blog/the-build-that-restarts-itself/
BibTeX
@online{mangalapilly2026the,
          author  = {Yesudeep Jose Mangalapilly},
          title   = {The Build That Restarts Itself},
          journal = {Sa\d{m}hit\=a Notes},
          year    = {2026},
          month   = {June},
          url     = {https://yesudeep.com/blog/the-build-that-restarts-itself/},
          urldate = {2026-07-01},
        }
Plain
Yesudeep Jose Mangalapilly. “The Build That Restarts Itself.” Saṃhitā Notes, 2026. https://yesudeep.com/blog/the-build-that-restarts-itself/.
RIS
TY  - ELEC
        AU  - Mangalapilly, Yesudeep Jose
        TI  - The Build That Restarts Itself
        T2  - Saṃhitā Notes
        PY  - 2026
        UR  - https://yesudeep.com/blog/the-build-that-restarts-itself/
        Y2  - 2026-07-01
        ER  - 

Annotations

Thank you — your note is held for review and will appear once approved.

Thank you — your note is published.

Please sign in below to leave a note.

Type to search · ↑↓ to move · ↵ to open · Esc to close