Notes · Dissecting Real Systems

growing

node_modules Is the Heaviest Object in the Universe

The same "the hash is the identity" idea that powers a build cache also explains why pnpm stores on disk what npm copies a hundred times over.

· · 11 min read

package-managers, pnpm, npm, content-addressing, dissecting-systems

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

— Andy Hunt & Dave Thomas, The Pragmatic Programmer (1999), Tip 15: "DRY—Don't Repeat Yourself"

Cite this
APA
Mangalapilly, Y. J. (2026, June). node_modules Is the Heaviest Object in the Universe. Saṃhitā Notes. https://yesudeep.com/blog/node-modules-is-the-heaviest-object-in-the-universe/
BibTeX
@online{mangalapilly2026node,
          author  = {Yesudeep Jose Mangalapilly},
          title   = {node\_modules Is the Heaviest Object in the Universe},
          journal = {Sa\d{m}hit\=a Notes},
          year    = {2026},
          month   = {June},
          url     = {https://yesudeep.com/blog/node-modules-is-the-heaviest-object-in-the-universe/},
          urldate = {2026-07-02},
        }
Plain
Yesudeep Jose Mangalapilly. “node_modules Is the Heaviest Object in the Universe.” Saṃhitā Notes, 2026. https://yesudeep.com/blog/node-modules-is-the-heaviest-object-in-the-universe/.
RIS
TY  - ELEC
        AU  - Mangalapilly, Yesudeep Jose
        TI  - node_modules Is the Heaviest Object in the Universe
        T2  - Saṃhitā Notes
        PY  - 2026
        UR  - https://yesudeep.com/blog/node-modules-is-the-heaviest-object-in-the-universe/
        Y2  - 2026-07-02
        ER  - 

A sequel to The Hash Is the Identity, pointed at a tool every JavaScript developer already fights. By the end you'll see that pnpm and a remote build cache are the same idea wearing different clothes: a file's content hash is its identity. In a build farm that idea lets a result be shared; on your laptop it lets a package be stored once instead of a hundred times — and the joke about node_modules stops being a joke.

There is a running joke that node_modules is the heaviest object in the universe. Like most good jokes it encodes a real grievance: install a handful of JavaScript projects and you will find the same dependency, byte-for-byte identical, copied into each one. Ten projects that all use lodash keep ten copies of lodash. The grievance is not that the packages are large. It is that they are duplicated — and duplication is a choice, not a law of physics.

node_modules is heavy not because packages are big, but because the same bytes are stored over and over.

The previous piece showed how a build farm makes a result shared across a whole team: name an artifact by the hash of its contents, and anyone who needs the same contents gets the same artifact, computed once. pnpm applies the identical principle to the most mundane thing on your machine — the packages in node_modules — and the payoff is the mirror image. Where the build cache used content addressing to share work, pnpm uses it to save space.

The npm model: a copy per project

The default behavior most developers know is npm's. Each project gets its own node_modules, and each node_modules contains its own full copy of every dependency. pnpm's own motivation page states the cost plainly:

If you have 100 projects using a dependency, you will have 100 copies of that dependency saved on disk.

That is the heaviest-object problem in one sentence. The bytes of a given package version are completely determined — version 4.17.21 of a package is the same files everywhere — and yet they are written to disk once per project that asks for them.

Here is the part worth getting exactly right: npm's problem is not that it lacks a content-addressed store. It has one — the cache under ~/.npm/_cacache is built on a library literally named cacache ("content-addressable cache"), keyed by Subresource Integrity hashes. The difference is what happens at install time. npm's own docs: a package "is loaded into the cache, and then unpacked into ./node_modules/foo". Unpacked — copied out. The shared name exists in the cache and is abandoned at the project boundary; each project's copy is thereafter identified by where it sits, not by what it contains. The split between npm and pnpm is not content-addressing-versus-not; it is copy out of the store versus link into it.

The pnpm model: one store, linked

pnpm makes one change with consequences all the way down: it keeps a single, global, content-addressed store, and a project's node_modules is built out of links into that store rather than copies. From the same motivation page:

All the files are saved in a single place on the disk. When packages are installed, their files are hard-linked from that single place, consuming no additional disk space.

Hard link — a second directory entry pointing at the same bytes on disk, not a copy of them. The file has one set of blocks and two (or a hundred) names; deleting one name leaves the bytes alive as long as another points at them. A reflink (copy-on-write clone) shares blocks the same way but gives the new name private blocks the moment either side writes. Learn more.

The title of this essay is a quantity, so weigh it yourself:

The heaviest object, weighed. p projects across, d megabytes of dependencies down, drawn to scale: npm keeps a copy per project, so the filled area is p ⋅ d — the same bytes, p times over. Flip to pnpm and the same projects share one content-addressed store; the area collapses to d, the shape every content-addressed design in this series converges on.

The store lives in a fixed place per machine — on Linux, ~/.local/share/pnpm/store; on macOS, ~/Library/pnpm/store — and holds each file once, addressed by a hash of its content. When you install a package, pnpm does not copy its files into your project. It links them. The precise mechanism is a per-filesystem choice — the default packageImportMethod: auto will, per pnpm's settings docs, "try to clone packages from the store. If cloning is not supported then hardlink packages from the store," and copy only as a last resort. On APFS (the macOS default) and other copy-on-write filesystems that means clones; elsewhere, hard links. Either way the effect is the same: your node_modules entry and the store entry share the same blocks on disk. A hundred projects linking the same package add the package to the store once and pay only for a hundred cheap directory entries, not a hundred copies of the bytes.

npm keeps a full copy of a dependency inside every project. pnpm keeps one copy in a content-addressed store and hard-links it into each project — the same bytes, named many times, stored once.

This is exactly the build-cache move from the previous article, run in reverse. There, FindMissingBlobs meant identical content was uploaded once, ever. Here, identical content is stored once, ever. Both work for the same reason: the content's hash is its identity, so two things with the same bytes are, to the system, the same thing.

Why content addressing pays off twice: per-file dedup

The deepest part is subtle, and it is where naming-by-content earns its keep a second time. The store is keyed by the hash of each file, not by the package the file came from. So the unit of sharing is the file, not the package — and dependency versions that overlap share their unchanged files automatically. From pnpm's docs:

For instance, if it has 100 files, and a new version has a change in only one of those files, pnpm update will only add 1 new file to the store, instead of cloning the entire dependency just for the singular change.

Imagine a library where every book is filed by a fingerprint of its contents, not by who wrote it or which shelf it came from. Two slightly different editions of the same book share every page that didn't change; only the one revised page is filed anew. You never store the same page twice, even across different books.

Note pnpm's own "for instance": the one-file upgrade is an idealization. Any real release changes at least package.json (the version field lives there), so the practical floor is two files, and real releases touch a handful. The point survives intact: the unit of storage is the file, so an upgrade pays for the files that changed, not a fresh copy of the whole package. Because identity is content, the question "do I already have this?" is answered for files, not packages — and the answer is yes far more often than per-package thinking would suggest.

Two versions of a package differ in one file. Keyed by content hash, the store holds the shared files once and adds only the single changed file — five files on disk, not eight.

The honest limits

The mechanism is a lever, not a miracle. Three caveats keep the claim truthful.

  • Links need one filesystem. A hard link (or clone) can only point at bytes on the same volume, so pnpm keeps "one store per disk." Put the store on a different drive than your project and, in pnpm's own words, it "will copy packages from the store instead of hard-linking them."
  • Hard links cut both ways. Two names for the same blocks means an edit through either name edits both: patch a file inside a hard-linked node_modules and you have silently modified the shared store for every project on the machine. pnpm knows this — its settings docs note that with cloning "you may edit files in your node_modules and they will not be modified in the central content-addressable store," and verifyStoreIntegrity exists precisely because store mutation happens: "if a file in the store has been modified, the content of this file is checked before linking it to a project's node_modules."
  • It dedups whole files, not pieces of files. Sharing is per file: two files that are byte-identical share one copy; two files that differ by a single character are two separate stored objects. pnpm does not diff within a file. "Per file, not per package" is the honest framing — already a large win, but not block-level delta compression.
  • The savings are proportional, not absolute. pnpm's disk benefit grows with the number of projects and shared dependencies. One project with no overlap saves little; a laptop with forty projects sharing a deep, common dependency tree saves enormously. The benchmark numbers pnpm publishes are about speed and shift with each release — treat any specific figure as as-of-a-date, not a constant.

Worth one more sentence of breadth: pnpm's is not the only answer to the title's joke. Yarn Plug'n'Play removes node_modules altogether — a single generated loader file tells Node where every package lives — trading filesystem convention for resolution logic rather than deduplicating the convention.

The same idea, twice

Step back and the two articles are one idea applied in two directions. A file's content hash is its identity. Point that at a build farm and identical work is computed once and shared by everyone. Point it at your node_modules and identical files are stored once and deduplicated across every project. Same principle, opposite scarcity: in the cloud it buys time; on your disk it buys space.

Name a thing by its contents and duplication becomes impossible by construction — whether the thing is a compiled artifact or a file in node_modules.

The joke about node_modules was always really a joke about copying. Once the package manager names files by content instead of by location, the copies collapse into links, the heaviest object in the universe goes on a diet, and the grievance turns out to have had a one-line fix all along: stop storing the same bytes twice.

Lessons

  • npm has a content-addressed cache too (cacache, keyed by integrity hashes) — but it unpacks copies into every project's node_modules; "100 projects using a dependency" means "100 copies on disk."
  • pnpm keeps one global content-addressed store and links files into each project — copy-on-write clones where the filesystem supports them, hard links otherwise — so identical bytes are stored once and shared.
  • Because the store is keyed by file content, dedup is per-file: a new package version that changes one file adds one file, not a fresh copy of the whole package.
  • The flat .pnpm store plus a declared-only symlink graph also designs out phantom dependencies — you can't import what you didn't list.
  • It's the same "the hash is the identity" principle as a remote build cache, aimed at space instead of shared work. Limits: hard links need one filesystem; dedup is per-file, not block-level; savings scale with how much your projects overlap.

References

  1. pnpm: motivation.” pnpm. — "saving disk space," in pnpm's own words
  2. pnpm: the symlinked node_modules structure.” pnpm. — the store-and-symlink layout
  3. pnpm: settings (packageImportMethod, verifyStoreIntegrity).” pnpm. — clone vs hardlink vs copy, and the store-integrity check
  4. cacache.” npm. — npm's own content-addressable cache
  5. Yarn Plug'n'Play.” Yarn. — the eliminate-node_modules alternative
  6. The Hash Is the Identity.” — the same principle, in a remote build cache

How to cite

APA
Mangalapilly, Y. J. (2026, June). node_modules Is the Heaviest Object in the Universe. Saṃhitā Notes. https://yesudeep.com/blog/node-modules-is-the-heaviest-object-in-the-universe/
BibTeX
@online{mangalapilly2026node,
          author  = {Yesudeep Jose Mangalapilly},
          title   = {node\_modules Is the Heaviest Object in the Universe},
          journal = {Sa\d{m}hit\=a Notes},
          year    = {2026},
          month   = {June},
          url     = {https://yesudeep.com/blog/node-modules-is-the-heaviest-object-in-the-universe/},
          urldate = {2026-07-02},
        }
Plain
Yesudeep Jose Mangalapilly. “node_modules Is the Heaviest Object in the Universe.” Saṃhitā Notes, 2026. https://yesudeep.com/blog/node-modules-is-the-heaviest-object-in-the-universe/.
RIS
TY  - ELEC
        AU  - Mangalapilly, Yesudeep Jose
        TI  - node_modules Is the Heaviest Object in the Universe
        T2  - Saṃhitā Notes
        PY  - 2026
        UR  - https://yesudeep.com/blog/node-modules-is-the-heaviest-object-in-the-universe/
        Y2  - 2026-07-02
        ER  - 

Annotations

Thank you — your note is held for review and will appear once approved.

Thank you — your note is published.

Please sign in below to leave a note.

Type to search · ↑↓ to move · ↵ to open · Esc to close