fixed-output derivations
how fetchers work and why network access requires a content hash.
how fetchers work and why network access requires a content hash.
every derivation you have seen so far is input-addressed. hash the inputs, get the output path. the sandbox blocks network access. same inputs, same output.
fetchers break this model. you cannot hash the inputs to a download because the interesting input is a URL, and URLs are not content-addressed. the same URL can serve different content tomorrow. a git repository at the same URL grows new commits.
fixed-output derivations solve this by flipping the guarantee: instead of "same inputs produce the same output", they say "the output must match this hash, or the build fails".
src = fetchurl {
url = "https://curl.se/download/curl-8.18.0.tar.xz";
sha256 = "sha256-QHkMV0L6dSQaf3sVGGeGIBJDQ7uftFaFMJfnMbMxNiE=";
};
fetchurl creates a derivation that:
sha256the store path is computed from the output hash, not the input hash. the fingerprint is:
"fixed:out:" <mode> <algo> ":" <hash> ":"
where mode is "" for flat or "r:" for recursive (see below). the URL, the builder script, the version of curl used to download; none of that appears in the fingerprint. two different URLs producing the same file get the same store path.
fixed-output derivations require three special attributes: outputHash, outputHashAlgo, and outputHashMode.
outputHashMode controls how the output is digested:
flat (default): hash the output file directly. fetchurl uses this. the output must be a single file.
recursive (aliased as nar since nix 2.21): hash the serialization of the output. fetchFromGitHub and fetchgit use this because their output is a directory.
you declare the hash upfront. nix trusts that declaration at evaluation time. it computes the output path from the hash, wires it into downstream derivations, and only verifies at build time.
if the hash is wrong, the build fails:
hash mismatch in fixed-output derivation '/nix/store/...-curl-8.18.0.tar.xz':
specified: sha256-QHkMV0L6dSQaf3sVGGeGIBJDQ7uftFaFMJfnMbMxNiE=
got: sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
no fallback. no warning. hard failure.
this is the one place where you make a claim about content and nix verifies it. everything else is structural: same inputs, same output, by construction.
one more relaxation: fixed-output derivations can use impureEnvVars to pass through environment variables from the host. fetchurl uses this for http_proxy, https_proxy, and related variables. this is safe because the output hash check catches any difference. if your proxy corrupts the download, the hash mismatches and the build fails.
you do not compute the hash yourself. nix does it.
the standard workflow: leave the hash empty (or use a dummy), run the build, nix tells you the correct hash in the error message, paste it in.
src = fetchurl {
url = "https://curl.se/download/curl-8.18.0.tar.xz";
sha256 = "";
};
hash mismatch in fixed-output derivation:
specified: sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
got: sha256-QHkMV0L6dSQaf3sVGGeGIBJDQ7uftFaFMJfnMbMxNiE=
copy the got: line. done. nix-prefetch-url and nix hash can also compute it without attempting a full build.
| fetcher | what it fetches |
|---|---|
fetchurl | a single file by URL |
fetchzip | a zip/tarball, auto-extracted |
fetchFromGitHub | a GitHub repository at a specific rev |
fetchgit | a git repository at a specific rev |
fetchpatch | a patch file by URL |
all of them are fixed-output derivations. all of them require a hash. all of them get network access in the sandbox.
src = fetchFromGitHub {
owner = "curl";
repo = "curl";
rev = "curl-8_18_0";
sha256 = "sha256-...";
};
this resolves to a URL like https://github.com/curl/curl/archive/curl-8_18_0.tar.gz. the rev pins the exact commit or tag. without it, the hash would change on every new commit and your build would break.
pinning is the whole point. a fixed-output derivation with a pinned rev and a verified hash is as reproducible as an input-addressed one. the network access is a one-time fetch, not an ongoing dependency.
if content-addressing is so clean, why not use it for all derivations?
because you would have to build first and hash second. you could not compute output paths before building. you could not wire dependencies together before building everything. the entire lazy evaluation model from chapter 3 depends on knowing output paths upfront.
input-addressing gives you output paths at evaluation time. that is what makes binary caches work: you compute the output path from the .drv hash, check if the cache has it, and skip the build entirely. with content-addressing, you would have to build locally to discover the output path, defeating the purpose.
nix does support experimental content-addressed derivations (__contentAddressed = true). they hash the output after building, enabling deduplication of identical outputs from different inputs. but the standard model is input-addressed, and nearly everything in nixpkgs uses it.
in the .drv file, the hash chain bottoms out at fixed-output derivations. every non-trivial package eventually depends on source code fetched from the internet. those fetches are the leaves of the derivation tree.
curl.drv
→ openssl.drv
→ openssl-3.6.1.tar.gz.drv (fixed-output)
→ curl-8.18.0.tar.xz.drv (fixed-output)
the leaves are fixed-output, verified by content hash. everything above is input-addressed. the chain is unbroken from source to binary.
nixpkgs is where all of this comes together. 100,000 packages, the callPackage pattern, overlays, and overrides.