Skip to main content

Command Palette

Search for a command to run...

Part II: The Drawing of the Image

Defining the Boundaries - Registry, Runtime, and the Breath of Life

Updated
51 min read
Part II: The Drawing of the Image
R
Still learning new tech with the curiosity of a junior developer, but with the battle scars of a 50-year-old senior. Currently letting AI write my boilerplate while I aggressively judge its architecture.

"He could hear the water, but he could not see it yet. It was enough. The sound was the promise, and the promise was sufficient for a man who had already crossed a desert."

— Stephen King, The Drawing of the Three


Prologue: The Shoreline

The scaffold was complete. Three weeks of foundation work—the make ci-local wheel, the 57-linter proving grounds, the waystation state engine, the tower configuration reader, the specgen embryo, the cobra CLI skeleton—had produced a binary that knew its own shape but could do nothing. It was like a gunslinger who had been born, raised, and trained in the art of the revolver—but whose holsters were still empty.

The desert of the Spec, that enormous burning cartography of standards and container runtime interfaces, lay behind him now. A mapped country. Surveyed and understood. The OCI Runtime Specification said how a container ran. The OCI Image Specification said what a container image was. The OCI Distribution Specification said how to obtain one from a registry. All three had been read, annotated, absorbed.

Now came the harder thing: implementing them.

The shore Roland approached in these middle weeks—weeks four through eleven of the Phase 1 journey—was not a literal shore, but the metaphor arrived unbidden from King's second Dark Tower volume, The Drawing of the Three. Roland wakes on a beach beside three doors rising from the sand, each door opening into a different mind in a different world, and through each door he must draw one of his ka-tet companions, one by one, across the boundary between worlds. The act of drawing is intimate and dangerous and irreversible. Once you have pulled something through, it is here. It exists in this world. You are responsible for it.

Pulling an OCI image from a remote registry is exactly this. The image—nginx:latest, alpine:3.19, postgres:16—exists somewhere out there in the registry world, an arrangement of blobs and JSON and digest claims that has never been on this machine. The engineer reaches through the door of the HTTP API, touches the image's manifest, and begins drawing. Layer by layer, SHA by SHA, the image crosses the boundary. When the last layer lands and the final digest is verified, the image is here. Local. Real. Committed to the content-addressable store, protected by cryptographic integrity, ready to be mounted as a rootfs.

This is the Drawing. And before Roland could draw anything, he needed to build the door.

The weeks that followed were not neat. They were the weeks of a craftsman building tools to build tools: shardik to talk to registries, maturin to store what was pulled, drawing to orchestrate the sequence, gan to orchestrate the execution, eld to interface with the OCI runtime, prim to prepare filesystems, specgen to generate specifications, white to enforce security policy. Each package took weeks to reach 100% coverage. Each one had its own wall of bugs, its own archaeology of edge cases, its own surprise hidden behind an assumption that turned out to be wrong.

The shore was long. The doors were heavy. The companions did not come willingly—they had to be understood, modeled, tested, and drawn precisely or they would arrive broken, and a broken companion on the beach of a hostile world was worse than no companion at all.

But the drawing was done. And this is how.


Chapter 1: Shardik — Building the Bear

The first package written in the drawing phase was shardik, named after the Great Bear who guards the portal in King's mythos. In the Dark Tower novels, Shardik is one of the twelve Guardians of the Beams—enormous mechanical animals, each protecting one end of one of the six beams that hold the Dark Tower upright. Shardik specifically is a bear the size of a skyscraper, ancient and ruined and terrifying, its metal skin pitted by centuries of operation, its eyes burning with a trapped and hopeless intelligence.

The registry is Shardik. It is enormous—Docker Hub alone serves billions of pulls per day. It is mechanical—the OCI Distribution Specification defines exactly how each HTTP request must be structured and what each response must contain. It is ancient—Docker Hub has been running since 2013, and its API has accumulated edge cases and undocumented behaviors over more than a decade. And it is terrifying to approach for the first time: authentication flows vary between registry implementations, manifest schemas have evolved through multiple incompatible versions, and the gap between what the specification says and what registries actually do in production is a chasm you can fall into without warning.

Roland's initial approach was to build shardik from first principles, implementing the distribution protocol by hand. He read the specification, sketched the HTTP calls, noted the token exchange sequence for Docker Hub authentication. Then he read the Go ecosystem research document he'd written during the planning phase, which had already evaluated the available libraries:

"go-containerregistry from Google is the standard. It is the library that crane, ko, skopeo (partially), and Kubernetes itself use for registry operations. It abstracts the authentication flows, the manifest schema differences, the redirect handling, and the dozens of per-registry quirks. It has excellent test coverage, is actively maintained, and handles cases that would take months to discover independently."

The conclusion was clear: use go-containerregistry. Not because it made the problem disappear, but because it removed the layer of problems that weren't Maestro's to solve. Maestro's differentiation was not in registry HTTP client implementation; it was in the full-stack integration: credential management, local image storage, spec generation, runtime invocation. The right strategy was to use the best available library for the standard protocol layer and spend engineering attention on the integration glue that no library could provide.

This decision—made consciously, documented, defended against the temptation to build everything by hand—was itself a form of gunslinger discipline. Roland did not forge his own gun barrels. He obtained them from the craftsmen who knew that art. He did the forging that only he could do.

1.1 — The RegistryClient Interface

The first thing written in shardik was not an implementation. It was an interface:

// RegistryClient defines the operations for interacting with an OCI-compatible registry.
type RegistryClient interface {
    // ImageMetadata retrieves the manifest digest and platform info for the given reference.
    ImageMetadata(ctx context.Context, ref string) (*ImageMetadata, error)
    // PullImage pulls an OCI image from the registry to local storage.
    PullImage(ctx context.Context, ref string, dest string) error
    // ListRegistryTags lists available tags for a given image reference.
    ListRegistryTags(ctx context.Context, ref string) ([]string, error)
}

Three methods. That was all drawing—the pull orchestrator—would ever need from shardik. The interface was narrow by design. Wide interfaces in Go are a trap: they force tests to mock methods that should be irrelevant to the behavior under test, they leak implementation decisions into the API, and they make the package harder to replace in the future.

The narrow interface also made testing immediate: a fakeRegistryClient struct implementing three methods, with configurable return values and call tracking, could drive the full Drawing test suite without ever making an HTTP connection. This was not test convenience—it was architectural integrity. If the Drawing package could be tested without a real registry, then the Drawing package was genuinely independent of the registry implementation. The interface was load-bearing: it proved the decoupling.

1.2 — The Coverage War

With go-containerregistry handling the transport layer, Roland focused on the credential management, error handling, and the structural decisions that lived above the library. The shardik package grew from a dozen functions to dozens, and with each function came a test, and with each group of tests came the coverage measurement ritual:

$ go test -coverprofile=coverage.out ./internal/shardik/...
ok      github.com/rodrigo-baliza/maestro/internal/shardik    2.47s   coverage: 75.5% of statements

Seventy-five and a half percent. The linter did not complain about it—there was no linter rule for coverage percentage—but the declared discipline of the project was one hundred percent on all packages. Not "we aim for high coverage." Not "keep it above eighty." One hundred percent, or document explicitly why each uncovered line was genuinely unreachable.

The gap between 75.5% and 100% was not laziness. It was archaeology. Every uncovered line was a question: why is this line not being exercised? Sometimes the answer was "the test setup is incomplete." Easily fixed. Sometimes the answer was "this error path depends on the registry returning a malformed response that go-containerregistry never actually surfaces." Those paths required mock servers—HTTP test servers that returned exactly the wrong thing in exactly the right context to trigger the defensive code.

The mock server construction was its own sub-discipline. A mock registry server had to handle multiple sequential requests: first the authentication challenge, then the authentication token response, then the manifest request, then each layer blob request. Each response had to include the exact headers that go-containerregistry expected: Content-Type: application/vnd.oci.image.manifest.v1+json for manifests, Content-Length for blobs, Docker-Content-Digest for SHA-256 verification. Missing a header sent the library into unexpected code paths. Including wrong Content-Type values triggered schema validation errors. The mock server was not a toy—it was a precision instrument that had to model the exact behavior of production registries in order to drive the exact error paths in the production code.

The mock server was implemented using net/http/httptest.NewServer, with a state machine tracking which request number each test was at and returning the appropriate response. Test cases that needed to trigger specific errors would configure the mock to return a 401 on the third request, or a 500 on the second layer download, or a Docker-Content-Digest header that didn't match the body—anything the real world could produce, reproduced in a controlled test harness. This made the tests both exhaustive and documentation: reading a mock server test showed, precisely, what the shardik code would see from a misbehaving registry and what it was expected to do in response.

By the time shardik reached 100% coverage, it had 74 tests across its internal and external test files. Twenty-nine of those tests had existed in the first working version. Forty-five were added specifically in service of the coverage campaign. Every one of them documented a real behavior—a real edge case that a real registry might trigger and that the code had to handle cleanly.

1.3 — The userHomeDirFn Injection

During the coverage campaign, one particularly stubborn uncovered line appeared in the credential loading code. The function that found the Docker credential store read its configuration from ~/.docker/config.json, which required calling os.UserHomeDir(). This was a global function with no way to override it in tests—without running as a test user with a controlled home directory, the behavior was tied to the host filesystem.

The fix was to introduce userHomeDirFn as an injectable field on the credential loader struct:

type CredentialLoader struct {
    userHomeDirFn func() (string, error)
    // ... other fields
}

func NewCredentialLoader() *CredentialLoader {
    return &CredentialLoader{
        userHomeDirFn: os.UserHomeDir,
    }
}

Tests could inject a function that returned a temporary directory with a controlled config.json content. This made the error path—os.UserHomeDir failing—testable by injecting a function that returned an error:

loader := NewCredentialLoader()
loader.userHomeDirFn = func() (string, error) {
    return "", errors.New("no home directory")
}

The injection was a tiny structural change with outsized testing leverage. The pattern—replacing calls to global functions with injectable function fields—was applied throughout the codebase wherever the code under test needed to interact with the host environment in ways that tests couldn't control. os.Getuid(), exec.LookPath(), os.Hostname(), time.Now()—each became an injectable dependency in any function that needed deterministic behavior in tests. The result was a codebase where every line was reachable under some test configuration, and where the distance between "what the code does" and "what the tests verify" was continuously closing toward zero.

[Engineering Sidebar: The OCI Distribution Specification]

The OCI Distribution Specification (formerly the Docker Registry HTTP API V2) defines how container images are stored in and retrieved from registries. The specification describes a two-phase authentication system: the client first contacts the registry endpoint, which returns a 401 Unauthorized with a WWW-Authenticate header pointing to a token service; the client contacts the token service with credentials and scope parameters; the token service returns a bearer token; the client uses that bearer token for all subsequent requests.

Beyond authentication, the specification defines two core object types: manifests and blobs. A manifest is a JSON document that describes an image—its configuration blob and its ordered list of layer blobs, each referenced by digest (SHA-256). A blob is arbitrary binary data, identified solely by its SHA-256 hash, regardless of its content or original name. The content-addressable nature of the blob store is fundamental: the same blob at different registries, or the same layer referenced from multiple images, will have identical digests.

Multi-platform images are represented as manifest indexes—top-level manifests that list platform-specific sub-manifests. When you pull nginx:latest on an arm64 machine, the client first fetches the index, finds the sub-manifest for linux/arm64, and then fetches that sub-manifest to get the actual layers. This two-level structure is why platform-aware pulls require logic above the basic manifest fetch.

The specification also defines the OCI Image Layout, a directory structure for storing images on a local filesystem—essentially a registry on disk. Maestro's maturin package implements a simplified version of this layout as its content-addressable store.


Chapter 2: Maturin Rises — The Content-Addressable Store

If Shardik was the bridge to the outside world, Maturin was the fortress that protected what crossed it. Named after the ancient Turtle—slow, immense, foundational, carrying the world on its back—the maturin package was Maestro's local image storage engine: a content-addressable filesystem where every blob was named by its SHA-256 digest and every read was verified against that digest before use.

The design was drawn directly from the OCI Image Layout specification, with additions dictated by operational reality:

~/.local/share/maestro/images/
├── index.json                    # catalog of known images: name+tag → digest
├── blobs/
│   └── sha256/
│       ├── <config-digest>       # image configuration JSON
│       ├── <layer-digest-1>      # compressed layer tar.gz
│       └── <layer-digest-2>
└── refs/
    └── <name>/
        └── <tag> -> ../../blobs/sha256/<manifest-digest>  # symlink

The refs/ directory used symlinks from human-readable image names to the blobs directory containing the manifest. This served two purposes: it allowed a tag like nginx:latest to be quickly resolved to a digest without parsing index.json, and it allowed the store to be inspected directly by a human who understood the layout—ls -la refs/nginx/ showed immediately the digest of the locally stored latest tag.

The most important structural decision in Maturin was the verifyingReader:

type verifyingReader struct {
    r        io.Reader
    expected digest.Digest
    h        hash.Hash
    read     int64
}

func (vr *verifyingReader) Read(p []byte) (int, error) {
    n, err := vr.r.Read(p)
    vr.h.Write(p[:n])
    vr.read += int64(n)
    if err == io.EOF {
        actual := digest.NewDigest("sha256", vr.h)
        if actual != vr.expected {
            return n, fmt.Errorf("digest mismatch: expected %s, got %s", vr.expected, actual)
        }
    }
    return n, err
}

Every blob written to Maturin was written through a verifyingReader that computed the SHA-256 hash as data streamed in. When the read reached EOF, it compared the computed hash to the expected digest from the manifest. If they differed—if even one byte had been corrupted in transit, if the registry had sent the wrong blob, if something had been truncated—the write failed with a clear error. Nothing was committed to storage without passing the integrity check.

This was not paranoia. This was the minimum viable integrity guarantee for a system that stored executable code. A corrupted layer could silently produce a container that ran wrong code, crashed in confusing ways, or exposed security vulnerabilities. "Trust but verify" was not the container runtime motto. The motto was "verify everything, trust nothing until verified."

2.1 — The Atomic Write Protocol

Maturin's write discipline went beyond integrity verification. Every blob write followed an atomic protocol:

  1. Write the blob data to a temporary file in the blobs directory (.tmp.sha256.<digest>)
  2. Verify the digest of the completed temporary file
  3. Only on successful verification, rename the temporary file to the final name

The rename was atomic at the filesystem level: on Linux, os.Rename across files in the same directory is guaranteed to be atomic by POSIX. Either the final named file exists with complete verified data, or it doesn't exist at all. There was no intermediate state where a partially-written blob existed at the canonical path. This meant that if Maestro crashed mid-write—power loss, OOM kill, signal—the worst outcome was a .tmp. file that would be cleaned up on next startup, not a corrupted blob that would be silently used as if it were valid.

2.2 — The Khef File Lock

Maturin used Khef—the ancient word from Roland's world meaning "the sharing of water at a secret place"—as the file-locking mechanism for concurrent access. If two maestro image pull commands ran simultaneously for the same image, they should not interfere with each other: one should win the lock, complete the pull, and release the lock; the other should either wait or detect that the image was already local. The Khef implementation used Linux syscall.Flock for advisory locking—a lock file in the image store root, acquired at the start of a pull operation and released on exit.

Advisory locking on Linux has limitations: it doesn't prevent a non-cooperative process from writing to locked files, it's advisory rather than mandatory, and it doesn't survive crashes without external cleanup. But for Maestro's use case—coordinating multiple maestro invocations from a single user on a single machine—it was sufficient. The goal was to prevent accidental corruption from concurrent operations, not to provide full distributed transaction semantics.

2.3 — The ListImages Bug

The first significant implementation bug in Maturin was found not through a test failure but through behavior inspection. The maestro image list command was implemented as a function that walked the refs directory and collected image entries. The naive implementation used three nested loops:

// First attempt — broken
for _, nameEntry := range nameEntries {
    for _, tagEntry := range tagEntries {
        for _, blobEntry := range blobEntries {
            // This structure doesn't match the actual directory layout
        }
    }
}

The problem was that the loop structure didn't correctly model the directory layout. The refs/ directory was organized as refs/<name>/<tag>, where <tag> was a symlink to a blob. Iterating nameEntries and tagEntries as sibling slices rather than as nested parent-child directories meant that tags from one image name could be incorrectly paired with entries from another. With a single image locally stored, this didn't manifest. With two images, it produced duplicate entries, missing entries, or mismatched name-tag-digest combinations depending on the iteration order.

The fix was to drop the nested loop structure entirely and use filepath.WalkDir:

// Second attempt — correct
err := filepath.WalkDir(refsDir, func(path string, d fs.DirEntry, err error) error {
    if err != nil || d.IsDir() {
        return err
    }
    // path is refs/<name>/<tag>
    rel, _ := filepath.Rel(refsDir, path)
    parts := strings.SplitN(rel, string(filepath.Separator), 2)
    if len(parts) != 2 {
        return nil
    }
    name, tag := parts[0], parts[1]
    target, _ := os.Readlink(path)
    digest := filepath.Base(target)
    images = append(images, ImageInfo{Name: name, Tag: tag, Digest: digest})
    return nil
})

filepath.WalkDir doesn't give you a flat list to misinterpret; it gives you each path in the tree exactly once, in lexicographic order, with its directory structure preserved. The path refs/nginx/latest unambiguously decomposed into name nginx and tag latest. There was no possibility of mispairing. The fix was seven lines of code. The lesson was that directory traversal, despite being superficially simple, is always better expressed as a recursive walk than as manually indexed nested slices.

[Engineering Sidebar: Content-Addressable Storage]

Content-addressable storage (CAS) is a data management strategy where objects are addressed by the cryptographic hash of their contents rather than by an assigned name or location. The hash is simultaneously the identity, the locator, and the integrity check: if you know the SHA-256 hash of a piece of data, you can (a) ask for it by that hash, (b) find it in any store that holds it, and (c) verify that what you received is exactly what was requested.

OCI image layers are pure CAS objects: each layer's SHA-256 hash appears in the manifest that references it, is used as the filename in local storage, and is verified after download. This means layers can be shared across images without duplication—if nginx:1.24 and nginx:1.25 share a base Debian layer, that layer is stored exactly once on disk, referenced by both manifests.

CAS also enables several security properties. An attacker who compromises a registry and replaces a blob with malicious content cannot bypass client-side digest verification: the client computes the hash of what it received and compares it to the hash listed in the manifest it pulled earlier. If they differ, the pull fails. The only way to serve malicious content through a CAS system without detection is to also compromise the manifest, which is a harder target (manifests are typically signed in high-security deployments).

git uses the same principle: every commit, tree, and blob in a git repository is addressed by its SHA-1 (now SHA-256) hash. If two commits have the same hash, they are the same commit. The integrity of the entire repository history is anchored to the hash of the HEAD commit, which chains recursively through the entire tree.


Chapter 3: The Drawing — Pull Orchestration

Drawing was the third player in the image retrieval sequence: a coordinator package that held the RegistryClient interface and drove the pull workflow from user command to locally stored image. It was the thinnest of the three packages by line count, but the most consequential by integration scope.

The Drawing package's primary function, PullImage, was itself relatively short:

func (d *Drawing) PullImage(ctx context.Context, ref string) (*maturin.ImageRecord, error) {
    // 1. Resolve the reference
    normalized, err := d.resolver.Normalize(ref)
    if err != nil {
        return nil, fmt.Errorf("invalid image reference %q: %w", ref, err)
    }

    // 2. Check if already local
    if record, err := d.store.LookupImage(ctx, normalized); err == nil {
        return record, nil
    }

    // 3. Pull from registry
    if err := d.registry.PullImage(ctx, normalized, d.store.BlobDir()); err != nil {
        return nil, fmt.Errorf("pulling %s: %w", normalized, err)
    }

    // 4. Index and return
    return d.store.IndexImage(ctx, normalized)
}

Clean. Linear. Each step delegated to its specialist. The elegance of this function was the direct result of the architectural discipline invested in the interface design and the package separation. If the RegistryClient interface had been wider, the PullImage function would have been longer. If Maturin's API had been messier, the indexing step would have been more complex. Clean calling code was evidence of clean underlying design.

3.1 — The fakeImage Problem

Testing the Drawing orchestrator required a fakeRegistryClient that could return controlled metadata and simulate various failure modes. This was straightforward. The subtler problem was the fakeImage used in tests of the content-addressable store: a synthetic image whose blobs had known, predictable content.

The initial fakeImage implementation used hardcoded digest strings that didn't correspond to the actual SHA-256 of the synthetic blob content. This worked as long as no code verified those digests. The moment maturin began verifying blobs through verifyingReader, every test that used fakeImage broke with digest mismatch errors.

The fix required rewriting the fakeImage construction to compute actual SHA-256 hashes of the synthetic content:

func makeFakeBlob(content string) (digest.Digest, []byte) {
    data := []byte(content)
    h := sha256.Sum256(data)
    d := digest.NewDigestFromBytes(digest.SHA256, h[:])
    return d, data
}

Every fake blob now had a real, computed digest. Every fake manifest referenced those real digests. When verifyingReader checked them, they matched. The tests passed, and they did so because the fake data was internally consistent rather than because the tests had disabled verification.

This was a subtle but important discipline: test infrastructure that mirrors production behavior rather than bypassing it. A fakeImage with wrong digests was not a test helper—it was a liability that would hide digest verification bugs by ensuring that verification was never actually exercised in the test environment.


Chapter 4: Gan Creates — Building the Execution Engine

With images pullable and storable, the second major phase of the Drawing weeks was the execution engine: taking a locally stored image and turning it into a running container. This meant implementing three more packages—gan (container lifecycle), eld (OCI runtime interface), and prim (filesystem preparation)—and wiring them together with specgen to produce the complete create-and-run pipeline.

Gan was named after the Dark Tower's creator deity—the force that caused the universe to exist and that continues to hold it together. In Maestro, gan was the orchestrator that created container records, invoked eld to interface with the OCI runtime, and coordinated with prim to prepare the filesystem. Gan was the moment of creation: the instant when an image (static, inert, a collection of filesystem layers) became a container (dynamic, alive, a process namespace with a rootfs).

Eld was named after the bloodline of Roland himself—the line of descent from Arthur Eld, the high king, that made Roland what he was. In Maestro, eld was the OCI runtime abstraction: an interface with three implementations (crun, runc, youki) that invoked the selected runtime binary with the correct arguments and returned structured results.

Prim was named after the Prim—the primordial ocean from which all creation arose in King's mythology. In Maestro, prim was the storage driver: a layered filesystem abstraction that could use kernel OverlayFS, fuse-overlayfs, or plain VFS copy to construct a writable rootfs from the read-only OCI image layers.

The wiring between these packages was the responsibility of gan's CreateContainer function, which:

  1. Received a container creation request with an image reference and configuration
  2. Called maturin to locate the image data and identify its layers
  3. Called prim to mount the layers into a merged rootfs at a working directory
  4. Called specgen to generate the OCI Runtime Spec JSON for the container
  5. Called eld to invoke the runtime's create command with the spec bundle path
  6. Recorded the container state in waystation
  7. Returned a container record with the assigned ID

Each step was fallible. Each failure needed to be communicated clearly upward and result in cleanup of whatever partial state had been created. If prim succeeded but specgen failed, the mounts needed to be unmounted before returning. If eld succeeded but waystation failed to persist the state, the container lifecycle was in an inconsistent state. The error handling in CreateContainer was the most complex error handling in the Phase 1 codebase—a cascading cleanup chain that unwound each successfully completed step if a later step failed.

[Engineering Sidebar: The OCI Runtime Specification]

The OCI Runtime Specification is a JSON document—conventionally named config.json—that completely describes how a container should be executed. It specifies the rootfs path, the process to run (including working directory, environment variables, and command line), the Linux namespaces to create (user, mount, PID, network, IPC, UTS), the capabilities to grant or drop, the seccomp profile to apply, the rlimits to set, the mounts to bind into the container (proc, sysfs, devpts, tmpfs, cgroup), and any hooks to invoke at specific lifecycle points.

The specification is explicit and exhaustive: you cannot "just run a command in a container"—you must specify every detail of the execution environment. This is the source of the specification's value (containers are reproducible because every parameter is explicit) and of its learning curve (beginners expect to run a command and get a container; instead they must author a 200-line JSON document).

specgen was Maestro's OCI spec generator: given a container creation request, it produced the complete config.json with sensible defaults for rootless operation, the correct mount list, the right namespace configuration, and per-request overrides. The complexity of specgen was a direct reflection of the complexity of the OCI Runtime Spec: every field it set represented a deliberate decision about what a "sensible default" meant in the context of a rootless, daemonless container runtime.


Chapter 5: The Silent Hang

The first maestro run alpine:latest echo hello did not fail. It did something worse: it silently hung, consuming CPU, making no progress, producing no output, refusing to respond to Ctrl-C. The terminal cursor blinked in a void.

Roland had encountered the Silent Hang. It was not a crash—crashes were comprehensible, they produced error messages, they had traceable call stacks. The Silent Hang was a prisoner that had stopped trying to escape, a system that had crossed into a state from which it had no path forward and no conceptual framework to describe where it had gone.

The debugging process took longer than the implementation of any single feature to that point. strace attached to the hung process showed it blocked on a read() syscall waiting for data from a pipe that would never be written. But why? What was the crun process waiting for?

The breakthrough came from reading the OCI Runtime Spec more carefully—specifically the section on the container initialization sequence. A container, during its create phase, goes through several initialization steps before it is in a state where start can be called to begin execution:

  1. The runtime forks a container init process
  2. The init process sets up namespaces, mounts, capabilities, seccomp, and cgroups
  3. The init process signals to the runtime via a synchronization pipe that initialization is complete
  4. The runtime calls the registered createRuntime and createContainer hooks
  5. The init process waits for a start signal from the start command
  6. On receipt, the init process execs the actual container command

Step 2 was the source of the hang. "Sets up... mounts" requires a mount list. The OCI spec config.json that specgen was generating did not include the essential mounts: /proc, /sys, tmpfs at /dev, devpts at /dev/pts, mqueue at /dev/mqueue, shm at /dev/shm, and the cgroup hierarchy. Without /proc, the container init process could not read its own status, could not determine its own PID, could not set up the PID namespace correctly. It waited for information it could never obtain, in a hole it had no exit from.

A container without /proc is not a container. It is a process in a box with no instruments. It cannot measure itself, it cannot see other processes, it cannot resolve /proc/self/fd references. Everything that depends on the proc filesystem—which is nearly everything a userspace process does—silently fails or blocks.

The fix—adding the correct mount list to specgen—was perhaps twenty lines of code. But finding the problem required understanding, at a deep level, what a container actually needed to function: not just "a process in namespaces" but "a process in namespaces with a complete virtual representation of a Linux operating environment."

This was the real lesson of the Silent Hang: the OCI Runtime Spec was not documentation for how to run containers. It was a complete specification of what a minimal Linux environment looked like. Every mount in the default list was not an option—it was a fundamental requirement that containerized software assumed would be present, had always been present in traditional systems, and had to be explicitly provided by the runtime for the container to function.


Chapter 6: The PATH of Exile

The Silent Hang was resolved. The container could initialize. But the first attempts to run user commands inside the container produced errors of the form:

Error: container_linux.go:380: starting container process caused: exec: "nginx": executable file not found in $PATH

The executable existed in the rootfs. It was at /usr/sbin/nginx. But it was not being found, because the container's PATH environment variable was empty. The container's PATH was empty because specgen was generating a spec with no default environment variables—and the OCI Runtime Spec does not synthesize a PATH from the rootfs.

In a traditional shell session, PATH comes from the system /etc/profile, from the user's .bashrc, from login processing, from the PAM stack. None of that exists in a container's initialization sequence. The OCI runtime invokes the specified process directly, without a shell, without profile processing, without PAM. The only environment variables the process receives are those explicitly specified in the runtime spec's process.env array.

The fix was to add a sensible default PATH to specgen:

defaultPath := []string{
    "/usr/local/sbin",
    "/usr/local/bin",
    "/usr/sbin",
    "/usr/bin",
    "/sbin",
    "/bin",
}

But the correct fix was to make this injectable for testing, not hardcoded. The production function was wrapped in a dependency-injected form that accepted an optional pathProvider function, defaulting to standard POSIX paths:

type EnvConfig struct {
    PathProvider func() []string
}

func defaultPathProvider() []string {
    return []string{"/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"}
}

This was a small structural decision with large testing consequences. If PATH was hardcoded, tests of spec generation had to either accept hardcoded values or set environment variables. If it was injectable, tests could provide their own path functions and verify the spec generation logic independently of the default values. The injection made the code both more testable and more honest about its dependencies.


Chapter 7: Rootless Identity — The User Namespace

The project's commitment to rootless operation—running containers without root privilege, without capabilities, without any kernel permission beyond what a normal user process had—introduced the most complex set of requirements in Phase 1: user namespace configuration and UID/GID mapping.

A user namespace in Linux is a kernel construct that virtualizes the mapping between UIDs/GIDs inside a namespace and UIDs/GIDs outside it. Inside the namespace, a process can appear to run as UID 0 (root), with full capabilities, while outside the namespace it is still running as UID 1000 (a normal user), with no elevated privileges. The magic of user namespaces is that the "root" inside the namespace only has authority over resources within that namespace—it cannot affect resources outside, cannot modify kernel data structures, cannot bypass the host operating system's security model.

This is the mechanism that makes rootless containers possible. The container runtime creates a user namespace, maps the host user's UID to UID 0 inside the namespace, and inside that namespace has full authority to create other namespaces (mount, PID, network), to perform mounts, to drop capabilities, to set up cgroups v2. All of this looks like root from inside the namespace; none of it requires root from outside.

But the mapping—the relationship between host UIDs and namespace UIDs—had to be configured correctly. Linux uses two helper programs, newuidmap and newgidmap, to set up extended UID/GID mappings that go beyond the single-user mapping that an unprivileged process can set up by directly writing to /proc/self/uid_map. These helpers read from /etc/subuid and /etc/subgid, which define the subordinate ID ranges that each user is allowed to map, and then configure mappings in the kernel on behalf of the user.

specgen needed to correctly compute and emit the uidMappings and gidMappings arrays in the OCI spec, reflecting the actual subordinate ranges available to the running user. This required reading the current user's UID, parsing the user's subuid and subgid entries, and constructing the mapping array. The logic was non-trivial, testable only with injectable override functions for os.Getuid() and the subuid file path, and critical for correctness: a wrong UID mapping would either fail at container start or, worse, create a container whose file ownership appeared correct but was actually running processes as unexpected host UIDs.

[Engineering Sidebar: Linux User Namespaces and ID Mapping]

Linux user namespaces (introduced in kernel 3.8) provide the foundational mechanism for rootless containers. Each user namespace has a mapping table, configured in /proc/<pid>/uid_map and /proc/<pid>/gid_map, that defines how UIDs/GIDs translate across the namespace boundary. A typical rootless mapping looks like:

0       1000    1           # host UID 1000 → namespace UID 0
1       100000  65536       # host UIDs 100000-165536 → namespace UIDs 1-65536

The first row maps the current user (UID 1000) to root (UID 0) inside the namespace. The second row maps a range of subordinate UIDs (from /etc/subuid) to the rest of the namespace UID space. This allows files owned by UID 1001, 1002, etc. inside the container—typical for images that use non-root users—to be mapped to real host UIDs (100001, 100002, etc.) that the kernel can enforce access control on.

Writing uid_map has strict rules: an unprivileged process can only write a single-entry, single-UID mapping for its own UID. Multi-entry mappings require either CAP_SETUID (elevated privilege) or the use of newuidmap, a SUID helper that reads authorized ranges from /etc/subuid. This is why setting up subordinate ID ranges in /etc/subuid and /etc/subgid is a prerequisite for rootless container operations—without them, only the single-UID mapping is available, which means only UID 0 inside the container can be the host user outside, limiting what images can be run effectively.


Chapter 8: The Todash Concurrency Bug

Todash was a package within beam—the networking subsystem. Its name came from the Todash darkness in King's mythos: the void between worlds, the absence of light, the space through which things could move from one world to another without being seen. In Maestro's architecture, todash handled the network namespace configuration: creating new network namespaces, entering existing ones, attaching pasta or CNI network interfaces to them.

The bug in todash was one of the most subtle bugs in Phase 1. It manifested only under concurrent usage, appeared randomly, and left no consistent error message: sometimes it crashed with a vague "network namespace setup failed," sometimes it proceeded but left the network namespace in an inconsistent state, sometimes it worked perfectly.

The root cause was the intersection of two systems that each had global state: Linux network namespaces and the Go scheduler.

Network namespace operations in Linux use setns(2) to change the network namespace of the current thread. Not the current process—the current thread. On a multi-threaded process (which any Go program with goroutines is), setns only affects the thread that called it. Other threads remain in the original namespace. This is by design—it allows different goroutines to be in different namespaces simultaneously—but it means that after a setns call, all subsequent network namespace-sensitive operations must occur on the same thread that called setns.

The Go scheduler, by default, freely migrates goroutines between OS threads. A goroutine that calls setns on thread A may, before its next line of code executes, be migrated by the scheduler to thread B. Thread B is in the original namespace. All subsequent operations seem to work—no error is returned—but they operate in the wrong namespace.

The fix was runtime.LockOSThread():

func enterNetworkNamespace(nsPath string) (func() error, error) {
    runtime.LockOSThread()  // This goroutine is now pinned to one OS thread
    
    // Save current namespace fd
    origNs, err := os.Open("/proc/self/ns/net")
    if err != nil {
        runtime.UnlockOSThread()
        return nil, err
    }
    
    // Enter target namespace
    if err := unix.Setns(int(targetNsFd), syscall.CLONE_NEWNET); err != nil {
        origNs.Close()
        runtime.UnlockOSThread()
        return nil, err
    }
    
    // Return cleanup function
    return func() error {
        defer runtime.UnlockOSThread()
        defer origNs.Close()
        return unix.Setns(int(origNsFd), syscall.CLONE_NEWNET)
    }, nil
}

runtime.LockOSThread() told the Go runtime: this goroutine is now permanently attached to its current OS thread until runtime.UnlockOSThread() is called. The scheduler would not migrate it. All operations that needed to occur in the entered namespace would occur on the same thread that entered it. The cleanup function would return to the original namespace on the same thread before unlocking.

This pattern—lock thread → do namespace-sensitive work → unlock thread—was not Maestro's invention. It was the standard Go idiom for any operation that needed to maintain thread-local state across multiple system calls. runtime.LockOSThread existed precisely for this use case. But it was easy to miss in initial implementations, and missing it produced exactly the kind of intermittent, unreproducible failures that made debugging painful.

The lesson: whenever Go code interacts with Linux APIs that have thread-local semantics (namespaces, signal masks, setenv), runtime.LockOSThread is not optional. It is a correctness requirement.


Chapter 9: The setgroups=deny Curse

The most philosophically fraught battle of the Drawing weeks was the setgroups=deny problem.

Linux user namespaces have a security restriction: before a process can write its GID mapping (/proc/self/gid_map), it must write deny to /proc/self/setgroups. This restriction was added in Linux 3.19 to prevent a specific privilege escalation: without it, an unprivileged user could create a user namespace, map themselves to GID 0 inside it, and use setgroups() to drop supplemental group memberships that the host kernel still enforced for unprivileged processes (specifically, the nogroup group that some files required for access restriction). The setgroups=deny requirement closed this hole by ensuring that once a user namespace is set up without full privilege, the process inside it cannot escalate through group manipulation.

The problem: setgroups=deny means exactly what it says. Any call to setgroups() inside that user namespace—from that point forward—returns EPERM. Setting up supplemental groups is permanently forbidden.

And nginx—the canonical test image—runs as UID 101. When nginx starts, its initialization code calls initgroups(101, ...) to set up the supplemental groups for UID 101. initgroups internally calls setgroups(). With setgroups=deny in effect, this call returns EPERM. nginx logged a permissions error and refused to start. The container was alive—the init process was running, the mounts were correct, the network namespace was configured—but the application inside was dead on arrival because it could not perform one of the most routine startup operations a non-root server does.

The diagnosis required cross-referencing three separate sources:

  1. The nginx error log inside the container: setgroups() failed: Operation not permitted. Clear message, but what caused setgroups to fail?
  2. The Linux kernel documentation for /proc/self/setgroups: explained the deny mechanism and its interaction with gid_map writes. This was the "aha" moment.
  3. The crun source code: showed the exact sequence in which crun wrote setgroups=deny before calling newgidmap, confirming that by the time the container process started, setgroups was permanently denied.

The solution was in the security subsystem that specgen generated: the white package (named after the concept of purity—the opposite of Roland's darkness). The OCI spec allowed configuring which supplemental groups the container runtime set up before setgroups=deny was written—that is, setting up group memberships at the container creation level rather than inside the container init process, bypassing the restriction entirely.

Concretely: if the OCI spec listed gid 101 in the container process's supplemental groups, the runtime (crun) would call setgroups([101, ...]) during the container creation phase, before the user namespace was fully locked down by setgroups=deny. Inside the container, initgroups would then find the groups already set and would not need to call setgroups itself. The setgroups=deny restriction was satisfied architecturally: it was in place before any container code ran, but the groups it would have blocked were already established by the time it went into effect.

This was not a workaround. This was the correct design, described in the OCI Runtime Spec, implemented by crun. The challenge was knowing that it was necessary and implementing it correctly in specgen's group configuration logic. Knowing came from reading, and reading came from discipline, and discipline came from a project culture that demanded understanding rather than trying random fixes until something works. The setgroups=deny curse was defeated not by luck but by literacy—reading the specification, the kernel documentation, and the runtime source until the answer was not just found but understood.

The white package that emerged from this battle became the security configuration subsystem for all of Maestro: capabilities, seccomp profiles, setgroups denial timing, supplemental group pre-seeding. It was named after the color—or the absence of color—that represented clarity and completeness in Roland's world. A container with a correctly configured white profile had no security debt: no unnecessary capabilities, no missing restrictions, no group escalation vectors.


Chapter 10: The Stabilization Sprint

The weeks between the first successful container run and the coverage accounting were not quiet. The initial execution milestone—echo hello—had been achieved on a skeleton: the create flow worked, the start flow worked in the basic case, but kill, inspect, and list were stubs with error returns. Log redirection didn't work. Port forwarding hadn't been connected. The container ID generation was naive.

Phase 1 stabilization was the process of completing the operation matrix: every user-facing command needed to work, not just the happy path but the full lifecycle including container stop, state inspection, and artifact cleanup on removal. The stabilization work was tracked in openspec/changes/p1-stabilization/, a document that read less like a feature specification and more like a field triage report—every known deficiency listed, prioritized by impact, with defined acceptance criteria for each fix.

10.1 — Container Operations Matrix

The operations matrix required:

Operation Status at first echo Target
create Working Working
start Working (basic) Complete
stop / kill Stub Complete
inspect Stub Complete
list Partial (buggy) Complete
remove Stub Complete
run (create + start) Working (basic) Complete

Each stub had to become real. kill required finding the container's PID from waystation, verifying it was still alive via /proc/<pid>/status, and sending the correct signal (SIGTERM by default, SIGKILL for force). The edge cases: what if the container had already exited? What if the PID had been recycled by the kernel? What if the container was in a namespace that didn't inherit the kill permission from the host?

Each of these questions had a correct answer. The correct answers came from reading the crun source code, testing against real container lifecycles, and applying defensive coding: always re-verify state before acting, always check for PID recycling via the container's cgroup, propagate meaningful errors upward rather than swallowing failures.

10.2 — Log Redirection

Container stdout and stderr needed to be captured and available via maestro logs. The OCI Runtime Spec's approach was to specify log pipes in the spec and have the runtime write to them. The implementation challenge was managing the file descriptors:

The naive approach—using io.Pipe() to redirect stdout and stderr from the container process—relied on Go's internal goroutine management to copy data between the pipe ends. Under the OCI create/start split, the container init process was running between create and start, potentially writing to the log before start was called. If the reading goroutine wasn't running yet, the pipe would fill, the write would block, and the init process would deadlock waiting to write what no one was reading.

The fix was to redirect logs to files rather than pipes. File writes don't block on a reader. The container init process could write to its log file at any point in its lifecycle without waiting for a goroutine on the other side. maestro logs would then read from the file, with optional follow mode implemented via inotify watching for new writes.

10.3 — The Global State Purge

The most structural work of the stabilization phase was the global state purge. Unit tests written during rapid implementation had accumulated a pattern: several packages used package-level global variables for configuration and state, modified by tests, with the assumption that each test ran in its own process (and therefore had a fresh globals slate). This assumption was correct for single tests but failed under Go's go test ./... parallel execution: multiple test packages could share a process, and globals modified by one test could interfere with another.

The full audit found 177 test functions that depended on global state in some way—either directly setting package-level variables, relying on package-level initialization functions having run in a specific order, or using shared file system paths that multiple tests could collide on. Each of these 177 tests was rewritten to use fully injected dependencies through the constructor parameters of the types under test.

The resulting test suite was longer in source lines but dramatically more reliable: tests could be run in any order, in parallel, multiple times in the same process, under race detection (-race), and would always produce the same results. The investment in the purge was not about aesthetics—it was about the correctness guarantee that came with a fully isolated test architecture.


Chapter 11: Coverage Battles — The Long Accounting

The setgroups=deny fix was the last functional blocker. After it, maestro run nginx:latest started nginx. The first real HTTP response from a container—returned by curl http://localhost on the host to nginx running inside the container over a port-forwarded interface—was a confirmation so small and so enormous simultaneously that Roland sat with it for a moment before moving to the next task.

The work was not done. It was never done in that simple sense. But a container was alive. An HTTP server was serving requests. The image had been pulled, verified, stored, mounted, described by a spec, handed to crun, and initialized correctly. Every link in the chain had been forged.

What remained was the coverage accounting. The discipline of one hundred percent had been declared in week one; the execution engine and all its support packages needed to submit to the same reckoning that Shardik and Maturin had undergone:

testing internal/gan...     coverage: 82.3% → target: 100%
testing internal/prim...    coverage: 67.4% → target: 100%
testing internal/specgen... coverage: 71.1% → target: 100%
testing internal/eld...     coverage: 88.9% → target: 100%
testing internal/waystation... coverage: 100.0% ✓
testing internal/tower...   coverage: 100.0% ✓
testing internal/shardik... coverage: 100.0% ✓
testing internal/maturin... coverage: 100.0% ✓

The gaps in gan, prim, specgen, and eld were the archaeology of implementation: error paths that only triggered on specific filesystem states (prim on a kernel that didn't support overlayfs), timeout paths that required real timing delays (too slow for unit tests without injection), defensive nil checks on values that the package construction guaranteed would never be nil, and dead code—functions written in anticipation of future needs that were never called.

Each of these categories required a different treatment. Untested error paths needed either tests that injected the correct failure condition (mock filesystem, mock runtime binary, injected error function) or a //nolint:funlen //coverage:ignore annotation with a documented reason. Dead code was removed. The defensive nil checks were examined: some were removed after confirming the invariant, others were retained with annotations explaining why the invariant held.

prim represented the most interesting coverage challenge. The storage driver used a decision tree to select between kernel overlayfs, fuse-overlayfs, and VFS copy. Testing all three paths required either different kernel versions, different tools installed, or capability injection—the ability to make the driver believe fuse-overlayfs wasn't available and fall back to VFS. The injection architecture—replacing direct exec.LookPath calls with injectable lookupFn dependencies—was added specifically to make the fallback paths testable without the external tooling. In production, lookupFn was exec.LookPath. In tests, it was a function that returned "not found" for fuse-overlayfs, driving the fallback path.

The final coverage numbers, after weeks of this systematic archaeology, were all at 100%—with the genuinely unreachable paths documented rather than silently skipped. The documentation was the important part: not just //coverage:ignore but //coverage:ignore — only reachable when kernel overlayfs is partially broken in a way that no test environment exhibits.


Chapter 12: The Rootless Dependency Audit

Before the Birth of Gan could be declared complete—before the milestone could be checked and the work of Phase 1 considered ready for the final networking stage—a full rootless dependency audit was conducted. The question was not "does the container run?" but "does it run correctly in every rootless configuration a user might have?"

The audit identified three rootless-specific requirement categories:

Filesystem: The kernel overlayfs driver requires CAP_SYS_ADMIN to mount in the traditional model. Rootless containers do not have CAP_SYS_ADMIN outside their user namespace. The fix—using overlayfs from inside the user namespace, where the process has effective CAP_SYS_ADMIN scoped to the namespace—worked on kernels 5.11+, which added support for rootless overlayfs mounts. On older kernels, fuse-overlayfs was required. The prim fallback logic handled this, but the fallback required fuse-overlayfs to be installed and locatable in PATH. If neither was available, the VFS copy fallback was used—fully portable, but slow for large images. The audit documented each fallback's performance implications and user-facing behavior.

Namespace: Entering a user namespace requires that the process itself not already be in a non-default user namespace. If maestro was run inside a container (container-in-container), the inner container might already have user namespace depth at the maximum the kernel allowed (typically 32 nested user namespaces). The audit documented this limitation and surfaced it in maestro system check output.

Subuid/subgid: The subordinate ID ranges in /etc/subuid and /etc/subgid must contain entries for the current user. On minimal system images (scratch containers, distroless images), these files might not exist. On multi-user systems, the current user might not have been assigned subordinate ranges. maestro system check was extended to verify the presence of these entries and provide actionable remediation messages when they were missing.

Each of these audit findings became either a feature of maestro system check (surfacing the problem to the user) or a fallback in the implementation (handling the condition gracefully rather than crashing). The system check command grew into a comprehensive preflight validator that could tell a user exactly why Maestro wouldn't work in their environment and how to fix it—a capability that the research phase had identified as essential for the user experience of a rootless-first tool.


Chapter 13: The Birth of Gan

The moment when maestro run alpine:latest echo hello returned the word hello on the terminal—without hanging, without an error, without a crash; cleanly, correctly, immediately—was not marked by ceremony. Roland noted it in the journey log with the brevity the moment warranted:

"Milestone 1.3 Gan Creates: First successful container execution. maestro run alpine:latest echo hello → hello. Container lifecycle complete: create → start → wait → collect exit code → record in waystation → report to user."

The brevity was appropriate. The implementation had been weeks of combat—the Silent Hang, the PATH of Exile, the UID mapping archaeology, the Todash concurrency bug, the setgroups curse—and one clean line of output was the victory flag that had been seen from across the battlefield during all of it. The line didn't need ceremony; it was the ceremony.

But the Birth of Gan was also incomplete, and Roland knew it. A container that could execute echo hello was not the same as a container that could run nginx and serve web traffic. Between those two existed the full networking stack: no port forwarding, no IP address assignment, no DNS resolution inside the container, no way for the outside world to reach the inside. The container was alive—breathing, thinking, executing—but it was alone. It could not speak to anything beyond its own namespace boundary.

The container was alive but silent. A being that could not communicate was not yet complete. The network namespace the container lived in was a void—not the productive void of Todash, but the barren void of isolation. Inside: processes running, filesystem present, kernel syscalls succeeding. Outside: no route, no interface, no bridge, no masquerade.

The gunslinger had drawn his companion through the door, but the companion still stood on the beach. The world was large. The Tower was far. And the Beam—the one that connected the container to the host network, to the internet, to the services and clients and protocols that made a container runtime useful—had not yet been forged.

That was the work of Part III.


Chapter 14: The Milestone Report

The entry for milestone 1.3 Gan Creates was brief, as all those records were—it was a log, not an essay. But reading it weeks later, with Part III of the phase already completed and the full scope of Phase 1 visible in retrospect, that brevity felt like a form of compression, the exact summary of a moment that now revealed itself to be much larger:

### Milestone 1.3: Gan Creates
- First successful container execution via crun
- `maestro run alpine:latest echo hello` → `hello`
- Full CRUD lifecycle for containers in waystation
- Specgen generates compliant OCI Runtime Spec config.json
- Prim handles overlayfs, fuse-overlayfs, VFS fallback
- Eld abstracts crun/runc/youki interface
- Coverage: all new packages at 100%

Each bullet represented weeks. The "100% coverage" line was six weeks of mock server construction, injectable function refactoring, global state purging, and edge case archaeology. The "full CRUD lifecycle" line was two weeks of null-case handling and concurrency testing. The "Prim handles overlayfs" line was the kernel-version compatibility matrix, the fallback detection logic, and the fuse-overlayfs detection code. The "crun/runc/youki interface" was the three-runtime test matrix and the behavioral difference documentation.

This was the compression of a milestone: the journey entry held the conclusions; the narrative held the war. Future contributors reading the CHANGELOG would know what was achieved. Only the narrative would tell them how hard it was and why it mattered.

The intermediate weeks of Phase 1—the Drawing phase—were where Maestro became real. The scaffold phase had created the structure. The drawing phase had filled it with substance. The networking phase that followed would complete it. But completion was only possible because the drawing phase did not cut corners, did not declare victory at 75.5% coverage, did not leave the setgroups=deny problem to "someone else's issue tracker," did not accept a container that ran echo hello but couldn't run nginx. Every compromise the drawing phase refused to make was one fewer unexpected failure in Phase 2.

The shore was left clean. The doors were closed behind the companions now drawn through. The beach was ready for the next wave.

And the Beam—somewhere ahead, waiting to be connected—was the thing that would complete the tower's foundation. Roland did not rest on the beach. You did not rest in Mid-World, not when there were still ka-tet members to draw and beams to walk and towers to reach. The beach was a waystation, not a destination. The destination was elsewhere.

He checked his guns. Full cylinders. Clean barrel. The gunslinger discipline that had carried him through the desert and across the shore was intact. Every package at 100% coverage. Every test isolated and reproducible. Every dependency injected rather than hardcoded. Every error handled or documented as unreachable. The codebase behind him was clean, and clean codebases led to clean futures.

The network namespace waited ahead, dark and connectionless. The pasta was uncooked. The CNI plugins were undownloaded. But the container was alive and the engineer who had built it was ready.

The drawing was done. The running had begun. The Tower was still distant, but its outline on the horizon was sharper now—and the gunslinger's step was certain.


The container breathes. rootfs mounted, proc pseudo-mounted, devpts attached. crun watches over the init process with the impassive attention of a warden who has seen a thousand inmates and knows that any one of them might, given the right conditions, attempt to break free of the namespace boundary.

But the network namespace is a dark room. No interface. No route. No ports. The pasta is still dry, uncooked, waiting in the package.

And somewhere across the host kernel's network stack, the outside world waits for a connection that has not yet been possible. curl http://localhost returns connection refused. The container server responds to no one.

Shardik has been crossed. Maturin holds the image. Gan has lit the fire of process life. But without the Beam—without beam, mejis, doorway, guardian—the container is deaf and blind and mute in its own namespace, a mind without a voice.

The work continues.


The Forge of Purpose The desert remains, but the Gunslinger no longer walks alone in the silence. The foundation has hardened, and the scaffold now echoes with the rhythmic strike of the hammer. Part II is no longer about intent—it is about the mechanism. The gears of the minimalist runtime are beginning to turn, grinding against the friction of the world, finding their resonance in the cold logic of Go 1.26.2.

The Tower is still distant, but its shadow has grown longer, reaching out to touch the code we have written. The path is narrowing, the details are becoming sharper, and the weight of the steel is real. We have moved beyond the first stone; we are now building the engine that will drive the quest forward.

If you seek the pulse of the machine, the updated glyphs of the forge await your inspection at the same eternal coordinates:

👉 https://github.com/garnizeh/maestro

Delve into the repository once more. Witness how the bones of the architecture have gained muscle and nerve. The second cycle of Ka is reaching its zenith, and the wheel turns with a heavier, more certain sound.

Until the next revolution.


End of Part II.


11 views

Maestro: Building an OCI Container Manager in Go

Part 2 of 5

Notes from a dev stepping out of his comfort zone. This series chronicles the raw, unpretentious journey of building "Maestro," a custom OCI v1.1-compatible container manager written entirely in Go. From deciphering low-level Linux namespaces and cgroups to navigating the Open Container Initiative specifications, this is a deep dive into system architecture and advanced Go interactions. We are not just blindly writing code here; we are aiming with the mind.

Up next

Part I: The Desert of the Spec

The Scaffold — How the Tower's First Stone Was Laid