Building tiny-docker-go in Go: What I Learned from Building a Tiny Docker-like Runtime

Building tiny-docker-go in Go: What I Learned from Building a Tiny Docker-like Runtime

I use Docker almost every day.

I use it for local development, backend services, databases, staging environments, CI/CD pipelines, and sometimes even for debugging production-like issues. Like many developers, I became comfortable with commands like:

docker run
docker ps
docker logs
docker stop
docker compose up

But for a long time, Docker still felt like a black box to me.

I knew how to use it.

I knew how to write Dockerfiles.

I knew how to debug containers when something failed.

But I did not deeply understand what actually happens under the hood when we run a container.

So I decided to build a small Docker-like container runtime in Go.

The project is called tiny-docker-go.

GitHub repository:

https://github.com/amirsefati/tiny-docker-go

The goal was not to rebuild Docker.

Docker is a mature platform with a huge ecosystem: image builds, registries, storage drivers, networking drivers, logging drivers, security features, orchestration integrations, plugins, and many other production-grade details.

My goal was much smaller:

Build a tiny runtime step by step, so I can understand the Linux ideas behind containers.

Docker’s own documentation describes containers as isolated processes that run on a host and have their own filesystem, networking, and process tree. That sentence looks simple, but it hides a lot of Linux internals.

To understand that sentence, I needed to touch the real building blocks:

  • Linux namespaces
  • cgroups
  • root filesystems
  • chroot
  • /proc
  • process lifecycle
  • signals
  • logs
  • network namespaces
  • bridge networking
  • veth pairs
  • NAT
  • container metadata

This article is a summary of the full 10-day journey.

It is not a tutorial for building a production runtime.

It is a developer story about learning containers by building a tiny version of one.

Why I started with Go

I chose Go because it fits this kind of project very well.

Go makes it simple to build CLI tools, execute processes, work with files, handle signals, and call lower-level Linux syscalls when needed.

Also, many important container projects are written in Go. Docker itself, containerd, runc, Kubernetes, and many cloud-native tools use Go heavily.

So using Go felt natural.

For this project, I wanted the code to stay simple and readable. I did not want to hide everything behind too many abstractions too early.

At the same time, I wanted the structure to be extensible enough so I could add one feature every day without rewriting the whole project.

That balance became one of the main lessons of the project.

When you build systems software, the hard part is not only writing code that works today.

The hard part is writing code that can survive the next feature.

The 10-day plan

I split the project into 10 small parts:

  1. Project structure and CLI foundation
  2. Linux namespaces
  3. Root filesystem isolation
  4. Container IDs and metadata
  5. Logs
  6. Stop and lifecycle management
  7. cgroups and memory limits
  8. Network namespace
  9. Bridge and veth networking
  10. Polish, README, roadmap, and lessons learned

This helped me avoid one common mistake:

Trying to build “Docker” in one step.

That is too much.

Instead, I treated each day as one small question.

Day 1:

Can I execute a command through my own CLI?

Day 2:

Can I run that command inside new Linux namespaces?

Day 3:

Can I give that process a different root filesystem?

Day 4:

Can I remember what I started?

Day 5:

Can I capture logs?

Day 6:

Can I stop a running container?

Day 7:

Can I limit memory?

Day 8:

Can I isolate networking?

Day 9:

Can I connect the container back to the outside world?

Day 10:

Can I explain the architecture clearly?

That made the project much easier to continue.

Day 1: Project Setup and CLI Foundation

On Day 1, I did not start with namespaces.

That may sound strange because namespaces are one of the most exciting parts of containers.

But I wanted to start with the boring foundation first.

The initial project structure looked like this:

tiny-docker-go/
├── cmd/
│   └── tiny-docker-go/
│       └── main.go
├── internal/
│   ├── app/
│   ├── cli/
│   └── runtime/
├── go.mod
└── README.md

The idea was simple:

  • cmd/ contains the executable entrypoint.
  • internal/cli handles user-facing commands.
  • internal/runtime handles process execution.
  • internal/app wires things together.

I added basic commands:

tiny-docker-go run
tiny-docker-go ps
tiny-docker-go stop
tiny-docker-go logs

At this stage, only run actually did something.

It executed a normal Linux command on the host.

Example:

go run ./cmd/tiny-docker-go run echo hello

Output:

hello

This was not a container yet.

There was no isolation.

No cgroups.

No rootfs.

No networking.

But this step mattered because it gave me a stable CLI shape.

I wanted the outside interface to look like a tiny version of Docker:

tiny-docker-go run /bin/sh
tiny-docker-go ps
tiny-docker-go logs <id>
tiny-docker-go stop <id>

Even before the internals were ready, the product shape was clear.

That helped a lot later.

Small lesson from Day 1

A container runtime is still a command runner at the beginning.

Before thinking about advanced kernel features, I needed a clean way to receive a command, validate it, execute it, and return output to the terminal.

A lot of systems projects start like this.

First, build a simple interface.

Then make the implementation smarter behind that interface.

Day 2: Adding Linux Namespaces

Day 2 was where the project started to feel like a real container runtime.

Linux namespaces are one of the core ideas behind containers.

A namespace gives a process a different view of some system resource.

For example:

  • PID namespace gives a different process tree.
  • UTS namespace gives a different hostname.
  • Mount namespace gives a different mount table.
  • Network namespace gives a different network stack.
  • User namespace gives a different view of user and group IDs.
  • IPC namespace isolates IPC resources.
  • Cgroup namespace isolates cgroup views.

The important thing is this:

A container is not a virtual machine. It is still a Linux process, but it sees a more isolated view of the system.

That sentence changed how I think about Docker.

When I run:

docker run alpine sh

Docker does not boot a new kernel like a VM.

It starts a process on the host kernel, but configures isolation around it.

In Go, I started experimenting with syscall.SysProcAttr and clone flags.

A simplified version looks like this:

cmd.SysProcAttr = &syscall.SysProcAttr{
    Cloneflags: syscall.CLONE_NEWUTS |
        syscall.CLONE_NEWPID |
        syscall.CLONE_NEWNS,
}

This creates the child process in new namespaces.

The first namespaces I added were:

UTS namespace
PID namespace
Mount namespace

UTS namespace

UTS namespace lets the container have its own hostname.

Inside the child process, I could call:

syscall.Sethostname([]byte("tiny-container"))

Then inside the container:

hostname

would show:

tiny-container

That was a small moment, but it felt important.

The process was still running on my machine, but it had its own hostname.

That was the first visible sign of isolation.

PID namespace

PID namespace was more interesting.

With a new PID namespace, the process inside the container can see itself as PID 1.

That is a big deal.

On Linux, PID 1 is special.

It is the init process of that namespace. It has responsibilities around signal handling and reaping zombie processes.

This is why container entrypoints matter.

If the main process inside a container does not handle signals correctly, stopping the container can behave badly.

This also helped me understand why tools like tini exist in container environments.

Mount namespace

Mount namespace gave the container its own mount table.

That means the process can have different mounts from the host.

At this point, I was not yet fully changing the filesystem, but I prepared the project for mounting /proc later.

One small Linux detail I learned here:

When working with mount namespaces, mount propagation can surprise you.

If mounts are shared with the host, changes inside one namespace may propagate in ways you do not expect. Real runtimes are careful about making mounts private before doing container setup.

This is one of those details that you do not think about when using Docker normally.

But when building a runtime, it becomes visible very quickly.

Parent and child process model

One design pattern I used was the parent/child model with:

/proc/self/exe

The parent process receives the CLI command.

Then it starts a child process by re-executing the same binary:

exec.Command("https://dev.to/proc/self/exe", "child", ...)

The parent is responsible for setup and management.

The child enters the isolated environment and runs the target command.

This pattern made the code easier to reason about.

There is a clear split:

parent process
├── parse CLI
├── prepare config
├── start child with namespaces
└── track metadata

child process
├── set hostname
├── prepare filesystem
├── mount proc
└── exec user command

This was the first time tiny-docker-go started to feel like a real runtime.

Day 3: RootFS and chroot

On Day 3, I added filesystem isolation.

Namespaces isolate views of system resources, but a container also needs a filesystem.

When I run an Alpine container, I expect to see Alpine files:

/bin/sh
/etc/os-release
/lib
/usr

I should not see the host root filesystem.

For the first version, I used chroot.

The idea is simple:

syscall.Chroot(rootfs)
os.Chdir("https://dev.to/")

After that, / inside the process points to the rootfs directory.

Example:

sudo tiny-docker-go run --rootfs ./rootfs/alpine /bin/sh

Inside the container:

cat /etc/os-release

shows Alpine information if the rootfs is Alpine.

This was another important moment.

Now the process had:

  • its own hostname
  • its own PID namespace
  • its own mount namespace
  • its own root filesystem

It still was not Docker, but it started to look like the core of a container.

chroot is not full container security

One important note:

chroot is useful for learning, but it is not complete container isolation by itself.

Historically, chroot was not designed as a full security boundary.

A real runtime usually uses more careful filesystem setup, often with pivot_root, mount namespaces, read-only mounts, bind mounts, capabilities, seccomp, AppArmor or SELinux, and other hardening layers.

For this project, chroot was enough because my goal was educational.

I wanted to understand the basic idea:

Give the process a different /.

That one idea explains a lot.

A container process does not magically have a filesystem.

The runtime prepares one.

Mounting /proc

After entering the rootfs, I mounted /proc:

syscall.Mount("proc", "https://dev.to/proc", "proc", 0, "")

Without /proc, commands like ps may not work correctly inside the container.

This helped me understand another detail:

Many Linux tools do not get information from some secret API.

They read from virtual filesystems like /proc.

For example, ps depends on /proc to inspect processes.

So if the container has a PID namespace but /proc is not mounted correctly, the view inside the container can be confusing.

This is one of those small details that makes containers feel less magical.

Day 4: Container ID and Metadata

After Day 3, I could start isolated processes.

But I had a new problem:

How do I remember them?

Docker can do:

docker ps
docker inspect 
docker logs 
docker stop 

That means Docker stores metadata about containers.

So on Day 4, I added a simple metadata store.

I used a local directory like:

/var/lib/tiny-docker/containers/<id>/

Each container gets a config.json.

Example fields:

{
  "id": "abc123",
  "command": ["https://dev.to/bin/sh"],
  "hostname": "tiny-container",
  "rootfs": "./rootfs/alpine",
  "status": "running",
  "created_at": "2026-05-12T10:00:00Z",
  "pid": 12345
}

This was simple, but it changed the architecture.

Before this, run was just executing a process.

After this, run was creating a managed container record.

That is a big conceptual difference.

A runtime needs memory.

Not RAM memory, but operational memory.

It needs to remember:

  • What did I start?
  • What PID belongs to this container?
  • Where are its logs?
  • Is it running or stopped?
  • What command did it start with?
  • What rootfs did it use?

Then ps became meaningful.

Instead of being a placeholder, it could read metadata files and show containers.

A very simple output could look like:

CONTAINER ID   PID     STATUS    COMMAND
abc123         12345   running   /bin/sh

Small lesson from Day 4

A container runtime is partly a process manager and partly a state manager.

Starting the process is only half of the job.

Remembering and managing it is the other half.

This helped me understand why Docker has a daemon.

If containers can continue running after the CLI exits, something needs to track them.

My tiny runtime did this in a simple way with JSON files.

Docker does it in a much more complete way.

But the idea is similar.

Day 5: Logs

On Day 5, I added logging.

This sounded easy at first.

Just redirect stdout and stderr to a file, right?

Something like:

logFile, _ := os.Create("container.log")
cmd.Stdout = logFile
cmd.Stderr = logFile

For detached containers, that works.

Then:

tiny-docker-go logs 

can read:

/var/lib/tiny-docker/containers/<id>/container.log

and print it.

But logs became more interesting when I thought about interactive mode.

If I run:

tiny-docker-go run /bin/sh

I want stdin, stdout, and stderr attached to my terminal.

But if I run a detached process, I want logs written to a file.

So the runtime needs to understand different modes:

interactive mode
├── stdin  -> terminal
├── stdout -> terminal
└── stderr -> terminal

detached mode
├── stdin  -> maybe closed
├── stdout -> log file
└── stderr -> log file

Docker has this same concept in a more advanced way.

docker logs reads logs from the container’s configured logging driver, and docker logs --follow streams new output.

For my tiny version, I kept it simple:

tiny-docker-go logs <id>
tiny-docker-go logs -f <id>

The -f mode can be implemented like a basic tail -f.

Small Linux detail: stdout and stderr matter

A container does not need to know about “logging” as a high-level concept.

Most container logging starts from something simple:

The process writes to stdout and stderr.

The runtime captures those streams.

That is why good containerized apps usually log to stdout/stderr instead of writing only to local files.

This is a small detail, but it matters a lot in production.

If your app logs only to a file inside the container, then your logging pipeline may not see it unless you mount volumes or configure extra collection.

Day 6: Stop and Lifecycle Management

On Day 6, I implemented stop.

The first version was simple:

tiny-docker-go stop 

The runtime reads metadata, gets the PID, and sends a signal.

The normal graceful flow is:

send SIGTERM
wait
if still running, send SIGKILL
update metadata

This is similar to Docker’s stop behavior.

Docker sends a termination signal first, and after a timeout it sends SIGKILL if the process does not exit.

This taught me a practical lesson:

Stopping a container is not the same as killing a process immediately.

A good runtime gives the process a chance to clean up.

For example, a backend service may need to:

  • close database connections
  • flush logs
  • finish current requests
  • release locks
  • write final state

If we send SIGKILL immediately, the process cannot handle it.

SIGKILL cannot be caught.

SIGTERM can be caught.

So graceful shutdown starts with SIGTERM.

PID 1 problem

This day also connected back to PID namespaces.

Inside a PID namespace, the main process becomes PID 1.

PID 1 has special behavior on Linux.

If it does not handle signals properly, stopping the container may not behave as expected.

That helped me understand why some containers use an init process.

It also made me more careful about what command I use as the container entrypoint.

A simple shell may behave differently from a proper application process.

This is one reason container lifecycle management is more subtle than it looks.

Day 7: cgroups and Memory Limits

Day 7 was about cgroups.

Namespaces answer this question:

What can the process see?

cgroups answer a different question:

How much can the process use?

That difference is important.

Namespaces isolate visibility.

cgroups control resources.

With cgroups, the runtime can limit or account for resources such as:

  • memory
  • CPU
  • pids
  • IO
  • sometimes devices and other controllers depending on system configuration

For this project, I focused on memory limit using cgroup v2.

On many modern Linux systems, cgroup v2 is mounted around:

/sys/fs/cgroup

A simplified container cgroup path might be:

/sys/fs/cgroup/tiny-docker//

To limit memory, the runtime can write to:

memory.max

For example:

echo 134217728 > memory.max

That means 128 MB.

Then the runtime adds the process PID to:

cgroup.procs

Example:

echo  > cgroup.procs

After that, the kernel applies the limit to that process group.

This was one of my favorite parts of the project.

Because suddenly “memory limit” stopped being an abstract Docker option.

When I write:

docker run --memory 128m ...

behind the scenes, the runtime eventually has to express that limit to the kernel.

The exact implementation is more complex in Docker, but the basic idea became clear.

Testing memory limits

A simple way to test memory limits is to run a command that allocates memory.

For example, inside a container rootfs with Python:

python3 -c "a = 'x' * 200 * 1024 * 1024; print('allocated')"

If the memory limit is 128 MB, the process should fail or be killed by the kernel.

This is where container behavior becomes very real.

The runtime does not “watch memory” manually in a loop.

The kernel enforces the limit.

That is the power of cgroups.

cgroup v1 vs cgroup v2

I focused on cgroup v2 because it is the modern unified hierarchy.

In cgroup v1, different controllers could be mounted in different hierarchies.

In cgroup v2, the model is unified and cleaner.

But cgroup v2 also has rules that you need to respect.

For example, controller availability depends on the system, and some controllers must be enabled in parent cgroups before child cgroups can use them.

This is where I learned another systems programming lesson:

The code can be correct but the host can still reject the setup because the kernel or systemd cgroup configuration is different.

So a real runtime needs strong detection, good errors, and compatibility handling.

My tiny runtime does not handle every host setup.

But it made the concept clear.

Day 8: Network Namespace

On Day 8, I added network namespace support.

This was the day where containers became both clearer and more confusing.

A network namespace gives a process its own network stack.

That includes its own:

  • interfaces
  • routing table
  • IP addresses
  • firewall rules view
  • loopback device

When I added:

syscall.CLONE_NEWNET

the container got its own network namespace.

But then something interesting happened:

The container had no network.

That is expected.

A new network namespace starts isolated.

Even loopback may need to be brought up manually.

So the first step was simply:

ip link set lo up

inside the namespace.

This taught me a simple but important point:

Network isolation does not automatically mean working networking.

It means the container has a separate network world.

The runtime still needs to connect that world to something.

At this stage, I added a --net none or --net isolated style mode.

That made the behavior explicit.

tiny-docker-go run --net isolated --rootfs ./rootfs/alpine /bin/sh

Inside the container:

ip addr

would show only the isolated namespace interfaces.

No internet.

No host access.

Just isolation.

Small lesson from Day 8

Before this project, I mostly thought about Docker networking from the user side:

-p 8080:80
docker network ls
docker network inspect

But from the runtime side, networking starts much lower:

create network namespace
create interface
move interface into namespace
assign IP
set route
configure NAT

Docker hides all of that.

Building even a tiny version forced me to see the real steps.

Day 9: Bridge and veth Networking

Day 9 was one of the most difficult and useful parts.

The goal was to give the container internet access.

For that, I needed a simple bridge and veth pair.

The model looks like this:

Host network namespace
│
├── eth0 / main host interface
│
├── td0 bridge
│   └── veth-host
│
└── container network namespace
    └── veth-container

A veth pair works like a virtual cable.

Whatever enters one side comes out the other side.

The host keeps one side.

The container gets the other side.

The bridge connects the host-side veth to a small virtual network.

A simple IP plan:

bridge td0:       10.10.0.1/24
container eth0:   10.10.0.2/24
default gateway:  10.10.0.1

The steps are roughly:

ip link add td0 type bridge
ip addr add 10.10.0.1/24 dev td0
ip link set td0 up

ip link add veth-host type veth peer name veth-container
ip link set veth-host master td0
ip link set veth-host up

ip link set veth-container netns 

Then inside the container namespace:

ip addr add 10.10.0.2/24 dev veth-container
ip link set veth-container name eth0
ip link set eth0 up
ip route add default via 10.10.0.1

Finally, on the host, NAT is needed:

iptables -t nat -A POSTROUTING -s 10.10.0.0/24 -j MASQUERADE

Also IP forwarding must be enabled:

sysctl -w net.ipv4.ip_forward=1

This is the point where I started to appreciate Docker networking much more.

Because every simple Docker command hides many small Linux networking operations.

Debugging container networking

The useful commands were:

ip addr
ip link
ip route
ip netns
iptables -t nat -L -n -v
sysctl net.ipv4.ip_forward
ping

Some issues I hit or expected:

  • loopback was down
  • veth interface was created but not moved correctly
  • IP address was missing
  • default route was missing
  • NAT rule was missing
  • host forwarding was disabled
  • DNS was not configured
  • interface name inside namespace was not what I expected

This part reminded me that networking bugs are usually not one big bug.

They are often one missing small step.

One missing route.

One down interface.

One missing NAT rule.

One wrong namespace.

Day 10: Polish, README, and Architecture

On Day 10, I focused on making the project understandable.

A learning project is more valuable when other people can read it.

So I improved the README and documented:

  • project goal
  • architecture
  • installation
  • usage examples
  • known limitations
  • roadmap
  • what each feature demonstrates

The final mental model looks like this:

tiny-docker-go
│
├── CLI
│   ├── run
│   ├── ps
│   ├── logs
│   └── stop
│
├── Runtime
│   ├── parent process
│   ├── child process
│   ├── namespace setup
│   ├── rootfs setup
│   └── command execution
│
├── State
│   ├── container id
│   ├── metadata json
│   ├── pid
│   ├── status
│   └── created_at
│
├── Logs
│   └── stdout/stderr capture
│
├── Cgroups
│   ├── memory.max
│   └── cgroup.procs
│
└── Network
    ├── network namespace
    ├── bridge
    ├── veth pair
    └── NAT

And the user-facing commands look like this:

tiny-docker-go run --rootfs ./rootfs/alpine /bin/sh
tiny-docker-go ps
tiny-docker-go logs 
tiny-docker-go stop 

This is still tiny.

But it is not just a toy CLI anymore.

It demonstrates many of the core ideas behind containers.

What I learned about containers

After building this project, my mental model of Docker changed.

Before, I thought of Docker mostly as:

images + containers + Dockerfile + ports + volumes

Now I think about it more like:

container = isolated Linux process + prepared filesystem + resource limits + networking + lifecycle metadata

That is a much more useful model.

A container is not magic.

It is a process.

But it is a carefully prepared process.

The runtime says:

  • this process should see this hostname
  • this process should see this PID tree
  • this process should use this root filesystem
  • this process should have this memory limit
  • this process should write logs here
  • this process should be connected to this network
  • this process should be stopped with these signals

That is the core idea.

Namespaces vs cgroups

One of the clearest lessons was the difference between namespaces and cgroups.

I would explain it like this:

Namespaces control what a process can see.
Cgroups control what a process can use.

Examples:

PID namespace:
The process sees its own process tree.

UTS namespace:
The process sees its own hostname.

Mount namespace:
The process sees its own mount table.

Network namespace:
The process sees its own network interfaces and routes.

Cgroups:
The process can only use a limited amount of memory, CPU, pids, or IO.

This distinction is simple, but it explains so much.

If a container cannot see host processes, that is namespace isolation.

If a container gets killed after using too much memory, that is cgroup enforcement.

If a container has its own IP address, that is network namespace plus virtual networking.

If a container sees Alpine files instead of host files, that is rootfs setup plus mount isolation.

Docker combines all of these into one clean developer experience.

Small Linux details that mattered

This project taught me many small Linux details that are easy to miss when only using Docker.

1. PID 1 is special

The first process inside a PID namespace becomes PID 1.

PID 1 handles signals differently and is responsible for reaping orphaned child processes.

This matters for container shutdown.

2. /proc must match the PID namespace

If /proc is not mounted inside the container correctly, tools like ps may show confusing information.

Mounting proc inside the container is not just cosmetic.

It affects how process information is visible.

3. chroot changes /, but it is not a complete security model

chroot is useful for learning filesystem isolation.

But real containers need stronger filesystem and security handling.

4. Logs are mostly stdout and stderr

Container logging starts with capturing process output.

If your app logs to stdout/stderr, the runtime can collect it naturally.

5. Graceful stop matters

A runtime should usually send SIGTERM first.

SIGKILL should be the fallback.

This gives the process a chance to shut down cleanly.

6. cgroups are kernel-enforced

The runtime does not manually police memory in a loop.

It writes limits into cgroup files, then the kernel enforces them.

7. A new network namespace has no useful network by default

Isolation comes first.

Connectivity must be built.

8. veth pairs are like virtual cables

One side stays on the host.

One side goes into the container.

That simple idea powers a lot of container networking.

9. NAT is what makes outbound internet work in the simple bridge model

Without NAT and IP forwarding, the container may have an IP but still not reach the internet.

10. Metadata turns a process into something manageable

Without metadata, you only started a process.

With metadata, you can list it, stop it, inspect it, and read its logs.

What this project is not

tiny-docker-go is not a Docker replacement.

It does not support real image pulling.

It does not implement OCI fully.

It does not have production security.

It does not have a daemon.

It does not have advanced volume management.

It does not have complete port publishing.

It does not handle all cgroup configurations.

It does not support all namespace combinations safely.

It does not include seccomp, AppArmor, SELinux, or capabilities hardening yet.

And that is okay.

The goal is not production.

The goal is learning.

Actually, keeping it small made the learning better.

When a project becomes too complete, it can hide the concept again.

I wanted the opposite.

I wanted the concept to stay visible.

What I want to add next

After these 10 days, there are many possible next steps.

Some features I want to explore:

1. Better image support

Right now, rootfs is local.

A next step could be:

tiny-docker-go pull alpine

Even if it is not a full registry implementation, I can start with downloading and unpacking rootfs archives.

2. OverlayFS

Docker images are layer-based.

A good next step is to use OverlayFS:

lowerdir = image layer
upperdir = container writable layer
workdir  = overlay work directory
merged   = final container rootfs

This would make the filesystem model closer to real containers.

3. Port mapping

Outbound internet is one thing.

Publishing container ports is another.

A next step:

tiny-docker-go run -p 8080:80 ...

This would require NAT/DNAT rules or a proxy approach.

4. Better process supervision

The runtime could track exit status, update metadata automatically, and clean up resources more reliably.

5. Capabilities

Linux capabilities are very important for container security.

Instead of giving a process full root power, Linux can split privileges into smaller capabilities.

Dropping capabilities would make the runtime more realistic.

6. Seccomp

Seccomp can restrict which syscalls a process can use.

This is another important container hardening feature.

7. User namespace

User namespaces are powerful because they can make a process think it is root inside the container while mapping it to a less privileged user on the host.

This is a very interesting security feature.

8. OCI runtime spec

Eventually, I want to read more about the OCI runtime spec and compare my tiny runtime with how real runtimes are structured.

Final thoughts

This project made Docker feel less magical and more impressive.

Less magical because I can now see the Linux pieces behind it.

More impressive because I understand how many details Docker handles for us.

Running a container sounds simple:

docker run nginx

But under that command, a runtime needs to prepare isolation, filesystem, networking, logs, metadata, signals, and resource limits.

Building tiny-docker-go helped me understand those pieces one by one.

The most important lesson for me was this:

A container is just a Linux process, but the runtime carefully shapes the world around that process.

That world includes what the process can see, what it can use, where its files come from, how its logs are captured, how it receives signals, and how it connects to the network.

This is why building a tiny container runtime is such a useful learning project.

You do not need to rebuild Docker completely.

You only need to rebuild enough of it to understand the ideas.

That is what I tried to do with tiny-docker-go.

You can follow the project here:

https://github.com/amirsefati/tiny-docker-go

References

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post

DDP Is Not Always Faster

Related Posts