, , ,

Rootless Podman user-namespaces in plain English

In my personal opinion, user-namespace are one of the most brain-twisting aspects of rootless containers to understand.  Arguably right up there with Kubernetes, the learning curve can be quite steep.  In this article, I will attempt to reduce that slope for new Podman users (and converts), with an easy to understand analogy.  Hopefully this will help prevent any essential brain-goo leaking out of your ear-holes.

My assumption is that you’re just getting into containers (in general), and in particular rootless Podman (i.e. operating as a normal, regular, unprivileged user).  Though it won’t technically matter how you’re using Podman – directly on the command-line, remotely with podman-remote, podman-machine, in a GUI like podman-desktop, or WebUI like Cockpit.  The basic concepts I’m going to illustrate will apply basically universally.

First, a bit of context on namespaces.  Like container images and cgroups, namespaces are a fundamental building block of containers.  There are several different kinds of namespaces, covering areas like networking, storage, and the process tree.  You can think of namespaces like the infrastructure in a house.  A house has its own electrical system, plumbing, drainage, etc.  They’re all separate from each other, single-purposed, yet supplied and managed by the state or some utility company.

Continuing this analogy, the house-infrastructure interacts with the utility services in a controlled manner.  In other words, there’s a level of accounting and isolation from one house to the next.  For example electrical meters record usage separately, and breakdowns in one house don’t affect another.  There’s also a useful separation between utilities, such that a drain becoming clogged doesn’t cause a fuse (or circuit-breaker) to fail.

Each of these utility-services represents a different kind of namespace, and (in case it’s not obvious) the houses represent containers.  The oversight and management of services by the utility companies (or state) represents namespace interactions from the host system. To help understand user-namespaces, let’s extend this analogy to the human occupants and their constant demand for sharing their thoughts. Since everybody from birth now has a cell-phone, everybody can easily communicate with each other.  However children have trouble remembering long numbers.  So for inter-household communication, let’s invent a shortcut-system where each person is assigned a short number, separate from their cell-number.

As a further convenience, let’s use the phone number zero as a quick way to contact the phone company (outside the house).  Whereas dialing zero inside the house connects to whoever is responsible to pay the phone bill.  So for example, little Billy can dial zero to call mom, and mom can dial zero to request that restrictions be placed on Billy’s excessive text messaging.

If you’re still with me, you’ll have realized we’ve now created a few major hassles: People’s cell-phone and shortcut numbers may overlap.  In other words, there’s no obvious difference between  cell-number 123 and shortcut-number 123, but the results may vary wildly depending on how many people are in one house compared to another.  Worse yet, dialing zero to complain about spam calls will get you very different results on the shortcut number compared to the cellular number. Speaking of which, let’s fix the spam-call problem by simply restricting all household phones to only dialing shortcut numbers, completely blocking access to the wider cellular network from inside a house.

As for the number-ambiguity problem, a possibly obvious solution may seem to have a phone book that records everybody’s name, cell, and shortcut number.  However a phone-book will actually make the problem worse, because there may be people in different houses that may share the same name and shortcut number.  Instead, let’s have a record for each house that reserves an exclusive collection of (otherwise unused) cell-phone numbers.  Then, each house can use a simple translation scheme to convert shortcut numbers into those reserved cell-numbers, keeping the phone-company happy. Also, should a house member somehow manage to phreak the system and dial out on the wider cellular network, they will be blocked because their cell number belongs to unused range.

This is all getting a bit cerebral so let’s use a simple example. The house on 221B Baker Street, might be given the reserved (and unused) cell-phone numbers 1000-1999 to back shortcut numbers.  Inside the house, if Watson has the shortcut number three, it would translate to the (unused) cell number 1002 outside the house. Further, the shortcut number zero at this house would simply translate directly to the cell number allocated to the Sherlock (again, outside the house).  In this way, the phone utility can exactly mimic the kind of separation and isolation used for the other utilities.  Communications are confined within each house, and there’s a high-degree of separation between all houses.

Coming back to the real world of hosts and containers, you can apply this cell & shortcut analogy directly to rootless user namespaces.  The three values you see in /etc/subuid and /etc/subgid describe the reserved range of unused IDs on the host, which will be allocated for translation inside containers.  The first item on each subuid/subgid line is the name of the user (or group) that will translate to ID zero inside the container.  The second number is the beginning of an unused range of IDs on the host, followed by the length of the allocated range.

So in terms of the Baker Street example above, an entry in these host files (one for UIDs the other for GIDs) might be sherlock:1000:999.  This means a container started by Sherlock, will have a root user (UID/GID zero) represented by the Sherlock ID on the host (whatever that happens to be).  Subsequent UID/GIDs inside the container (such as Watson), will simply translate (in order) to the unused range on the host starting at 1000.  Since Watson’s ID in the container is three, that’s why outside the container he’s allocated ID 1002. If an ID in the container tries to go beyond 999, the system will simply throw an error – since the host-side ID allocation stops at 1999. Again, the key here is that the UIDs (and/or GIDs) 1000-1999 are not used by anything on the host – maintaining the confinement of all user-namespaces on the system.

On a real host, you’ll typically find much larger values in /etc/subuid and /etc/subgid. Typically the range lengths will be set to 65536, since most Linux container images will require that minimum. In any case, with these analogies in mind, you’re well equipped to understand the concept of container namespaces.  In particular, the shortcut-to-cell number translation scheme exactly mirrors how user namespaces work.  When you’re ready to further reinforce these concepts, I highly recommend Dan Walsh’s blog article on controlling user-namespace access. Following that, for a more practical example, Dan’s article Dealing with user namespaces and SELinux on rootless containers is also very good.

Leave a Reply


Sign up with your email address to receive updates by email from this website.