Container IDs for the audit subsystem
Linux containers are something of an amorphous beast, at least with respect to the kernel. There are lots of facilities that the kernel provides (namespaces, control groups, seccomp, and so on) that can be composed by user-space tools into containers of various shapes and colors; the kernel is blissfully unaware of how user space views that composition. But there is interest in having the kernel be more aware of containers and for it to be able to distinguish what user space considers to be a single container. One particular use case for the kernel managing container identifiers is the audit subsystem, which needs unforgeable IDs for containers that can be associated with audit trails.
Back in early October, Richard Guy Briggs posted the second version of his RFC for
kernel container IDs that can be used by the audit subsystem. The first
version was posted in mid-September, but is
not the only proposal out there. David Howells proposed turning containers into full-fledged
kernel objects back in May, but seemingly ran aground on objections that
the proposal "muddies the waters and makes things more
brittle
", in the words of namespaces
maintainer Eric W. Biederman.
Briggs's proposal is focused on the needs of the audit subsystem, rather than trying to solve any larger problem, however. He described some of the problems for the audit subsystem in a 2016 Linux Security Summit talk. In addition, he laid out some of the requirements for container tracking in response to a query from Carlos O'Donell about the first RFC:
- ability to filter unwanted, irrelevant or unimportant messages before they fill queue so important messages don't get lost. This is a certification requirement.
- ability to make security claims about containers, require tracking of actions within those containers to ensure compliance with established security policies.
- ability to route messages from events to relevant audit daemon instance or host audit daemon instance or both, as required or determined by user-initiated rules
As proposed, audit container IDs would be handled as follows. A container orchestration system would register the ID of a container (a 16-byte UUID) by writing to a special file in the /proc directory for the container's initial process. Briggs proposes a new capability (CAP_CONTAINER_ADMIN) that would be required for a process to be able to register a container ID, but no process would be able to change its own container ID even with the capability.
Registering the container ID would associate the process ID (PID) of the first process (in the initial PID namespace) and all of that process's namespaces (using the namespace filesystem device and inode numbers) with the ID in an AUDIT_CONTAINER record that gets logged. The container IDs would then be used in various audit log messages to associate auditable events with the container that performed them. Any child processes would inherit the container ID of their parent so that all of the processes and threads in a container would be associated with its ID. If the first process has already forked or created threads, the registration would either fail or all of the child processes/threads would be associated with the ID; the right course will be determined as part of the RFC and implementation process.
Audit events would be generated for all namespace creation and destruction operations; creation events would be associated with the container ID of the process performing the action, destruction events occur when there are no more references to a namespace, so just the device and inode of the namespace destroyed would be logged. Changes to a process's namespaces would also generate an audit event that records the new and old namespace information.
The new capability for container IDs was one of the first things questioned
about the proposal. Casey Schaufler asked
how there could be a kernel container capability when the RFC clearly
states that the kernel knows nothing about containers. Briggs likened container IDs to login user IDs
and session IDs "that the kernel tracks for the convenience of
userspace
". He suggested that if the CAP_CONTAINER_ADMIN
name was the problem, he would be fine with something like
CAP_AUDIT_CONTAINERID, but that was not the core of Schaufler's complaint:
If it's audit behavior, you want CAP_AUDIT_CONTROL. If it's more than audit behavior you have to define what system security policy you're dealing with in order to pick the right capability.
We get this request pretty regularly. "I need my own capability because I have a niche thing that isn't part of the system security policy but that is important!" Fit the containerID into the system security policy, and if that results in using CAP_SYS_ADMIN, oh well.
There already are two capabilities for the audit subsystem (CAP_AUDIT_CONTROL and CAP_AUDIT_WRITE) but, as Paul Moore explained, neither is quite right to govern the ability to register container IDs:
James Bottomley suggested sidestepping the capability question by making the container ID a write-once attribute; once set, nothing could change it. The idea of nested containers came up several times, though, which would require some way to change these container IDs. Bottomley suggested simply to allow appending to the container ID, so that the hierarchy is inherent in the chain of IDs. Moore agreed that write-once would work for the non-nested case:
But Aleksa Sarai pointed out that nested containers are a fairly common use case, for LXC system containers in particular (which will often have other container runtimes running inside them). Biederman noted that there is not, as yet, a solution for running the audit daemon in containers, so it may be premature to worry about nested container IDs at this point.
Schaufler is concerned that adding an ID for auditing containers is heading down the wrong path. He suggested the ptags Linux Security Module as a way forward; it would allow arbitrary tags with values to be set for a process.
Moore stressed that the effort was not
aimed at a more general mechanism, but simply to address the needs of the
audit subsystem at this point. He said that the ID is meant to be an
"audit container ID
" and not a more general "container
ID
". Using the audit ID for other purposes risks opening up
problems in other areas (such as container migration), so he and Briggs are
attempting to restrict the use cases.
At this point, there is no code on the table, it is purely a discussion on where things should go. Adding a new capability for registering these IDs seems to be a non-starter; the write-once scheme governed by one of the existing audit capabilities seems like it might plausibly pass muster. Though, as Moore said, there seems to be a bigger need here, but more general solutions have so far been hard to come by. Adding IDs willy-nilly may be suboptimal but, until something more general comes along, might just be the right way forward.
Index entries for this article | |
---|---|
Kernel | Auditing |