Aristeu Rozanski <arozansk@xxxxxxxxxx> writes: > On Mon, Mar 18, 2013 at 03:16:52PM -0700, Eric W. Biederman wrote: >> Adding the containers list so folks with container expertise can see >> what is being proposed. >> >> Aristeu Rozanski <arozansk@xxxxxxxxxx> writes: >> >> > This patchset introduces a new audit record to follow all USER records which >> > provides namespace information of the process. The idea is to allow processes >> > in containers to create records in the host system while providing means to be >> > filtered out. >> >> It looks like this mechanism makes it easy for an unprivileged program >> to spam and overwhelm the audit log. >> >> > For each new namespace, a unique procfs inode number is allocated and this >> > number has been used by userspace to determine which processes belong to the >> > same namespace. These numbers are used in the new audit record. >> > >> > Applications such as libvirt-sandbox and lxc can then report the same numbers >> > when a container is created and destroyed allowing to map records to a certain >> > container. Maybe the next step would be having a record for whenever a new >> > namespace is created? >> > >> > First 6 patches are needed in order to get each namespace's inode number. >> >> Grumble the existing methods can be used you don't have to introduce a >> whole new set of methods. Grumble. Besides the bug of assuming that >> the inodes now and forever will be the same across all instances of >> proc. > > the existing methods are for procfs use and I didn't want to abuse it. > like I said the other email, the fact that it's not a reliable way to > indefinitely describe a namespace due to multiple procfs instances or > migration, the whole idea is flawed. It is always possible to pick the instance of /proc connected to the initial pid namespace. And there is a device number you can use to say that. Usually designs that need global identifiers for namespaces suffer from the need for a namespace of namespaces (which we sort of have in /proc), and I push back by default to get people to think if what they are trying to do really makes sense. >> > Patch 7 properly defines the new record that is related to the USER >> > record >> >> Not agmenting the current user records seems a little odd to me. >> >> You also continue in this my current policy of not allowing any audit >> records in the container itself, so I a don't quite know what the point >> of all of this is. > > your current policy wasn't known to me and > /* Only support the initial namespaces for now. */ > sounds like something that didn't happen for other reasons The reasons were simply that to my knowledge no one has thought through how audit records and namespaces make sense to interact. My expectation would be that an extention of audit records would be logged on a per container basis. But I don't have any motivating examples. >> > Patch 8 allows USER records to be generated from different namespaces >> >> Which essentially allows any user to create any USER record they want >> whenever they want. >> >> > Here's an example of output: >> > type=CRED_DISP msg=audit(1363528861.403:311): pid=20016 uid=0 auid=0 ses=45 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:setcred acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success' >> >> Ok. This seems totally bizarre. You are running a container with a >> user namespace with some uid mapped to uid 0? > > on the notes section: > - while the last patch allows a new userns to send audit records, I haven't > look yet on making sure it has proper capabilities so regular users' > containers can create records > > so I haven't tried it with userns. It's a RFC. I though you would have taken the time to run it at least once, or to perhaps have manually edited your example to see how things would fit together. > That's a regular record > to show the related records, using initial namespaces. like I stated in > the email, I wasn't sure how I'd handle capabilities but the idea would be > to allow containers to log to the system's auditd. since inode numbers > aren't more reliable for more than a moment, I guess there's no other > way than having an audit namespace and run an audit daemon inside the > container (and communicate over the network like an individual host). What was really missing from your RFC is a motivating example. I sort of see that in your paragraph above but it isn't clear to me. What is lost by not allowing USER audit records from processes in containers? What is gained by implementing user process to have them? And of course what are your thoughts on preventing unprivileged users overwhelming the audit subsystem. My minimal experience with the audit subsystem roughly feels like hardly anyone really cares. Although I may be wrong. Eric _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers