Quoting Eric W. Biederman (ebiederm@xxxxxxxxxxxx): > "Serge E. Hallyn" <serge@xxxxxxxxxx> writes: > > > Quoting Eric W. Biederman (ebiederm@xxxxxxxxxxxx): > >> Serge Hallyn <serge.hallyn@xxxxxxxxxxxxx> writes: > >> > >> > Introduce a system log namespace. The syslog ns is tied to a user > >> > namespace. You must create a new user namespace before you can create a > >> > new sylog ns. The syslog ns is created through a new command (11) to > >> > the __NR_syslog system call. > >> > > >> > Once a task enters a new syslog ns, it's "dmesg", "dmesg -c" and > >> > /dev/kmsg actions affect only itself, so that user-created syslog > >> > messages no longer are confusingly combined in the host's syslog. > >> > "printk" itself always goes to the initial syslog_ns, and consoles > >> > belong only to the initial syslog_ns. However printks relating to a > >> > specific network namespace, for instance, can now be targeted to the > >> > syslog ns for the user ns which owns the network ns, aiding in debugging > >> > in a container. > >> > > >> > This patch is on top of the user namespace enhanced kernel at > >> > git://kernel.ubuntu.com/serge/quantal-userns. It is good enough to > >> > compile with stock ubuntu kernel options, boot, launch other syslog > >> > namespaces and exercise them. It will need help before it will compile > >> > with funky options like CONFIG_PRINTK=n. This is only being sent out to > >> > get feedback on the general idea. > >> > > >> > Comments greatly appreciated. > >> > > >> > (See https://wiki.ubuntu.com/LxcSyslogNs for background). > >> > >> Overall I would say the goal sounds well thought out. > >> > >> I am not a fan of how this ties into the user namespace. I would prefer > >> closer or looser ties. The recursive reference count loop where a > >> userns refers to a syslogns and that syslogns refers to the same userns > >> is unpleasant. > > > > We could make the nsproxy point to the syslog_ns, but this seemed simpler. > > Note that the syslog_ns does not need to pin the user_ns, since by design > > the user_ns owning a syslog_ns can't go away if the syslog_ns is still > > alive. > > > > But yes, the question of "what should point to the syslog_ns" is what has > > kept a syslog_ns from being seriously proposed since february 2010 :) > > > > Hm, wait. A nagging feeling made me look back, and I see that I do in > > fact pin the user_ns from the syslog_ns. I didn't mean to (and I don't > > release it :) and we don't need to. When a syslog_ns is created, it > > can only be inherited by child user_ns's, and its owner, the parent user_ns, > > can never go away until the child user_ns's go away. > > There is an argument to be made that syslog messages are the kind of > security identifiers like uid, gids, and keys that should be part of a > user namespace. I'm not fully convinced but there are some DOS attacks > that would naturally prevent. I can't really think of a good case for not putting the syslogns straight into the userns (i.e. not having a separate syslogns), so I'd say let's go that route. There is a big locking bug (besides syslog_ns pinning user_ns) in my patch - something needs to be done with struct cont, which pins the syslog_ns. So either when a user_ns is freed we need to flush struct cont if it is pinning this user_ns, or the struct cont should explicitly pin the user_ns. > >> The important case as I understand it is to handle injection of messages > >> into dmesg by userspace? > > > > 1. injection of messages into dmesg by userspace, 2. clearing of messages > > by userspace, but also 3. allowing appropriate kernel printks to be > > targeted to containers. > > > >> I would really like to see how messages from networking devices and > >> netfilter would be handled. Right now one of the ugliest bits of > > > > It would simply replace a > > printk(KERN_NOTICE "doing something\n"); > > with > > nsprintk(net->user_ns->syslog_ns, KERN_NOTICE "doing something\n"); > > > > I'm not yet clear on whether we'd want nsprintk to print to both the > > init_syslog_ns (with a ns prefix) and the child ns. > > There are some specialized forms of printk like dev_printk and in > particular netdev_printk that it would be very interesting if they > did the work behind the scenes. So that you could code the obvious > thing and it would do the right thing automatically. Agreed. > >> lowering the permissions in the network namespace is what do about the > >> commands that set the message loglevel. > > > > Here I'm not sure what you mean. > > There is a possible DOS attack that by turning on debug messages in a > user namespace you can overwhelm syslog. Oh, I see. > >> In general unless we can safely and sanely direct kernel messages into > >> this new dmesg I don't actually see the point of having another ring > >> buffer in the kernel. If the only success is userspace having the > >> syslog facility simply be unavailable seems more palatable. > > > > No I didn't do any in this patch, but directing kernel messages into the > > new dmesg was definately a goal and should be trivial now. > > Getting the semantics of which kernel messages should be directed at the > new ring buffer and what that means seems to me to be a key factor in > seeing how practical this is. Otherwise this seems to call out for a > change in userspace. Ok, I was hoping that once there was a trivial to use nsprintk the appopriate users would be converted by others :), but I can take a look at converting compelling users before I resend. > Certainly inside a user namespace now you can't destructively touch the > kernel's syslog at all. That should be true, yes. thanks, -serge _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers