On Wed, Sep 09, 2015 at 03:24:12PM -0400, Michael J Coss wrote: > On 9/8/2015 11:55 PM, Greg KH wrote: > > On Tue, Sep 08, 2015 at 10:10:29PM -0400, Michael J. Coss wrote: > >> Adds capability to allow userspace programs to forward a given event to > >> a specific network namespace as determined by the provided pid. In > >> addition, support for a per-namespace kobject_sequence counter was > >> added. Sysfs was modified to return the correct event counter based on > >> the current network namespace. > >> > >> Signed-off-by: Michael J. Coss <michael.coss@xxxxxxxxxxxxxxxxxx> > >> --- > >> include/linux/kobject.h | 3 ++ > >> include/net/net_namespace.h | 3 ++ > >> kernel/ksysfs.c | 12 ++++++ > >> lib/kobject_uevent.c | 90 +++++++++++++++++++++++++++++++++++++++++++++ > >> 4 files changed, 108 insertions(+) > >> > >> diff --git a/include/linux/kobject.h b/include/linux/kobject.h > >> index 637f670..d1bb509 100644 > >> --- a/include/linux/kobject.h > >> +++ b/include/linux/kobject.h > >> @@ -215,6 +215,9 @@ extern struct kobject *firmware_kobj; > >> int kobject_uevent(struct kobject *kobj, enum kobject_action action); > >> int kobject_uevent_env(struct kobject *kobj, enum kobject_action action, > >> char *envp[]); > >> +#if defined(CONFIG_UDEVNS) || defined(CONFIG_UDEVNS_MODULE) > >> +int kobject_uevent_forward(char *buf, size_t len, pid_t pid); > >> +#endif > >> > >> __printf(2, 3) > >> int add_uevent_var(struct kobj_uevent_env *env, const char *format, ...); > >> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h > >> index e951453..a4013e5 100644 > >> --- a/include/net/net_namespace.h > >> +++ b/include/net/net_namespace.h > >> @@ -134,6 +134,9 @@ struct net { > >> #if IS_ENABLED(CONFIG_MPLS) > >> struct netns_mpls mpls; > >> #endif > >> +#if defined(CONFIG_UDEVNS) || defined(CONFIG_UDEVNS_MODULE) > >> + u64 kevent_seqnum; > >> +#endif > >> struct sock *diag_nlsk; > >> atomic_t fnhe_genid; > >> }; > >> diff --git a/kernel/ksysfs.c b/kernel/ksysfs.c > >> index 6683cce..4bc15fd 100644 > >> --- a/kernel/ksysfs.c > >> +++ b/kernel/ksysfs.c > >> @@ -21,6 +21,9 @@ > >> #include <linux/compiler.h> > >> > >> #include <linux/rcupdate.h> /* rcu_expedited */ > >> +#if defined(CONFIG_UDEVNS) || defined(CONFIG_UDEVNS_MODULE) > >> +#include <net/net_namespace.h> > >> +#endif > > #if isn't needed here, right? > > > >> > >> #define KERNEL_ATTR_RO(_name) \ > >> static struct kobj_attribute _name##_attr = __ATTR_RO(_name) > >> @@ -33,6 +36,15 @@ static struct kobj_attribute _name##_attr = \ > >> static ssize_t uevent_seqnum_show(struct kobject *kobj, > >> struct kobj_attribute *attr, char *buf) > >> { > >> +#if defined(CONFIG_UDEVNS) || defined(CONFIG_UDEVNS_MODULE) > >> + pid_t p = task_pid_vnr(current); > >> + struct net *n = get_net_ns_by_pid(p); > >> + > >> + if (n != ERR_PTR(-ESRCH)) { > >> + if (!net_eq(n, &init_net)) > >> + return sprintf(buf, "%llu\n", n->kevent_seqnum); > >> + } > >> +#endif > >> return sprintf(buf, "%llu\n", (unsigned long long)uevent_seqnum); > >> } > >> KERNEL_ATTR_RO(uevent_seqnum); > >> diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c > >> index d791e33..7589745 100644 > >> --- a/lib/kobject_uevent.c > >> +++ b/lib/kobject_uevent.c > >> @@ -379,6 +379,96 @@ int kobject_uevent(struct kobject *kobj, enum kobject_action action) > >> } > >> EXPORT_SYMBOL_GPL(kobject_uevent); > >> > >> +#if defined(CONFIG_UDEVNS) || defined(CONFIG_UDEVNS_MODULE) > >> +/** > >> + * kobject_uevent_forward - forward event to specified network namespace > >> + * > >> + * @buf: event buffer > >> + * @len: event length > >> + * @pid: pid of network namespace > >> + * > >> + * Returns 0 if kobject_uevent_forward() is completed with success or the > >> + * corresponding error when it fails. > >> + */ > >> +int kobject_uevent_forward(char *buf, size_t len, pid_t pid) > >> +{ > >> + int retval = 0; > >> +#if defined(CONFIG_NET) > >> + struct uevent_sock *ue_sk; > >> + struct net *pns; > >> + char *p; > >> + u64 num; > >> + > >> + /* grab the network namespace of the provided pid */ > >> + pns = get_net_ns_by_pid(pid); > >> + if (pns == ERR_PTR(-ESRCH)) > >> + return -ESRCH; > >> + > >> + /* find sequence number in buffer */ > >> + p = buf; > >> + num = 0; > >> + while (p < (buf + len)) { > >> + if (strncmp(p, "SEQNUM=", 7) == 0) { > >> + int r; > >> + > >> + p += 7; > >> + r = kstrtoull(p, 10, &num); > >> + if (r) { > >> + put_net(pns); > >> + return r; > >> + } > >> + break; > >> + } > >> + p += (strlen(p) + 1); > >> + } > > Ok, that's crazy, replacing an existing sequence number with a sequence > > number of the namespace? An interesting hack, yes, but something we > > want to maintain for the next 20+ years, no. Surely there's a better > > way to do this? > > > > But again, I'm not sold on this whole idea anyway. > > > > greg k-h > > > Re: the #if > The #if is only needed because the new udevns code references a > structure defined in that header. Obviously it could be included > without consequences but I #if it to show it was part of the forwarding > code. That's not an issue, we don't put #ifdefs in .c code if at all possible, you will note that you added a bunch of them :( > Re: sequence # > I wanted it as transparent as possible, without this the udevd running > inside the container has issues with the values of the sequence numbers > seen, by making it tied to the namespace, udevd could run unchanged. Oh I know why you did it, I just don't like it :) > Our goal was to minimize the changes to a linux distro and still be able > to run a full desktop inside a container. But even in absence of our > use case, the first question is should uevents be broadcasts to every > network namespace? I don't think that broadcasting is the correct > action. And follow on questions are what if anything should be done > with regards to uevents and containers. I don't think you should be running udevd within a container, as a device is never bound to a namespace (network devices are the only exception), it's a false thing to think that a uevent is only for a single namespace as well. I understand your wish to change only the kernel to get unmodified userspace to run, but I suggest modifying your userspace instead :) sorry, greg k-h _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers