Amir Goldstein <amir@xxxxxxxxxxx> writes: > Excellent! let's focus the discussion on a new device driver we want > to write > which is namespace aware. let's call this device driver valarm-dev. > Similarly to Android's alarm-dev, valarm-dev can be used to request > RTC wakeup calls > from user space and get/set RTC values, but with valarm-dev, every > container > may use different values for current time. > > As you can see in our patch set, we already have a version of > alarm-dev that maintains > its state inside a context, instead of in global variable, so it is > capable of providing > different context per namespace. > > And now for the 1M$ question: per *which* namespace do we attribute > the current realtime clock time? To none of them. Just use a different minor per instance, then you don't have a hard question to answer. > To UTS namespace (because T historically stands for Time)? To device > namespace? > Even if device namespace would exist, we do not want to tie the policy > decision of "separate time" > to a very wide definition of "separate devices". > > So what we want to create, is an API for device driver writers, that > will enable to write a namespace > aware device and allow userspace to configure when the namespace aware > device context is unshared. > We would like to share with you our very initial thoughts about how > this will be implemented: > - Extend register_pernet_subsys/device(ops) API > to register_perns_subsys/device(nstype, ops) API > - Extend pernet_operations to perns_operations that include optional > migrate() and/or unshare() ops > - Let valarm-dev register_peruser_subsys/device(&alarm_userns_ops) For the network subsystem that makes sense. But it doesn't make sense for devices. It is just an unneeded extra complication. > - Implement a new syscall (or netlink command if it makes more sense) > setdevns(int dev_fd, int ns_fd, int nstype, int flags) ioctl? master device? How do people communicate with raw devices these days? > - Unlike the netlink set netns case, this API is not used solely to > *move* a device to a different namespace, > but also to *unshare* a device context between namespaces, for those > devices that resigtered unshare() ops. I really think this all makes most sense a driver a virtual driver at a time. > This is our missing piece of the puzzle. > After that, whether we make changes to existing drivers (e.g. evdev) > or write new virtualized drivers (e.g. vevdev) > is a technicality. We care not which way to go, whichever way seems > more maintainable. > > What do you think of this master plan? I think by making your devices behavior depend on which namespace they are in you are making the drivers unnecesarily fragile, and unnecessarily unusable. I think the code will be simpler/cleaner/better if you don't need to have context outside of your drivers. > P.S. Please try to refrain from addressing the validity of the use > case of alarm-dev in particular, > as we do not wish to get engage "Android sucks" wars. > We simply want to present the case for improving the namespace > infrastructure to cater the needs > of device driver writers that wish to tailor their drivers for > containers based products. I think this is a driver interface problem, not a namespace problem. None of the similar drivers that exist in the network namespace change their behavior depending on which namespace they are in. The two practical choices I see are. 1) Use a bunch of minors for your driver. 2) Act roughly like /dev/pts and use different mounts of the filesystem to create new instances. I think different minors is probably easier, but we have two successfull models I am aware of so I have mentioned both. Eric _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers