On Thu, 2017-05-18 at 16:09 +0000, Trond Myklebust wrote: > On Thu, 2017-05-18 at 11:28 -0400, bfields@xxxxxxxxxxxx wrote: > > On Thu, May 18, 2017 at 03:17:11PM +0000, Trond Myklebust wrote: > > > For the case that Stefan is discussing (kvm) it would literally be > > > a > > > single process that is being migrated. For lxc and > > > docker/kubernetes- > > > style containers, it would be a collection of processes. > > > > > > The mountpoints used by these containers are often owned by the > > > host; > > > they are typically set up before starting the containerised > > > processes. > > > Furthermore, there is typically no "start container" system call > > > that > > > we can use to identify which set of processes (or cgroups) are > > > containerised, and should share a clientid. > > > > Is that such a hard problem? > > > > Err, yes... isn't it? How do I identify a container and know where to > set the lease boundary? > > Bear in mind that the definition of "container" is non-existent beyond > the obvious "a loose collection of processes". It varies from the > docker/lxc/virtuozzo style container, which uses namespaces to bound > the processes, to the Google type of "container" that is actually just > a set of cgroups and to the kvm/qemu single process. > > > In any case, from the protocol point of view these all sound like > > client > > implementation details. > > If you are seeing an obvious architecture for the client, then please > share... > > > The only problem I see with multiple client ID's is that you'd like > > to > > keep their delegations from conflicting with each other so they can > > share cache. > > > > But, maybe I'm missing something else. > > Having to an EXCHANGE_ID + CREATE_SESSION on every call to > fork()/clone() and a DESTROY_SESSION/DESTROY_EXCHANGEID in each process > destructor? Lease renewal pings from 1000 processes running on 1000 > clients? > > This is what I mean about container boundaries. If they aren't well > defined, then we're down to doing precisely the above. > This is the crux of the problem with containers in general. We've been pretending for a long time that the kernel doesn't really need to understand them and can just worry about namespaces, but that really hasn't worked out well so far. I think we need to consider making a "container" a first-class object in the kernel. Note that that would also help solve the long-standing problem of how to handle usermode helper upcalls in containers. I do happen to know of one kernel developer (cc'ed here) who has been working on something along those lines... -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html