Re: EXCHANGE_ID with same network address but different server owner

Jeff Layton <jlayton@xxxxxxxxxx> · Mon, 22 May 2017 10:25:28 -0400

On Thu, 2017-05-18 at 16:09 +0000, Trond Myklebust wrote:
> On Thu, 2017-05-18 at 11:28 -0400, bfields@xxxxxxxxxxxx wrote:
> > On Thu, May 18, 2017 at 03:17:11PM +0000, Trond Myklebust wrote:
> > > For the case that Stefan is discussing (kvm) it would literally be
> > > a
> > > single process that is being migrated. For lxc and
> > > docker/kubernetes-
> > > style containers, it would be a collection of processes.
> > > 
> > > The mountpoints used by these containers are often owned by the
> > > host;
> > > they are typically set up before starting the containerised
> > > processes.
> > > Furthermore, there is typically no "start container" system call
> > > that
> > > we can use to identify which set of processes (or cgroups) are
> > > containerised, and should share a clientid.
> > 
> > Is that such a hard problem?
> > 
> 
> Err, yes... isn't it? How do I identify a container and know where to
> set the lease boundary?
> 
> Bear in mind that the definition of "container" is non-existent beyond
> the obvious "a loose collection of processes". It varies from the
> docker/lxc/virtuozzo style container, which uses namespaces to bound
> the processes, to the Google type of "container" that is actually just
> a set of cgroups and to the kvm/qemu single process.
> 
> > In any case, from the protocol point of view these all sound like
> > client
> > implementation details.
> 
> If you are seeing an obvious architecture for the client, then please
> share...
> 
> > The only problem I see with multiple client ID's is that you'd like
> > to
> > keep their delegations from conflicting with each other so they can
> > share cache.
> > 
> > But, maybe I'm missing something else.
> 
> Having to an EXCHANGE_ID + CREATE_SESSION on every call to
> fork()/clone() and a DESTROY_SESSION/DESTROY_EXCHANGEID in each process
> destructor? Lease renewal pings from 1000 processes running on 1000
> clients?
> 
> This is what I mean about container boundaries. If they aren't well
> defined, then we're down to doing precisely the above.
> 

This is the crux of the problem with containers in general.

We've been pretending for a long time that the kernel doesn't really
need to understand them and can just worry about namespaces, but that
really hasn't worked out well so far.

I think we need to consider making a "container" a first-class object in
the kernel. Note that that would also help solve the long-standing
problem of how to handle usermode helper upcalls in containers.

I do happen to know of one kernel developer (cc'ed here) who has been
working on something along those lines...
-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html