On Mon, Dec 10, 2018 at 12:49:02PM -0500, Chuck Lever wrote: > > On Dec 10, 2018, at 12:47 PM, bfields@xxxxxxxxxxxx wrote: > > In a little more detail, as starting point, I was considering naming > > each client directory with a small integer, and including files like: > > > > info: a text file with > > NFS protocol version > > ascii representation of client address > > krb5 principal if available > > > > clientid: NFSv4 client ID; file absent for NFSv2/3 clients. > > > > locks: list of locks, following something like the /proc/locks > > format. > > > > opens: list of file opens, with access bits, inode numbers, > > device number. > > > > Does that sound reasonable? Any other ideas? > > How do you plan to make this kernel API namespace-aware? I may have some details wrong, but: We associate most of nfsd's state with the network namespace. The "nfsd" pseudofilesystem is where I'm thinking I might put this, and it inherits the network namespace from the process that calls mount and stores it in the superblock. So each mount done in a different net namespace should get its own superblock, its own inodes, etc. (I think that's how proc works, too?) So when you set up containerized nfsd, you mount a new nfsd filesystem in each container. And the list of clients visible there should only be the ones visible to that namespace. I guess that means that if you share an export across multiple containers, then if you want to find all the clients locking a given file, you have to iterate over all the container's "nfsd" mounts. I suspect that's what we have to do, though. I mean, the client addresses, for example, may not even make sense unless you know which network namespace they come from. (On the other hand... how does /proc/locks actually work? Looks to me like it always lists every lock on the system. Can it really translate any process on the system into a pid that makes sense in any container? I'm not following that, from a brief look at the code.) --b.