On Wed, Jul 31, 2019 at 08:07:53AM -0400, Scott Mayhew wrote: > On Tue, 30 Jul 2019, J. Bruce Fields wrote: > > > On Tue, Jul 30, 2019 at 05:08:45PM -0400, Scott Mayhew wrote: > > > At the spring bakeathon, Chuck suggested that we should store the > > > kerberos principal in addition to the client id string in nfsdcld. The > > > idea is to prevent an illegitimate client from reclaiming another > > > client's opens by supplying that client's id string. This is an initial > > > attempt at doing that. > > > > Great to see this, thanks! > > > > > Initially I had played with the idea of hanging a version number off > > > either the cld_net or nfsd_net structure, exposing that via a proc file, > > > and having the userspace daemon write the desired version number to the > > > proc file prior to opening the rpc_pipefs file. That was potentially > > > problematic if nfsd was already trying to queue an upcall using a > > > version that nfscld didn't support... so I like the GetVersion upcall > > > idea better. > > > > Sounds OK to me. > > > > It queries the version once on nfsd startup and sticks with that > > version. We allow rpc.mountd to be restarted while nfsd is running. So > > the one case I think it wouldn't handle is the case where somebody > > downgrades mountd while nfsd is running. > > I'm assuming you meant nfsdcld rather than mountd here... Oops, right. > currently if > someone were to downgrade nfsdcld while nfsd is running then that case > wouldn't be handled, so a restart of nfsd would also be necessary. Right. > > My feeling is that handling that case would be way too much trouble, so > > actually I'm going to consider that a plus. But it might be worth > > documenting (if only in a patch changelog). > > To handle that scenario, We'd need to change the database schema. I'd > really rather do that in an external script than stick downgrade logic > into nfsdcld... mainly because I'd want to check if there was any data > in the columns being eliminated and warn the user & ask for > confirmation. > > We'd also need to change how nfsdcld reacts when it gets a message > version it doesn't support. Currently it just reopens the pipe file, > which causes the upcall to fail with -EAGAIN, which causes nfsd to retry > the upcall, and it just goes into a loop. I could implement a "version > not supported" downcall. I'm not sure what error number to use... maybe > -EPROTONOSUPP? Also even if we implemented a "version not supported" > downcall now, that wouldn't handle the problem with existing nfsdcld's. > The only thing I can think of there is to add a counter to > cld_pipe_upcall() and exit the loop after a certain number of iterations > (10? 100? 1000?). This sounds too complicated. Let's just add a note that downgrades require an nfsd restart. > > > The second patch actually adds the v2 message, which adds the principal > > > (actually just the first 1024 bytes) to the Cld_Create upcall and to the > > > Cld_GraceStart downcall (which is what loads the data in the > > > reclaim_str_hashtbl). I couldn't really figure out what the maximum length > > > of a kerberos principal actually is (looking at krb5.h the length field in > > > the struct krb5_data is an unsigned int, so I guess it can be pretty big). > > > I don't think the principal will be that large in practice, and even if > > > it is the first 1024 bytes should be sufficient for our purposes. > > > > How does it fail when principals are longer? Does it error out, or > > treat two principals as equal if they agree in the first 1024 bytes? > > It treats them as equal. Got it, thanks. --b.