> On Aug 17, 2018, at 12:54 AM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Thu, 2018-08-16 at 15:03 -0400, Chuck Lever wrote: >>> On Aug 16, 2018, at 12:27 PM, Trond Myklebust < >>> trondmy@xxxxxxxxxxxxxxx> wrote: >>> >>> On Thu, 2018-08-16 at 12:05 -0400, Chuck Lever wrote: >>>> A multi-homed NFS server may have more than one "nfs" key in its >>>> keytab. Enable the kernel to pick the key it wants as a machine >>>> credential when establishing a GSS context. >>>> >>>> This is useful for GSS-protected NFSv4.0 callbacks, which are >>>> required by RFC 7530 S3.3.3 to use the same principal as the >>>> service >>>> principal the client used when establishing its lease. >>>> >>>> A complementary modification to rpc.gssd is required to fully >>>> enable >>>> this feature. >>>> >>>> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> >>>> --- >>>> net/sunrpc/auth_gss/auth_gss.c | 20 +++++++++++++++++--- >>>> 1 file changed, 17 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/net/sunrpc/auth_gss/auth_gss.c >>>> b/net/sunrpc/auth_gss/auth_gss.c >>>> index be8f103..1943e11 100644 >>>> --- a/net/sunrpc/auth_gss/auth_gss.c >>>> +++ b/net/sunrpc/auth_gss/auth_gss.c >>>> @@ -284,7 +284,12 @@ struct gss_auth { >>>> return p; >>>> } >>>> >>>> -#define UPCALL_BUF_LEN 128 >>>> +/* XXX: Need some documentation about why UPCALL_BUF_LEN is so >>>> small. >>>> + * Is user space expecting no more than UPCALL_BUF_LEN >>>> bytes? >>>> + * Note that there are now _two_ NI_MAXHOST sized data >>>> items >>>> + * being passed in this string. >>>> + */ >>>> +#define UPCALL_BUF_LEN 256 >>>> >>> >>> Why? The services are currently "nfs" or "nfsd". Hostnames are >>> normally >>> < 64 characters. >> >> True, but that's an average. We actually should be accommodating the >> largest possible hostname here: /usr/include/netdb.h defines >> NI_MAXHOST >> as 1024. Thus even 256 is too small, but as you point out it will >> work >> fine in most real world cases. >> >> >>>> struct gss_upcall_msg { >>>> refcount_t count; >>>> @@ -462,8 +467,17 @@ static int gss_encode_v1_msg(struct >>>> gss_upcall_msg *gss_msg, >>>> p += len; >>>> gss_msg->msg.len += len; >>>> } >>>> - if (service_name != NULL) { >>>> - len = scnprintf(p, buflen, "service=%s ", >>>> service_name); >>>> + if (service_name) { >>>> + char *c = strchr(service_name, '@'); >>>> + >>>> + if (!c) >>>> + len = scnprintf(p, buflen, "service=%s ", >>>> + service_name); >>>> + else >>>> + len = scnprintf(p, buflen, >>>> + "service=%.*s srchost=%s ", >>>> + (int)(c - service_name), >>>> + service_name, c + 1); >>>> buflen -= len; >>>> p += len; >>>> gss_msg->msg.len += len; >>> >>> Isn't this just duplicating the functionality of the 'target' >>> argument? >> >> I might not understand your question, but I believe the answer is no. >> >> I have a multi-homed client and server. The primary hostname of the >> client is "manet.example.net" and the server is "klimt.example.net" >> Both have IB interfaces, respectively "manet.ib.example.net" and >> "klimt.ib.example.net". >> >> Now I do this: >> >> manet# mount -o vers=4.0,sec=sys klimt.ib.example.net:/export /mnt >> >> The mount is AUTH_SYS, but lease management is done using krb5i >> because >> both systems have a keytab. For the callback channel, RFC 7530 >> requires >> that the server use the client's lease management auth flavor. >> Moreover, >> the server is required to use the same acceptor that the client >> authenticated to (in this case, "klimt.ib.example.net"). >> >> So now the server establishes a GSS context for it's callback >> channel: >> >> Aug 16 11:00:17 klimt rpc.gssd[1186]: handle_gssd_upcall: 'mech=krb5 >> uid=0 target=host@xxxxxxxxxxxxxxxxx service=nfs >> srchost=klimt.ib.example.net enctypes=18,17,16,23,3,1,2 ' >> (nfsd4_cb/clnt4) >> >> The "target" is the principal of the remote service the server is >> authenticating to. In full, the target is: >> >> host/manet.example.net@xxxxxxxxxxx >> >> The "srchost" is the hostname part of the server's service principal. >> This >> has to match the acceptor string provided by the client, thus the >> full >> source principal has to be: >> >> nfs/klimt.ib.example.net@xxxxxxxxxxx >> >> gssd uses the service= and srchost= to select the correct key in its >> keytab when it establishes a GSS context for the callback channel. >> Without >> srchost it would select the key for klimt.example.net, which is the >> wrong key in this case. Since this commit: >> >> commit f11b2a1cfbf5dd783eb55cb470509d06e20d1c78 >> Author: Jeff Layton <jlayton@xxxxxxxxxxxxxxx> >> AuthorDate: Sat Jun 21 20:52:17 2014 -0400 >> Commit: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> >> CommitDate: Sat Jul 12 18:41:25 2014 -0400 >> >> nfs4: copy acceptor name from context to nfs_client >> >> the client rejects NFSv4.0 callbacks using krb5 with the wrong >> acceptor >> string with "NFSv4 callback contains invalid cred". This occurs >> _after_ >> the server has established a GSS context and a connection to the >> client, >> so it has already granted delegations which it now cannot recall. >> > > The above story really does deserve an explanation in the code for the > upcall. > Right now, you have to hunt through nested layers of struct copies to > figure out that 'target' refers to the service on the server side > callback channel for NFSv4.0 only. A second hostname field for the same > use case just adds to that confusion. I can send an additional patch that adds one or more comments documenting the function of target=, service=, and srchost=. -- Chuck Lever