Re: NFSv4.0 callback with Kerberos not working

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Mon, 19 Sep 2022 20:15:13 +0000

> On Sep 19, 2022, at 3:32 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> 
> On Mon, Sep 19, 2022 at 2:16 PM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
>> 
>> 
>> 
>>> On Sep 19, 2022, at 1:59 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
>>> 
>>> On Mon, 2022-09-19 at 15:31 +0000, Chuck Lever III wrote:
>>>> Hi-
>>>> 
>>>> I rediscovered recently that NFSv4.0 with Kerberos does not work on
>>>> multi-homed hosts. This is true even with sec=sys because the client
>>>> attempts to establish a GSS context when there is a keytab present.
>>>> 
>>>> Basically my test environment has to work for sec=sys and sec=krb*
>>>> and for all NFS versions and minor versions. Thus I keep a keytab
>>>> on it.
>>>> 
>>>> Now, I have three network interfaces on my client: one RoCE, one
>>>> IB, and one GbE. They are each on their own subnet and each has
>>>> a unique hostname (that varies in the domain part).
>>>> 
>>>> But mounting one of my IB or RoCE test servers with NFSv4.0 results
>>>> in the infamous "NFSv4: Invalid callback credential" message. The
>>>> client always uses the principal for GbE interface.
>>>> 
>>>> This was working at one point, but seems to have devolved over time.
>>>> 
>>>> 
>>>> Here are some of the problems I found:
>>>> 
>>>> 1. The kernel always asks for service=* .
>>>> 
>>>> If your system's keytab has only "nfs" service principals in it,
>>>> that should be OK. If it has a "host" principal in it, that's
>>>> going to be the first one that gssd picks up.
>>>> 
>>>> NFSv4.0 callback does not work with a host@ acceptor -- it wants
>>>> nfs@.
>>>> 
>>>> There are two possible workarounds:
>>>> 
>>>> a. Remove all but the nfs@ keys from your system's keytab.
>>>> 
>>>> b. Modify the kernel to use "service=nfs" in the upcall.
>>>> 
>>> 
>>> There's also
>>> 
>>> c. Put the nfs service principal in its own keytab and use the '-k'
>>> option to tell rpc.gssd where to find it.
>>> 
>>> However note that 'host/<hostname@REALM>' is normally the expected
>>> principal name for authenticating as a specific hostname. So I'd expect
>>> clients to want to authenticate using that credential so that it is
>>> matched to the hostname entry in /etc/exports on the server.
>>> 
>>> The 'nfs/<hostname@REALM>' would normally be considered a NFS service
>>> principal name, so should really be used by the NFSv4 server to
>>> identify its service (see RFC5661 Section 2.2.1.1.1.3.) rather than
>>> being used by the NFS client.
>> 
>> Fair enough, we can leave the client's service name alone.
>> 
>> 
>>> The same principal is also used by the NFSv4 server to identify itself
>>> when acting as a client to the NFS callback service according to
>>> RFC7530 section 3.3.3.
>>> So what I'm saying is that for the standard NFS client, then '*' is
>>> probably the right thing to use (with a slight preference for 'host/'),
>>> but for the NFS server use case of connecting to the callback service,
>>> it should specify the 'nfs/' prefix. It can do that right now by
>>> setting the clnt->cl_principal. As far as I can tell, the current
>>> behaviour in knfsd is to set it to the same prefix as the server
>>> svc_cred, and to default to 'nfs/' if the server svc_cred doesn't have
>>> such a prefix.
>> 
>> The server uses the client-provided service name in this case.
>> If the client authenticates as "host@" then the server will
>> authenticate to the "host@" service on the backchannel.
>> 
>> Maybe the only mismatch is that my server is using
>> "host@xxxxxxxxxxxxxxxxxxxxx" on the backchannel, and it should
>> be using "host@xxxxxxxxxxxxxxxxxx" instead?
> 
> Given that the spec says: "therefore, the realm name for the server
>   principal must be the same for the callback as it was for the
>   SETCLIENTID." Doesn't it mean that the server needs to use the same
> domain/realm name as what the client authenticated to in the
> forechannel (ie server should be using @client.ib.example.net realm
> for the callback channel)?

Yes.

If the server is using the client's acceptor, then it should
authenticate to whatever the client sent it. The server should
use @client.ib.example.net only if that's what the client sent.

The service name component was a red herring.

I'm looking into the Linux server's behavior now, but I have to
revert all my debugging crap to get a clear picture.

--
Chuck Lever