> On May 16, 2023, at 5:25 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Tue, 2023-05-16 at 19:25 +0000, Chuck Lever III wrote: >> >>> On May 16, 2023, at 3:23 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: >>> >>> On Tue, 2023-05-02 at 14:14 +0000, Chuck Lever III wrote: >>>> >>>>> On May 2, 2023, at 7:01 AM, Jiri Slaby <jirislaby@xxxxxxxxxx> wrote: >>>>> >>>>> On 08. 01. 23, 17:31, Chuck Lever wrote: >>>>>> From: Chuck Lever <chuck.lever@xxxxxxxxxx> >>>>>> To navigate around the space that svcauth_gss_accept() reserves >>>>>> for the RPC payload body length and sequence number fields, >>>>>> svcauth_gss_release() does a little dance with the reply's >>>>>> accept_stat, moving the accept_stat value in the response buffer >>>>>> down by two words. >>>>>> Instead, let's have the ->accept() methods each set the proper >>>>>> final location of the accept_stat to avoid having to move >>>>>> things. >>>>> >>>>> Hi, >>>>> >>>>> I bisected to this (4bcf0343e8) >>>> >>>> Assuming you did the bisect on the NFS server's kernel? >>>> >>>> >>>>> as it breaks nfs3-only servers in 6.3. I.e. /etc/nfs.conf containing: >>>>> [nfsd] >>>>> vers4=no >>>> >>>> Note: Changing the settings in /etc/nfs.conf had no effect >>>> on my server, so I effected the change by stopping the >>>> server and poking values into /proc/fs/nfsd/versions by >>>> hand. >>>> >>>> Steve? >>>> >>>> >>>>> The client sees: >>>>> mount("10.0.2.15:/tmp", "/mnt", "nfs", 0, "vers=4.2,addr=10.0.2.15,clientad"...) = -1 EIO (Input/output error) >>>>> write(2, "mount.nfs: mount system call fai"..., 45 >>>>> mount.nfs: mount system call failed for /mnt >>>>> >>>>> And the kernel says: >>>>> nfs4_discover_server_trunking unhandled error -5. Exiting with error EIO >>>>> >>>>> I reported in downstream as: >>>>> https://bugzilla.suse.com/show_bug.cgi?id=1210995 >>>>> >>>>> It cannot be reverted cleanly on the top of 6.3. >>>>> >>>>> Any ideas? >>>> >>>> I can reproduce a similar problem. Network capture shows >>>> that the server is responding with NFS4ERR_NOENT to the >>>> EXCHANGE_ID operation, and the client kernel log says: >>>> >>>>> nfs4_discover_server_trunking unhandled error -121. Exiting with error EIO >>>> >>>> That's not the failure mode I expected given the commit >>>> you bisected to, so it might not be the same problem you've >>>> hit. I'll troubleshoot this and send a fix for testing. >>>> >>> >>> Alex hit this problem in testing too, and I took a quick look. >>> >>> In the attached capture, the client should have gotten back a >>> RPC_PROG_MISMATCH error, but the server has recorded an extra successful >>> accept state before encoding the RPC_PROG_MISMATCH error, leading to a >>> malformed reply. >>> >>> I think that the problem is that encoding the accept status too early >>> means that we can't properly handle failures from the pg_init_request >>> call. >>> >>> Chuck, any thoughts on how you'd like to handle this? >> >> With this: >> >> https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/commit/?h=nfsd-fixes&id=29cd2927fb914cc53b5ba4f67d2b74695c994ba4 >> >> I plan to send the fix to Linus tomorrow. >> >> > > Oh! I hadn't seen that cross the list. Did I miss it? https://lore.kernel.org/linux-nfs/8cd5d041-77c3-51dd-a960-7fd8ce1d1271@xxxxxxxxxx/T/#t -- Chuck Lever