Re: Data corruption with 5.10.x client -> 6.5.x server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Sep 24, 2023, at 12:51 PM, Mantas Mikulėnas <grawity@xxxxxxxxx> wrote:
> 
> On 2023-09-24 17:44, Chuck Lever III wrote:
>>> On Sep 24, 2023, at 10:32 AM, Mantas Mikulėnas <grawity@xxxxxxxxx> wrote:
>>> 
>>> On 2023-09-24 16:28, Chuck Lever III wrote:
>>>>> On Sep 24, 2023, at 9:07 AM, Mantas Mikulėnas <grawity@xxxxxxxxx> wrote:
>>>>> 
>>>>> I've recently upgraded my home NFS server from 6.4.12 to 6.5.4 (running Arch Linux x86_64).
>>>>> 
>>>>> Now, when I'm accessing the server over NFSv4.2 from a client that's running 5.10.0 (32-bit x86, Debian 11), if the mount is using sec=krb5i or sec=krb5p, trying to read a file that's <= 4092 bytes in size will return all-zero data. (That is, `hexdump -C file` shows "00 00 00...") Files that are 4093 bytes or larger seem to be unaffected.
>>>>> 
>>>>> Only sec=krb5i/krb5p are affected by this – plain sec=krb5 (or sec=sys for that matter) seems to work without any problems.
>>>>> 
>>>>> Newer clients (like 6.1.x or 6.4.x) don't seem to have any issues, it's only 5.10.0 that does... though it might also be that the client is 32-bit, but the same client did work previously when the server was running older kernels, so I still suspect 6.5.x on the server being the problem.
>>>>> 
>>>>> Upgrading to 6.6.0-rc2 on the server hasn't changed anything.
>>>>> The server is using Btrfs but I've tested with tmpfs as well.
>>>> I'm guessing proto=tcp as well (as opposed to proto=rdma).
>>> 
>>> Yes, it's TCP.
>>> 
>>> (I do have RDMA set up between two of the 6.5.x server systems, but in this case all the clients I've tested were TCP-only, and the home server that I originally noticed the problem with doesn't have RDMA at all.)
>>> 
>>>> Does the problem go away with vers=4.1 ?
>>> 
>>> No, it doesn't (neither with 4.0).
>>> 
>>>> Can you capture network traffic during the failure? Use sec=krb5i so
>>>> we can see the RPC payloads. On the client:
>>>> # tcpdump -iany -s0 -w/tmp/sniffer.pcap
>>> 
>>> Attached. (The script I've been using for testing mounts with -o sec=krb5i, cats three files, then unmounts.)<nfs_krb5i.pcap>
>> I see three NFS READs in the capture.
>> The first READ payload is all zeroes. The second payload contains
>> "Hello World (4093 bytes)" repeatedly, and the third contains
>> "Hello World (4096 bytes)" repeatedly.
> 
> Right, whereas on the server, the first file is filled with "Hello World (4092 bytes)" as I originally tried to narrow down the issue.
> 
> Meanwhile, 6.4.x (Arch) clients don't seem to be having any problems with the same server, and with seemingly the same mount options.
> 
> Thanks for looking into it!<nfs_krb5i_working_6.4client.pcap>

I found /a/ problem with the nfsd-fixes branch and krb5i, but
maybe not /your/ problem, and it's with a recent client. Scrounging
a v5.10-vintage client is a little more work, we'll see if that's
needed for confirming an eventual fix.


--
Chuck Lever






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux