Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




More investigation...

After xdr_read_pages call the xdr stream points to the
beginning of RPC package and return the xid value on
the next read (which should be the size of the notification bitmap).


Regards,
   Tigran.

----- Original Message -----
> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx>
> To: "trondmy" <trondmy@xxxxxxxxxxxxxxx>
> Cc: "linux-nfs" <linux-nfs@xxxxxxxxxxxxxxx>, "Anna Schumaker" <anna.schumaker@xxxxxxxxxx>
> Sent: Thursday, 26 November, 2020 18:17:41
> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels

> I have added some debug info Ind got this:
> 
> decode_getdeviceinfo: layout type 4
> decode_getdeviceinfo: layout size 64
> decode_getdeviceinfo: layout notification bitmap size 1094994757
> 
> So it looks like that xdr_read_pages set xdr pointer to a wrong position
> and next read, which is bitmap size
> 
> 5857         pdev->mincount = be32_to_cpup(p);
> 5858         dprintk("%s: layout size %u\n", __func__, pdev->mincount);
> 5859         if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
> 5860                 return -EIO;
> 5861
> 5862         /* Parse notification bitmap, verifying that it is zero. */
> 5863         p = xdr_inline_decode(xdr, 4);
> 5864         if (unlikely(!p))
> 5865                 return -EIO;
> 5866         len = be32_to_cpup(p);
> 5867         dprintk("%s: layout notification bitmap size %u\n", __func__, len);
> 5868         if (len) {
> 5869                 uint32_t i;
> 
> 
> Tigran.
> 
> 
> ----- Original Message -----
>> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx>
>> To: "trondmy" <trondmy@xxxxxxxxxxxxxxx>
>> Cc: "linux-nfs" <linux-nfs@xxxxxxxxxxxxxxx>, "Anna Schumaker"
>> <anna.schumaker@xxxxxxxxxx>
>> Sent: Tuesday, 17 November, 2020 15:50:57
>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>> channels
> 
>> Here is the result:
>> 
>> 
>> $ git bisect bad
>> c567552612ece787b178e3b147b5854ad422a836 is the first bad commit
>> commit c567552612ece787b178e3b147b5854ad422a836
>> Author: Anna Schumaker <Anna.Schumaker@xxxxxxxxxx>
>> Date:   Wed May 28 13:41:22 2014 -0400
>> 
>>    NFS: Add READ_PLUS data segment support
>>    
>>    This patch adds client support for decoding a single NFS4_CONTENT_DATA
>>    segment returned by the server. This is the simplest implementation
>>    possible, since it does not account for any hole segments in the reply.
>>    
>>    Signed-off-by: Anna Schumaker <Anna.Schumaker@xxxxxxxxxx>
>> 
>> fs/nfs/nfs42xdr.c         | 141 ++++++++++++++++++++++++++++++++++++++++++++++
>> fs/nfs/nfs4client.c       |   2 +
>> fs/nfs/nfs4proc.c         |  43 +++++++++++++-
>> fs/nfs/nfs4xdr.c          |   1 +
>> include/linux/nfs4.h      |   2 +-
>> include/linux/nfs_fs_sb.h |   1 +
>> include/linux/nfs_xdr.h   |   2 +-
>> 7 files changed, 187 insertions(+), 5 deletions(-)
>> 
>> 
>> Regards,
>>   Tigran.
>> 
>> 
>> 
>> ----- Original Message -----
>>> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx>
>>> To: "trondmy" <trondmy@xxxxxxxxxxxxxxx>
>>> Cc: "linux-nfs" <linux-nfs@xxxxxxxxxxxxxxx>
>>> Sent: Monday, 16 November, 2020 21:55:50
>>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>>> channels
>> 
>>> Hi Trond,
>>> 
>>> I am afraid, that the fix didn't work. I bisecting it....
>>> 
>>> 
>>> Tigran.
>>> 
>>> 
>>> ----- Original Message -----
>>>> From: "trondmy" <trondmy@xxxxxxxxxxxxxxx>
>>>> To: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx>
>>>> Cc: "linux-nfs" <linux-nfs@xxxxxxxxxxxxxxx>
>>>> Sent: Saturday, 14 November, 2020 15:29:01
>>>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>>>> channels
>>> 
>>>> On Sat, 2020-11-14 at 00:46 +0100, Mkrtchyan, Tigran wrote:
>>>>> 
>>>>> 
>>>>> ----- Original Message -----
>>>>> > From: "trondmy" <trondmy@xxxxxxxxxxxxxxx>
>>>>> > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx>
>>>>> > Cc: "linux-nfs" <linux-nfs@xxxxxxxxxxxxxxx>
>>>>> > Sent: Friday, 13 November, 2020 23:45:00
>>>>> > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS
>>>>> > file+flexfiles data channels
>>>>> 
>>>>> > On Fri, 2020-11-13 at 22:30 +0100, Mkrtchyan, Tigran wrote:
>>>>> > > 
>>>>> > > After more testing, it looks like that client doesn't like
>>>>> > > notification bitmap:
>>>>> > > 
>>>>> > > 
>>>>> > > [31576.789492] --> _nfs4_proc_getdeviceinfo
>>>>> > > [31576.789503] --> nfs41_call_sync_prepare data->seq_server
>>>>> > > 000000001d17c43e
>>>>> > > [31576.789507] --> nfs4_alloc_slot used_slots=0000
>>>>> > > highest_used=4294967295 max_slots=16
>>>>> > > [31576.789510] <-- nfs4_alloc_slot used_slots=0001 highest_used=0
>>>>> > > slotid=0
>>>>> > > [31576.789527] encode_sequence:
>>>>> > > sessionid=2910695007:150995712:0:16777216 seqid=92462 slotid=0
>>>>> > > max_slotid=0 cache_this=0
>>>>> > > [31576.789991] decode_getdeviceinfo: unsupported notification
>>>>> > 
>>>>> > According to this, you appear to be returning a deviceinfo bitmap
>>>>> > with
>>>>> > at least one non-zero entry that is not in the first 32-bit word.
>>>>> > We
>>>>> > only ask for notifications for NOTIFY_DEVICEID4_CHANGE and
>>>>> > NOTIFY_DEVICEID4_DELETE, so we only expect bitmap[0] to have non-
>>>>> > zero
>>>>> > entries.
>>>>> 
>>>>> 
>>>>> according to packet capture only bitmap[0] has non zero bits set.
>>>>> This is the reply of compound starting from nfs staus code, tag
>>>>> length and so on.
>>>>> 
>>>>> 
>>>>> 0000   00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 35
>>>>> 0010   00 00 00 00 5f ae 7d ad 00 03 00 09 00 00 00 00
>>>>> 0020   00 00 00 01 00 00 00 4c 00 00 00 00 00 00 00 0f
>>>>> 0030   00 00 00 0f 00 00 00 00 00 00 00 2f 00 00 00 00
>>>>> 0040   00 00 00 04 00 00 00 40 00 00 00 01 00 00 00 03
>>>>> 0050   74 63 70 00 00 00 00 16 31 33 31 2e 31 36 39 2e
>>>>> 0060   31 39 31 2e 31 34 33 2e 31 32 35 2e 34 39 00 00
>>>>> 0070   00 00 00 01 00 00 00 04 00 00 00 01 00 10 00 00
>>>>> 0080   00 10 00 00 00 00 00 01 00 00 00 02 00 00 00 06
>>>>> 0090   00 00 00 00
>>>>> 
>>>>> 
>>>>> the last 12 bytes : bitmap size, bitmap[0], bitmap[1]
>>>>> 
>>>>> 
>>>>> This part of code in the didn't change since 2010, and I
>>>>> have no issues to use 5.8 kernel. I am pretty sure, that
>>>>> tests with 5.9 did pass as expected. I will try to bisec it.
>>>> 
>>>> I don't think I've introduced this bug. I did not touch anything in the
>>>> getdeviceinfo proc or XDR code.
>>>> Does the following patch help?
>>>> 
>>>> 8<-------------------------------------------------------
>>>> From e92b2d4e39e91d379ec1147115820ab5dfe4c89a Mon Sep 17 00:00:00 2001
>>>> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
>>>> Date: Fri, 13 Nov 2020 21:42:16 -0500
>>>> Subject: [PATCH] NFSv4: Fix the alignment of page data in the getdeviceinfo
>>>> reply
>>>> 
>>>> We can fit the device_addr4 opaque data padding in the pages.
>>>> 
>>>> Fixes: cf500bac8fd4 ("SUNRPC: Introduce rpc_prepare_reply_pages()")
>>>> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
>>>> ---
>>>> fs/nfs/nfs4xdr.c | 14 ++++++++++----
>>>> 1 file changed, 10 insertions(+), 4 deletions(-)
>>>> 
>>>> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
>>>> index c6dbfcae7517..c8714381d511 100644
>>>> --- a/fs/nfs/nfs4xdr.c
>>>> +++ b/fs/nfs/nfs4xdr.c
>>>> @@ -3009,15 +3009,19 @@ static void nfs4_xdr_enc_getdeviceinfo(struct rpc_rqst
>>>> *req,
>>>> 	struct compound_hdr hdr = {
>>>> 		.minorversion = nfs4_xdr_minorversion(&args->seq_args),
>>>> 	};
>>>> +	uint32_t replen;
>>>> 
>>>> 	encode_compound_hdr(xdr, req, &hdr);
>>>> 	encode_sequence(xdr, &args->seq_args, &hdr);
>>>> +
>>>> +	replen = hdr.replen + op_decode_hdr_maxsz;
>>>> +
>>>> 	encode_getdeviceinfo(xdr, args, &hdr);
>>>> 
>>>> -	/* set up reply kvec. Subtract notification bitmap max size (2)
>>>> -	 * so that notification bitmap is put in xdr_buf tail */
>>>> +	/* set up reply kvec. device_addr4 opaque data is read into the
>>>> +	 * pages */
>>>> 	rpc_prepare_reply_pages(req, args->pdev->pages, args->pdev->pgbase,
>>>> -				args->pdev->pglen, hdr.replen - 2);
>>>> +				args->pdev->pglen, replen + 2);
>>>> 	encode_nops(&hdr);
>>>> }
>>>> 
>>>> @@ -5848,7 +5852,9 @@ static int decode_getdeviceinfo(struct xdr_stream *xdr,
>>>> 	 * and places the remaining xdr data in xdr_buf->tail
>>>> 	 */
>>>> 	pdev->mincount = be32_to_cpup(p);
>>>> -	if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
>>>> +	/* Calculate padding */
>>>> +	len = xdr_align_size(pdev->mincount);
>>>> +	if (xdr_read_pages(xdr, len) != len)
>>>> 		return -EIO;
>>>> 
>>>> 	/* Parse notification bitmap, verifying that it is zero. */
>>>> --
>>>> 2.28.0
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Trond Myklebust
>>>> Linux NFS client maintainer, Hammerspace
> > > > trond.myklebust@xxxxxxxxxxxxxxx




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux