Hi Trond, I am afraid, that the fix didn't work. I bisecting it.... Tigran. ----- Original Message ----- > From: "trondmy" <trondmy@xxxxxxxxxxxxxxx> > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx> > Cc: "linux-nfs" <linux-nfs@xxxxxxxxxxxxxxx> > Sent: Saturday, 14 November, 2020 15:29:01 > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels > On Sat, 2020-11-14 at 00:46 +0100, Mkrtchyan, Tigran wrote: >> >> >> ----- Original Message ----- >> > From: "trondmy" <trondmy@xxxxxxxxxxxxxxx> >> > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx> >> > Cc: "linux-nfs" <linux-nfs@xxxxxxxxxxxxxxx> >> > Sent: Friday, 13 November, 2020 23:45:00 >> > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS >> > file+flexfiles data channels >> >> > On Fri, 2020-11-13 at 22:30 +0100, Mkrtchyan, Tigran wrote: >> > > >> > > After more testing, it looks like that client doesn't like >> > > notification bitmap: >> > > >> > > >> > > [31576.789492] --> _nfs4_proc_getdeviceinfo >> > > [31576.789503] --> nfs41_call_sync_prepare data->seq_server >> > > 000000001d17c43e >> > > [31576.789507] --> nfs4_alloc_slot used_slots=0000 >> > > highest_used=4294967295 max_slots=16 >> > > [31576.789510] <-- nfs4_alloc_slot used_slots=0001 highest_used=0 >> > > slotid=0 >> > > [31576.789527] encode_sequence: >> > > sessionid=2910695007:150995712:0:16777216 seqid=92462 slotid=0 >> > > max_slotid=0 cache_this=0 >> > > [31576.789991] decode_getdeviceinfo: unsupported notification >> > >> > According to this, you appear to be returning a deviceinfo bitmap >> > with >> > at least one non-zero entry that is not in the first 32-bit word. >> > We >> > only ask for notifications for NOTIFY_DEVICEID4_CHANGE and >> > NOTIFY_DEVICEID4_DELETE, so we only expect bitmap[0] to have non- >> > zero >> > entries. >> >> >> according to packet capture only bitmap[0] has non zero bits set. >> This is the reply of compound starting from nfs staus code, tag >> length and so on. >> >> >> 0000 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 35 >> 0010 00 00 00 00 5f ae 7d ad 00 03 00 09 00 00 00 00 >> 0020 00 00 00 01 00 00 00 4c 00 00 00 00 00 00 00 0f >> 0030 00 00 00 0f 00 00 00 00 00 00 00 2f 00 00 00 00 >> 0040 00 00 00 04 00 00 00 40 00 00 00 01 00 00 00 03 >> 0050 74 63 70 00 00 00 00 16 31 33 31 2e 31 36 39 2e >> 0060 31 39 31 2e 31 34 33 2e 31 32 35 2e 34 39 00 00 >> 0070 00 00 00 01 00 00 00 04 00 00 00 01 00 10 00 00 >> 0080 00 10 00 00 00 00 00 01 00 00 00 02 00 00 00 06 >> 0090 00 00 00 00 >> >> >> the last 12 bytes : bitmap size, bitmap[0], bitmap[1] >> >> >> This part of code in the didn't change since 2010, and I >> have no issues to use 5.8 kernel. I am pretty sure, that >> tests with 5.9 did pass as expected. I will try to bisec it. > > I don't think I've introduced this bug. I did not touch anything in the > getdeviceinfo proc or XDR code. > Does the following patch help? > > 8<------------------------------------------------------- > From e92b2d4e39e91d379ec1147115820ab5dfe4c89a Mon Sep 17 00:00:00 2001 > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > Date: Fri, 13 Nov 2020 21:42:16 -0500 > Subject: [PATCH] NFSv4: Fix the alignment of page data in the getdeviceinfo > reply > > We can fit the device_addr4 opaque data padding in the pages. > > Fixes: cf500bac8fd4 ("SUNRPC: Introduce rpc_prepare_reply_pages()") > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > --- > fs/nfs/nfs4xdr.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c > index c6dbfcae7517..c8714381d511 100644 > --- a/fs/nfs/nfs4xdr.c > +++ b/fs/nfs/nfs4xdr.c > @@ -3009,15 +3009,19 @@ static void nfs4_xdr_enc_getdeviceinfo(struct rpc_rqst > *req, > struct compound_hdr hdr = { > .minorversion = nfs4_xdr_minorversion(&args->seq_args), > }; > + uint32_t replen; > > encode_compound_hdr(xdr, req, &hdr); > encode_sequence(xdr, &args->seq_args, &hdr); > + > + replen = hdr.replen + op_decode_hdr_maxsz; > + > encode_getdeviceinfo(xdr, args, &hdr); > > - /* set up reply kvec. Subtract notification bitmap max size (2) > - * so that notification bitmap is put in xdr_buf tail */ > + /* set up reply kvec. device_addr4 opaque data is read into the > + * pages */ > rpc_prepare_reply_pages(req, args->pdev->pages, args->pdev->pgbase, > - args->pdev->pglen, hdr.replen - 2); > + args->pdev->pglen, replen + 2); > encode_nops(&hdr); > } > > @@ -5848,7 +5852,9 @@ static int decode_getdeviceinfo(struct xdr_stream *xdr, > * and places the remaining xdr data in xdr_buf->tail > */ > pdev->mincount = be32_to_cpup(p); > - if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount) > + /* Calculate padding */ > + len = xdr_align_size(pdev->mincount); > + if (xdr_read_pages(xdr, len) != len) > return -EIO; > > /* Parse notification bitmap, verifying that it is zero. */ > -- > 2.28.0 > > > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@xxxxxxxxxxxxxxx