Re: [PATCH 2/2] svcrdma: Remove extra writeargs sanity check for NFSv2/3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[ Correcting Steve’s e-mail address ]

On Jul 10, 2014, at 3:43 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:

> On Thu, Jul 10, 2014 at 03:07:34PM -0400, Chuck Lever wrote:
>> Hi Bruce-
>> 
>> On Jul 10, 2014, at 2:49 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
>> 
>>> On Thu, Jul 10, 2014 at 02:24:57PM -0400, Chuck Lever wrote:
>>>> 
>>>> On Jul 10, 2014, at 2:19 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
>>>> 
>>>>> On Thu, Jul 10, 2014 at 01:44:35PM -0400, Chuck Lever wrote:
>>>>>> The server should comply with RFC 5667,
>>>>> 
>>>>> Where's the relevant language?  (I took a quick look but didn't see it.)
>>>> 
>>>> Sorry, I listed the wrong RFC when I wrote the description of bug 246.
>>>> It’s actually RFC 5666, section 3.7.
>>> 
>>> Thanks.
>>> 
>>>>> So I think you just want to drop the round-up of len, not the whole
>>>>> check.
>>>> 
>>>> I’ll try that, thanks!
>> 
>> Works-as-expected.
>> 
>>> Actually, I'd really rather this get fixed up in the rpc layer.  The
>>> padding is really required for correct xdr.
>> 
>> How so?
> 
> Well, to be spec-lawyerly, rfc 1832 3.9 defines opaque data as including
> the zero-padding; a sequence of bytes isn't legal xdr if it just ends
> early.

RFC 5666 section 3.7 is talking about the situation where the transport
is dealing with a list of pages, not a stream of bytes. Each item in
the list is handled by a separate RDMA READ. The client isn’t sending
the payload bytes, the server is pulling them from the client, so the
integrity of the RPC request stream isn’t affected if the server doesn’t
read the last item in the page list, for example.

> 
>> All of NFSv4 and all other NFSv3 operations work as expected
>> without that padding present. There doesn’t seem to be any operational
>> dependency on the presence of padding. Help?
> 
> I can believe that the code deals with it now, I just wonder if this
> check may not be the only case where someone writing xdr code expects
> total length to be a multiple of four.

The exception in section 3.7 applies only to lists of pages. RPC/RDMA
requests sent via RDMA SEND or via RDMA_NOMSG would terminate the XDR
byte stream normally with a zero pad. Since such requests are transmitted
via a single RDMA operation anyway, there’s scant benefit to leaving
off the pad bytes.

Thus this exception applies only to the two operations that RFC 5667
allows to use a read chunk (page list): WRITE and SYMLINK.

> The drc code also depends on the length being right, see
> nfsd_cache_csum.  I don't know whether that will cause a practical
> problem in this case.

OK. DRC is kind of important, at least for NFSv3.

> (What about the krb5i case?)
> 
>>> The fact that RDMA doesn't
>>> require those zeroes to be literally transmitted across the network
>>> sounds like a transport-level detail that should be hidden from the xdr
>>> code.
>> 
>> The best I can think of is adding a false page array entry to the
>> xdr_buf if the last incoming page is short by a few bytes.
> 
> The padding just gets added to the end of whichever page the write ended
> on, and you only use another page if you run out of space, right?

I think I misunderstood the RDMA READ code. No extra page should be
necessary.

The only question I have is whether the receive pages are set up so
that the server itself can write pad bytes into them before the RDMA
READs are done.

> I don't know, if that's a huge pain then I guess we can live with this.

Fixing svcrdma was my first approach, but I abandoned it once I
realized that NFSv3 WRITE was the only failing case.

I think the sanity check you pointed out is strictly satisfied by
testing against the unaligned number of bytes. Is there a strong
reason to do the extra math for that check during each WRITE?

Otherwise, I’m looking closely at hooking this up in the transport
as you suggested, for comparison.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux