Re: NFSoRDMA Fails for max_sge Less Than 18

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Jan 11, 2017, at 3:18 PM, Steve Wise <swise@xxxxxxxxxxxxxxxxxxxxx> wrote:
> 
> <snip>
> 
>>>> It might be cool to have an API similar to rdma_rw that allows ULPs
>>>> to use a scatterlist for Send and Receive operations. It could hide
>>>> the driver and device maximum SGE values.
>>>> 
>>> 
>>> I'm not sure what you mean by "in place"?  (sorry for being not up to speed
> on
>>> this whole issue)  But perhaps some API like this could be added to
> rdma_rw...
>> 
>> "in place" == the SGE array would point to struct pages
>> containing parts of the message payload. That's basically what
>> this "support large inline threshold" patch is doing.
>> 
>> If the device supports only 4 SGEs, then the largest
>> message size that can be sent this way is just one or two
>> pages.
>> 
>> Some would prefer to send much larger payloads this way.
>> 
> 
> I'm not sure how the API would do this w/o having to send multiple ULP protocol
> SEND messages and thus that moves the logic into the ULP.  IE if the device only
> supports 4 SGE, and that only allows 2 pages worth of inline data, then the ULP
> needs to create multiple SEND messages with ULP headers in each, I would think.
> Not sure how this could be done below the ULP...

Right, me neither. It was just a thought.


>> I guess what I'm asking is whether 4 SGEs is going to be typical
>> of HCAs going forward, or whether there is a definite trend for
>> adding more in new device designs.
>> 
> 
> The iWARP spec mandates 4 as the minimum.   That's where the 4 came from for
> iWARP devices...
> 
> <snip>
> 
>>>> The original code needed only two SGEs for sending, and one for
>>>> receiving.
>>>> 
>>>> IIRC the RPC-over-RDMA receive path still needs just one SGE.
>>>> 
>>> 
>>> No I mean the code that bumps it up to 18.  Would that cause an immediate
>>> failure if cxgb4 supported 17 and only enforces it at post_send() time?
>> 
>> "mount" would fail immediately if the driver reported max_sge == 17.
>> The check that Ram mentioned happens at mount time, before anything
>> has been sent.
>> 
> 
> Hmm...
> 
>> 
>>> (haven't looked in detail at your patches...sorry).  Our QA ran testing on
> 4.9
>>> and didn't see this issue, so that's why I'm asking.    They have not yet
> run
>>> NFS/RDMA testing on 4.9-rc.  I've asked them to do a quick regression test
> asap.
>> 
> 
> Correction: I meant 4.10-rc above.   
> 
> But still, I believe Chelsio tested 4.9, so perhaps it isn't the "mount" that
> causes a failure but trying to send something with an SGE > 4 that happens
> immediately after the mount?  And since cxgb4 supports up to 17, the failure
> wouldn't be seen until some inline message was attempted that required 18
> sges...

The original check for 18 or 19 was too aggressive (that's the
bug here). With the default inline threshold settings, RPC-over-RDMA
won't ever use more than 4 (or at most 5) SGEs for RDMA Send.

So if somehow the mount was allowed, and no changes were made to the
default settings, everything should still work fine for cxgb4.


>> That's curious!
>> 
>> 
>>>>> Note: the ib_device_attr only has a max_sge that pertains to both send and
>>> recv,
>>>>> so cxgb4 sets it to the min value.  We should probably add a max_recv_sge
>>> and
>>>>> max_send_sge to ib_device_attr...
>>>> 
>>>> I could go for that too.
>>>> 
>>> 
>>> I'm swamped right now to add this, but the changes should be trivial...
>> 
>> Maybe I could get to it, but no promises.
> 
> (I'm holding my breath! ;))
> 
> Steve.
> 

--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux