2022-05-24 4:17 GMT+09:00, Long Li <longli@xxxxxxxxxxxxx>: >> Subject: Re: RDMA (smbdirect) testing >> >> On 5/23/2022 11:05 AM, Namjae Jeon wrote: >> > 2022-05-23 22:45 GMT+09:00, Tom Talpey <tom@xxxxxxxxxx>: >> >> On 5/22/2022 7:06 PM, Namjae Jeon wrote: >> >>> 2022-05-21 20:54 GMT+09:00, Tom Talpey <tom@xxxxxxxxxx>: >> >>>> ... >> >>>> Why does the code require >> >>>> 16 sge's, regardless of other size limits? Normally, if the lower >> >>>> layer supports fewer, the upper layer will simply reduce its >> >>>> operation sizes. >> >>> This should be answered by Long Li. It seems that he set the >> >>> optimized value for the NICs he used to implement RDMA in cifs. >> >> >> >> "Optimized" is a funny choice of words. If the provider doesn't >> >> support the value, it's not much of an optimization to insist on 16. >> >> :) >> > Ah, It's obvious that cifs haven't been tested with soft-iWARP. And >> > the same with ksmbd... >> >> >> >> Personally, I'd try building a kernel with smbdirect.h changed to >> >> have SMBDIRECT_MAX_SGE set to 6, and see what happens. You might have >> >> to reduce the r/w sizes in mount, depending on any other issues this >> >> may reveal. >> > Agreed, and ksmbd should also be changed as well as cifs for test. We >> > are preparing the patches to improve this in ksmbd, rather than >> > changing/building this hardcoding every time. >> >> So, the patch is just for this test, right? Because I don't think any >> kernel-based >> storage upper layer should ever need more than 2 or 3. >> How many memory regions are you doing per operation? I would expect one >> for >> the SMB3 headers, and another, if needed, for data. >> These would all be lmr-type and would not require actual new memreg's. >> >> And for bulk data, I would hope you're using fast-register, which takes a >> different path and doesn't use the same sge's. >> >> Getting this right, and keeping things efficient both in SGE bookkeeping >> as well >> as memory registration efficiency, is the rocket science behind RDMA >> performance and correctness. Slapping "16" or "6" or whatever isn't the >> long- >> term fix. > Hi Long, > I found max_sge is extremely large on Mellanox hardware, but smaller on > other iWARP hardware. > > Hardcoding it to 16 is certainly not a good choice. I think we should set it > to the smaller value of 1) a predefined value (e.g. 8), and the 2) the > max_sge the hardware reports. Okay, Could you please send the patch for cifs.ko ? Thanks. > > If the CIFS upper layer ever sends data with larger number of SGEs, the send > will fail. > > Long > >> >> Tom. >> >> > diff --git a/fs/cifs/smbdirect.h b/fs/cifs/smbdirect.h index >> > a87fca82a796..7003722ab004 100644 >> > --- a/fs/cifs/smbdirect.h >> > +++ b/fs/cifs/smbdirect.h >> > @@ -226,7 +226,7 @@ struct smbd_buffer_descriptor_v1 { >> > } __packed; >> > >> > /* Default maximum number of SGEs in a RDMA send/recv */ >> > -#define SMBDIRECT_MAX_SGE 16 >> > +#define SMBDIRECT_MAX_SGE 6 >> > /* The context for a SMBD request */ >> > struct smbd_request { >> > struct smbd_connection *info; diff --git >> > a/fs/ksmbd/transport_rdma.c b/fs/ksmbd/transport_rdma.c index >> > e646d79554b8..70662b3bd590 100644 >> > --- a/fs/ksmbd/transport_rdma.c >> > +++ b/fs/ksmbd/transport_rdma.c >> > @@ -42,7 +42,7 @@ >> > /* SMB_DIRECT negotiation timeout in seconds */ >> > #define SMB_DIRECT_NEGOTIATE_TIMEOUT 120 >> > >> > -#define SMB_DIRECT_MAX_SEND_SGES 8 >> > +#define SMB_DIRECT_MAX_SEND_SGES 6 >> > #define SMB_DIRECT_MAX_RECV_SGES 1 >> > >> > /* >> > >> > Thanks! >> >> >> >> Tom. >> >> >> > >