> Subject: Re: RDMA (smbdirect) testing > > 2022-05-24 4:17 GMT+09:00, Long Li <longli@xxxxxxxxxxxxx>: > >> Subject: Re: RDMA (smbdirect) testing > >> > >> On 5/23/2022 11:05 AM, Namjae Jeon wrote: > >> > 2022-05-23 22:45 GMT+09:00, Tom Talpey <tom@xxxxxxxxxx>: > >> >> On 5/22/2022 7:06 PM, Namjae Jeon wrote: > >> >>> 2022-05-21 20:54 GMT+09:00, Tom Talpey <tom@xxxxxxxxxx>: > >> >>>> ... > >> >>>> Why does the code require > >> >>>> 16 sge's, regardless of other size limits? Normally, if the > >> >>>> lower layer supports fewer, the upper layer will simply reduce > >> >>>> its operation sizes. > >> >>> This should be answered by Long Li. It seems that he set the > >> >>> optimized value for the NICs he used to implement RDMA in cifs. > >> >> > >> >> "Optimized" is a funny choice of words. If the provider doesn't > >> >> support the value, it's not much of an optimization to insist on 16. > >> >> :) > >> > Ah, It's obvious that cifs haven't been tested with soft-iWARP. And > >> > the same with ksmbd... > >> >> > >> >> Personally, I'd try building a kernel with smbdirect.h changed to > >> >> have SMBDIRECT_MAX_SGE set to 6, and see what happens. You > might > >> >> have to reduce the r/w sizes in mount, depending on any other > >> >> issues this may reveal. > >> > Agreed, and ksmbd should also be changed as well as cifs for test. > >> > We are preparing the patches to improve this in ksmbd, rather than > >> > changing/building this hardcoding every time. > >> > >> So, the patch is just for this test, right? Because I don't think any > >> kernel-based storage upper layer should ever need more than 2 or 3. > >> How many memory regions are you doing per operation? I would expect > >> one for the SMB3 headers, and another, if needed, for data. > >> These would all be lmr-type and would not require actual new memreg's. > >> > >> And for bulk data, I would hope you're using fast-register, which > >> takes a different path and doesn't use the same sge's. > >> > >> Getting this right, and keeping things efficient both in SGE > >> bookkeeping as well as memory registration efficiency, is the rocket > >> science behind RDMA performance and correctness. Slapping "16" or "6" > >> or whatever isn't the > >> long- > >> term fix. > > > Hi Long, > > I found max_sge is extremely large on Mellanox hardware, but smaller > > on other iWARP hardware. > > > > Hardcoding it to 16 is certainly not a good choice. I think we should > > set it to the smaller value of 1) a predefined value (e.g. 8), and the > > 2) the max_sge the hardware reports. > Okay, Could you please send the patch for cifs.ko ? Yes, will do. Long > > Thanks. > > > > If the CIFS upper layer ever sends data with larger number of SGEs, > > the send will fail. > > > > Long > > > >> > >> Tom. > >> > >> > diff --git a/fs/cifs/smbdirect.h b/fs/cifs/smbdirect.h index > >> > a87fca82a796..7003722ab004 100644 > >> > --- a/fs/cifs/smbdirect.h > >> > +++ b/fs/cifs/smbdirect.h > >> > @@ -226,7 +226,7 @@ struct smbd_buffer_descriptor_v1 { > >> > } __packed; > >> > > >> > /* Default maximum number of SGEs in a RDMA send/recv */ > >> > -#define SMBDIRECT_MAX_SGE 16 > >> > +#define SMBDIRECT_MAX_SGE 6 > >> > /* The context for a SMBD request */ > >> > struct smbd_request { > >> > struct smbd_connection *info; diff --git > >> > a/fs/ksmbd/transport_rdma.c b/fs/ksmbd/transport_rdma.c index > >> > e646d79554b8..70662b3bd590 100644 > >> > --- a/fs/ksmbd/transport_rdma.c > >> > +++ b/fs/ksmbd/transport_rdma.c > >> > @@ -42,7 +42,7 @@ > >> > /* SMB_DIRECT negotiation timeout in seconds */ > >> > #define SMB_DIRECT_NEGOTIATE_TIMEOUT 120 > >> > > >> > -#define SMB_DIRECT_MAX_SEND_SGES 8 > >> > +#define SMB_DIRECT_MAX_SEND_SGES 6 > >> > #define SMB_DIRECT_MAX_RECV_SGES 1 > >> > > >> > /* > >> > > >> > Thanks! > >> >> > >> >> Tom. > >> >> > >> > > >