On 5/8/2019 4:13 PM, Christoph Hellwig wrote:
There is a performance degradation when writing big block sizes.
Degradation is caused by the complexity of combining multiple
indirections and perform RDMA READ operation from it. This will be
fixed in the following patches by reducing the indirections if
possible.
It would be good to figure out if it is posisble, as the regressions
are pretty significant.
Not sure if I understand. We've created an optimization and we'll
perform MTT mapping (similar to "PRP" in NVMe) instead of KLM (similar
to "SGL") if the buffers are not gappy.
In general, KLM is less effective than MTT since it's more complicated
operation for the HW. And we should use it only if we really need it
(e.g, we have a "gappy" SG list coming from the user).
I also want to perform an optimization for non-PI operations that will
use KLM mkey only if MTT mkey is not enough to map the sg list (for
SG_GAPS memory region). And then add SG_GAPS support to NVMf/RDMA.