On 2024/2/16 22:25, Wenjia Zhang wrote:
On 11.01.24 13:00, Wen Gu wrote:
This provides a way to {get|set} whether loopback-ism device supports
merging sndbuf with peer DMB to eliminate data copies between them.
echo 0 > /sys/devices/virtual/smc/loopback-ism/dmb_copy # support
echo 1 > /sys/devices/virtual/smc/loopback-ism/dmb_copy # not support
Besides the same confusing as Niklas already mentioned, the name of the option looks not clear enough to what it means.
What about:
echo 1 > /sys/devices/virtual/smc/loopback-ism/nocopy_support # merge mode
echo 0 > /sys/devices/virtual/smc/loopback-ism/nocopy_support # copy mode
OK, if we decide to keep the knobs, I will improve the name. Thanks!
The settings take effect after re-activating loopback-ism by:
echo 0 > /sys/devices/virtual/smc/loopback-ism/active
echo 1 > /sys/devices/virtual/smc/loopback-ism/active
After this, the link group related to loopback-ism will be flushed and
the sndbufs of subsequent connections will be merged or not merged with
peer DMB.
The motivation of this control is that the bandwidth will be highly
improved when sndbuf and DMB are merged, but when virtually contiguous
DMB is provided and merged with sndbuf, it will be concurrently accessed
on Tx and Rx, then there will be a bottleneck caused by lock contention
of find_vmap_area when there are many CPUs and CONFIG_HARDENED_USERCOPY
is set (see link below). So an option is provided.
Link: https://lore.kernel.org/all/238e63cd-e0e8-4fbf-852f-bc4d5bc35d5a@xxxxxxxxxxxxxxxxx/
Signed-off-by: Wen Gu <guwen@xxxxxxxxxxxxxxxxx>
---
We tried some simple workloads, and the performance of the no-copy case was remarkable. Thus, we're wondering if it is
necessary to have the tunable setting in this loopback case? Or rather, why do we need the copy option? Is that because
of the bottleneck caused by using the combination of the no-copy and virtually contiguours DMA? Or at least let no-copy
as the default one.
Yes, it is because the bottleneck caused by using the combination of the no-copy
and virtual-DMB. If we have to use virtual-DMB and CONFIG_HARDENED_USERCOPY is
set, then we may be forced to use copy mode in many CPUs environment, to get the
good latency performance (the bandwidth performance still drop because of copy mode).
But if we agree that physical-DMB is acceptable (it costs 1 physical buffer per conn side
in loopback-ism no-copy mode, same as what sndbuf costs when using s390 ISM), then
there is no such performance issue and the two knobs can be removed. (see also the reply
for 13/15 patch [1]).
[1] https://lore.kernel.org/netdev/442061eb-107a-421d-bc2e-13c8defb0f7b@xxxxxxxxxxxxxxxxx/
Thanks!