Am 07.09.22 um 18:06 schrieb Eric Dumazet:
On Wed, Sep 7, 2022 at 5:26 AM Alexandra Winter <wintera@xxxxxxxxxxxxx> wrote:
Since linear payload was removed even for single small messages,
an additional page is required and we are measuring performance impact.
3613b3dbd1ad ("tcp: prepare skbs for better sack shifting")
explicitely allowed "payload in skb->head for first skb put in the queue,
to not impact RPC workloads."
472c2e07eef0 ("tcp: add one skb cache for tx")
made that obsolete and removed it.
When
d8b81175e412 ("tcp: remove sk_{tr}x_skb_cache")
reverted it, this piece was not reverted and not added back in.
When running uperf with a request-response pattern with 1k payload
and 250 connections parallel, we measure 13% difference in throughput
for our PCI based network interfaces since 472c2e07eef0.
(our IO MMU is sensitive to the number of mapped pages)
Could you please consider allowing linear payload for the first
skb in queue again? A patch proposal is appended below.
No.
Please add a work around in your driver.
You can increase throughput by 20% by premapping a coherent piece of
memory in which
you can copy small skbs (skb->head included)
Something like 256 bytes per slot in the TX ring.
FWIW this regression was withthe standard mellanox driver (nothing s390 specific).