Hi,
May I know if this is really an intended fix to post officially, or just
a workaround/probe to make the offset in page_frag happy when
net_high_order_alloc_disable is true? In case it's the former, even
though this could fix the issue, I would assume clamping to a smaller
page_frag than a regular page size for every buffer may have certain
performance regression for the merge-able buffer case? Can you justify
the performance impact with some benchmark runs with larger MTU and
merge-able rx buffers to prove the regression is negligible? You would
need to compare against where you don't have the inadvertent
virtnet_rq_dma cost on any page i.e. getting all 4 patches of this
series reverted. Both tests with net_high_order_alloc_disable set to on
and off are needed.
Thanks,
-Siwei
On 8/17/2024 6:20 AM, Xuan Zhuo wrote:
Hi, guys, I have a fix patch for this.
Could anybody test it?
Thanks.
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index af474cc191d0..426d68c2d01d 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2492,13 +2492,15 @@ static unsigned int get_mergeable_buf_len(struct receive_queue *rq,
{
struct virtnet_info *vi = rq->vq->vdev->priv;
const size_t hdr_len = vi->hdr_len;
- unsigned int len;
+ unsigned int len, max_len;
+
+ max_len = PAGE_SIZE - ALIGN(sizeof(struct virtnet_rq_dma), L1_CACHE_BYTES);
if (room)
- return PAGE_SIZE - room;
+ return max_len - room;
len = hdr_len + clamp_t(unsigned int, ewma_pkt_len_read(avg_pkt_len),
- rq->min_buf_len, PAGE_SIZE - hdr_len);
+ rq->min_buf_len, max_len - hdr_len);
return ALIGN(len, L1_CACHE_BYTES);
}