On Tue, Aug 20, 2024 at 12:44:46PM -0700, Si-Wei Liu wrote: > > > On 8/20/2024 12:19 AM, Xuan Zhuo wrote: > > leads to regression on VM with the sysctl value of: > > > > - net.core.high_order_alloc_disable=1 > > > > which could see reliable crashes or scp failure (scp a file 100M in size > > to VM): > > > > The issue is that the virtnet_rq_dma takes up 16 bytes at the beginning > > of a new frag. When the frag size is larger than PAGE_SIZE, > > everything is fine. However, if the frag is only one page and the > > total size of the buffer and virtnet_rq_dma is larger than one page, an > > overflow may occur. In this case, if an overflow is possible, I adjust > > the buffer size. If net.core.high_order_alloc_disable=1, the maximum > > buffer size is 4096 - 16. If net.core.high_order_alloc_disable=0, only > > the first buffer of the frag is affected. > > > > Fixes: f9dac92ba908 ("virtio_ring: enable premapped mode whatever use_dma_api") > > Reported-by: "Si-Wei Liu" <si-wei.liu@xxxxxxxxxx> > > Closes: http://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@xxxxxxxxxx > > Signed-off-by: Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> > > --- > > drivers/net/virtio_net.c | 12 +++++++++--- > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > index c6af18948092..e5286a6da863 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -918,9 +918,6 @@ static void *virtnet_rq_alloc(struct receive_queue *rq, u32 size, gfp_t gfp) > > void *buf, *head; > > dma_addr_t addr; > > - if (unlikely(!skb_page_frag_refill(size, alloc_frag, gfp))) > > - return NULL; > > - > > head = page_address(alloc_frag->page); > > dma = head; > > @@ -2421,6 +2418,9 @@ static int add_recvbuf_small(struct virtnet_info *vi, struct receive_queue *rq, > > len = SKB_DATA_ALIGN(len) + > > SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > > + if (unlikely(!skb_page_frag_refill(len, &rq->alloc_frag, gfp))) > > + return -ENOMEM; > > + > Do you want to document the assumption that small packet case won't end up > crossing the page frag boundary unlike the mergeable case? Add a comment > block to explain or a WARN_ON() check against potential overflow would work > with me. > > > buf = virtnet_rq_alloc(rq, len, gfp); > > if (unlikely(!buf)) > > return -ENOMEM; > > @@ -2521,6 +2521,12 @@ static int add_recvbuf_mergeable(struct virtnet_info *vi, > > */ > > len = get_mergeable_buf_len(rq, &rq->mrg_avg_pkt_len, room); > > + if (unlikely(!skb_page_frag_refill(len + room, alloc_frag, gfp))) > > + return -ENOMEM; > > + > > + if (!alloc_frag->offset && len + room + sizeof(struct virtnet_rq_dma) > alloc_frag->size) > > + len -= sizeof(struct virtnet_rq_dma); > > + > This could address my previous concern for possibly regressing every buffer > size for the mergeable case, thanks. Though I still don't get why carving up > a small chunk from page_frag for storing the virtnet_rq_dma metadata, this > would cause perf regression on certain MTU size 4Kbyte MTU exactly? > that happens to end up with > one more base page (and an extra descriptor as well) to be allocated > compared to the previous code without the extra virtnet_rq_dma content. How > hard would it be to allocate a dedicated struct to store the related > information without affecting the (size of) datapath pages? > > FWIW, out of the code review perspective, I've looked up the past > conversations but didn't see comprehensive benchmark was done before > removing the old code and making premap the sole default mode. Granted this > would reduce the footprint of additional code and the associated maintaining > cost immediately, but I would assume at least there should have been > thorough performance runs upfront to guarantee no regression is seen with > every possible use case, or the negative effect is comparatively negligible > even though there's slight regression in some limited case. If that kind of > perf measurement hadn't been done before getting accepted/merged, I think at > least it should allow both modes to coexist for a while such that every user > could gauge the performance effect. > > Thanks, > -Siwei > > > buf = virtnet_rq_alloc(rq, len + room, gfp); > > if (unlikely(!buf)) > > return -ENOMEM;