On Mon, Oct 14, 2024 at 11:12 AM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote: > > When the frag just got a page, then may lead to regression on VM. > Specially if the sysctl net.core.high_order_alloc_disable value is 1, > then the frag always get a page when do refill. > > Which could see reliable crashes or scp failure (scp a file 100M in size > to VM): > > The issue is that the virtnet_rq_dma takes up 16 bytes at the beginning > of a new frag. When the frag size is larger than PAGE_SIZE, > everything is fine. However, if the frag is only one page and the > total size of the buffer and virtnet_rq_dma is larger than one page, an > overflow may occur. > > Here, when the frag size is not enough, we reduce the buffer len to fix > this problem. > > Fixes: f9dac92ba908 ("virtio_ring: enable premapped mode whatever use_dma_api") > Reported-by: "Si-Wei Liu" <si-wei.liu@xxxxxxxxxx> > Signed-off-by: Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> Though this may fix the problem and we need it now, I would like to have some tweaks in the future. Basically because of the security implication for sharing driver metadata with the device (most IOMMU can only do protection at the granule at the page). So we may end up with device-triggerable behaviour. For safety, we should use driver private memory for DMA metadata. > --- > drivers/net/virtio_net.c | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index f8131f92a392..59a99bbaf852 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -926,9 +926,6 @@ static void *virtnet_rq_alloc(struct receive_queue *rq, u32 size, gfp_t gfp) > void *buf, *head; > dma_addr_t addr; > > - if (unlikely(!skb_page_frag_refill(size, alloc_frag, gfp))) > - return NULL; > - > head = page_address(alloc_frag->page); > > if (rq->do_dma) { > @@ -2423,6 +2420,9 @@ static int add_recvbuf_small(struct virtnet_info *vi, struct receive_queue *rq, > len = SKB_DATA_ALIGN(len) + > SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > > + if (unlikely(!skb_page_frag_refill(len, &rq->alloc_frag, gfp))) > + return -ENOMEM; > + > buf = virtnet_rq_alloc(rq, len, gfp); > if (unlikely(!buf)) > return -ENOMEM; > @@ -2525,6 +2525,12 @@ static int add_recvbuf_mergeable(struct virtnet_info *vi, > */ > len = get_mergeable_buf_len(rq, &rq->mrg_avg_pkt_len, room); > > + if (unlikely(!skb_page_frag_refill(len + room, alloc_frag, gfp))) > + return -ENOMEM; This makes me think of another question, how could we guarantee len + room is less or equal to PAGE_SIZE. Especially considering we need extra head and tailroom for XDP? If we can't it is a bug: """ /** * skb_page_frag_refill - check that a page_frag contains enough room * @sz: minimum size of the fragment we want to get * @pfrag: pointer to page_frag * @gfp: priority for memory allocation * * Note: While this allocator tries to use high order pages, there is * no guarantee that allocations succeed. Therefore, @sz MUST be * less or equal than PAGE_SIZE. */ """ > + > + if (!alloc_frag->offset && len + room + sizeof(struct virtnet_rq_dma) > alloc_frag->size) > + len -= sizeof(struct virtnet_rq_dma); Any reason we need to check alloc_frag->offset? > + > buf = virtnet_rq_alloc(rq, len + room, gfp); Btw, as pointed out in previous review, we should have a consistent API: 1) hide the alloc_frag like virtnet_rq_alloc() or 2) pass the alloc_frag to virtnet_rq_alloc() > if (unlikely(!buf)) > return -ENOMEM; > -- > 2.32.0.3.g01195cf9f > Thanks