Re: [PATCH vhost v2 4/7] virtio_net: big mode support premapped

Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> · Tue, 23 Apr 2024 20:31:13 +0800

On Tue, 23 Apr 2024 12:36:42 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote:
> On Mon, Apr 22, 2024 at 3:24 PM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote:
> >
> > In big mode, pre-mapping DMA is beneficial because if the pages are not
> > used, we can reuse them without needing to unmap and remap.
> >
> > We require space to store the DMA address. I use the page.dma_addr to
> > store the DMA address from the pp structure inside the page.
> >
> > Every page retrieved from get_a_page() is mapped, and its DMA address is
> > stored in page.dma_addr. When a page is returned to the chain, we check
> > the DMA status; if it is not mapped (potentially having been unmapped),
> > we remap it before returning it to the chain.
> >
> > Based on the following points, we do not use page pool to manage these
> > pages:
> >
> > 1. virtio-net uses the DMA APIs wrapped by virtio core. Therefore,
> >    we can only prevent the page pool from performing DMA operations, and
> >    let the driver perform DMA operations on the allocated pages.
> > 2. But when the page pool releases the page, we have no chance to
> >    execute dma unmap.
> > 3. A solution to #2 is to execute dma unmap every time before putting
> >    the page back to the page pool. (This is actually a waste, we don't
> >    execute unmap so frequently.)
> > 4. But there is another problem, we still need to use page.dma_addr to
> >    save the dma address. Using page.dma_addr while using page pool is
> >    unsafe behavior.
> >
> > More:
> >     https://lore.kernel.org/all/CACGkMEu=Aok9z2imB_c5qVuujSh=vjj1kx12fy9N7hqyi+M5Ow@xxxxxxxxxxxxxx/
> >
> > Signed-off-by: Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx>
> > ---
> >  drivers/net/virtio_net.c | 123 ++++++++++++++++++++++++++++++++++-----
> >  1 file changed, 108 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 2c7a67ad4789..d4f5e65b247e 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -439,6 +439,81 @@ skb_vnet_common_hdr(struct sk_buff *skb)
> >         return (struct virtio_net_common_hdr *)skb->cb;
> >  }
> >
> > +static void sg_fill_dma(struct scatterlist *sg, dma_addr_t addr, u32 len)
> > +{
> > +       sg->dma_address = addr;
> > +       sg->length = len;
> > +}
> > +
> > +/* For pages submitted to the ring, we need to record its dma for unmap.
> > + * Here, we use the page.dma_addr and page.pp_magic to store the dma
> > + * address.
> > + */
> > +static void page_chain_set_dma(struct page *p, dma_addr_t addr)
> > +{
> > +       if (sizeof(dma_addr_t) > sizeof(unsigned long)) {
>
> Need a macro like PAGE_POOL_32BIT_ARCH_WITH_64BIT_DMA.
>
> > +               p->dma_addr = lower_32_bits(addr);
> > +               p->pp_magic = upper_32_bits(addr);
>
> And this uses three fields on page_pool which I'm not sure the other
> maintainers are happy with. For example, re-using pp_maing might be
> dangerous. See c07aea3ef4d40 ("mm: add a signature in struct page").
>
> I think a more safe way is to reuse page pool, for example introducing
> a new flag with dma callbacks?

If we use page pool, how can we chain the pages allocated for a packet?

Yon know the "private" can not be used.

If the pp struct inside the page is not safe, how about:

		struct {	/* Page cache and anonymous pages */
			/**
			 * @lru: Pageout list, eg. active_list protected by
			 * lruvec->lru_lock.  Sometimes used as a generic list
			 * by the page owner.
			 */
			union {
				struct list_head lru;

				/* Or, for the Unevictable "LRU list" slot */
				struct {
					/* Always even, to negate PageTail */
					void *__filler;
					/* Count page's or folio's mlocks */
					unsigned int mlock_count;
				};

				/* Or, free page */
				struct list_head buddy_list;
				struct list_head pcp_list;
			};
			/* See page-flags.h for PAGE_MAPPING_FLAGS */
			struct address_space *mapping;
			union {
				pgoff_t index;		/* Our offset within mapping. */
				unsigned long share;	/* share count for fsdax */
			};
			/**
			 * @private: Mapping-private opaque data.
			 * Usually used for buffer_heads if PagePrivate.
			 * Used for swp_entry_t if PageSwapCache.
			 * Indicates order in the buddy system if PageBuddy.
			 */
			unsigned long private;
		};

Or, we can map the private space of the page as a new structure.

Thanks.

>
> Thanks
>