On 2025/1/17 0:09, Jesper Dangaard Brouer wrote: ... >> Mainly because there is no space available for keeping tracking of inflight >> pages, using page->pp can only find the page_pool owning the page, but page_pool >> is not able to keep track of the inflight page when the page is handled by >> networking stack. >> >> By using page_pool_item as below, the state is used to tell if a specific >> item is being used/dma mapped or not by scanning all the item blocks in >> pool->item_blocks. If a specific item is used by a page, then 'pp_netmem' >> will point to that page so that dma unmapping can be done for that page >> when page_pool_destroy() is called, otherwise free items sit in the >> pool->hold_items or pool->release_items by using 'lentry': >> >> struct page_pool_item { >> unsigned long state; >> >> union { >> netmem_ref pp_netmem; >> struct llist_node lentry; >> }; >> }; > > pahole -C page_pool_item vmlinux > struct page_pool_item { > /* An 'encoded_next' is a pointer to next item, lower 2 bits is used to > * indicate the state of current item. > */ > long unsigned int encoded_next; /* 0 8 */ > union { > netmem_ref pp_netmem; /* 8 8 */ > struct llist_node lentry; /* 8 8 */ > }; /* 8 8 */ > > /* size: 16, cachelines: 1, members: 2 */ > /* last cacheline: 16 bytes */ > }; > > >> When a page is added to the page_pool, a item is deleted from pool->hold_items >> or pool->release_items and set the 'pp_netmem' pointing to that page and set >> 'state' accordingly in order to keep track of that page. >> >> When a page is deleted from the page_pool, it is able to tell which page_pool >> this page belong to by using the below function, and after clearing the 'state', >> the item is added back to pool->release_items so that the item is reused for new >> pages. >> > > To understand below, I'm listing struct page_pool_item_block for other > reviewers: > > pahole -C page_pool_item_block vmlinux > struct page_pool_item_block { > struct page_pool * pp; /* 0 8 */ > struct list_head list; /* 8 16 */ > unsigned int flags; /* 24 4 */ > refcount_t ref; /* 28 4 */ > struct page_pool_item items[]; /* 32 0 */ > > /* size: 32, cachelines: 1, members: 5 */ > /* last cacheline: 32 bytes */ > }; > >> static inline struct page_pool_item_block * >> page_pool_item_to_block(struct page_pool_item *item) >> { >> return (struct page_pool_item_block *)((unsigned long)item & PAGE_MASK); > > This trick requires some comments explaining what is going on! > Please correct me if I'm wrong: Here you a masking off the lower bits of > the pointer to page_pool_item *item, as you know that a struct > page_pool_item_block is stored in the top of a struct page. This trick > is like a "container_of" for going from page_pool_item to > page_pool_item_block, right? Yes, you are right. > > I do notice that you have a comment above struct page_pool_item_block > (that says "item_block is always PAGE_SIZE"), which is nice, but to be > more explicit/clear: > I want a big comment block (placed above the main code here) that > explains the design and intention behind this newly invented > "item-block" scheme, like e.g. the connection between > page_pool_item_block and page_pool_item. Like the advantage/trick that > allows page->pp pointer to be an "item" and be mapped back to a "block" > to find the page_pool object it belongs to. Don't write *what* the code > does, but write about the intended purpose and design reasons behind the > code. The comment for page_pool_item_block is below, it seems I also wrote about intended purpose and design reasons here. /* The size of item_block is always PAGE_SIZE, so that the address of item_block * for a specific item can be calculated using 'item & PAGE_MASK' */ Anyway, If putting something like above for page_pool_item_to_block() does make it clearer, will add some comment for page_pool_item_to_block() too. > > >> } >> >> static inline struct page_pool *page_pool_get_pp(struct page *page) >> { >> return page_pool_item_to_block(page->pp_item)->pp; >> } >> >> >>> >>>> +static void __page_pool_item_init(struct page_pool *pool, struct page *page) >>>> +{ >>> >>> Function name is confusing. First I though this was init'ing a single >>> item, but looking at the code it is iterating over ITEMS_PER_PAGE. >>> >>> Maybe it should be called page_pool_item_block_init ? >> >> The __page_pool_item_init() is added to make the below >> page_pool_item_init() function more readable or maintainable, changing >> it to page_pool_item_block_init doesn't seems consistent? > > You (of-cause) also have to rename the other function, I though that was > implicitly understood. > > BUT does my suggested rename make sense? What I'm seeing is that all > the *items* in the "block" is getting inited. But we are also setting up > the "block" (e.g. "block->pp=pool"). I am not really sure about that, as using the PAGE_SIZE block to hold the item seems like a implementation detail, which might change in the future, renaming other function to something like that doesn't seem right to me IMHO. Also the next patch will add page_pool_item_blk_add() to support unlimited inflight pages, it seems a better name is needed for that too, perheps rename page_pool_item_blk_add() to page_pool_dynamic_item_add()? For __page_pool_item_init(), perhaps just inline it back to page_pool_item_init() as __page_pool_item_init() is only used by page_pool_item_init(), and both of them are not really large function.