On Wed, Aug 14, 2024 at 8:05 PM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote: > > On 2024/8/15 5:00, Alexander H Duyck wrote: ... > >> +static inline void page_frag_alloc_abort(struct page_frag_cache *nc, > >> + unsigned int fragsz) > >> +{ > >> + nc->pagecnt_bias++; > >> + nc->remaining += fragsz; > >> +} > >> + > > > > This doesn't add up. Why would you need abort if you have commit? Isn't > > this more of a revert? I wouldn't think that would be valid as it is > > possible you took some sort of action that might have resulted in this > > memory already being shared. We shouldn't allow rewinding the offset > > pointer without knowing that there are no other entities sharing the > > page. > > This is used for __tun_build_skb() in drivers/net/tun.c as below, mainly > used to avoid performance penalty for XDP drop case: Yeah, I reviewed that patch. As I said there, rewinding the offset should be avoided unless you can verify you are the only owner of the page as you have no guarantees that somebody else didn't take an access to the page/data to send it off somewhere else. Once you expose the page to any other entity it should be written off or committed in your case and you should move on to the next block. > > >> +static struct page *__page_frag_cache_reload(struct page_frag_cache *nc, > >> + gfp_t gfp_mask) > >> { > >> + struct page *page; > >> + > >> if (likely(nc->encoded_va)) { > >> - if (__page_frag_cache_reuse(nc->encoded_va, nc->pagecnt_bias)) > >> + page = __page_frag_cache_reuse(nc->encoded_va, nc->pagecnt_bias); > >> + if (page) > >> goto out; > >> } > >> > >> - if (unlikely(!__page_frag_cache_refill(nc, gfp_mask))) > >> - return false; > >> + page = __page_frag_cache_refill(nc, gfp_mask); > >> + if (unlikely(!page)) > >> + return NULL; > >> > >> out: > >> /* reset page count bias and remaining to start of new frag */ > >> nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; > >> nc->remaining = page_frag_cache_page_size(nc->encoded_va); > >> - return true; > >> + return page; > >> +} > >> + > > > > None of the functions above need to be returning page. > > Are you still suggesting to always use virt_to_page() even when it is > not really necessary? why not return the page here to avoid the > virt_to_page()? Yes. The likelihood of you needing to pass this out as a page should be low as most cases will just be you using the virtual address anyway. You are essentially trading off branching for not having to use virt_to_page. It is unnecessary optimization. > > >> +struct page *page_frag_alloc_pg(struct page_frag_cache *nc, > >> + unsigned int *offset, unsigned int fragsz, > >> + gfp_t gfp) > >> +{ > >> + unsigned int remaining = nc->remaining; > >> + struct page *page; > >> + > >> + VM_BUG_ON(!fragsz); > >> + if (likely(remaining >= fragsz)) { > >> + unsigned long encoded_va = nc->encoded_va; > >> + > >> + *offset = page_frag_cache_page_size(encoded_va) - > >> + remaining; > >> + > >> + return virt_to_page((void *)encoded_va); > >> + } > >> + > >> + if (unlikely(fragsz > PAGE_SIZE)) > >> + return NULL; > >> + > >> + page = __page_frag_cache_reload(nc, gfp); > >> + if (unlikely(!page)) > >> + return NULL; > >> + > >> + *offset = 0; > >> + nc->remaining = remaining - fragsz; > >> + nc->pagecnt_bias--; > >> + > >> + return page; > >> } > >> +EXPORT_SYMBOL(page_frag_alloc_pg); > > > > Again, this isn't returning a page. It is essentially returning a > > bio_vec without calling it as such. You might as well pass the bio_vec > > pointer as an argument and just have it populate it directly. > > I really don't think your bio_vec suggestion make much sense for now as > the reason mentioned in below: > > "Through a quick look, there seems to be at least three structs which have > similar values: struct bio_vec & struct skb_frag & struct page_frag. > > As your above agrument about using bio_vec, it seems it is ok to use any > one of them as each one of them seems to have almost all the values we > are using? > > Personally, my preference over them: 'struct page_frag' > 'struct skb_frag' > > 'struct bio_vec', as the naming of 'struct page_frag' seems to best match > the page_frag API, 'struct skb_frag' is the second preference because we > mostly need to fill skb frag anyway, and 'struct bio_vec' is the last > preference because it just happen to have almost all the values needed. That is why I said I would be okay with us passing page_frag in patch 12 after looking closer at the code. The fact is it should make the review of that patch set much easier if you essentially just pass the page_frag back out of the call. Then it could be used in exactly the same way it was before and should reduce the total number of lines of code that need to be changed. > Is there any specific reason other than the above "almost all the values you > are using are exposed by that structure already " that you prefer bio_vec?" > > 1. https://lore.kernel.org/all/ca6be29e-ab53-4673-9624-90d41616a154@xxxxxxxxxx/ My reason for preferring bio_vec is that of the 3 it is the most setup to be used as a local variable versus something stored in a struct such as page_frag or used for some specialty user case such as skb_frag_t. In addition it already has a set of helpers for converting it to a virtual address or copying data to and from it which would make it easier to get rid of a bunch of duplicate code.