On Mon, Apr 8, 2024 at 6:39 AM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote: > > On 2024/4/8 1:52, Alexander H Duyck wrote: > > On Sun, 2024-04-07 at 21:08 +0800, Yunsheng Lin wrote: > >> We are above to use page_frag_alloc_*() API to not just > >> allocate memory for skb->data, but also use them to do > >> the memory allocation for skb frag too. Currently the > >> implementation of page_frag in mm subsystem is running > >> the offset as a countdown rather than count-up value, > >> there may have several advantages to that as mentioned > >> in [1], but it may have some disadvantages, for example, > >> it may disable skb frag coaleasing and more correct cache > >> prefetching > >> > >> We have a trade-off to make in order to have a unified > >> implementation and API for page_frag, so use a initial zero > >> offset in this patch, and the following patch will try to > >> make some optimization to aovid the disadvantages as much > >> as possible. > >> > >> 1. https://lore.kernel.org/all/f4abe71b3439b39d17a6fb2d410180f367cadf5c.camel@xxxxxxxxx/ > >> > >> CC: Alexander Duyck <alexander.duyck@xxxxxxxxx> > >> Signed-off-by: Yunsheng Lin <linyunsheng@xxxxxxxxxx> > >> --- > >> mm/page_frag_cache.c | 31 ++++++++++++++----------------- > >> 1 file changed, 14 insertions(+), 17 deletions(-) > >> > >> diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c > >> index a0f90ba25200..3e3e88d9af90 100644 > >> --- a/mm/page_frag_cache.c > >> +++ b/mm/page_frag_cache.c > >> @@ -67,9 +67,8 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, > >> unsigned int fragsz, gfp_t gfp_mask, > >> unsigned int align_mask) > >> { > >> - unsigned int size = PAGE_SIZE; > >> + unsigned int size, offset; > >> struct page *page; > >> - int offset; > >> > >> if (unlikely(!nc->va)) { > >> refill: > >> @@ -77,10 +76,6 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, > >> if (!page) > >> return NULL; > >> > >> -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) > >> - /* if size can vary use size else just use PAGE_SIZE */ > >> - size = nc->size; > >> -#endif > >> /* Even if we own the page, we do not use atomic_set(). > >> * This would break get_page_unless_zero() users. > >> */ > >> @@ -89,11 +84,18 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, > >> /* reset page count bias and offset to start of new frag */ > >> nc->pfmemalloc = page_is_pfmemalloc(page); > >> nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; > >> - nc->offset = size; > >> + nc->offset = 0; > >> } > >> > >> - offset = nc->offset - fragsz; > >> - if (unlikely(offset < 0)) { > >> +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) > >> + /* if size can vary use size else just use PAGE_SIZE */ > >> + size = nc->size; > >> +#else > >> + size = PAGE_SIZE; > >> +#endif > >> + > >> + offset = ALIGN(nc->offset, -align_mask); > >> + if (unlikely(offset + fragsz > size)) { > > > > Rather than using "ALIGN" with a negative value it would probably make > > more sense to use __ALIGN_KERNEL_MASK with ~align_mask. I am not sure > > how well the compiler sorts out the use of negatives to flip values > > that are then converted to masks with the "(a) - 1". > > The next patch will remove the '-' in '-align_mask' as the 'ALIGN' operation > is done in the inline helper. I am not sure that matter much to use > __ALIGN_KERNEL_MASK with ~align_mask? It is a matter of making the negations more obvious. Basically you could achieve the same alignment by doing: (offset + (~align_mask)) & ~(~align_mask) rather than: (offset + ((-align_mask) - 1)) & ~((-align_mask) - 1) I'm not sure the compiler will pick up on the fact that the two are identical and can save a number of operations. Also my suggested approach is closer to how it used to work. Technically the one you are using only works if align_mask is a negative power of 2.