On Fri, Aug 16, 2024 at 4:56 AM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote: > > On 2024/8/15 23:03, Alexander Duyck wrote: > > On Wed, Aug 14, 2024 at 8:10 PM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote: > >> > >> On 2024/8/15 0:13, Alexander H Duyck wrote: > >>> On Thu, 2024-08-08 at 20:37 +0800, Yunsheng Lin wrote: > >>>> Currently there is one 'struct page_frag' for every 'struct > >>>> sock' and 'struct task_struct', we are about to replace the > >>>> 'struct page_frag' with 'struct page_frag_cache' for them. > >>>> Before begin the replacing, we need to ensure the size of > >>>> 'struct page_frag_cache' is not bigger than the size of > >>>> 'struct page_frag', as there may be tens of thousands of > >>>> 'struct sock' and 'struct task_struct' instances in the > >>>> system. > >>>> > >>>> By or'ing the page order & pfmemalloc with lower bits of > >>>> 'va' instead of using 'u16' or 'u32' for page size and 'u8' > >>>> for pfmemalloc, we are able to avoid 3 or 5 bytes space waste. > >>>> And page address & pfmemalloc & order is unchanged for the > >>>> same page in the same 'page_frag_cache' instance, it makes > >>>> sense to fit them together. > >>>> > >>>> After this patch, the size of 'struct page_frag_cache' should be > >>>> the same as the size of 'struct page_frag'. > >>>> > >>>> CC: Alexander Duyck <alexander.duyck@xxxxxxxxx> > >>>> Signed-off-by: Yunsheng Lin <linyunsheng@xxxxxxxxxx> > >>>> --- > >>>> include/linux/mm_types_task.h | 16 +++++----- > >>>> include/linux/page_frag_cache.h | 52 +++++++++++++++++++++++++++++++-- > >>>> mm/page_frag_cache.c | 49 +++++++++++++++++-------------- > >>>> 3 files changed, 85 insertions(+), 32 deletions(-) > >>>> > >>>> diff --git a/include/linux/mm_types_task.h b/include/linux/mm_types_task.h > >>>> index b1c54b2b9308..f2610112a642 100644 > >>>> --- a/include/linux/mm_types_task.h > >>>> +++ b/include/linux/mm_types_task.h > >>>> @@ -50,18 +50,18 @@ struct page_frag { > >>>> #define PAGE_FRAG_CACHE_MAX_SIZE __ALIGN_MASK(32768, ~PAGE_MASK) > >>>> #define PAGE_FRAG_CACHE_MAX_ORDER get_order(PAGE_FRAG_CACHE_MAX_SIZE) > >>>> struct page_frag_cache { > >>>> - void *va; > >>>> -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) > >>>> + /* encoded_va consists of the virtual address, pfmemalloc bit and order > >>>> + * of a page. > >>>> + */ > >>>> + unsigned long encoded_va; > >>>> + > >>> > >>> Rather than calling this an "encoded_va" we might want to call this an > >>> "encoded_page" as that would be closer to what we are actually working > >>> with. We are just using the virtual address as the page pointer instead > >>> of the page struct itself since we need quicker access to the virtual > >>> address than we do the page struct. > >> > >> Calling it "encoded_page" seems confusing enough when calling virt_to_page() > >> with "encoded_page" when virt_to_page() is expecting a 'va', no? > > > > It makes about as much sense as calling it an "encoded_va". What you > > have is essentially a packed page struct that contains the virtual > > address, pfmemalloc flag, and order. So if you want you could call it > > "packed_page" too I suppose. Basically this isn't a valid virtual > > address it is a page pointer with some extra metadata packed in. > > I think we are all argeed that is not a valid virtual address by adding > the 'encoded_' part. > I am not really sure if "encoded_page" or "packed_page" is better than > 'encoded_va' here, as there is no 'page pointer' that is implied by > "encoded_page" or "packed_page" here. For 'encoded_va', at least there > is 'virtual address' that is implied by 'encoded_va', and that 'virtual > address' just happen to be page pointer. Basically we are using the page's virtual address to encode the page into the struct. If you look, "virtual" is a pointer stored in the page to provide the virtual address on some architectures. It also happens that we have virt_to_page which provides an easy way to get back and forth between the values. > Yes, you may say the 'pfmemalloc flag and order' part is about page, not > about 'va', I guess there is trade-off we need to make here if there is > not a perfect name for it and 'va' does occupy most bits of 'encoded_va'. The naming isn't really a show stopper one way or another. It was more the fact that you had several functions accessing it that were using the name "encoded_page" as I recall. That is why I thought it might make sense to rename it to that. Why have functions called "encoded_page_order" work with an "encoded_va" versus an "encoded_page". It makes it easier to logically lump them all together.