On Tue, Jul 30, 2024 at 3:13 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Tue, Jul 30, 2024 at 01:11:31AM +1200, Barry Song wrote: > > for this zRAM case, it is a new allocated large folio, only > > while all conditions are met, we will allocate and map > > the whole folio. you can check can_swapin_thp() and > > thp_swap_suitable_orders(). > > YOU ARE DOING THIS WRONGLY! > > All of you anonymous memory people are utterly fixated on TLBs AND THIS > IS WRONG. Yes, TLB performance is important, particularly with crappy > ARM designs, which I know a lot of you are paid to work on. But you > seem to think this is the only consideration, and you're making bad > design choices as a result. It's overly complicated, and you're leaving > performance on the table. > > Look back at the results Ryan showed in the early days of working on > large anonymous folios. Half of the performance win on his system came > from using larger TLBs. But the other half came from _reduced software > overhead_. The LRU lock is a huge problem, and using large folios cuts > the length of the LRU list, hence LRU lock hold time. > > Your _own_ data on how hard it is to get hold of a large folio due to > fragmentation should be enough to convince you that the more large folios > in the system, the better the whole system runs. We should not decline to > allocate large folios just because they can't be mapped with a single TLB! I am not convinced. for a new allocated large folio, even alloc_anon_folio() of do_anonymous_page() does the exactly same thing alloc_anon_folio() { /* * Get a list of all the (large) orders below PMD_ORDER that are enabled * for this vma. Then filter out the orders that can't be allocated over * the faulting address and still be fully contained in the vma. */ orders = thp_vma_allowable_orders(vma, vma->vm_flags, TVA_IN_PF | TVA_ENFORCE_SYSFS, BIT(PMD_ORDER) - 1); orders = thp_vma_suitable_orders(vma, vmf->address, orders); } you are not going to allocate a mTHP for an unaligned address for a new PF. Please point out where it is wrong. Thanks Barry