On Thu, Aug 31, 2023 at 03:21:53PM +0100, Ryan Roberts wrote: > On 25/08/2023 14:59, Matthew Wilcox (Oracle) wrote: > > By making release_pages() call folios_put(), we can get rid of the calls > > to compound_head() for the callers that already know they have folios. > > We can also get rid of the lock_batch tracking as we know the size of > > the batch is limited by folio_batch. > > - /* > > - * Make sure the IRQ-safe lock-holding time does not get > > - * excessive with a continuous string of pages from the > > - * same lruvec. The lock is held only if lruvec != NULL. > > - */ > > - if (lruvec && ++lock_batch == SWAP_CLUSTER_MAX) { > > SWAP_CLUSTER_MAX is 32. By using the folio_batch, I think you are limitted to 15 > in your batch, so I guess you could be taking/releasing the lock x2 as often? Is > there any perf implication? Yes, if the batch size is larger than 15, we'll take/release the lru lock more often. We could increase the size of the folio_batch if that becomes a problem. I'm not sure how often it's a problem; we already limit the number of folios to process to 15 in, eg, folio_batch_add_and_move(). I'm not really sure why this code gets to be special and hold the lock for twice as long as the callers of folio_batch_add_and_move. > > @@ -1040,6 +1020,40 @@ void release_pages(release_pages_arg arg, int nr) > > > > mem_cgroup_uncharge_list(&pages_to_free); > > free_unref_page_list(&pages_to_free); > > + folios->nr = 0; > > folio_batch_reinit(folios) ? I don't really like the abstraction here. Back to folio_batch_move_lru() as an example: for (i = 0; i < folio_batch_count(fbatch); i++) { struct folio *folio = fbatch->folios[i]; ... } ... folio_batch_reinit(fbatch); vs what I'd rather write: for (i = 0; i < fbatch->nr; i++) { struct folio *folio = fbatch->folios[i]; ... } ... fbatch->nr = 0; OK, we've successfully abstracted away that there is a member of folio_batch called 'nr', but we still have to go poking around inside folio_batch to extract the folio itself. So it's not like we've managed to make folio_batch a completely opaque type. And I don't think that folio_batch_count() is really all that much more descriptive than fbatch->nr. Indeed, I think the second one is easier to read; it's obviously a plain loop. I suppose that folio_batch_count() / _reinit() are easier to grep for than '>nr\>' but I don't think that's a particularly useful thing to do. We could add abstractions to get the folio_batch_folio(fbatch, i), but when we start to get into something like folio_batch_remove_exceptionals() (and there's something similar happening in this patchset where we strip out the hugetlb folios), you're messing with the internal structure of the folio_batch so much that you may as well not bother with any kind of abstraction. I'm really temped to rip out folio_batch_count() and folio_batch_reinit(). They don't seem useful enough.