On Tue, Jan 02, 2024 at 05:26:20PM +0100, David Sterba wrote: > On Fri, Dec 22, 2023 at 05:59:34PM +0800, kernel test robot wrote: > > > > > > Hello, > > > > kernel test robot noticed a -18.0% regression of stress-ng.link.ops_per_sec on: > > > > > > commit: 8d993618350c86da11cb408ba529c13e83d09527 ("btrfs: migrate get_eb_page_index() and get_eb_offset_in_page() to folios") > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > Unfortunatelly the conversion to folios adds a lot of assembly code and > we can't rely on constants like PAGE_SIZE anymore. The calculations in > extent buffer members are therefore slower, 18% is a lot but within my > expected range for metadta-only operations. > > This could be improved by caching some values, like folio_size, so it's > a dereference and not a calculation of "PAGE_SIZE << folio_order" with > conditionals around. You're in the unfortunate position of paying all the costs of a variable folio size while not getting the benefit of variable folio sizes ... There's no space in struct folio to cache folio_size(). It's an loff_t, so potentially huge. Also there are people who have designs on the remaining space in struct folio for a variety of purposes. Would it be better to be PAGE_SIZE * folio_nr_pages(), which is cached? That's at least dereference, then shift-variable-by-constant, rather than dereference, shift-constant-by-variable.