On Wed, Oct 06, 2021 at 05:25:23PM +0800, Hsin-Yi Wang wrote: > Hi Matthew, > > We tested that the performance of readahead is regressed on multicore > arm64 platforms running on the 5.10 kernel. > - The platform we used: 8 cores (4x a53(small), 4x a73(big)) arm64 platform > - The command we used: ureadahead $FILE ($FILE is a 1MB+ pack file, > note that if the file size is small, it's not obvious to see the > regression) > > After we revert the commit c1f6925e1091("mm: put readahead pages in > cache earlier"), the readahead performance is back: > - time ureadahead $FILE: > - 5.10: 1m23.124s > - with c1f6925e1091 reverted: 0m3.323s > - other LTS kernel (eg. 5.4): 0m3.066s > > The slowest part is aops->readpage() in read_pages() called in > read_pages(ractl, &page_pool, false); (the 3rd in > page_cache_ra_unbounded()) What filesystem are you using? > static void read_pages(struct readahead_control *rac, struct list_head *pages, > bool skip_page) > { > ... > if (aops->readahead) { > ... > } else if (aops->readpages) { > ... > } else { > while ((page = readahead_page(rac))) { > aops->readpage(rac->file, page); // most of the time is > spent on this line > put_page(page); > } > } > ... > } > > We also found following metrics that are relevant: > - time ureadahead $FILE: > - 5.10 > - taskset ureadahead to a small core: 0m7.411s > - taskset ureadahead to a big core: 0m5.982s > compared to the original 1m23s, pining the ureadahead task on a > single core also solves the gap. > > Do you have any idea why moving pages to cache earlier then doing page > read later will cause such a difference? > > Thanks, > > Hsin-Yi