> On Mar 13, 2021, at 11:39 AM, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Sat, Mar 13, 2021 at 01:16:48PM +0000, Mel Gorman wrote: >>> I'm not claiming the pagevec is definitely a win, but it's very >>> unclear which tradeoff is actually going to lead to better performance. >>> Hopefully Jesper or Chuck can do some tests and figure out what actually >>> works better with their hardware & usage patterns. >> >> The NFS user is often going to need to make round trips to get the pages it >> needs. The pagevec would have to be copied into the target array meaning >> it's not much better than a list manipulation. > > I don't think you fully realise how bad CPUs are at list manipulation. > See the attached program (and run it on your own hardware). On my > less-than-a-year-old core-i7: > > $ gcc -W -Wall -O2 -g array-vs-list.c -o array-vs-list > $ ./array-vs-list > walked sequential array in 0.001765s > walked sequential list in 0.002920s > walked sequential array in 0.001777s > walked shuffled list in 0.081955s > walked shuffled array in 0.007367s > > If you happen to get the objects in-order, it's only 64% worse to walk > a list as an array. If they're out of order, it's *11.1* times as bad. > <array-vs-list.c> IME lists are indeed less CPU-efficient, but I wonder if that expense is insignificant compared to serialization primitives like disabling and re-enabling IRQs, which we are avoiding by using bulk page allocation. My initial experience with the current interface left me feeling uneasy about re-using the lru list field. That seems to expose an internal API feature to consumers of the page allocator. If we continue with a list-centric bulk allocator API I hope there can be some conveniently-placed documentation that explains when it is safe to use that field. Or perhaps the field should be renamed. I have a mild preference for an array-style interface because that's more natural for the NFSD consumer, but I'm happy to have a bulk allocator either way. Purely from a code-reuse point of view, I wonder how many consumers of alloc_pages_bulk() will be like svc_alloc_arg(), where they need to fill in pages in an array. Each such consumer would need to repeat the logic to convert the returned list into an array. We have, for instance, release_pages(), which is an array-centric page allocator API. Maybe a helper function or two might prevent duplication of the list conversion logic. And I agree with Mel that passing a single large array seems more useful then having to build code at each consumer call-site to iterate over smaller page_vecs until that array is filled. -- Chuck Lever