On Mon, Mar 22, 2021 at 08:32:54PM +0000, Chuck Lever III wrote: > > It's not expected that the array implementation would be worse *unless* > > you are passing in arrays with holes in the middle. Otherwise, the success > > rate should be similar. > > Essentially, sunrpc will always pass an array with a hole. > Each RPC consumes the first N elements in the rq_pages array. > Sometimes N == ARRAY_SIZE(rq_pages). AFAIK sunrpc will not > pass in an array with more than one hole. Typically: > > .....PPPP > > My results show that, because svc_alloc_arg() ends up calling > __alloc_pages_bulk() twice in this case, it ends up being > twice as expensive as the list case, on average, for the same > workload. Can you call memmove() to shift all the pointers down to be the first N elements? That prevents creating a situation where we have PPPPPPPP (consume 6) ......PP (try to allocate 6, only 4 available) PPPP..PP instead, you'd do: PPPPPPPP (consume 6) PP...... (try to allocate 6, only 4 available) PPPPPP.. Alternatively, you could consume from the tail of the array instead of the head. Some CPUs aren't as effective about backwards walks as they are for forwards walks, but let's keep the pressure on CPU manufacturers to make better CPUs.