On Thu, 2018-03-08 at 09:41 +0100, Hannes Reinecke wrote: > IE the _entire_ request set is allocated as _one_ array, making it quite > hard to handle from the lower-level CPU caches. > Also the 'node' indicator doesn't really help us here, as the requests > have to be access by all CPUs in the shared tag case. > > Would it be possible move tags->rqs to become a _double_ pointer? > Then we would have only a shared lookup table, but the requests > themselves can be allocated per node, depending on the CPU map. > _And_ it should be easier on the CPU cache ... That is one possible solution. Another solution is to follow the approach from sbitmap: allocate a single array that is slightly larger than needed and use the elements in such a way that no two CPUs use the same cache line. Bart.