On 11/05/2022 03:07, Ming Lei wrote:
Hi Ming,
Spreading the memory out does probably make sense, but we need to retain
the fast normal case. Making sbitmap support both, selected at init
time, would be far more likely to be acceptable imho.
I wanted to keep the code changes minimal for an initial RFC to test the
water.
My original approach did not introduce the extra load for normal path and
had some init time selection for a normal word map vs numa word map, but the
code grew and became somewhat unmanageable. I'll revisit it to see how to
improve that.
I understand this approach just splits shared sbitmap into per-numa-node
part, but what if all IOs are just from CPUs in one same numa node? Doesn't
this way cause tag starvation and waste?
We would not do this. If we can't find a free bit in one node then we
need to check the others before giving up. This is some of the added
complexity which I hinted at. And things like batch get or RR support
become more complex.
Alternatively we could have the double pointer for numa spreading only,
which would make things simpler. I need to check which is overall
better. Adding the complexity for dealing with numa node sub-arrays may
affect performance also.
Thanks,
John