On Mon, 11 Nov 2024, Suren Baghdasaryan wrote:
To minimize memory overhead, vm_lock implementation is changed from using rw_semaphore (40 bytes) to an atomic (8 bytes) and several vm_area_struct members are moved into the last cacheline, resulting in a less fragmented structure:
I am not a fan of building a custom lock, replacing a standard one. How much do we really care about this? rwsems are quite optimized and are known to heavily affect mm performance altogether. ...
Performance measurements using pft test on x86 do not show considerable difference, on Pixel 6 running Android it results in 3-5% improvement in faults per second.
pft is a very micro benchmark, these results do not justify this change, imo. Thanks, Davidlohr