On Thu, Nov 28, 2024 at 12:24 PM Bharata B Rao <bharata@xxxxxxx> wrote: > > On 28-Nov-24 10:07 AM, Bharata B Rao wrote: > > On 28-Nov-24 9:52 AM, Matthew Wilcox wrote: > >> On Thu, Nov 28, 2024 at 09:31:50AM +0530, Bharata B Rao wrote: > >>> However a point of concern is that FIO bandwidth comes down drastically > >>> after the change. > >>> > >>> default inode_lock-fix > >>> rw=30% > >>> Instance 1 r=55.7GiB/s,w=23.9GiB/s r=9616MiB/s,w=4121MiB/s > >>> Instance 2 r=38.5GiB/s,w=16.5GiB/s r=8482MiB/s,w=3635MiB/s > >>> Instance 3 r=37.5GiB/s,w=16.1GiB/s r=8609MiB/s,w=3690MiB/s > >>> Instance 4 r=37.4GiB/s,w=16.0GiB/s r=8486MiB/s,w=3637MiB/s > >> > >> Something this dramatic usually only happens when you enable a debugging > >> option. Can you recheck that you're running both A and B with the same > >> debugging options both compiled in, and enabled? > > > > It is the same kernel tree with and w/o Mateusz's inode_lock changes to > > block/fops.c. I see the config remains same for both the builds. > > > > Let me get a run for both base and patched case w/o running perf lock > > contention to check if that makes a difference. > > Without perf lock contention > > default inode_lock-fix > rw=30% > Instance 1 r=54.6GiB/s,w=23.4GiB/s r=11.4GiB/s,w=4992MiB/s > Instance 2 r=52.7GiB/s,w=22.6GiB/s r=11.4GiB/s,w=4981MiB/s > Instance 3 r=53.3GiB/s,w=22.8GiB/s r=12.7GiB/s,w=5575MiB/s > Instance 4 r=37.7GiB/s,w=16.2GiB/s r=10.4GiB/s,w=4581MiB/s > per my other e-mail can you follow willy's suggestion and increase the hash? best case scenario this takes care of it and then some heuristic can be added how to autosize the thing. If someone feels like microoptimizing I also note there is magic infra to have the size hotpatchable into generated asm instead of it being read (see dentry cache as an example user). -- Mateusz Guzik <mjguzik gmail.com>