Re: [RFC PATCH 0/1] Large folios in block buffered IO path

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29-Nov-24 5:01 AM, Mateusz Guzik wrote:
On Thu, Nov 28, 2024 at 12:24 PM Bharata B Rao <bharata@xxxxxxx> wrote:

On 28-Nov-24 10:07 AM, Bharata B Rao wrote:
On 28-Nov-24 9:52 AM, Matthew Wilcox wrote:
On Thu, Nov 28, 2024 at 09:31:50AM +0530, Bharata B Rao wrote:
However a point of concern is that FIO bandwidth comes down drastically
after the change.

         default                inode_lock-fix
rw=30%
Instance 1    r=55.7GiB/s,w=23.9GiB/s        r=9616MiB/s,w=4121MiB/s
Instance 2    r=38.5GiB/s,w=16.5GiB/s        r=8482MiB/s,w=3635MiB/s
Instance 3    r=37.5GiB/s,w=16.1GiB/s        r=8609MiB/s,w=3690MiB/s
Instance 4    r=37.4GiB/s,w=16.0GiB/s        r=8486MiB/s,w=3637MiB/s

Something this dramatic usually only happens when you enable a debugging
option.  Can you recheck that you're running both A and B with the same
debugging options both compiled in, and enabled?

It is the same kernel tree with and w/o Mateusz's inode_lock changes to
block/fops.c. I see the config remains same for both the builds.

Let me get a run for both base and patched case w/o running perf lock
contention to check if that makes a difference.

Without perf lock contention

                  default                         inode_lock-fix
rw=30%
Instance 1      r=54.6GiB/s,w=23.4GiB/s         r=11.4GiB/s,w=4992MiB/s
Instance 2      r=52.7GiB/s,w=22.6GiB/s         r=11.4GiB/s,w=4981MiB/s
Instance 3      r=53.3GiB/s,w=22.8GiB/s         r=12.7GiB/s,w=5575MiB/s
Instance 4      r=37.7GiB/s,w=16.2GiB/s         r=10.4GiB/s,w=4581MiB/s


per my other e-mail can you follow willy's suggestion and increase the hash?

With Mateusz's inode_lock fix and PAGE_WAIT_TABLE_BITS value of 10, 14, 16 and 20.
(Two values given with each instance below are FIO READ bw and WRITE bw)

                10              14              16              20
rw=30%
Instance 1      11.3GiB/s       14.2GiB/s       14.8GiB/s       14.9GiB/s
                4965MiB/s       6225MiB/s       6487MiB/s       6552MiB/s
Instance 2      12.3GiB/s       10.4GiB/s       10.9GiB/s       11.0GiB/s
                5389MiB/s       4548MiB/s       4770MiB/s       4815MiB/s
Instance 3      11.1GiB/s       12.3GiB/s       11.2GiB/s       13.5GiB/s
                4864MiB/s       5410MiB/s       4923MiB/s       5927MiB/s
Instance 4      12.3GiB/s       13.7GiB/s       13.0GiB/s       11.4GiB/s
                5404MiB/s       6004MiB/s       5689MiB/s       5007MiB/s

Number of hash buckets don't seem to matter all that much in this case.

Regards,
Bharata.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux