Re: [PATCH v4 0/3] dioread_nolock patch

Jiaying Zhang <jiayingz@xxxxxxxxxx> · Wed, 17 Feb 2010 11:34:32 -0800

Hi Darrick,

Thank you for running these tests!

On Tue, Feb 16, 2010 at 1:07 PM, Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
> On Fri, Jan 15, 2010 at 02:30:09PM -0500, Theodore Ts'o wrote:
>
>> The plan is to merge this for 2.6.34.  I've looked this over pretty
>> carefully, but another pair of eyes would be appreciated, especially if
>
> I don't have a high speed disk but it was suggested that I give this patchset a
> whirl anyway, so down the rabbit hole I went.  I created a 16GB ext4 image in
> an equally big tmpfs, then ran the read/readall directio tests in ffsb to see
> if I could observe any difference.  The kernel is 2.6.33-rc8, and the machine
> in question has 2 Xeon E5335 processors and 24GB of RAM.  I reran the test
> several times, with varying thread counts, to produce the table below.  The
> units are MB/s.
>
> For the dio_lock case, mount options were: rw,relatime,barrier=1,data=ordered.
> For the dio_nolock case, they were: rw,relatime,barrier=1,data=ordered,dioread_nolock.
>
>        dio_nolock      dio_lock
> threads read    readall read    readall
> 1       37.6    149     39      159
> 2       59.2    245     62.4    246
> 4       114     453     112     445
> 8       111     444     115     459
> 16      109     442     113     448
> 32      114     443     121     484
> 64      106     422     108     434
> 128     104     417     101     393
> 256     101     412     90.5    366
> 512     93.3    377     84.8    349
> 1000    87.1    353     88.7    348
>
> It would seem that the old code paths are faster with a small number of
> threads, but the new patch seems to be faster when the thread counts become
> very high.  That said, I'm not all that familiar with what exactly tmpfs does,
> or how well it mimicks an SSD (though I wouldn't be surprised to hear
> "poorly").  This of course makes me wonder--do other people see results like
> this, or is this particular to my harebrained setup?
The dioread_nolock patch set is to eliminate the need of holding i_mutex lock
during DIO read. That is why we usually see more improvements as the number
of threads increases on high-speed SSDs. The performance difference is
also more obvious as the bandwidth of device increases.

I am surprised to see around 6% performance drop on single thread case.
The dioread_nolock patches change the ext4 buffer write code path a lot but on
the dio read code path, the only change is to not grab the i_mutex lock.
I haven't seen such difference in my tests. I mostly use fio test for
performance
comparison. I will give ffsb test a try.

Meanwhile, could you also post the stdev numbers?

>
> For that matter, do I need to have more patches than just 2.6.33-rc8 and the
> four posted in this thread?
>
> I also observed that I could make the kernel spit up "Process hung for more
> than 120s!" messages if I happened to be running ffsb on a real disk during a
> heavy directio write load.  I'll poke around on that a little more and write
> back when I have more details.

Did the hang happen only with dioread_nolock or it also happened without
the patches applied? It is not surprising to see such messages on slow disk
since the processes are all waiting for IOs.

>
> For poweroff testing, could one simulate a power failure by running IO
> workloads in a VM and then SIGKILLing the VM?  I don't remember seeing any sort
> of powerfail test suite from the Googlers, but my mail client has been drinking
> out of firehoses lately. ;)
As far as I know, these numbers are not posted yet but will come out soon.

Jiaying
>
> --D
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html