Re: [Kernel 6.5] Important read()/write() performance regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Le samedi 30 septembre 2023 à 20:33 +0200, Greg KH a écrit :
> On Sat, Sep 30, 2023 at 07:07:06PM +0200, Florent DELAHAYE wrote:
> > Hello guys,
> > 
> > During the last few months, I felt a performance regression when
> > using
> > read() and write() on my high-speed Nvme SSD (about 7GB/s).
> > 
> > To get more precise information about it I quickly developed
> > benchmark
> > tool basically running read() or write() in a loop to simulate a
> > sequential file read or write. The tool also measures the real time
> > consumed by the loop. Finally, the tool can call open() with or
> > without
> > O_DIRECT.
> > 
> > I ran the tests on EXT4 and Exfat with following settings (buffer
> > values have been set for best result):  
> > - Write settings: buffer 400mb * 100  
> > - Read settings: buffer 200mb  
> > - Drop caches before non-direct read/write test
> > 
> > With this hardware:  
> > - CPU AMD Ryzen 7600X  
> > - RAM DDR5 5200 32GB  
> > - SSD Kingston Fury Renegade 4TB with 4K LBA
> > 
> > 
> > Here are some results I got with last upstream kernels (default
> > config):
> > +------------------+----------+------------------+-----------------
> > -+--
> > ----------------+------------------+------------------+
> > > ~42GB            | O_DIRECT | Linux 6.2.0      | Linux 6.3.0     
> > > |
> > Linux 6.4.0      | Linux 6.5.0      | Linux 6.5.5      |
> > +------------------+----------+------------------+-----------------
> > -+--
> > ----------------+------------------+------------------+
> > > Ext4 (sector 4k) |          |                  |                 
> > > | 
> > >                  |                  |
> > > Read             | no       | 7.2s (5800MB/s)  | 7.1s (5890MB/s) 
> > > |
> > 8.3s (5050MB/s)  | 13.2s (3180MB/s) | 13.2s (3180MB/s) |
> > > Write            | no       | 12.0s (3500MB/s) | 12.6s (3340MB/s)
> > > |
> > 12.2s (3440MB/s) | 28.9s (1450MB/s) | 28.9s (1450MB/s) |
> > > Read             | yes      | 6.0s (7000MB/s)  | 6.0s (7020MB/s) 
> > > |
> > 5.9s (7170MB/s)  | 5.9s (7100MB/s)  | 5.9s (7100MB/s)  |
> > > Write            | yes      | 6.7s (6220MB/s)  | 6.7s (6290MB/s) 
> > > |
> > 6.9s (6080MB/s)  | 6.9s (6080MB/s)  | 6.9s (6970MB/s)  |
> > > Exfat (sector ?) |          |                  |                 
> > > | 
> > >                  |                  |
> > > Read             | no       | 7.3s (5770MB/s)  | 7.2s (5830MB/s) 
> > > |
> > 9s (4620MB/s)    | 13.3s (3150MB/s) | 13.2s (3180MB/s) |
> > > Write            | no       | 8.3s (5040MB/s)  | 8.9s (4750MB/s) 
> > > |
> > 8.3s (5040MB/s)  | 18.3s (2290MB/s) | 18.5s (2260MB/s) |
> > > Read             | yes      | 6.2s (6760MB/s)  | 6.1s (6870MB/s) 
> > > |
> > 6.0s (6980MB/s)  | 6.5s (6440MB/s)  | 6.6s (6320MB/s)  |
> > > Write            | yes      | 16.1s (2610MB/s) | 16.0s (2620MB/s)
> > > |
> > 18.7s (2240MB/s) | 34.1s (1230MB/s) | 34.5s (1220MB/s) |
> > +------------------+----------+------------------+-----------------
> > -+--
> > ----------------+------------------+------------------+
> > 
> > Please note that I rounded some values to clarify readiness. Small
> > variations can be considered as margin error.
> > 
> > Ext4 results: cached reads/writes time have increased of almost
> > 100%
> > from 6.2.0 to 6.5.0 with a first increase with 6.4.0. Direct access
> > times have stayed similar though.  
> > Exfat results: performance decrease too with and without direct
> > access
> > this time.
> > 
> > I realize there are thousands of commits between, plus the issue
> > can
> > come from multiple kernel parts such as the page cache, the file
> > system
> > implementation (especially for Exfat), the IO engine, a driver,
> > etc.
> > The results also showed that there is not only a specific version
> > impacted. Anyway, at the end the performance have highly decreased.
> > 
> > If you want to verify my benchmark tool source code, please ask.
> 
> Have you tried something like fio instead of a new benchmark tool? 
> That
> way others can test and verify the results on their systems as that
> is a
> well-known and tested benchmark tool.

I understand. Yes I did and had similar results however I just ran it
again to record the average results. I have run following fio commands:

write: fio --name=test-write --filename=testfio.out --readwrite=write -
-blocksize=400m --size=41943040000 --ioengine=libaio --iodepth=2 (--
direct=1)

read: fio –name=test-read --filename=testfio.out --readwrite=read --
blocksize=200m --size=41943040000 --ioengine=libaio –iodepth=2 (--
direct=1)

See results below.

> Also, are you sure you just haven't been hit by the spectre fixes
> that
> slow down the I/O path a lot?  Be sure you have feature parity on
> those
> older kernels please.  Many of the ones you list above do NOT have
> those
> required changes.

Indeed! mitigations=off clearly mitigated the performance loss, thank
you! I thought it was not very relevant nowadays but it clearly still
is especially from 6.4.0 to 6.5.0, however there is still a trend as
you can see here (fio results with mitigations=off):

+-----------+----------+-----------------+------------------+----------
--------+------------------+
|   ~42GB   | O_DIRECT |   Linux 6.2.0   |   Linux 6.3.0    |   Linux
6.4.0    |   Linux 6.5.0    |
+-----------+----------+-----------------+------------------+----------
--------+------------------+
| Ext4 (4k) |          |                 |                  |         
|                  |
| Read      | no       | 9.0s (4620MB/s) | 9.0s (4600MB/s)  | 13.2s
(3180MB/s) | 13.6s (3080MB/s) |
| Write     | no       | 12s (3490MB/s)  | 12.2s (3430MB/s) | 12.0s
(3500MB/s) | 11.0s (3820MB/s) |
| Read      | yes      | 5.9s (7070MB/s) | 5.9s (7070MB/s)  | 5.8s
(7200MB/s)  | 5.8s (7200MB/s)  |
| Write     | yes      | 6.4s(6500MB/s)  | 6.4s (6530MB/s)  | 6.5s
(6430MB/s)  | 6.9s (6070MB/s)  |
+-----------+----------+-----------------+------------------+----------
--------+------------------+

-> So basically there are similar direct read/write timings, cached
writes are roughly constant too however cached reads take more time (9s
> 13s, +~50%).

How to make sure I have feature parity across all kernels? I run a
simple "make menuconfig" and exit with default options for compilation.

Regards

Florent DELAHAYE





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux