Le samedi 30 septembre 2023 à 20:33 +0200, Greg KH a écrit : > On Sat, Sep 30, 2023 at 07:07:06PM +0200, Florent DELAHAYE wrote: > > Hello guys, > > > > During the last few months, I felt a performance regression when > > using > > read() and write() on my high-speed Nvme SSD (about 7GB/s). > > > > To get more precise information about it I quickly developed > > benchmark > > tool basically running read() or write() in a loop to simulate a > > sequential file read or write. The tool also measures the real time > > consumed by the loop. Finally, the tool can call open() with or > > without > > O_DIRECT. > > > > I ran the tests on EXT4 and Exfat with following settings (buffer > > values have been set for best result): > > - Write settings: buffer 400mb * 100 > > - Read settings: buffer 200mb > > - Drop caches before non-direct read/write test > > > > With this hardware: > > - CPU AMD Ryzen 7600X > > - RAM DDR5 5200 32GB > > - SSD Kingston Fury Renegade 4TB with 4K LBA > > > > > > Here are some results I got with last upstream kernels (default > > config): > > +------------------+----------+------------------+----------------- > > -+-- > > ----------------+------------------+------------------+ > > > ~42GB | O_DIRECT | Linux 6.2.0 | Linux 6.3.0 > > > | > > Linux 6.4.0 | Linux 6.5.0 | Linux 6.5.5 | > > +------------------+----------+------------------+----------------- > > -+-- > > ----------------+------------------+------------------+ > > > Ext4 (sector 4k) | | | > > > | > > > | | > > > Read | no | 7.2s (5800MB/s) | 7.1s (5890MB/s) > > > | > > 8.3s (5050MB/s) | 13.2s (3180MB/s) | 13.2s (3180MB/s) | > > > Write | no | 12.0s (3500MB/s) | 12.6s (3340MB/s) > > > | > > 12.2s (3440MB/s) | 28.9s (1450MB/s) | 28.9s (1450MB/s) | > > > Read | yes | 6.0s (7000MB/s) | 6.0s (7020MB/s) > > > | > > 5.9s (7170MB/s) | 5.9s (7100MB/s) | 5.9s (7100MB/s) | > > > Write | yes | 6.7s (6220MB/s) | 6.7s (6290MB/s) > > > | > > 6.9s (6080MB/s) | 6.9s (6080MB/s) | 6.9s (6970MB/s) | > > > Exfat (sector ?) | | | > > > | > > > | | > > > Read | no | 7.3s (5770MB/s) | 7.2s (5830MB/s) > > > | > > 9s (4620MB/s) | 13.3s (3150MB/s) | 13.2s (3180MB/s) | > > > Write | no | 8.3s (5040MB/s) | 8.9s (4750MB/s) > > > | > > 8.3s (5040MB/s) | 18.3s (2290MB/s) | 18.5s (2260MB/s) | > > > Read | yes | 6.2s (6760MB/s) | 6.1s (6870MB/s) > > > | > > 6.0s (6980MB/s) | 6.5s (6440MB/s) | 6.6s (6320MB/s) | > > > Write | yes | 16.1s (2610MB/s) | 16.0s (2620MB/s) > > > | > > 18.7s (2240MB/s) | 34.1s (1230MB/s) | 34.5s (1220MB/s) | > > +------------------+----------+------------------+----------------- > > -+-- > > ----------------+------------------+------------------+ > > > > Please note that I rounded some values to clarify readiness. Small > > variations can be considered as margin error. > > > > Ext4 results: cached reads/writes time have increased of almost > > 100% > > from 6.2.0 to 6.5.0 with a first increase with 6.4.0. Direct access > > times have stayed similar though. > > Exfat results: performance decrease too with and without direct > > access > > this time. > > > > I realize there are thousands of commits between, plus the issue > > can > > come from multiple kernel parts such as the page cache, the file > > system > > implementation (especially for Exfat), the IO engine, a driver, > > etc. > > The results also showed that there is not only a specific version > > impacted. Anyway, at the end the performance have highly decreased. > > > > If you want to verify my benchmark tool source code, please ask. > > Have you tried something like fio instead of a new benchmark tool? > That > way others can test and verify the results on their systems as that > is a > well-known and tested benchmark tool. I understand. Yes I did and had similar results however I just ran it again to record the average results. I have run following fio commands: write: fio --name=test-write --filename=testfio.out --readwrite=write - -blocksize=400m --size=41943040000 --ioengine=libaio --iodepth=2 (-- direct=1) read: fio –name=test-read --filename=testfio.out --readwrite=read -- blocksize=200m --size=41943040000 --ioengine=libaio –iodepth=2 (-- direct=1) See results below. > Also, are you sure you just haven't been hit by the spectre fixes > that > slow down the I/O path a lot? Be sure you have feature parity on > those > older kernels please. Many of the ones you list above do NOT have > those > required changes. Indeed! mitigations=off clearly mitigated the performance loss, thank you! I thought it was not very relevant nowadays but it clearly still is especially from 6.4.0 to 6.5.0, however there is still a trend as you can see here (fio results with mitigations=off): +-----------+----------+-----------------+------------------+---------- --------+------------------+ | ~42GB | O_DIRECT | Linux 6.2.0 | Linux 6.3.0 | Linux 6.4.0 | Linux 6.5.0 | +-----------+----------+-----------------+------------------+---------- --------+------------------+ | Ext4 (4k) | | | | | | | Read | no | 9.0s (4620MB/s) | 9.0s (4600MB/s) | 13.2s (3180MB/s) | 13.6s (3080MB/s) | | Write | no | 12s (3490MB/s) | 12.2s (3430MB/s) | 12.0s (3500MB/s) | 11.0s (3820MB/s) | | Read | yes | 5.9s (7070MB/s) | 5.9s (7070MB/s) | 5.8s (7200MB/s) | 5.8s (7200MB/s) | | Write | yes | 6.4s(6500MB/s) | 6.4s (6530MB/s) | 6.5s (6430MB/s) | 6.9s (6070MB/s) | +-----------+----------+-----------------+------------------+---------- --------+------------------+ -> So basically there are similar direct read/write timings, cached writes are roughly constant too however cached reads take more time (9s > 13s, +~50%). How to make sure I have feature parity across all kernels? I run a simple "make menuconfig" and exit with default options for compilation. Regards Florent DELAHAYE