Re: [git pull] device mapper changes for 5.9

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Just to bring in some more context: the primary trigger that made us look into it was high p99 read latency on a random read workflow on modern-ish SATA SSD and NVME disks. That is, on average things looked fine, but some portions of requests, which required a small chunk of data to be fetched from the disk fast were stalled for an unreasonable amount of time.

Most modern IO intensive workflows probably have good provisions to deal with slow writes and usually when we write the data we care more about the average throughput, that is we have enough throughput to write all the incoming data to disk without losing it. On the contrary, there are many modern IO workflows which require small chunks of data to be fetched fast (distributed key-value stores, caching systems etc), thus the emphasis there is on latency of reads (vs throughput of writes). And this is where we think the synchronous behaviour provides most benefit.

Additionally if one cares about latency they will not use HDDs for the workflow and HDDs have much higher IO latency than CPU scheduling. Thus it does not make much sense to do any benchmarks on HDDs as the HDD latency will likely hide any improvement/degradation of the synchronous IO handling in dm-crypt.

But, even latency wise, in our testing on larger block sizes (>2M) the synchronous IO (read/writes) may show worse performance and without fully understanding why? we're probably not ready yet to recommend something as a default.

Regards,
Ignat

On Tue, Aug 18, 2020 at 9:40 PM John Dorminy <jdorminy@xxxxxxxxxx> wrote:
For what it's worth, I just ran two tests on a machine with dm-crypt
using the cipher_null:ecb cipher. Results are mixed; not offloading IO
submission can result in -27% to +23% change in throughput, in a
selection of three IO patterns HDDs and SSDs.

(Note that the IO submission thread also reorders IO to attempt to
submit it in sector order, so that is an additional difference between
the two modes -- it's not just "offload writes to another thread" vs
"don't offload writes".) The summary (for my FIO workloads focused on
parallelism) is that offloading is useful for high IO depth random
writes on SSDs, and for long sequential small writes on HDDs.
Offloading reduced throughput for immensely high IO depths on SSDs,
where I would guess lock contention is reducing effective IO depth to
the disk; and for low IO depths of sequential writes on HDDs, where I
would guess (as it would for a zoned device) preserving submission order
is better than attempting to reorder before submission.

Two test regimes: randwrite on 7xSamsung SSD 850 PRO 128G, somewhat
aged, behind a LSI MegaRAID card providing raid0. 6 processors
(Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz); 128G RAM; and seqwrite,
on a software raid0 (512k chunk size) of 4 HDDs on the same machine
specs. Scheduler 'none' for both. LSI card in writethrough cache mode.
All data in MB/s.


depth    jobs    bs    dflt    no_wq    %chg    raw disk
----------------randwrite, SSD--------------
128    1    4k    282    282    0    285
256    4    4k    251    183    -27    283
2048    4    4k    266    283    +6    284
1    4    1m    433    414    -4    403
----------------seqwrite, HDD---------------
128    1    4k    87    107    +23    86
256    4    4k    101    90     -11    91
2048    4    4k    273    233    -15    249
1    4    1m    144    146    +1    146

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux