Am 23.01.2017 um 21:24 schrieb Sitsofe Wheeler:
On 23 January 2017 at 19:40, Tobias Oberstein
<tobias.oberstein@xxxxxxxxx> wrote:
Am 23.01.2017 um 20:13 schrieb Sitsofe Wheeler:
On 23 January 2017 at 18:33, Tobias Oberstein
<tobias.oberstein@xxxxxxxxx> wrote:
libaio is nowhere near what I get with engine=sync and high job counts.
Mmh.
Plus the strange behavior.
Have you tried batching the IOs and controlling how much are you
reaping at any one time? See
http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-iodepth_batch_submit
for some of the options for controlling this...
Thanks! Nice.
For libaio, and with all the hints applied (no 4k sectors yet), I get (4k
randread)
Individual NVMes: iops=7350.4K
MD (RAID-0) over NVMes: iops=4112.8K
The going up and down of IOPS is gone.
It's becoming more apparent I'd say, that tthere is a MD bottleneck though.
If you're "just" trying for higher IOPS you can also try gtod_reduce
(see http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-gtod_reduce
). This subsumes things like disable_lat but you'll get fewer and less
accurate measurement stats back. With libaio userspace reap
(http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-userspace_reap
) can sometimes nudge numbers up but at the cost of CPU.
Using that option plus bumping to QD=64 and batch submit 16, I get
plain NVMes: iops=7415.9K
MD over NVMes: iops=4112.4K
These are staggering numbers for sure!
In fact, the Intel P3608 4TB datasheet says: up to 850k random 4kB
Since we have 8 (physical) of these, the real world measurement (7.4
mio) is even above the datasheet (6.8 mio).
I'd say: very good job Intel =)
The price of course is the CPU load to reach these numbers .. we have
the 2nd largest Intel Xeon available
Intel(R) Xeon(R) CPU E7-8880 v4 @ 2.20GHz
and 4 of these .. and even that isn't enough to saturate these NVMe
beasts while still having room to do useful work (PostgreSQL).
So we're gonna be CPU bound .. again - this is the 2nd iteration of such
a box. The first one has 48 cores E7 v2 and 8 x P3700 2TB. Also CPU
bound on PostgreSQL anyway .. with 3TB RAM.
Cheers,
/Tobias
randread-individual-nvmes: (groupid=0, jobs=128): err= 0: pid=37454: Mon
Jan 23 22:12:30 2017
read : io=869361MB, bw=28968MB/s, iops=7415.9K, runt= 30011msec
cpu : usr=6.14%, sys=64.55%, ctx=59170293, majf=0, minf=8320
randread-md-over-nvmes: (groupid=1, jobs=128): err= 0: pid=37582: Mon
Jan 23 22:12:30 2017
read : io=481982MB, bw=16064MB/s, iops=4112.4K, runt= 30004msec
cpu : usr=3.88%, sys=95.88%, ctx=14209, majf=0, minf=6784
[global]
group_reporting
size=30G
ioengine=libaio
iodepth=64
iodepth_batch_submit=16
thread=1
direct=1
time_based=1
randrepeat=0
norandommap=1
disable_lat=1
gtod_reduce=1
bs=4k
runtime=30
[randread-individual-nvmes]
stonewall
filename=/dev/nvme0n1:/dev/nvme1n1:/dev/nvme2n1:/dev/nvme3n1:/dev/nvme4n1:/dev/nvme5n1:/dev/nvme6n1:/dev/nvme7n1:/dev/nvme8n1:/dev/nvme9n1:/dev/nvme10n1:/dev/nvme11n1:/dev/nvme12n1:/dev/nvme13n1:/dev/nvme14n1:/dev/nvme15n1
rw=randread
numjobs=128
[randread-md-over-nvmes]
stonewall
filename=/dev/md1
rw=randread
numjobs=128
--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html