> -----Original Message----- > From: fio-owner@xxxxxxxxxxxxxxx <fio-owner@xxxxxxxxxxxxxxx> On Behalf Of > Mauricio Tavares > Sent: Wednesday, January 15, 2020 9:51 AM > Subject: CPUs, threads, and speed > ... > [global] > name=4k random write 4 ios in the queue in 32 queues > filename=/dev/nvme0n1 > ioengine=libaio > direct=1 > bs=4k > rw=randwrite > iodepth=4 > numjobs=32 > buffered=0 > size=100% > loops=2 > randrepeat=0 > norandommap > refill_buffers > > [job1] > > That is taking a ton of time, like days to go. Is there anything I can > do to speed it up? For instance, what is the default value for > cpus_allowed (or cpumask)[2]? Is it all CPUs? If not what would I gain > by throwing more cpus at the problem? > > I also read[2] by default fio uses fork. What would I get by going to > threads? > Jobs: 32 (f=32): [w(32)][10.8%][w=301MiB/s][w=77.0k IOPS][eta 06d:13h:56m:51s]] 77 kIOPs for random writes isn't bad - check your drive data sheet. If the drive is 1 TB, it should take 1 TB / (77k * 4 KiB) = 3170 s = 52.8 minutes to write the whole drive. Best practice is to use all CPU cores, lock threads to cores, and be NUMA aware. If the device is attached to physical CPU 0 and that CPU has 12 cores known to linux as 0-11 (per "lscpu" or "numactl --hardware"), try: iodepth=16 numjobs=12 cpus_allowed=0-11 cpus_allowed_policy=split Based on these: numjobs=32, size=100%, loops=2 fio will run each job for that many bytes, so a 1 TB drive will result in IOs for 64 TB rather than 1 TB. That could easily result in the multi-day estimate. Other nits: * thread - threading might be slightly more efficient than spawning full processes * gtod_reduce=1 - precision latency measurements don't matter for this * refill_buffers - presuming you don't care about the data contents, don't include this. zero_buffers is the simplest/fastest, unless you're concerned that the device might do compression or zero detection * norandommap - if you want it to hit each LBA a precise number of times, you can't include this; fio won't remember what it's done. There is a lot of overhead in keeping track, though.