Hi All, We're running queue depth sweeps with a 4k random read workload (sample config below) against a high performance PCIe SSD - the Micron p320h. We're seeing latency spikes to 1 sec when the 'thread' option is used. Instrumenting the driver, we see max latencies from driver entry point to block layer completion callback of <20 ms at high queue depths. If 'thread' is not used, the max latencies reported by fio align almost exactly with that seen by the driver. There are typically only one or two of these latency outliers during a 40 sec run, for example, but they represent a significant enough excursion to pull our std. dev. very high. Has anyone witnessed this sort of behavior? We see it with all the versions of fio that we have used (2.0.5+) with a variety of kernels. It's also very suspicious that the max latency is either almost exactly 1 sec or aligns with our hardware incurred latency for the given queue depth. Thanks, -Sam [global] #thread group_reporting direct=1 time_based norandommap=1 refill_buffers runtime=40 ioengine=libaio filename=/dev/rssda bs=4k rw=randread [read-qd-256] numjobs=8 iodepth=32 stonewall [read-qd-248] numjobs=8 iodepth=31 stonewall [read-qd-240] numjobs=8 iodepth=30 stonewall [read-qd-232] numjobs=8 iodepth=29 stonewall <snip> [read-qd-64] numjobs=8 iodepth=8 stonewall [read-qd-56] numjobs=8 iodepth=7 stonewall [read-qd-48] numjobs=8 iodepth=6 stonewall [read-qd-40] numjobs=8 iodepth=5 stonewall [read-qd-32] numjobs=8 iodepth=4 stonewall [read-qd-24] numjobs=8 iodepth=3 stonewall [read-qd-16] numjobs=8 iodepth=2 stonewall [read-qd-8] numjobs=8 iodepth=1 stonewall [read-qd-4] numjobs=4 iodepth=1 stonewall [read-qd-1] numjobs=1 iodepth=1 stonewall -- To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html