Hi, I’m trying to isolate a performance issue I’ve noticed on a Supermicro A2SAV [1] motherboard, have noticed a difference in behavior on Marvell vs Intel SATA ports and wondering if anyone on this list has any suggestions. I’m using a 4.1.2 kernel and out of kernel build of IET iSCSI Enterprise target (for legacy reasons) to export a SATA connected HDD (TOSHIBA MQ01ABD1). On first attempt to bring up this hardware I noticed read performance on benchmark tests (Iometer 1M sequential read workload) was about 70MB/s, or approximately 30-40% slower than what we see on a similar hardware platform. I noticed this slow performance only when connected to the Marvell 88SE9230 ports on the motherboard, when I used the Intel ports (Atom E3940 SoC) the problem disappeared. I noticed if I turned off NCQ on the HDD when connected to the Marvell ports (echo 1 > /sys/class/block/sdb/device/queue_depth) the performance goes back to what I would expect, around 108 MB/s. I’ve captured SATA traces of the activity on the HDD using a SATA analyzer and noticed a difference in pending IO as well as IO throughput, latency and response time on the 88SE9230 when compared to the Intel port. Here are relevant screenshots: Intel port with iSCSI reads: https://photos.app.goo.gl/MIcZpraFvBw9DazG3 Marvell port with iSCSI reads: https://photos.app.goo.gl/IbPgLZyuNkXpROWQ2 As you can see, the Intel SATA port config is able to keep the pending IO queue depth near 7 for the duration of the transfer, however the Marvell port bounces around between 1 and 7, and io latency/response times are similarly variable and overall longer than with the Intel SATA config. I can share the full traces as well if anyone is interested. I don’t see obvious differences in terms of sequential data access or size of transfers (all are 256KB). I’ve tried to reproduce without the iSCSI target and have not yet been able to do so. I was able to demonstrate similar queue behavior, however, using aio-stress [2] and commands sudo ./aio-stress -c 1 -t 1 -O -o 1 -r 16K /dev/sdb. See the pending IO depth, throughput and latencies in this case at https://photos.app.goo.gl/HEcgVftVR7gjASc78 . Overall read performance matched the Intel chipset performance at ~108 MB/s however. I haven’t tried moving to the latest kernel, and I think it would take some effort to get the IET target to run on a latest kernel, however I could compare aio-stress queue depth with most recent kernels if that might be interesting. My main questions are whether the pending IO and queue depth behavior difference observation on Marvell vs Intel is expected and if there are any things other than disabling NCQ I can try to improve performance in this scenario on the Marvell 88SE9230 ports. Thanks for reading and for any suggestions! Dan [1] https://www.supermicro.com/products/motherboard/Atom/A2SAV.cfm [2] https://www.vi4io.org/tools/benchmarks/aio-stress -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html