Hi Martin, CC'ing linux-scsi here, as aacraid doesn't have an official maintainer atm. --nab On Wed, 2013-02-20 at 16:38 +0100, Martin Svec wrote: > Hello, > > I've noticed read I/O starvation problems of LIO iSCSI target when > used on top of writeback-enabled HW RAID controller (PERC H700 with > 1GB cache). For intensive mixed read-write workload in virtualized > environments, writes are able to consume over 95% of the IOPS > throughput and cause starvation of reads. > > After a number of tests it seems to me it's a general issue of block > layer I/O scheduling when running on top of a writeback device. If > there is a write-intensive task, all writes go to the writeback cache > with near-zero latency. This allows writer to quickly saturate the > device with thousands of writes while using only a minimal fraction of > queue depth. However, non-cached reads depend on spinning drive > latencies which are orders of magnitude higher than writeback cache > latencies, and so readers cannot submit so many requests per second as > writers. Consequently, I guess the controller has totally wrong view > of the incoming workload pattern, tries to satisfy the write flood > first and the net result is inacceptable starvation of reads, with > latencies up to hundreds of milliseconds. > > A simple fio test with 1TiB block device where one thread does 4k > random sync writes with iodepth=32 and one thread does 4k random reads > with iodepth=32 shows that instead of the theoretical 50:50 IOPS > ratio, the block device runs with 95:5 ratio in favor of writes. In > fact, the imbalance is so high that even write iodepth=2 is enaugh to > achieve the same numbers. > > Real workloads that tend to exhibit this problem are: initial zeroing > of a virtual machine disk, virtual machine migration, virtual machine > cloning, intensive swapping of one virtual machine etc. > > I tried to set WCE=1 on target iblock device, played with queue > depths, tested all three I/O schedulers and their parameters, > controller's parameters, but with no luck. To achieve reasonably good > fairness, the only solution is to set nr_requests to 1 or disable > controller's writeback cache at all -- at the expense of degraded > overall performance :-( > > Regarding nr_requests, there's obvious relation between iodepths and > read starvation: if (nr_requests >= workload iodepth) then starvation > surely occurs. Lowering nr_requests below this threshold slowly starts > improving fairness and for every rd+wr iodepths pair, there exists > sufficiently low nr_requests value at which IOPS ratio is finally > balanced according to rd:wr iodepth ratio. Unfortunately it means > there is no minimal nr_requests value suitable for all workloads. For > iodepths around 2 to 8, only nr_requests=1 provides fair load balancing. > > Is this a known problem? Does anybody find block layer parameters that > elliminate this problem for iscsi-target storage in mixed random > read-write environments like virtualization? Or should I start writing > my own I/O scheduler? ;-) > > Update: I've just found https://lkml.org/lkml/2012/12/10/550 (Read > starvation by sync writes), where Jan Kara describes identical > symptoms. But setting nr_requests=10000 doesn't help in my case. > CC'ing LKML too (I'm not LKML subscriber). > > Thanks, > > Martin > > -- > To unsubscribe from this list: send the line "unsubscribe target-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html