Fabio Checconi wrote: >>> Fabio Checconi wrote: >>>> - To detect hw tagging in BFQ we consider a sample valid iff the >>>> number of requests that the scheduler could have dispatched (given >>>> by cfqd->rb_queued + cfqd->rq_in_driver, i.e., the ones still into >>>> the scheduler plus the ones into the driver) is higher than the >>>> CFQ_HW_QUEUE_MIN threshold. This obviously caused no problems >>>> during testing, but the way CFQ uses now seems a little bit >>>> strange. >>> BFQ's tag detection logic is broken in the same way that CFQ's used to >>> be. Explanation is in this patch: >>> >> If you look at bfq_update_hw_tag(), the logic introduced by the patch >> you mention is still there; BFQ starts with ->hw_tag = 1, and updates it Yes, I missed that. So which part of CFQ's hw_tag detection is strange? >> every 32 valid samples. What changed WRT your patch, apart from the >> number of samples, is that the condition for a sample to be valid is: >> >> bfqd->rq_in_driver + bfqd->queued >= 5 >> >> while in your patch it is: >> >> cfqd->rq_queued > 5 || cfqd->rq_in_driver > 5 >> >> We preferred the first one because that sum better reflects the number >> of requests that could have been dispatched, and I don't think that this >> is wrong. I think it's fine too. CFQ's condition accounts for a few rare situations, such as the device stalling or hw_tag being updated right after a bunch of requests are queued. They are probably irrelevant, but can't hurt. >> There is a problem, but it's not within the tag detection logic itself. >> From some quick experiments, what happens is that when a process starts, >> CFQ considers it seeky (*), BFQ doesn't. As a side effect BFQ does not >> always dispatch enough requests to correctly detect tagging. >> >> At the first seek you cannot tell if the process is going to bee seeky >> or not, and we have chosen to consider it sequential because it improved >> fairness in some sequential workloads (the CIC_SEEKY heuristic is used >> also to determine the idle_window length in [bc]fq_arm_slice_timer()). >> >> Anyway, we're dealing with heuristics, and they tend to favor some >> workload over other ones. If recovering this thoughput loss is more >> important than a transient unfairness due to short idling windows assigned >> to sequential processes when they start, I've no problems in switching >> the CIC_SEEKY logic to consider a process seeky when it starts. >> >> Thank you for testing and for pointing out this issue, we missed it >> in our testing. >> >> >> (*) to be correct, the initial classification depends on the position >> of the first accessed sector. > > Sorry, I forgot the patch... This seems to solve the problem with > your workload here, does it work for you? Yes, it works fine now :) However, hw_tag detection (in CFQ and BFQ) is still broken in a few ways: * If you go from queue_depth=1 to queue_depth=large, it's possible that the detection logic fails. This could happen if setting queue_depth to a larger value at boot, which seems a reasonable situation. * It depends too much on the hardware. If you have a seekly load on a fast disk with a unit queue depth, idling sucks for performance (I imagine this is particularly bad on SSDs). If you have any disk with a deep queue, not idling sucks for fairness. I suppose CFQ's slice_resid is supposed to help here, but as far as I can tell, it doesn't do a thing. -- Aaron _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/virtualization