Vivek Goyal wrote: > o Found another issue during testing. Consider following hierarchy. > > root > / \ > R1 G1 > /\ > R2 W > > Generally in CFQ when readers and writers are running, reader immediately > preempts writers and hence reader gets the better bandwidth. In case of > hierarchical setup, it becomes little more tricky. In above diagram, G1 > is a group and R1, R2 are readers and W is writer tasks. > > Now assume W runs and then R1 runs and then R2 runs. After R2 has used its > time slice, if R1 is schedule in, after couple of ms, R1 will get backlogged > again in group G1, (streaming reader). But it will not preempt R1 as R1 is > also a reader and also because preemption across group is not allowed for > isolation reasons. Hence R2 will get backlogged in G1 and will get a > vdisktime much higher than W. So when G2 gets scheduled again, W will get > to run its full slice length despite the fact R2 is queue on same service > tree. > > The core issue here is that apart from regular preemptions (preemption > across classes), CFQ also has this special notion of preemption with-in > class and that can lead to issues active task is running in a differnt > group than where new queue gets backlogged. > > To solve the issue keep a track of this event (I am calling it late > preemption). When a group becomes eligible to run again, if late_preemption > is set, check if there are sync readers backlogged, and if yes, expire the > writer after one round of dispatch. > > This solves the issue of reader not getting enough bandwidth in hierarchical > setups. > > Signed-off-by: Vivek Goyal <vgoyal@xxxxxxxxxx> Conceptually a nice solution. The code gets a little tricky, but I guess any code dealing with these situations would end up that way :) Acked-by: Rik van Riel <riel@xxxxxxxxxx> -- All rights reversed. _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers