Vivek Goyal wrote:
o Found another issue during testing. Consider following hierarchy.
root
/ \
R1 G1
/\
R2 W
Generally in CFQ when readers and writers are running, reader immediately
preempts writers and hence reader gets the better bandwidth. In case of
hierarchical setup, it becomes little more tricky. In above diagram, G1
is a group and R1, R2 are readers and W is writer tasks.
Now assume W runs and then R1 runs and then R2 runs. After R2 has used its
time slice, if R1 is schedule in, after couple of ms, R1 will get backlogged
again in group G1, (streaming reader). But it will not preempt R1 as R1 is
also a reader and also because preemption across group is not allowed for
isolation reasons. Hence R2 will get backlogged in G1 and will get a
vdisktime much higher than W. So when G2 gets scheduled again, W will get
to run its full slice length despite the fact R2 is queue on same service
tree.
The core issue here is that apart from regular preemptions (preemption
across classes), CFQ also has this special notion of preemption with-in
class and that can lead to issues active task is running in a differnt
group than where new queue gets backlogged.
To solve the issue keep a track of this event (I am calling it late
preemption). When a group becomes eligible to run again, if late_preemption
is set, check if there are sync readers backlogged, and if yes, expire the
writer after one round of dispatch.
This solves the issue of reader not getting enough bandwidth in hierarchical
setups.
Signed-off-by: Vivek Goyal <vgoyal@xxxxxxxxxx>
Conceptually a nice solution. The code gets a little tricky,
but I guess any code dealing with these situations would end
up that way :)
Acked-by: Rik van Riel <riel@xxxxxxxxxx>
--
All rights reversed.
--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel