Hi. On čtvrtek 28. prosince 2017 12:19:17 CET Paolo Valente wrote: > To maximise responsiveness, BFQ raises the weight, and performs device > idling, for bfq_queues associated with processes deemed as > interactive. In particular, weight raising has a maximum duration, > equal to the time needed to start a large application. If a > weight-raised process goes on doing I/O beyond this maximum duration, > it loses weight-raising. > > This mechanism is evidently vulnerable to the following false > positives: I/O-bound applications that will go on doing I/O for much > longer than the duration of weight-raising. These applications have > basically no benefit from being weight-raised at the beginning of > their I/O. On the opposite end, while being weight-raised, these > applications > a) unjustly steal throughput to applications that may truly need > low latency; > b) make BFQ uselessly perform device idling; device idling results > in loss of device throughput with most flash-based storage, and may > increase latencies when used purposelessly. > > This commit adds a countermeasure to reduce both the above > problems. To introduce this countermeasure, we provide the following > extra piece of information (full details in the comments added by this > commit). During the start-up of the large application used as a > reference to set the duration of weight-raising, involved processes > transfer at most ~110K sectors each. Accordingly, a process initially > deemed as interactive has no right to be weight-raised any longer, > once transferred 110K sectors or more. > > Basing on this consideration, this commit early-ends weight-raising > for a bfq_queue if the latter happens to have received an amount of > service at least equal to 110K sectors (actually, a little bit more, > to keep a safety margin). I/O-bound applications that reach a high > throughput, such as file copy, get to this threshold much before the > allowed weight-raising period finishes. Thus this early ending of > weight-raising reduces the amount of time during which these > applications cause the problems described above. > > Signed-off-by: Paolo Valente <paolo.valente@xxxxxxxxxx> > --- > block/bfq-iosched.c | 81 > +++++++++++++++++++++++++++++++++++++++++++++++------ block/bfq-iosched.h | > 5 ++++ > block/bfq-wf2q.c | 3 ++ > 3 files changed, 80 insertions(+), 9 deletions(-) > > diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c > index 6f75015d18c0..ea48b5c8f088 100644 > --- a/block/bfq-iosched.c > +++ b/block/bfq-iosched.c > @@ -209,15 +209,17 @@ static struct kmem_cache *bfq_pool; > * interactive applications automatically, using the following formula: > * duration = (R / r) * T, where r is the peak rate of the device, and > * R and T are two reference parameters. > - * In particular, R is the peak rate of the reference device (see below), > - * and T is a reference time: given the systems that are likely to be > - * installed on the reference device according to its speed class, T is > - * about the maximum time needed, under BFQ and while reading two files in > - * parallel, to load typical large applications on these systems. > - * In practice, the slower/faster the device at hand is, the more/less it > - * takes to load applications with respect to the reference device. > - * Accordingly, the longer/shorter BFQ grants weight raising to interactive > - * applications. > + * In particular, R is the peak rate of the reference device (see > + * below), and T is a reference time: given the systems that are > + * likely to be installed on the reference device according to its > + * speed class, T is about the maximum time needed, under BFQ and > + * while reading two files in parallel, to load typical large > + * applications on these systems (see the comments on > + * max_service_from_wr below, for more details on how T is obtained). > + * In practice, the slower/faster the device at hand is, the more/less > + * it takes to load applications with respect to the reference device. > + * Accordingly, the longer/shorter BFQ grants weight raising to > + * interactive applications. > * > * BFQ uses four different reference pairs (R, T), depending on: > * . whether the device is rotational or non-rotational; > @@ -254,6 +256,60 @@ static int T_slow[2]; > static int T_fast[2]; > static int device_speed_thresh[2]; > > +/* > + * BFQ uses the above-detailed, time-based weight-raising mechanism to > + * privilege interactive tasks. This mechanism is vulnerable to the > + * following false positives: I/O-bound applications that will go on > + * doing I/O for much longer than the duration of weight > + * raising. These applications have basically no benefit from being > + * weight-raised at the beginning of their I/O. On the opposite end, > + * while being weight-raised, these applications > + * a) unjustly steal throughput to applications that may actually need > + * low latency; > + * b) make BFQ uselessly perform device idling; device idling results > + * in loss of device throughput with most flash-based storage, and may > + * increase latencies when used purposelessly. > + * > + * BFQ tries to reduce these problems, by adopting the following > + * countermeasure. To introduce this countermeasure, we need first to > + * finish explaining how the duration of weight-raising for > + * interactive tasks is computed. > + * > + * For a bfq_queue deemed as interactive, the duration of weight > + * raising is dynamically adjusted, as a function of the estimated > + * peak rate of the device, so as to be equal to the time needed to > + * execute the 'largest' interactive task we benchmarked so far. By > + * largest task, we mean the task for which each involved process has > + * to do more I/O than for any of the other tasks we benchmarked. This > + * reference interactive task is the start-up of LibreOffice Writer, > + * and in this task each process/bfq_queue needs to have at most ~110K > + * sectors transferred. > + * > + * This last piece of information enables BFQ to reduce the actual > + * duration of weight-raising for at least one class of I/O-bound > + * applications: those doing sequential or quasi-sequential I/O. An > + * example is file copy. In fact, once started, the main I/O-bound > + * processes of these applications usually consume the above 110K > + * sectors in much less time than the processes of an application that > + * is starting, because these I/O-bound processes will greedily devote > + * almost all their CPU cycles only to their target, > + * throughput-friendly I/O operations. This is even more true if BFQ > + * happens to be underestimating the device peak rate, and thus > + * overestimating the duration of weight raising. But, according to > + * our measurements, once transferred 110K sectors, these processes > + * have no right to be weight-raised any longer. > + * > + * Basing on the last consideration, BFQ ends weight-raising for a > + * bfq_queue if the latter happens to have received an amount of > + * service at least equal to the following constant. The constant is > + * set to slightly more than 110K, to have a minimum safety margin. > + * > + * This early ending of weight-raising reduces the amount of time > + * during which interactive false positives cause the two problems > + * described at the beginning of these comments. > + */ > +static const unsigned long max_service_from_wr = 120000; > + > #define RQ_BIC(rq) icq_to_bic((rq)->elv.priv[0]) > #define RQ_BFQQ(rq) ((rq)->elv.priv[1]) > > @@ -1352,6 +1408,7 @@ static void bfq_update_bfqq_wr_on_rq_arrival(struct > bfq_data *bfqd, if (old_wr_coeff == 1 && wr_or_deserves_wr) { > /* start a weight-raising period */ > if (interactive) { > + bfqq->service_from_wr = 0; > bfqq->wr_coeff = bfqd->bfq_wr_coeff; > bfqq->wr_cur_max_time = bfq_wr_duration(bfqd); > } else { > @@ -3665,6 +3722,12 @@ static void bfq_update_wr_data(struct bfq_data *bfqd, > struct bfq_queue *bfqq) bfqq->entity.prio_changed = 1; > } > } > + if (bfqq->wr_coeff > 1 && > + bfqq->wr_cur_max_time != bfqd->bfq_wr_rt_max_time && > + bfqq->service_from_wr > max_service_from_wr) { > + /* see comments on max_service_from_wr */ > + bfq_bfqq_end_wr(bfqq); > + } > } > /* > * To improve latency (for this or other queues), immediately > diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h > index fcd941008127..350c39ae2896 100644 > --- a/block/bfq-iosched.h > +++ b/block/bfq-iosched.h > @@ -337,6 +337,11 @@ struct bfq_queue { > * last transition from idle to backlogged. > */ > unsigned long service_from_backlogged; > + /* > + * Cumulative service received from the @bfq_queue since its > + * last transition to weight-raised state. > + */ > + unsigned long service_from_wr; > > /* > * Value of wr start time when switching to soft rt > diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c > index 4456eda34e48..4498c43245e2 100644 > --- a/block/bfq-wf2q.c > +++ b/block/bfq-wf2q.c > @@ -838,6 +838,9 @@ void bfq_bfqq_served(struct bfq_queue *bfqq, int served) > if (!bfqq->service_from_backlogged) > bfqq->first_IO_time = jiffies; > > + if (bfqq->wr_coeff > 1) > + bfqq->service_from_wr += served; > + > bfqq->service_from_backlogged += served; > for_each_entity(entity) { > st = bfq_entity_service_tree(entity); Building and running it routinely on my laptop didn't reveal any smoke (at least, yet), so Tested-by: Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx> (actually, currently my Tested-by applies to all pending BFQ patches too) Regards, Oleksandr