On Sun, Jun 27, 2010 at 5:44 PM, Jeff Moyer <jmoyer@xxxxxxxxxx> wrote: > Vivek Goyal <vgoyal@xxxxxxxxxx> writes: > >> On Fri, Jun 25, 2010 at 01:03:20PM +0200, Christoph Hellwig wrote: >>> On Wed, Jun 23, 2010 at 09:44:20PM -0400, Vivek Goyal wrote: >>> I see the point of this logic for reads where various workloads have >>> dependent reads that might be close to each other, but I don't really >>> see any point for writes. >>> >>> > So looks like fsync path will do bunch of IO and then will wait for jbd thread >>> > to finish the work. In this case idling is waste of time. >>> >>> Given that ->writepage already does WRITE_SYNC_PLUG I/O which includes >>> REQ_NODILE I'm still confused why we still have that issue. >> >> In current form, cfq honors REQ_NOIDLE conditionally and that's why we >> still have the issue. If you look at cfq_completed_request(), we continue >> to idle in following two cases. >> >> - If we classifed the queue as SYNC_WORKLOAD. >> - If there is another random read/write happening on sync-noidle service >> tree. >> >> SYNC_WORKLOAD means that cfq thinks this particular queue is doing sequential >> IO. For random IO queues, we don't idle on each individual queue but a >> group of queue. >> >> In jeff's testing, fsync thread/queue sometimes is viewed as sequential >> workload and goes on SYNC_WORKLOAD tree. In that case even if request is >> REQ_NOIDLE, we will continue to idle hence fsync issue. > > I'm now testing OCFS2, and I'm seeing performance that is not great > (even with the blk_yield patches applied). What happens is that we > successfully yield the queue to the journal thread, but then idle on the > journal thread (even though RQ_NOIDLE was set). > > So, can we just get rid of idling when RQ_NOIDLE is set? Hi Jeff, I think I spotted a problem with the initial implementation of the tree-wide idle when RQ_NOIDLE is set: I assumed that a queue would either send possibly-idling requests or no-idle requests, but it seems that RQ_NOIDLE is being used to mark the end of a stream of possibly-idling requests (in my initial implementation, this will then cause an unintended idle). The attached patch should fix it, and I think the logic is simpler than Vivek's. Can you give it a spin? Otherwise, I think that reverting the "noidle_tree_requires_idle" behaviour completely may be better than adding complexity, since it is really trying to solve corner cases (that maybe happen only on synthetic workloads), but affecting negatively more common cases. About what it is trying to solve, since I think it was not clear: - we have a workload of 2 queues, both issuing requests that are being put in the no-idle tree (e.g. they are random) + 1 queue issuing idling requests (e.g. sequential). - if one of the 2 "random" queues marks its requests as RQ_NOIDLE, then the timeslice for the no-idle tree is not preserved, causing unfairness, as soon as an RQ_NOIDLE request is serviced and the tree is empty. Thanks Corrado > > Vivek sent me this patch to test, and it got rid of the performance > issue for the fsync workload. Can we discuss its merits? > > Thanks, > Jeff > > Index: linux-2.6/block/cfq-iosched.c > =================================================================== > --- linux-2.6.orig/block/cfq-iosched.c 2010-06-25 15:57:33.832125786 -0400 > +++ linux-2.6/block/cfq-iosched.c 2010-06-25 15:59:19.788876361 -0400 > @@ -318,6 +318,7 @@ > CFQ_CFQQ_FLAG_split_coop, /* shared cfqq will be splitted */ > CFQ_CFQQ_FLAG_deep, /* sync cfqq experienced large depth */ > CFQ_CFQQ_FLAG_wait_busy, /* Waiting for next request */ > + CFQ_CFQQ_FLAG_group_idle, /* This queue is doing group idle */ > }; > > #define CFQ_CFQQ_FNS(name) \ > @@ -347,6 +348,7 @@ > CFQ_CFQQ_FNS(split_coop); > CFQ_CFQQ_FNS(deep); > CFQ_CFQQ_FNS(wait_busy); > +CFQ_CFQQ_FNS(group_idle); > #undef CFQ_CFQQ_FNS > > #ifdef CONFIG_CFQ_GROUP_IOSCHED > @@ -1613,6 +1615,7 @@ > > cfq_clear_cfqq_wait_request(cfqq); > cfq_clear_cfqq_wait_busy(cfqq); > + cfq_clear_cfqq_group_idle(cfqq); > > /* > * If this cfqq is shared between multiple processes, check to > @@ -3176,6 +3179,13 @@ > if (cfq_class_rt(new_cfqq) && !cfq_class_rt(cfqq)) > return true; > > + /* > + * If were doing group_idle and we got new request in same group, > + * preempt the queue > + */ > + if (cfq_cfqq_group_idle(cfqq)) > + return true; > + > if (!cfqd->active_cic || !cfq_cfqq_wait_request(cfqq)) > return false; > > @@ -3271,6 +3281,7 @@ > struct cfq_queue *cfqq = RQ_CFQQ(rq); > > cfq_log_cfqq(cfqd, cfqq, "insert_request"); > + cfq_clear_cfqq_group_idle(cfqq); > cfq_init_prio_data(cfqq, RQ_CIC(rq)->ioc); > > rq_set_fifo_time(rq, jiffies + cfqd->cfq_fifo_expire[rq_is_sync(rq)]); > @@ -3416,10 +3427,12 @@ > * SYNC_NOIDLE_WORKLOAD idles at the end of the tree > * only if we processed at least one !rq_noidle request > */ > - if (cfqd->serving_type == SYNC_WORKLOAD > - || cfqd->noidle_tree_requires_idle > - || cfqq->cfqg->nr_cfqq == 1) > + if (cfqd->noidle_tree_requires_idle) > + cfq_arm_slice_timer(cfqd); > + else if (cfqq->cfqg->nr_cfqq == 1) { > + cfq_mark_cfqq_group_idle(cfqq); > cfq_arm_slice_timer(cfqd); > + } > } > } > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- __________________________________________________________________________ dott. Corrado Zoccolo mailto:czoccolo@xxxxxxxxx PhD - Department of Computer Science - University of Pisa, Italy -------------------------------------------------------------------------- The self-confidence of a warrior is not the self-confidence of the average man. The average man seeks certainty in the eyes of the onlooker and calls that self-confidence. The warrior seeks impeccability in his own eyes and calls that humbleness. Tales of Power - C. Castaneda Hi
From 30bdf702604f304607ac8a986d7d89285c7141fa Mon Sep 17 00:00:00 2001 From: Corrado Zoccolo <czoccolo@xxxxxxxxx> Date: Tue, 29 Jun 2010 10:48:21 +0200 Subject: [PATCH] cfq-iosched: fix tree-wide handling of rq_noidle Patch: 8e55063 cfq-iosched: fix corner cases in idling logic introduced the possibility of iding even on no-idle requests in the no_idle tree, if any previous request in the current slice could idle. The implementation had a problem, though: - if a queue sent several possibly idling requests and a noidle request as last one in the same time slice, the tree was still marked as idle. This patch fixes this misbehaviour, by using a 31-bucket hash to keep idling/non-idling status for queues in the noidle tree. Signed-off-by: Corrado Zoccolo <czoccolo@xxxxxxxxx> --- block/cfq-iosched.c | 11 ++++++++--- 1 files changed, 8 insertions(+), 3 deletions(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index d1ad066..5ef9a5d 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -214,7 +214,7 @@ struct cfq_data { enum wl_type_t serving_type; unsigned long workload_expires; struct cfq_group *serving_group; - bool noidle_tree_requires_idle; + u32 noidle_tree_requires_idle; /* * Each priority tree is sorted by next_request position. These @@ -2066,7 +2066,7 @@ static void choose_service_tree(struct cfq_data *cfqd, struct cfq_group *cfqg) slice = max_t(unsigned, slice, CFQ_MIN_TT); cfq_log(cfqd, "workload slice:%d", slice); cfqd->workload_expires = jiffies + slice; - cfqd->noidle_tree_requires_idle = false; + cfqd->noidle_tree_requires_idle = 0U; } static struct cfq_group *cfq_get_next_cfqg(struct cfq_data *cfqd) @@ -3349,7 +3349,12 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq) cfq_slice_expired(cfqd, 1); else if (sync && cfqq_empty && !cfq_close_cooperator(cfqd, cfqq)) { - cfqd->noidle_tree_requires_idle |= !rq_noidle(rq); + u32 bitmask = 1U << (((int)cfqq) % 31); + if (rq_noidle(rq)) + cfqd->noidle_tree_requires_idle &= ~bitmask; + else + cfqd->noidle_tree_requires_idle |= bitmask; + /* * Idling is enabled for SYNC_WORKLOAD. * SYNC_NOIDLE_WORKLOAD idles at the end of the tree -- 1.6.4.4