> Il giorno 31 mag 2022, alle ore 12:01, Jan Kara <jack@xxxxxxx> ha scritto: > > On Tue 31-05-22 17:33:25, Yu Kuai wrote: >> 在 2022/05/31 17:19, Paolo Valente 写道: >>>> Il giorno 31 mag 2022, alle ore 11:06, Yu Kuai <yukuai3@xxxxxxxxxx> ha scritto: >>>> >>>> 在 2022/05/31 16:36, Paolo VALENTE 写道: >>>>>> Il giorno 30 mag 2022, alle ore 10:40, Yu Kuai <yukuai3@xxxxxxxxxx> ha scritto: >>>>>> >>>>>> 在 2022/05/30 16:34, Yu Kuai 写道: >>>>>>> 在 2022/05/30 16:10, Paolo Valente 写道: >>>>>>>> >>>>>>>> >>>>>>>>> Il giorno 28 mag 2022, alle ore 11:50, Yu Kuai <yukuai3@xxxxxxxxxx> ha scritto: >>>>>>>>> >>>>>>>>> Currently, bfq can't handle sync io concurrently as long as they >>>>>>>>> are not issued from root group. This is because >>>>>>>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in >>>>>>>>> bfq_asymmetric_scenario(). >>>>>>>>> >>>>>>>>> The way that bfqg is counted into 'num_groups_with_pending_reqs': >>>>>>>>> >>>>>>>>> Before this patch: >>>>>>>>> 1) root group will never be counted. >>>>>>>>> 2) Count if bfqg or it's child bfqgs have pending requests. >>>>>>>>> 3) Don't count if bfqg and it's child bfqgs complete all the requests. >>>>>>>>> >>>>>>>>> After this patch: >>>>>>>>> 1) root group is counted. >>>>>>>>> 2) Count if bfqg have at least one bfqq that is marked busy. >>>>>>>>> 3) Don't count if bfqg doesn't have any busy bfqqs. >>>>>>>> >>>>>>>> Unfortunately, I see a last problem here. I see a double change: >>>>>>>> (1) a bfqg is now counted only as a function of the state of its child >>>>>>>> queues, and not of also its child bfqgs >>>>>>>> (2) the state considered for counting a bfqg moves from having pending >>>>>>>> requests to having busy queues >>>>>>>> >>>>>>>> I'm ok with with (1), which is a good catch (you are lady explained >>>>>>>> the idea to me some time ago IIRC). >>>>>>>> >>>>>>>> Yet I fear that (2) is not ok. A bfqq can become non busy even if it >>>>>>>> still has in-flight I/O, i.e. I/O being served in the drive. The >>>>>>>> weight of such a bfqq must still be considered in the weights_tree, >>>>>>>> and the group containing such a queue must still be counted when >>>>>>>> checking whether the scenario is asymmetric. Otherwise service >>>>>>>> guarantees are broken. The reason is that, if a scenario is deemed as >>>>>>>> symmetric because in-flight I/O is not taken into account, then idling >>>>>>>> will not be performed to protect some bfqq, and in-flight I/O may >>>>>>>> steal bandwidth to that bfqq in an uncontrolled way. >>>>>>> Hi, Paolo >>>>>>> Thanks for your explanation. >>>>>>> My orginal thoughts was using weights_tree insertion/removal, however, >>>>>>> Jan convinced me that using bfq_add/del_bfqq_busy() is ok. >>>>>>> From what I see, when bfqq dispatch the last request, >>>>>>> bfq_del_bfqq_busy() will not be called from __bfq_bfqq_expire() if >>>>>>> idling is needed, and it will delayed to when such bfqq get scheduled as >>>>>>> in-service queue again. Which means the weight of such bfqq should still >>>>>>> be considered in the weights_tree. >>>>>>> I also run some tests on null_blk with "irqmode=2 >>>>>>> completion_nsec=100000000(100ms) hw_queue_depth=1", and tests show >>>>>>> that service guarantees are still preserved on slow device. >>>>>>> Do you this is strong enough to cover your concern? >>>>> Unfortunately it is not. Your very argument is what made be believe >>>>> that considering busy queues was enough, in the first place. But, as >>>>> I found out, the problem is caused by the queues that do not enjoy >>>>> idling. With your patch (as well as in my initial version) they are >>>>> not counted when they remain without requests queued. And this makes >>>>> asymmetric scenarios be considered erroneously as symmetric. The >>>>> consequence is that idling gets switched off when it had to be kept >>>>> on, and control on bandwidth is lost for the victim in-service queues. >>>> >>>> Hi,Paolo >>>> >>>> Thanks for your explanation, are you thinking that if bfqq doesn't enjoy >>>> idling, then such bfqq will clear busy after dispatching the last >>>> request? >>>> >>>> Please kindly correct me if I'm wrong in the following process: >>>> >>>> If there are more than one bfqg that is activatied, then bfqqs that are >>>> not enjoying idle are still left busy after dispatching the last >>>> request. >>>> >>>> details in __bfq_bfqq_expire: >>>> >>>> if (RB_EMPTY_ROOT(&bfqq->sort_list) && >>>> ┊ !(reason == BFQQE_PREEMPTED && >>>> ┊ idling_needed_for_service_guarantees(bfqd, bfqq))) { >>>> -> idling_needed_for_service_guarantees will always return true, >>> >>> It returns true only is the scenario is symmetric. Not counting bfqqs >>> with in-flight requests makes an asymmetric scenario be considered >>> wrongly symmetric. See function bfq_asymmetric_scenario(). >> >> Hi, Paolo >> >> Do you mean this gap? >> >> 1. io1 is issued from bfqq1(from bfqg1) >> 2. bfqq1 dispatched this io, it's busy is cleared >> 3. *before io1 is completed*, io2 is issued from bfqq2(bfqg2) > > Yes. So as far as I understand Paolo is concerned about this scenario. > >> 4. with this patchset, while dispatching io2 from bfqq2, the scenario >> should be symmetric while it's considered wrongly asymmetric. > > But with this patchset, we will consider this scenario symmetric because at > any point in time there is only one busy bfqq. Before, we considered this > scenario asymmetric because two different bfq groups have bfqq in their > weights_tree. So before this patchset > idling_needed_for_service_guarantees() returned true, after this patchset > the function returns false so we won't idle anymore and Paolo argues that > bfqq1 does not get adequate protection from bfqq2 as a result. > > I agree with Paolo this seems possible. The fix is relatively simple though > - instead of changing how weights_tree is used for weight raised queues as > you did originally, I'd move the accounting of groups with pending requests > to bfq_add/del_bfqq_busy() and bfq_completed_request(). > Why don't we use simply the existing logic? I mean, as for the changes made by this patch, we could simply turn the loop: void bfq_weights_tree_remove(struct bfq_data *bfqd, struct bfq_queue *bfqq) { ... for_each_entity(entity) { struct bfq_sched_data *sd = entity->my_sched_data; ... if (entity->in_groups_with_pending_reqs) { entity->in_groups_with_pending_reqs = false; bfqd->num_groups_with_pending_reqs--; } } ... } into a single: bfqd->num_groups_with_pending_reqs--; so that only the parent group is concerned. Thanks, Paolo > Honza > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR