On 2020/3/18 19:07, Paolo Valente wrote: > > >> Il giorno 18 mar 2020, alle ore 10:52, Zhiqiang Liu <liuzhiqiang26@xxxxxxxxxx> ha scritto: >> >> >> >> On 2020/3/18 16:45, Paolo Valente wrote: >>> >>> >>>>>> spin_lock_irqsave(&bfqd->lock, flags); >>>>>> - bfq_clear_bfqq_wait_request(bfqq); >>>>>> - >>>>>> if (bfqq != bfqd->in_service_queue) { >>>>>> spin_unlock_irqrestore(&bfqd->lock, flags); >>>>>> return; >>>>>> } >>>>>> >>>>>> + bfq_clear_bfqq_wait_request(bfqq); >>>>>> + >>>>> >>>>> Please add a comment on why you (correctly) clear this flag only if bfqq is in service. >>>>> >>>>> For the rest, seems ok to me. >>>>> >>>>> Thank you very much for spotting and fixing this bug, >>>>> Paolo >>>>> >>>> Thanks for your reply. >>>> Considering that the bfqq may be in race, we should firstly check whether bfqq is in service before >>>> doing something on it. >>>> >>> >>> The comment you propose is correct, but the correctness issue I raised >>> is essentially the opposite. Sorry for not being clear. >>> >>> Let me put it the other way round: why is it still correct that, if >>> bfqq is not the queue in service, then that flag is not cleared at all? >>> IOW, why is it not a problem that that flag remains untouched is bfqq >>> is not in service? >>> >>> Thanks, >>> Paolo >>> >> Thanks for your patient. >> As you comment in bfq_idle_slice_timer, there are two race situations as follows, >> a) bfqq is null >> bfq_idle_slice_timer will not call bfq_idle_slice_timer_body ->no problem >> b) bfqq are not in service >> 1) bfqq is freed >> it will cause use-after-free problem before calling bfq_clear_bfqq_wait_request >> in bfq_idle_slice_timer_body. -> use-after-free problem as analyzed in the patch. >> 2) bfqq is not freed >> it means in_service_queue has been set to a new bfqq. The old bfqq has been expired >> through __bfq_bfqq_expire func. Then the wait_request flags of old bfqq will be cleared >> in __bfq_bfqd_reset_in_service func. -> it is no a problem to re-clear the wait_request >> flags before checking whether bfqq is in service. > > Great, this item 2 is exactly what I meant. We need a comment > because, even if now this stuff is clear to you, imagine somebody > else getting to your modified piece of code after reading hundreds of > lines of code, about a non-trivial state machine as BFQ ... :) > > Thanks, > Paolo > Ok, I will add the following comment in the v3 patch. Considering that the bfqq may be in race, we should firstly check whether bfqq is in service before doing something on it. If the bfqq in race is not in service, it means the bfqq has been expired through __bfq_bfqq_expire func, and wait_request flags has been cleared in __bfq_bfqd_reset_in_service func. >> >> In one word, the old bfqq in race has already cleared the wait_request flag when switching in_service_queue. >> >> Thanks, >> Zhiqiang Liu >> >>>>>> >>>>> >>>>> >>>>> . >>> >>> >>> . >>> >> > > > . >