Re: races between blk-cgroup operations and I/O scheds in blk-mq (?)

Paolo Valente <paolo.valente@xxxxxxxxxx> · Thu, 18 May 2017 09:35:17 +0200

> Il giorno 17 mag 2017, alle ore 21:12, Tejun Heo <tj@xxxxxxxxxx> ha scritto:
> 
> Hello,
> 
> On Mon, May 15, 2017 at 09:49:13PM +0200, Paolo Valente wrote:
>> So, unless you tell me that there are other races I haven't seen, or,
>> even worse, that I'm just talking nonsense, I have thought of a simple
>> solution to address this issue without resorting to the request_queue
>> lock: further caching, on blkg lookups, the only policy or blkg data
>> the scheduler may use, and access this data directly when needed.  By
>> doing so, the issue is reduced to the occasional use of stale data.
>> And apparently this already happens, e.g., in cfq when it uses the
>> weight of a cfq_queue associated with a process whose group has just
>> been changed (and for which a blkg_lookup has not yet been invoked).
>> The same should happen when cfq invokes cfq_log_cfqq for such a
>> cfq_queue, as this function prints the path of the group the bfq_queue
>> belongs to.
> 
> I haven't studied the code but the problem sounds correct to me.  All
> of blkcg code assumes the use of rq lock.  And, yeah, none of the hot
> paths requires strong synchornization.  All the actual management
> operations can be synchronized separately and the hot lookup path can
> be protected with rcu and maybe percpu reference counters.
> 

Great, thanks for this ack.  User reports do confirm the problem, and,
so far, the effectiveness of a solution I have implemented.  I'm
finalizing the patch for submission.

Thanks,
Paolo

> Thanks.
> 
> -- 
> tejun