Re: [PATCH 0/5] blk-mq: fix use-after-free on stale request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 26, 2020 at 08:24:07PM +0800, Ming Lei wrote:
> On Wed, Aug 26, 2020 at 01:03:37PM +0100, John Garry wrote:
> > On 21/08/2020 03:49, Ming Lei wrote:
> > > Hello Bart,
> > > 
> > > On Thu, Aug 20, 2020 at 01:30:38PM -0700, Bart Van Assche wrote:
> > > > On 8/20/20 11:03 AM, Ming Lei wrote:
> > > > > We can't run allocating driver tag and updating tags->rqs[tag] atomically,
> > > > > so stale request may be retrieved from tags->rqs[tag]. More seriously, the
> > > > > stale request may have been freed via updating nr_requests or switching
> > > > > elevator or other use cases.
> > > > > 
> > > > > It is one long-term issue, and Jianchao previous worked towards using
> > > > > static_rqs[] for iterating request, one problem is that it can be hard
> > > > > to use when iterating over tagset.
> > > > > 
> > > > > This patchset takes another different approach for fixing the issue: cache
> > > > > freed rqs pages and release them until all tags->rqs[] references on these
> > > > > pages are gone.
> > > > 
> > > > Hi Ming,
> > > > 
> > > > Is this the only possible solution? Would it e.g. be possible to protect the
> > > > code that iterates over all tags with rcu_read_lock() / rcu_read_unlock() and
> > > > to free pages that contain request pointers only after an RCU grace period has
> > > > expired?
> > > 
> > > That can't work, tags->rqs[] is host-wide, request pool belongs to scheduler tag
> > > and it is owned by request queue actually. When one elevator is switched on this
> > > request queue or updating nr_requests, the old request pool of this queue is freed,
> > > but IOs are still queued from other request queues in this tagset. Elevator switch
> > > or updating nr_requests on one request queue shouldn't or can't other request queues
> > > in the same tagset.
> > > 
> > > Meantime the reference in tags->rqs[] may stay a bit long, and RCU can't cover this
> > > case.
> > > 
> > > Also we can't reset the related tags->rqs[tag] simply somewhere, cause it may
> > > race with new driver tag allocation.
> > 
> > How about iterate all tags->rqs[] for all scheduler tags when exiting the
> > scheduler, etc, and clear any scheduler requests references, like this:
> > 
> > cmpxchg(&hctx->tags->rqs[tag], scheduler_rq, 0);
> > 
> > So we NULLify any tags->rqs[] entries which contain a scheduler request of
> > concern atomically, cleaning up any references.
> 
> Looks this approach can work given cmpxchg() will prevent new store on
> this address.

Another process may still be reading this to-be-freed request via
blk_mq_queue_tag_busy_iter or blk_mq_tagset_busy_iter(), meantime NULLify is done
and all requests of this scheduler are freed.


Thanks, 
Ming




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux