On Wed, Aug 26, 2020 at 01:03:37PM +0100, John Garry wrote: > On 21/08/2020 03:49, Ming Lei wrote: > > Hello Bart, > > > > On Thu, Aug 20, 2020 at 01:30:38PM -0700, Bart Van Assche wrote: > > > On 8/20/20 11:03 AM, Ming Lei wrote: > > > > We can't run allocating driver tag and updating tags->rqs[tag] atomically, > > > > so stale request may be retrieved from tags->rqs[tag]. More seriously, the > > > > stale request may have been freed via updating nr_requests or switching > > > > elevator or other use cases. > > > > > > > > It is one long-term issue, and Jianchao previous worked towards using > > > > static_rqs[] for iterating request, one problem is that it can be hard > > > > to use when iterating over tagset. > > > > > > > > This patchset takes another different approach for fixing the issue: cache > > > > freed rqs pages and release them until all tags->rqs[] references on these > > > > pages are gone. > > > > > > Hi Ming, > > > > > > Is this the only possible solution? Would it e.g. be possible to protect the > > > code that iterates over all tags with rcu_read_lock() / rcu_read_unlock() and > > > to free pages that contain request pointers only after an RCU grace period has > > > expired? > > > > That can't work, tags->rqs[] is host-wide, request pool belongs to scheduler tag > > and it is owned by request queue actually. When one elevator is switched on this > > request queue or updating nr_requests, the old request pool of this queue is freed, > > but IOs are still queued from other request queues in this tagset. Elevator switch > > or updating nr_requests on one request queue shouldn't or can't other request queues > > in the same tagset. > > > > Meantime the reference in tags->rqs[] may stay a bit long, and RCU can't cover this > > case. > > > > Also we can't reset the related tags->rqs[tag] simply somewhere, cause it may > > race with new driver tag allocation. > > How about iterate all tags->rqs[] for all scheduler tags when exiting the > scheduler, etc, and clear any scheduler requests references, like this: > > cmpxchg(&hctx->tags->rqs[tag], scheduler_rq, 0); > > So we NULLify any tags->rqs[] entries which contain a scheduler request of > concern atomically, cleaning up any references. Looks this approach can work given cmpxchg() will prevent new store on this address. > > I quickly tried it and it looks to work, but maybe not so elegant. I think this way is good enough. thanks, Ming