> > Hello Kashyap, > > On Wed, Dec 15, 2021 at 12:11:13AM +0530, Kashyap Desai wrote: > > + John Garry > > > > > blk-mq can't run allocating driver tag and updating ->rqs[tag] > > atomically, > > > meantime blk-mq doesn't clear ->rqs[tag] after the driver tag is > > released. > > > > > > So there is chance to iterating over one stale request just after > > > the > > tag is > > > allocated and before updating ->rqs[tag]. > > > > > > scsi_host_busy_iter() calls scsi_host_check_in_flight() to count > > > scsi > > in-flight > > > requests after scsi host is blocked, so no new scsi command can be > > marked as > > > SCMD_STATE_INFLIGHT. However, driver tag allocation still can be run > > > by > > blk- > > > mq core. One request is marked as SCMD_STATE_INFLIGHT, but this > > > request may have been kept in another slot of ->rqs[], meantime the > > > slot can be allocated out but ->rqs[] isn't updated yet. Then this > > > in-flight request > > is > > > counted twice as SCMD_STATE_INFLIGHT. This way causes trouble in > > handling > > > scsi error. > > > > Hi Ming, > > > > We found similar issue on RHEL8.5 (kernel does not have this patch in > > discussion.). Issue reproduced on 5.15 kernel as well. > > I understood this commit will fix specific race condition and avoid > > reading incorrect host_busy value. > > As per commit message - That incorrect host_busy will be just transient. > > If we read after some delay, correct host_busy count will be available. > > Right ? > > Yeah, any counter(include atomic counter) works in this way. > > But here it may be 'permanent' because one stale request pointer may stay in > one slot of ->rqs[] for long enough time if this slot isn't reused, meantime the > same request can be reallocated in case of real io scheduler. Maybe the > commit log should be improved a bit for making it explicit. Changing commit log description will help. > > > > > In my case (I am using shared host tag enabled driver), it is not race > > condition issue but stale rqs[] entries create permanent incorrect > > count of host_busy. > > Example - There are two pending IOs. This IOs are timed out. Bitmap of > > pending IO is tag#5 (actually belongs to hctx0), tag#10 (actually > > belongs to hctx1). Note - This is a shared bit map. > > If hctx0 has same address of the request at 5th and 10th index, we > > will > > It shouldn't be possible, since ->rqs[] is per-tags. If it is shared bit map, both > tag#5 and tag#10 are set, and both shared_tags->rqs[5] & shared_tags- > >rqs[10] should point to the updated requests(timed out). Updated pointers will be there for actual hctx. Below is possible and that is what causing problem in original issue. shared_tags->rqs[5] of hctx0 is having scmd = 0xAA (inflight command) shared_tags->rqs[10] of hctx0 is having scmd = 0xAA (inflight command) <- This is incorrect. While looping on hctx0 tags[], bitmap = 10 this entry is also found which is actually outstanding on hctx1. shared_tags->rqs[10] of hctx1 is having scmd = 0xBB (inflight command) Issue noticed by me is the exact same issue described @ below - https://lore.kernel.org/linux-scsi/fe5cf6c4-ce5e-4a0f-f4ab-5c10539492cb@hu awei.com/ Issue is only exposed to shared host tagset. I got the required information. Thanks. Kashyap > > > count total 2 inflight commands instead of 1 from hctx0 context + From > > hctx1 context, we will count 1 inflight command = Total is 3. > > Even though we read after some delay, host_busy will be incorrect. We > > expect host_busy = 2 but it will return 3. > > > > This patch fix my issue explained above for shared host-tag case. I > > am confused reading the commit message. You may not have intentionally > > fix the issue as I explained but indirectly it fixes my issue. Am I correct ? > > > > What was an issue reported by Luojiaxiang ? I am interested to know if > > issue reported by Luojiaxiang had shared host tagset enabled ? > > https://lore.kernel.org/linux-scsi/fe5cf6c4-ce5e-4a0f-f4ab- > 5c10539492cb@xxxxxxxxxx/ I check this. It is same issue as what I am seeing on Broadcom controller only if shared host tagset is enabled.
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature