On 2021-02-05 21:51, pragalla@xxxxxxxxxxxxxx wrote:
On 2021-02-05 21:37, John Garry wrote:
- bouncing jianchao.w.wang@xxxxxxxxxx
Some time ago you replied the following to an email from me with a
suggestion for a fix: "Please let me consider it a bit more." Are
you
still working on a fix?
Unfortunately I have not had a chance, sorry. But I can look again.
So I have only seen KASAN use-after-free's myself, but never an
actual
oops. IIRC, someone did report an oops.
Hi John,
@Pradeep, do you have a reliable re-creator? I noticed the timeout
handler stackframe in your mail, so I guess not. However, as an
experiment, could you test:
https://lore.kernel.org/linux-block/1608203273-170555-2-git-send-email-john.garry@xxxxxxxxxx/
Yes, i don't have a reliable re-creator. The oops was noticed as a
part of stability testing and
was not an intentional try. This was noticed couple of times.
Please share the steps (if any) to easy hit or to exercise this path
more frequently.
Meanwhile, i will go with the usual stability procedure. i will
update the results here later.
Hi John,
we ran the stability with the above patch
(https://lore.kernel.org/linux-block/1608203273-170555-2-git-send-email-john.garry@xxxxxxxxxx/)
with switching the io-schedulers in b/w for ~88hrs on 2 devices, we
didn't notice any crash/issue.
Do you have a full kernel log for your crash?
Yes. Attaching the full kernel dmesg log.
So there are different flavors of this issue, and you reported a crash
from blk_mq_queue_tag_busy_iter().
If you check:
https://lore.kernel.org/linux-block/76190c94-c5c1-9553-5509-9969fc323544@xxxxxxxxxx/
You can see how I artificially trigger an issue in
blk_mq_queue_tag_busy_iter().
Sure, i will go through the steps on the recreation part. Thanks.
This should fix the common issue. But no final solution to issues
discussed from patch 2/2, which is more exotic.
BTW, is this the same Pradeep who reported:
https://lore.kernel.org/linux-block/1606402925-24420-1-git-send-email-ppvk@xxxxxxxxxxxxxx/
Thanks,
John
Thanks and Regards,
Pradeep