On 12/11/21 20:51, Geert Uytterhoeven wrote:
[...]
BTW, today I just found that hang in blk_mq_freeze_queue_wait() is
caused by commit 900e080752025f00, and the following patch can fix it:
- blk-mq: don't grab ->q_usage_counter in blk_mq_sched_bio_merge
https://lore.kernel.org/linux-block/20211111085650.GA476@xxxxxx/T/#m759b88fda094a65ebf29bc81b780967cdaf9cf28
Maybe you can try the above patch.
Thanks! I have applied both patches, but it doesn't make a difference.
Thanks for your test!
Can you try the following patch?
[...]
That's definitely a real fix, akin to the other pre-enter variants, this
one just post checks. Geert, can you give this a whirl?
With both of
blk-mq: don't grab ->q_usage_counter in blk_mq_sched_bio_merge
blk-mq: rename blk_attempt_bio_merge
applied, and the version above, I no longer saw the error, but the
boot sometimes hangs after:
ext3 filesystem being remounted at / supports timestamps until
2038 (0x7fffffff)
I don't know how easy that is to trigger: it hung on my first try, but
the second and third tries it booted fully into old Debian userspace.
Ming, would you mind sending this as a real patch?
The above patch may not be enough, since submit_bio_checks() is done in
case of using cached request, so how about the following patch(un-tested)?
Worked fine in five subsequent boots. Thanks!
Tested-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>
For good measure: block-5.16-2021-11-13 tested fine running IO stress
tests on a mix of IDE and SCSI disks, only one of those supporting DPO/FUA.
Using
WARN_ON(rq->cmd_flags & REQ_FUA && !sdkp->DPOFUA);
in sd.c:sd_setup_read_write_cmnd(), nothing seen in the logs.
Cheers,
Michael
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds