On Wed, Aug 17, 2022, at 10:53 AM, Ming Lei wrote: > On Wed, Aug 17, 2022 at 10:34:38AM -0400, Chris Murphy wrote: >> >> >> On Wed, Aug 17, 2022, at 8:06 AM, Ming Lei wrote: >> >> > blk-mq debugfs log is usually helpful for io stall issue, care to post >> > the blk-mq debugfs log: >> > >> > (cd /sys/kernel/debug/block/$disk && find . -type f -exec grep -aH . {} \;) >> >> This is only sda >> https://drive.google.com/file/d/1aAld-kXb3RUiv_ShAvD_AGAFDRS03Lr0/view?usp=sharing > > From the log, there isn't any in-flight IO request. > > So please confirm that it is collected after the IO stall is triggered. Yes, iotop reports no reads or writes at the time of collection. IO pressure 99% for auditd, systemd-journald, rsyslogd, and postgresql, with increasing pressure from all the qemu processes. Keep in mind this is a raid10, so maybe it's enough for just one block device IO to stall and the whole thing stops? That's why I included all block devices. > If yes, the issue may not be related with BFQ, and should be related > with blk-cgroup code. Problem happens with cgroup.disable=io, does this setting affect blk-cgroup? -- Chris Murphy