Hi. On 06.12.2019 17:17, Paolo Valente wrote:
Simone (in CC) and I have worked a little bit on reproducing the I/O freeze you report. Simone made a small change in SCSI_debug, which makes the latter serve I/O with a highly varying random delay (100ms - 1s), about twice a second. Then, to generate some fluctuating and heavy I/O, he ran the comm_startup_lat.sh script of my S suite with SCSI_debug a few times. Unfortunately, he didn't succeed in reproducing the problem. If you want, we can send you a patch with his change for SCSI_debug. Any news on your side?
FWIW, I guess I'm safe to exclude BFQ at the moment since I've encountered a very similar issue without having BFQ enabled.
Also, I think this might be unrelated to the block layer at all. I suspect there's some race between MADV_MERGEABLE and MADV_DONTNEED since this is what's hammering the affected tasks and what I see from the call traces.
I'll investigate further and probably talk to MM people instead. Sorry for the noise.
-- Oleksandr Natalenko (post-factum)