On 8/24/23 00:29, Jaco Kroon wrote:
We're definitely seeing the same thing on another host using an ahci
controller. This seems to hint that it's not a firmware issue, as does
the fact that this happens much less frequently with the none scheduler.
That is unexpected. I don't think there is enough data available yet to
conclude whether these issues are identical or not?
I will make a plan to action the firmware updates on the raid controller
over the weekend regardless, just in order to eliminate that. I will
then revert to mq-deadline. Assuming this does NOT fix it, how would I
go about assessing if this is a controller firmware issue or a Linux
kernel issue?
If the root cause would be an issue in the mq-deadline scheduler or in
the core block layer then there would be many more reports about I/O
lockups. For this case I think that it's very likely that the root cause
is either the I/O controller driver or the I/O controller firmware.
Thanks,
Bart.