Hi,
On 2023/07/11 16:45, Bart Van Assche wrote:
On 7/11/23 06:22, Jaco Kroon wrote:
So *suspected* mq-deadline bug?
That seems unlikely to me. I have not yet seen mq-deadline causing an
I/O lockup. I'm not claiming that it would be impossible that there is a
bug in mq-deadline but it seems unlikely to me. However, I have seen it
many times that an I/O lockup was caused by a buggy HBA driver and/or
HBA firmware so I recommend to start with checking these thoroughly.
Care to share how or at least point me in a direction please?
Supermicro has responded with appropriate upgrade options which
we've not executed yet, but I'll make time for that over the coming
weekend, or perhaps I should wait a bit longer to give more time for
a similar lockup with the none scheduler?
If this is possible, verifying whether the lockup can be reproduced
without I/O scheduler sounds like a good idea to me.
If I had a reliable way to trigger this it would help a great deal.
Currently we're at 14 days uptime with scheduler set to none rather than
mq-deadline, previously we had to hard reboot approximately every 7 days
... again, how long is a piece of string? A lockup proves the presence
of an issue, but how long must it not lock up to prove the absence of?
Ideas/Suggestions?
Kind regards,
Jaco