Hi Boris,
Yap, I think we're in agreement here. I believe the important question
is whether you need to get error information from multiple sources
together in order to do proper recovery or doing it per error source
suffices.
And I think the actual use cases could/should dictate our
drivers/orchestrators design.
Thus my question how you guys are planning on tying all that error info
the drivers report, into the whole system design?
We have daemon script that collects correctable/uncorrectable errors
from EDAC sysfs and reports to Amazon service that allow us to take
action on specific error thresholds.
Thanks,
Hanna