Thanks for the suggestion,
seems reasonable unfortunately on a operational system
it means a lot of down time,
but we end up there anyway.
Thanks
-Sev
Martial Herbaut wrote:
But we didn't actually lose power on the raid or hosts
just the connecting switches, so we lost all communication.
Presumably, in this situation the controller cache should have been
emptied Is my reasoning correct here ?
Correct. If your RAID has w/b cache enabled, but is battery backed, you
should be OK.
Beyond this, I'm not sure what else you can look at.
don't mean to barge in, however I have seen similar corruption happen in
the past where the fabric went away momentarily, like unplugging and
replugging a fibre cable on a non-dualpath/failover setup but the host
was not killed/rebooted. From memory the corruption was not immediately
apparent and became so later.
I think the best thing to do in that case scenario is force a reboot of
the host and then force fsck as opposed to continuing on and hope for the
best.
Martial Herbaut
---------------
Server101.com
_______________________________________________
Ext3-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/ext3-users
--
Sev Binello
Brookhaven National Laboratory
Upton, New York
631-344-5647
sev@xxxxxxx
_______________________________________________
Ext3-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/ext3-users