But since it hit all of your machines, and at about the same time, I
tend to think that someone did something to these machines that caused
this issue, and it's not a 7.4.x problem.
I'm sure it is pilot error, and we're still trying to figure out exactly which pilot and what error.
Did you update / upgrade kernels, device drivers, hardware, etc...
What is common between all these systems besides postgresql? Was
there a power outage? All machines had the same admin one day who had
a brain cramp and did something stupid?
This occurred as part of an upgrade -- new OS, kernel, drivers.
Simply put, we need more info on how this happened.
We've recovered. There is root cause analysis going on. The question is whether I can use an argument about 8.0 vs. 7.4 reliability from this fiasco to help us get to 8.0.
8.0 actually is more reliable than 7.4, I assume.
Morris