On 02/09/09 11:33, Corey Kovacs wrote:
A colleague has a 5 node cluster with 4GB ram in each node. It's not enough for the cluster and more ram is on the way. The problem though is that until the ram arrives, there is risk of oom-killer (which he found out the other day) firing up and putting the node into a state which made it utterly useless but still looked good to the cluster. We could of course disable oom-killer but that's a workaround, not a fix. I am wondering if the cluster responding to oom-killer firing up and fencing the offending node is possible and if so, how others might have done it. Seems like it should just be handled by the cluster tho. Maybe have cman put a message across the openais "bus" like, "Hey, losing my brain here, someone whak me"...
I suppose you could give cman a large value for /proc/<pid>/oom_score so that it is the first thing to be killed if the system runs out of memory. That should guarantee that it will be fenced by the other nodes ... provided they have enough memory to remain quorate!
Chrissie -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster