On Fri, 2005-07-08 at 16:18 -0400, Eric Kerin wrote: > > Well, I was able to track it down, it's being caused by the throttle on > the monitor operations for resources. > > Basically, any time a shared resource is referenced more than once, it > will not get monitored for the 2nd+ time it's referenced. This is > because it keeps track of the last time the resource was checked at the > resource level, and if it hasn't been more time than the amount of time > the monitor attribute says is the interval, it doesn't run the monitor > operation on it. > > So here's a patch that seems to fix it in my quick testing, but I'm not > sure if it's the best way to fix the bug. It copies the action list for > the resource to the resource_node when a resource is referenced. It > then uses that copy of the action list when doing status checks. > > > Perhaps a better way would be to make a copy of the struct for the > shared resource_t any time it's referenced, rather than just using the > same one for all resource_node_t. I'm willing to write up this patch if > you think it's a better course of action. Both work. For now, I think we should use this, as copying an entire resource_t structure has the downside of complicating reconfiguration, which is already rather ... complicated :) Ideally, we'd just have private "last-check-time" and "last-check-level" in the resource node structure and not put it in the resource action structures. This would require a little more work. -- Lon -- Linux-cluster@xxxxxxxxxx http://www.redhat.com/mailman/listinfo/linux-cluster