On 09/07/2015 06:12 AM, Emmanuel Dreyfus wrote: > I wrote a simple nagios plugin in C that calls gluster volume status to > check taht all bricks are online (is it of any interest to someone else > than me? What name would you expect for it? Does check_gfbricks looks > sane?) > > The thing periodically reported offline bricks and I did not understood > why, until I realized that the peers all run the test at the same time, > and hence may fail to lock the volume because another peer already holds > the lock. > > It seems that a failed lock acquisition is reported as offline bricks > for the peer. The simple workaround is to not check at the same time, > but perhaps the reported data could be improved? GlusterD doesn't report the status of bricks if find that another transaction on the same volume is in progress on the cluster. You could very well prove that by running a concurrent volume status command from different peers. I would suggest you to check the plugin and see why nagios is not handling the negative case here. ~Atin _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel