On 06/16/2016 01:32 PM, B.K.Raghuram wrote: > Thanks a lot Atin, > > The problem is that we are using a forked version of 3.6.1 which has > been modified to work with ZFS (for snapshots) but we do not have the > resources to port that over to the later versions of gluster. > > Would you know of anyone who would be willing to take this on?! If you can cherry pick the patches and apply them on your source and rebuild it, I can point the patches to you, but you'd need to give a day's time to me as I have some other items to finish from my plate. ~Atin > > Regards, > -Ram > > On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee <amukherj@xxxxxxxxxx > <mailto:amukherj@xxxxxxxxxx>> wrote: > > > > On 06/16/2016 10:49 AM, B.K.Raghuram wrote: > > > > > > On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee <amukherj@xxxxxxxxxx <mailto:amukherj@xxxxxxxxxx> > > <mailto:amukherj@xxxxxxxxxx <mailto:amukherj@xxxxxxxxxx>>> wrote: > > > > > > > > On 06/15/2016 04:24 PM, B.K.Raghuram wrote: > > > Hi, > > > > > > We're using gluster 3.6.1 and we periodically find that gluster commands > > > fail saying the it could not get the lock on one of the brick machines. > > > The logs on that machine then say something like : > > > > > > [2016-06-15 08:17:03.076119] E > > > [glusterd-op-sm.c:3058:glusterd_op_ac_lock] 0-management: Unable to > > > acquire lock for vol2 > > > > This is a possible case if concurrent volume operations are run. Do you > > have any script which checks for volume status on an interval from all > > the nodes, if so then this is an expected behavior. > > > > > > Yes, I do have a couple of scripts that check on volume and quota > > status.. Given this, I do get a "Another transaction is in progress.." > > message which is ok. The problem is that sometimes I get the volume lock > > held message which never goes away. This sometimes results in glusterd > > consuming a lot of memory and CPU and the problem can only be fixed with > > a reboot. The log files are huge so I'm not sure if its ok to attach > > them to an email. > > Ok, so this is known. We have fixed lots of stale lock issues in 3.7 > branch and some of them if not all were also backported to 3.6 branch. > The issue is you are using 3.6.1 which is quite old. If you can upgrade > to latest versions of 3.7 or at worst of 3.6 I am confident that this > will go away. > > ~Atin > > > > > > > > After sometime, glusterd then seems to give up and die.. > > > > Do you mean glusterd shuts down or segfaults, if so I am more > interested > > in analyzing this part. Could you provide us the glusterd log, > > cmd_history log file along with core (in case of SEGV) from > all the > > nodes for the further analysis? > > > > > > There is no segfault. glusterd just shuts down. As I said above, > > sometimes this happens and sometimes it just continues to hog a lot of > > memory and CPU.. > > > > > > > > > > Interestingly, I also find the following line in the > beginning of > > > etc-glusterfs-glusterd.vol.log and I dont know if this has any > > > significance to the issue : > > > > > > [2016-06-14 06:48:57.282290] I > > > [glusterd-store.c:2063:glusterd_restore_op_version] > 0-management: > > > Detected new install. Setting op-version to maximum : 30600 > > > > > > > > > What does this line signify? > > _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users