Sorry, didn't think to look in the log file, I can see I have bigger problems. Last time I saw this was because I had changed an IP address but this time all I did was reboot the server. I've checked all the files in vols and everything looks good. [2014-03-18 08:09:18.117040] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-0 [2014-03-18 08:09:18.117074] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-1 [2014-03-18 08:09:18.117087] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-2 [2014-03-18 08:09:18.117097] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-3 [2014-03-18 08:09:18.117107] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-4 [2014-03-18 08:09:18.117117] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-5 [2014-03-18 08:09:18.117128] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-6 [2014-03-18 08:09:18.117138] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-7 [2014-03-18 08:09:18.117148] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-8 [2014-03-18 08:09:18.117158] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-9 [2014-03-18 08:09:18.117168] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-10 [2014-03-18 08:09:18.117178] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-11 [2014-03-18 08:09:18.117196] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-12 [2014-03-18 08:09:18.117209] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-13 [2014-03-18 08:09:18.117219] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-14 [2014-03-18 08:09:18.117229] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-15 This is from another server [root@nas1 bricks]# gluster vol status Status of volume: data Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick nas1-10g:/data1/gvol 49152 Y 17331 Brick nas2-10g:/data5/gvol 49160 Y 3933 Brick nas1-10g:/data2/gvol 49153 Y 17340 Brick nas2-10g:/data6/gvol 49161 Y 3942 Brick nas1-10g:/data3/gvol 49154 Y 17350 Brick nas2-10g:/data7/gvol 49162 Y 3951 Brick nas1-10g:/data4/gvol 49155 Y 17360 Brick nas2-10g:/data8/gvol 49163 Y 3960 Brick nas3-10g:/data9/gvol 49156 Y 10076 Brick nas3-10g:/data10/gvol 49157 Y 10085 Brick nas3-10g:/data11/gvol 49158 Y 10094 Brick nas3-10g:/data12/gvol 49159 Y 10108 Brick nas4-10g:/data13/gvol N/A N 8879 Brick nas4-10g:/data14/gvol N/A N 8884 Brick nas4-10g:/data15/gvol N/A N 8888 Brick nas4-10g:/data16/gvol N/A N 8892 NFS Server on localhost 2049 Y 18725 NFS Server on nas3-10g 2049 Y 11667 NFS Server on nas2-10g 2049 Y 4980 NFS Server on nas4-10g N/A N N/A There are no active volume tasks Any ideas? On Tue, 2014-03-18 at 12:39 +0530, Kaushal M wrote: > The lock is an in-memory structure which isn't persisted. Restarting > should reset the lock. You could possibly reset the lock by gdbing > into the glusterd process. > > Since this is happening to you consistently, there is something else > that is wrong. Could you please give more details on your cluster? And > the glusterd logs of the misbehaving peer (if possible for all the > peers). It would help in tracking it down. > > > > On Tue, Mar 18, 2014 at 12:24 PM, Franco Broi <franco.broi@xxxxxxxxxx> wrote: > > > > Restarted the glusterd daemons on all 4 servers, still the same. > > > > It only and always fails on the same server and it always works on the > > other servers. > > > > I had to reboot the server in question this morning, perhaps it's got > > itself in a funny state. > > > > Is the lock something that can be examined? And removed? > > > > On Tue, 2014-03-18 at 12:08 +0530, Kaushal M wrote: > >> This mostly occurs when you run two gluster commands simultaneously. > >> Gluster uses a lock on each peer to synchronize commands. Any command > >> which would need to do operations on multiple peers, would first > >> acquire this lock, and release it after doing the operation. If a > >> command cannot acquire a lock because another command had the lock, it > >> will fail with the above error message. > >> > >> It sometimes happens that a command could fail to release the lock on > >> some peers. When this happens all further commands which need the lock > >> will fail with the same error. In this case your only option is to > >> restart glusterd on the peers which have the stale lock held. This > >> will not cause any downtime as the brick processes are not affected by > >> restarting glusterd. > >> > >> In your case, since you can run commands on other nodes, most likely > >> you are running commands simultaneously or at least running a command > >> before an old one finishes. > >> > >> ~kaushal > >> > >> On Tue, Mar 18, 2014 at 11:24 AM, Franco Broi <franco.broi@xxxxxxxxxx> wrote: > >> > > >> > What causes this error? And how do I get rid of it? > >> > > >> > [root@nas4 ~]# gluster vol status > >> > Another transaction could be in progress. Please try again after sometime. > >> > > >> > > >> > Looks normal on any other server. > >> > > >> > _______________________________________________ > >> > Gluster-users mailing list > >> > Gluster-users@xxxxxxxxxxx > >> > http://supercolony.gluster.org/mailman/listinfo/gluster-users > > > > _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users