root@chastcvtprd04:~# gluster peer status
Number of Peers: 1
Hostname: chglbcvtprd04
Uuid: bb12d7e7-ded5-4d32-b294-8f5011f70afb
State: Peer in Cluster (Connected)
Other names:
chglbcvtprd04.fpprod.corp
root@chastcvtprd04:~# glusterd statedump
root@chastcvtprd04:~# cat /var/log/glusterfs/statedump.log
[2016-10-20 14:57:04.636547] I [MSGID: 100030] [glusterfsd.c:2454:main] 0-glusterd: Started running glusterd version 3.8.5 (args: glusterd statedump)
[2016-10-20 14:57:04.636599] E [MSGID: 100007] [glusterfsd.c:578:create_fuse_mount] 0-glusterfsd: Not a client process, not performing mount operation
root@chastcvtprd04:~#
root@chglbcvtprd04:~# gluster peer status
Number of Peers: 1
Hostname: chastcvtprd04.fpprod.corp
Uuid: 82aef154-8444-46bb-9fd5-d7eaf4f0a6bc
State: Peer in Cluster (Connected)
Other names:
chastcvtprd04
root@chglbcvtprd04:~# glusterd statedump
root@chglbcvtprd04:~# cat /var/log/glusterfs/statedump.log
[2016-10-20 14:59:21.047747] I [MSGID: 100030] [glusterfsd.c:2454:main] 0-glusterd: Started running glusterd version 3.8.5 (args: glusterd statedump)
[2016-10-20 14:59:21.047785] E [MSGID: 100007] [glusterfsd.c:578:create_fuse_mount] 0-glusterfsd: Not a client process, not performing mount operation
root@chglbcvtprd04:~#
I had a chance to look at the logs from both the nodes. I could see a repetitive instance of the following logs (in both the nodes):
"Lock for Oracle_Legal_04 held by bb12d7e7-ded5-4d32-b294-8f5011f70afb"
What that means is node with UUID is the culprit. However I doubt that the logs you shared from the nodes have this UUID. Could you please help me with gluster peer status output? If you happen to find which node is having the UUID mentioned, could you take glusterd statedump and share it with us?On Tue, Oct 18, 2016 at 2:11 PM, Bernhard Duebi <bernhard@xxxxxx> wrote:Hello,
I'm running gluster 3.8.5 on Ubuntu 16.04. I have 2 nodes which mirror
each other. There are 32 volumes and all have the same configuration:
Type: Replicate
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: node01:/data/glusterfs/vol/disk/brick
Brick2: node02:/data/glusterfs/vol/disk/brick
Options Reconfigured:
diagn
ostics.count-fop-hits: on
diagnostics.latency-measurement: on
performance
.readdir-ahead: on
nfs.disable: on
auth.allow:
127.0.0.1,10.11.12.21,10.11.12.22
Nagios runs every 5 mins for each volume
# gluster volume heal $vol info
# gluster volume status $vol detail
Diamond runs every minute
# gluster volume list
and then for every volume
# gluster volume profile $vol info cumulative --xml
this was running fine with Gluster 3.7 but since I upgraded to 3.8.5 I
see a lot of problems with locking. After a reboot of both machines
everything is fine. But after a while gluster volume status gives me
the following error:
Another transaction is in progress for $vol. Please try again after
sometime
The problem is, that the system never recovers, only rebooting the
machines helps. Ok, probably a restart of gluster would do to.
I attached the logfiles from both glusterd. Let me know if you need
more information.
Thanks
Bernhard
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users