Re: [Gluster-devel] Query on healing process

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Fri, Mar 4, 2016 at 6:36 PM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:
On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:

Ok, just to confirm, glusterd  and other brick processes are running after this node rebooted? 
When you run the above command, you need to check /var/log/glusterfs/glfsheal-volname.log logs errros. Setting client-log-level to DEBUG would give you a more verbose message

Yes, glusterd and other brick processes running fine. I have check the  /var/log/glusterfs/glfsheal-volname.log file without the log-level= DEBUG. Here is the logs from that file

[2016-03-02 13:51:39.059440] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2016-03-02 13:51:39.072172] W [MSGID: 101012] [common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info [No such file or directory]
[2016-03-02 13:51:39.072228] W [MSGID: 101081] [common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able to get reserved ports, hence there is a possibility that glusterfs may consume reserved port
[2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish] 0-gfapi: connection to 127.0.0.1:24007 failed (Connection refused)

Not sure why ^^ occurs. You could try flushing iptables (iptables -F), restart glusterd and run the heal info command again .

No hint from the logs? I'll try your suggestion.

[2016-03-02 13:51:39.072663] E [MSGID: 104024] [glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with remote-host: localhost (Transport endpoint is not connected) [Transport endpoint is not connected]
[2016-03-02 13:51:39.072700] I [MSGID: 104025] [glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile servers [Transport endpoint is not connected]
# gluster volume heal c_glusterfs info split-brain
c_glusterfs: Not able to fetch volfile from glusterd
Volume heal failed.



And based on the your observation I understood that this is not the problem of split-brain but is there any way through which can find out the file which is not in split-brain as well as not in sync?

`gluster volume heal c_glusterfs info split-brain`  should give you files that need heal.

Sorry  I meant '
gluster volume heal c_glusterfs info' should give you the files that need heal and 'gluster volume heal c_glusterfs info split-brain' the list of files in split-brain.
The commands are detailed in https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md

Yes, I have tried this as well It is also giving Number of entries : 0 means no healing is required but the file /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml is not in sync both of brick showing the different version of this file.

You can see it in the getfattr command outcome as well.


# getfattr -m . -d -e hex /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
getfattr: Removing leading '/' from absolute path names
# file: opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-0=0x000000000000000000000000
trusted.afr.c_glusterfs-client-2=0x000000000000000000000000
trusted.afr.c_glusterfs-client-4=0x000000000000000000000000
trusted.afr.c_glusterfs-client-6=0x000000000000000000000000
trusted.afr.c_glusterfs-client-8=0x000000060000000000000000 //because client8 is the latest client in our case and starting 8 digits

00000006....are saying like there is something in changelog data.
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x000000000000001356d86c0c000217fd
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

# lhsh 002500 getfattr -m . -d -e hex /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
getfattr: Removing leading '/' from absolute path names
# file: opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-1=0x000000000000000000000000 // and here we can say that there is no split brain but the file is out of sync
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x000000000000001156d86c290005735c
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

 
Regards,
   Abhishek

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux