Re: Volume heal info not reporting files in split brain and core dumping, after upgrading to 3.7.0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,
in case you need it, here there is the detailed backtrace too.
Cheers,

Alessandro

Attachment: backtrace.txt.gz
Description: GNU Zip compressed data



Il giorno 29/mag/2015, alle ore 11:46, Alessandro De Salvo <Alessandro.DeSalvo@xxxxxxxxxxxxx> ha scritto:

Hi Pranith,
I’m definitely sure the log is correct, but you are also correct when you say there is no sign of crash (even checking with grep!).
However I see core dumps (e.g. core.19430) in /var/log/gluster) created every time I issue the heal info command.
From gdb I see this:


GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
Reading symbols from /usr/sbin/glfsheal...Reading symbols from /usr/lib/debug/usr/sbin/glfsheal.debug...done.
done.
[New LWP 19430]
[New LWP 19431]
[New LWP 19434]
[New LWP 19436]
[New LWP 19433]
[New LWP 19437]
[New LWP 19432]
[New LWP 19435]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glfsheal adsnet-vm-01'.
Program terminated with signal 11, Segmentation fault.
#0  inode_unref (inode=0x7f7a1e27806c) at inode.c:499
499             table = inode->table;
(gdb) bt
#0  inode_unref (inode=0x7f7a1e27806c) at inode.c:499
#1  0x00007f7a265e8a61 in fini (this=<optimized out>) at qemu-block.c:1092
#2  0x00007f7a39a53791 in xlator_fini_rec (xl=0x7f7a2000b9a0) at xlator.c:463
#3  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a2000d450) at xlator.c:453
#4  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a2000e800) at xlator.c:453
#5  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a2000fbb0) at xlator.c:453
#6  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a20010f80) at xlator.c:453
#7  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a20012330) at xlator.c:453
#8  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a200136e0) at xlator.c:453
#9  0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a20014b30) at xlator.c:453
#10 0x00007f7a39a53725 in xlator_fini_rec (xl=0x7f7a20015fc0) at xlator.c:453
#11 0x00007f7a39a54eea in xlator_tree_fini (xl=<optimized out>) at xlator.c:545
#12 0x00007f7a39a90b25 in glusterfs_graph_deactivate (graph=<optimized out>) at graph.c:340
#13 0x00007f7a38d50e3c in pub_glfs_fini (fs=fs@entry=0x7f7a3a6b6010) at glfs.c:1155
#14 0x00007f7a39f18ed4 in main (argc=<optimized out>, argv=<optimized out>) at glfs-heal.c:821


Thanks,

Alessandro

Il giorno 29/mag/2015, alle ore 11:12, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> ha scritto:



On 05/29/2015 02:37 PM, Alessandro De Salvo wrote:
Hi Pranith,
many thanks for the help!
The volume info of the problematic volume is the following:

# gluster volume info adsnet-vm-01
 
Volume Name: adsnet-vm-01
Type: Replicate
Volume ID: f8f615df-3dde-4ea6-9bdb-29a1706e864c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: gwads02.sta.adsnet.it:/gluster/vm01/data
Brick2: gwads03.sta.adsnet.it:/gluster/vm01/data
Options Reconfigured:
nfs.disable: true
features.barrier: disable
features.file-snapshot: on
server.allow-insecure: on
Are you sure the attached log is correct? I do not see any backtrace in the log file to indicate there is a crash :-(. Could you do "grep -i crash /var/log/glusterfs/*" to see if there is some other file with the crash. If that also fails, will it be possible for you to provide the backtrace of the core by opening it using gdb?

Pranith

The log is in attachment.
I just wanted to add that the heal info command works fine on other volumes hosted by the same machines, so it’s just this volume which is causing problems.
Thanks,

Alessandro




Il giorno 29/mag/2015, alle ore 10:50, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> ha scritto:



On 05/29/2015 02:18 PM, Pranith Kumar Karampuri wrote:


On 05/29/2015 02:13 PM, Alessandro De Salvo wrote:
Hi,
I'm facing a strange issue with split brain reporting.
I have upgraded to 3.7.0, after stopping all gluster processes as described in the twiki, on all servers hosting the volumes. The upgrade and the restart was fine, and the volumes are accessible.
However I had two files in split brain that I did not heal before upgrading, so I tried a full heal with 3.7.0. The heal was launched correctly, but when I now perform an heal info there is no output, while the heal statistics says there are actually 2 files in split brain. In the logs I see something like this:

glustershd.log:
[2015-05-29 08:28:43.008373] I [afr-self-heal-entry.c:558:afr_selfheal_entry_do] 0-adsnet-gluster-01-replicate-0: performing entry selfheal on 7fd1262d-949b-402e-96c2-ae487c8d4e27
[2015-05-29 08:28:43.012690] W [client-rpc-fops.c:241:client3_3_mknod_cbk] 0-adsnet-gluster-01-client-1: remote operation failed: Invalid argument. Path: (null)
Hey could you let us know "gluster volume info" output? Please let us know the backtrace printed by /var/log/glusterfs/glfsheal-<volname>.log as well.
Please attach /var/log/glusterfs/glfsheal-<volname>.log file to this thread so that I can take a look.

Pranith

Pranith


So, it seems like the files to be healed are not correctly identified, or at least their path is null.
Also, every time I issue a "gluster volume heal <volname> info" a core dump is generated in the log area.
All servers are using the latest CentOS 7.
Any idea why this might be happening and how to solve it?
Thanks,

   Alessandro



_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users



_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux