On Mon, 2014-08-04 at 12:31 +0200, Niels de Vos wrote: > On Mon, Aug 04, 2014 at 05:05:10PM +0800, Franco Broi wrote: > > > > A bit more background to this. > > > > I was running 3.4.3 on all the clients (120+ nodes) but I also have a > > 3.5 volume which I wanted to mount on the same nodes. The 3.4.3 client > > mounts of the 3.5 volume would sometimes hang on mount requiring a > > volume stop/start to clear. I raised this issue on this list but it was > > never resolved. I also tried to downgrade the 3.5 volume to 3.4 but that > > also didn't work. > > > > I had a single client node running 3.5 and it was able to mount both > > volumes so I decided to update everything on the client side. > > > > Middle of last week I did a glusterfs update from 3.4.3 to 3.5.1 and > > everything appeared to be ok. The existing 3.4.3 mounts continued to > > work and I was able to mount the 3.5 volume without any of the hanging > > problems I was seeing before. Great, I thought. > > > > Today mount points started to fail, both for the 3.4 volume with the 3.4 > > client and for the 3.5 volume with the 3.5 client. > > > > I've been remounting the filesystems as they break but it's a pretty > > unstable environment. > > > > BTW, is there some way to get gluster to write its core files somewhere > > other than the root filesystem? If I could do that I might at least get > > a complete core dump to run gdb on. > > You can set a sysctl with a path, for example: > > # mkdir /var/cores > # mount /dev/local_vg/cores /var/cores > # sysctl -w kernel.core_pattern=/var/cores/core Thanks for that. > > I am not sure if the "mismatching layouts" can cause a segmentation > fault. In any case, it would be good to get the extended attributes for > the directories in question. The xattrs contain the hash-range (layout) > on where the files should get located. > > For all bricks (replace the "..." with the path for the brick): > > # getfattr -m. -ehex -d .../promax_data/115_endurance/31fasttrackstk > > Please also include a "gluster volume info $VOLUME". Please see attached. > > You should also file a bug for this, core dumping should definitely not > happen. > > Thanks, > Niels > > > > > > > Cheers, > > > > On Mon, 2014-08-04 at 12:53 +0530, Pranith Kumar Karampuri wrote: > > > CC dht folks > > > > > > Pranith > > > On 08/04/2014 11:52 AM, Franco Broi wrote: > > > > I've had a sudden spate of mount points failing with Transport endpoint > > > > not connected and core dumps. The dumps are so large and my root > > > > partitions so small that I haven't managed to get a decent traceback. > > > > > > > > BFD: Warning: //core.2351 is truncated: expected core file size >= > > > > 165773312, found: 154107904. > > > > [New Thread 2351] > > > > [New Thread 2355] > > > > [New Thread 2359] > > > > [New Thread 2356] > > > > [New Thread 2354] > > > > [New Thread 2360] > > > > [New Thread 2352] > > > > Cannot access memory at address 0x1700000006 > > > > (gdb) where > > > > #0 glusterfs_signals_setup (ctx=0x8b17c0) at glusterfsd.c:1715 > > > > Cannot access memory at address 0x7fffaa46b2e0 > > > > > > > > > > > > Log file is full of messages like this: > > > > > > > > [2014-08-04 06:10:11.160482] I [dht-common.c:623:dht_revalidate_cbk] 0-data-dht: mismatching layouts for /promax_data/115_endurance/31fasttrackstk > > > > [2014-08-04 06:10:11.160495] I [dht-layout.c:718:dht_layout_dir_mismatch] 0-data-dht: /promax_data/115_endurance/31fasttrackstk - disk layout missing > > > > [2014-08-04 06:10:11.160502] I [dht-common.c:623:dht_revalidate_cbk] 0-data-dht: mismatching layouts for /promax_data/115_endurance/31fasttrackstk > > > > [2014-08-04 06:10:11.160514] I [dht-layout.c:718:dht_layout_dir_mismatch] 0-data-dht: /promax_data/115_endurance/31fasttrackstk - disk layout missing > > > > [2014-08-04 06:10:11.160522] I [dht-common.c:623:dht_revalidate_cbk] 0-data-dht: mismatching layouts for /promax_data/115_endurance/31fasttrackstk > > > > [2014-08-04 06:10:11.160622] I [dht-layout.c:718:dht_layout_dir_mismatch] 0-data-dht: /promax_data/115_endurance/31fasttrackstk - disk layout missing > > > > [2014-08-04 06:10:11.160634] I [dht-common.c:623:dht_revalidate_cbk] 0-data-dht: mismatching layouts for /promax_data/115_endurance/31fasttrackstk > > > > > > > > > > > > I'm running 3.5.1 on the client side and 3.4.3 on the server. > > > > > > > > Any quick help much appreciated. > > > > > > > > Cheersm > > > > > > > > _______________________________________________ > > > > Gluster-users mailing list > > > > Gluster-users@xxxxxxxxxxx > > > > http://supercolony.gluster.org/mailman/listinfo/gluster-users > > > > > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users@xxxxxxxxxxx > > http://supercolony.gluster.org/mailman/listinfo/gluster-users
# file: data1/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c trusted.glusterfs.dht=0x00000001000000002e8ba2e845d1745b # file: data2/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c trusted.glusterfs.dht=0x0000000100000000d1745d14e8ba2e87 # file: data3/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c trusted.glusterfs.dht=0x0000000100000000e8ba2e88ffffffff # file: data4/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c # file: data5/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c # file: data6/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c # file: data7/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c # file: data8/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c # file: data10/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c trusted.glusterfs.dht=0x00000001000000001745d1742e8ba2e7 # file: data11/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c trusted.glusterfs.dht=0x000000010000000045d1745c5d1745cf # file: data12/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c trusted.glusterfs.dht=0x00000001000000005d1745d0745d1743 # file: data9/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c trusted.glusterfs.dht=0x0000000100000000000000001745d173 # file: data13/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c trusted.glusterfs.dht=0x0000000100000000745d17448ba2e8b7 # file: data14/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c trusted.glusterfs.dht=0x00000001000000008ba2e8b8a2e8ba2b # file: data15/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c trusted.glusterfs.dht=0x0000000100000000a2e8ba2cba2e8b9f # file: data16/gvol/promax_data/115_endurance/31fasttrackstk trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c trusted.glusterfs.dht=0x0000000100000000ba2e8ba0d1745d13 Volume Name: data Type: Distribute Volume ID: 11d03f34-cc91-469f-afc3-35005db0faef Status: Started Number of Bricks: 16 Transport-type: tcp Bricks: Brick1: nas1-10g:/data1/gvol Brick2: nas2-10g:/data5/gvol Brick3: nas1-10g:/data2/gvol Brick4: nas2-10g:/data6/gvol Brick5: nas1-10g:/data3/gvol Brick6: nas2-10g:/data7/gvol Brick7: nas1-10g:/data4/gvol Brick8: nas2-10g:/data8/gvol Brick9: nas3-10g:/data9/gvol Brick10: nas3-10g:/data10/gvol Brick11: nas3-10g:/data11/gvol Brick12: nas3-10g:/data12/gvol Brick13: nas4-10g:/data13/gvol Brick14: nas4-10g:/data14/gvol Brick15: nas4-10g:/data15/gvol Brick16: nas4-10g:/data16/gvol Options Reconfigured: nfs.export-volumes: on nfs.disable: off cluster.min-free-disk: 5% network.frame-timeout: 10800 cluster.readdir-optimize: off
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users