On Tue, Aug 12, 2014 at 08:24:44AM +0800, Franco Broi wrote: > > Did a major update yesterday to 3.5.2 on all servers and happy to report > that it went smoothly and everything seems to be working well. I also > did an update of ZOL to 0.6.3 running on linux-3.14.16 and the result of > all the updates is a definite improvement in speed of ls for the fuse > client, good enough for us to switch back from using gNFS for > interactive applications. > > All my mountpoints look good too. no more crashes. > > Thanks for all the good work, hopefully you wont be hearing much from me > for a while! Thanks for the update, Franco! Niels > > Cheers, > > > On Wed, 2014-08-06 at 08:54 +0800, Franco Broi wrote: > > I think all the mounts that have failed were mounted with 3.4.3 prior to > > the update. Not sure why they continued to work for several days before > > failing but remounting them with 3.5 appears to fix the problem. Running > > fusermount -zu makes them eventually exit with a core dump. > > > > So no more live updates! > > > > Cheers, > > > > > > On Tue, 2014-08-05 at 14:24 +0800, Franco Broi wrote: > > > On Mon, 2014-08-04 at 12:31 +0200, Niels de Vos wrote: > > > > On Mon, Aug 04, 2014 at 05:05:10PM +0800, Franco Broi wrote: > > > > > > > > > > A bit more background to this. > > > > > > > > > > I was running 3.4.3 on all the clients (120+ nodes) but I also have a > > > > > 3.5 volume which I wanted to mount on the same nodes. The 3.4.3 client > > > > > mounts of the 3.5 volume would sometimes hang on mount requiring a > > > > > volume stop/start to clear. I raised this issue on this list but it was > > > > > never resolved. I also tried to downgrade the 3.5 volume to 3.4 but that > > > > > also didn't work. > > > > > > > > > > I had a single client node running 3.5 and it was able to mount both > > > > > volumes so I decided to update everything on the client side. > > > > > > > > > > Middle of last week I did a glusterfs update from 3.4.3 to 3.5.1 and > > > > > everything appeared to be ok. The existing 3.4.3 mounts continued to > > > > > work and I was able to mount the 3.5 volume without any of the hanging > > > > > problems I was seeing before. Great, I thought. > > > > > > > > > > Today mount points started to fail, both for the 3.4 volume with the 3.4 > > > > > client and for the 3.5 volume with the 3.5 client. > > > > > > > > > > I've been remounting the filesystems as they break but it's a pretty > > > > > unstable environment. > > > > > > > > > > BTW, is there some way to get gluster to write its core files somewhere > > > > > other than the root filesystem? If I could do that I might at least get > > > > > a complete core dump to run gdb on. > > > > > > > > You can set a sysctl with a path, for example: > > > > > > > > # mkdir /var/cores > > > > # mount /dev/local_vg/cores /var/cores > > > > # sysctl -w kernel.core_pattern=/var/cores/core > > > > > > Thanks for that. > > > > > > > > > > > I am not sure if the "mismatching layouts" can cause a segmentation > > > > fault. In any case, it would be good to get the extended attributes for > > > > the directories in question. The xattrs contain the hash-range (layout) > > > > on where the files should get located. > > > > > > > > For all bricks (replace the "..." with the path for the brick): > > > > > > > > # getfattr -m. -ehex -d .../promax_data/115_endurance/31fasttrackstk > > > > > > > > Please also include a "gluster volume info $VOLUME". > > > > > > Please see attached. > > > > > > > > > > > > > > You should also file a bug for this, core dumping should definitely not > > > > happen. > > > > > > > > Thanks, > > > > Niels > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > On Mon, 2014-08-04 at 12:53 +0530, Pranith Kumar Karampuri wrote: > > > > > > CC dht folks > > > > > > > > > > > > Pranith > > > > > > On 08/04/2014 11:52 AM, Franco Broi wrote: > > > > > > > I've had a sudden spate of mount points failing with Transport endpoint > > > > > > > not connected and core dumps. The dumps are so large and my root > > > > > > > partitions so small that I haven't managed to get a decent traceback. > > > > > > > > > > > > > > BFD: Warning: //core.2351 is truncated: expected core file size >= > > > > > > > 165773312, found: 154107904. > > > > > > > [New Thread 2351] > > > > > > > [New Thread 2355] > > > > > > > [New Thread 2359] > > > > > > > [New Thread 2356] > > > > > > > [New Thread 2354] > > > > > > > [New Thread 2360] > > > > > > > [New Thread 2352] > > > > > > > Cannot access memory at address 0x1700000006 > > > > > > > (gdb) where > > > > > > > #0 glusterfs_signals_setup (ctx=0x8b17c0) at glusterfsd.c:1715 > > > > > > > Cannot access memory at address 0x7fffaa46b2e0 > > > > > > > > > > > > > > > > > > > > > Log file is full of messages like this: > > > > > > > > > > > > > > [2014-08-04 06:10:11.160482] I [dht-common.c:623:dht_revalidate_cbk] 0-data-dht: mismatching layouts for /promax_data/115_endurance/31fasttrackstk > > > > > > > [2014-08-04 06:10:11.160495] I [dht-layout.c:718:dht_layout_dir_mismatch] 0-data-dht: /promax_data/115_endurance/31fasttrackstk - disk layout missing > > > > > > > [2014-08-04 06:10:11.160502] I [dht-common.c:623:dht_revalidate_cbk] 0-data-dht: mismatching layouts for /promax_data/115_endurance/31fasttrackstk > > > > > > > [2014-08-04 06:10:11.160514] I [dht-layout.c:718:dht_layout_dir_mismatch] 0-data-dht: /promax_data/115_endurance/31fasttrackstk - disk layout missing > > > > > > > [2014-08-04 06:10:11.160522] I [dht-common.c:623:dht_revalidate_cbk] 0-data-dht: mismatching layouts for /promax_data/115_endurance/31fasttrackstk > > > > > > > [2014-08-04 06:10:11.160622] I [dht-layout.c:718:dht_layout_dir_mismatch] 0-data-dht: /promax_data/115_endurance/31fasttrackstk - disk layout missing > > > > > > > [2014-08-04 06:10:11.160634] I [dht-common.c:623:dht_revalidate_cbk] 0-data-dht: mismatching layouts for /promax_data/115_endurance/31fasttrackstk > > > > > > > > > > > > > > > > > > > > > I'm running 3.5.1 on the client side and 3.4.3 on the server. > > > > > > > > > > > > > > Any quick help much appreciated. > > > > > > > > > > > > > > Cheersm > > > > > > > > > > > > > > _______________________________________________ > > > > > > > Gluster-users mailing list > > > > > > > Gluster-users@xxxxxxxxxxx > > > > > > > http://supercolony.gluster.org/mailman/listinfo/gluster-users > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Gluster-users mailing list > > > > > Gluster-users@xxxxxxxxxxx > > > > > http://supercolony.gluster.org/mailman/listinfo/gluster-users > > > > > > > _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users