Yes, you should file a bug to track this issue and to share information.
Also, I would like to have logs which are present in /var/log/messages, specially mount logs with name mnt.log or something.
Following are the points I would like to bring in to your notice-
1 - Are you sure that all the bricks are UP?
2 - Is there any connection issues?
3 - It is possible that there is a bug which caused crash. So please check for core dump created while doing mount and you saw ENOTCONN error.
4 - I am not very much aware of armhf and have not run glusterfs on this hardware. So, we need to see if there is anything in code which is
stopping us to run glusterfs on this architecture and setup.
5 - Please provide the output of gluster v info and gluster v status for the volume in BZ.
---
Ashish
From: "Fox" <foxxz.net@xxxxxxxxx>
To: gluster-users@xxxxxxxxxxx
Sent: Friday, August 3, 2018 9:51:30 AM
Subject: Disperse volumes on armhf
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
To: gluster-users@xxxxxxxxxxx
Sent: Friday, August 3, 2018 9:51:30 AM
Subject: Disperse volumes on armhf
Just wondering if anyone else is running into the same behavior with disperse volumes described below and what I might be able to do about it.
I am using ubuntu 18.04LTS on Odroid HC-2 hardware (armhf) and have installed gluster 4.1.2 via PPA. I have 12 member nodes each with a single brick. I can successfully create a working volume via the command:
gluster volume create testvol1 disperse 12 redundancy 4 gluster01:/exports/sda/brick1/testvol1 gluster02:/exports/sda/brick1/testvol1 gluster03:/exports/sda/brick1/testvol1 gluster04:/exports/sda/brick1/testvol1 gluster05:/exports/sda/brick1/testvol1 gluster06:/exports/sda/brick1/testvol1 gluster07:/exports/sda/brick1/testvol1 gluster08:/exports/sda/brick1/testvol1 gluster09:/exports/sda/brick1/testvol1 gluster10:/exports/sda/brick1/testvol1 gluster11:/exports/sda/brick1/testvol1 gluster12:/exports/sda/brick1/testvol1
And start the volume:
gluster volume start testvol1
Mounting the volume on an x86-64 system it performs as expected.
Mounting the same volume on an armhf system (such as one of the cluster members) I can create directories but trying to create a file I get an error and the file system unmounts/crashes:
root@gluster01:~# mount -t glusterfs gluster01:/testvol1 /mnt
root@gluster01:~# cd /mnt
root@gluster01:/mnt# ls
root@gluster01:/mnt# mkdir test
root@gluster01:/mnt# cd test
root@gluster01:~# cd /mnt
root@gluster01:/mnt# ls
root@gluster01:/mnt# mkdir test
root@gluster01:/mnt# cd test
root@gluster01:/mnt/test# cp /root/notes.txt ./
cp: failed to close './notes.txt': Software caused connection abort
root@gluster01:/mnt/test# ls
ls: cannot open directory '.': Transport endpoint is not connected
cp: failed to close './notes.txt': Software caused connection abort
root@gluster01:/mnt/test# ls
ls: cannot open directory '.': Transport endpoint is not connected
I get many of these in the glusterfsd.log:
The message "W [MSGID: 101088] [common-utils.c:4316:gf_backtrace_save] 0-management: Failed to save the backtrace." repeated 100 times between [2018-08-03 04:06:39.904166] and [2018-08-03 04:06:57.521895]
Furthermore, if a cluster member ducks out (reboots, loses connection, etc) and needs healing the self heal daemon logs messages similar to that above and can not heal - no disk activity (verified via iotop) though very high CPU usage and the volume heal info command indicates the volume needs healing.
I tested all of the above in virtual environments using x86-64 VMs and could self heal as expected.
Again this only happens when using disperse volumes. Should I be filing a bug report instead?
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users