Last few days has been tense because a R3 3.8.5 Gluster cluster that I built has been plagued by problems. The first symptom has been a continuous stream in the client logs of: [2016-12-17 15:55:02.047508] E [MSGID: 108009] [afr-open.c:187:afr_openfd_fix_open_cbk] 0-hisap-prod-1-replicate-0: Failed to open /home/galaxy/HISAP/java/lib/java/jre1.7.0_51/jre/lib/rt.jar on subvolume hisap-prod-1-client-2 [Transport endpoint is not connected] followed by very frequent peer disconnections/reconnections and a continuous stream of files to be healed on several volumes. The problem has been traced back to a flaky X540-T2 10GBE NIC embedded in one of the peers motherboard, that was incapable of keeping the correct 10Gbit speed negotiation with the switch. The motherboard has been replaced on the peer. and then the volumes healed quickly to complete health. All of these while the users kept running some heavy-duty bioinformatics applications (NGS data analysis) on top of Gluster. No user noticed ANYTHING despite a major hardware problem and offi-lining of a peer. This is a RESILIENT system, in my book. Gluster people, despite the constant stream of problems and requests for help that you see on the ML and IRC, rest assured that you are building a nice piece of software, at least IMHO. Keep-up the good work and Merry Christmas. Ivan Rossi _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users