Hi Dennis. That is the first issue we saw and has nothing to do with the amd processors (which only relates to the second issue we saw). So the fix in the patch https://github.com/ceph/ceph/pull/10027 should work for you. In our case we went for the full compilation for our own specific reasons. But you should only need to recompile the ceph fuse client. If you want a temp solution while this is not fixed in jewel, just deploy ceph-fuse using an infernalis client. That is how we did it during the 3 weeks we were debugging our issues. Cheers Goncalo ________________________________________ From: Dennis Kramer (DBS) [dennis@xxxxxxxxx] Sent: 30 August 2016 20:59 To: Goncalo Borges; ceph-users@xxxxxxxxxxxxxx Subject: Re: ceph-fuse "Transport endpoint is not connected" on Jewel 10.2.2 Hi Goncalo, Thank you for providing below info. I'm getting the exact same errors: ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) 1: (()+0x2ae88e) [0x5647a76f488e] 2: (()+0x113d0) [0x7f7d14c393d0] 3: (Client::get_root_ino()+0x10) [0x5647a75eb730] 4: (CephFuse::Handle::make_fake_ino(inodeno_t, snapid_t)+0x175) [0x5647a75e9595] 5: (()+0x1a3eb1) [0x5647a75e9eb1] 6: (()+0x14ef5) [0x7f7d15283ef5] 7: (()+0x15679) [0x7f7d15284679] 8: (()+0x11e38) [0x7f7d15280e38] 9: (()+0x76fa) [0x7f7d14c2f6fa] 10: (clone()+0x6d) [0x7f7d1351ab5d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. After reading your thread I wasn't sure if your solution would work in our environment, since we don't use the AMD procs you mentioned. Though the segfaults are identical in debugging. Have you recompiled ceph completely for your cluster or just the MDS server? On 08/25/2016 02:45 AM, Goncalo Borges wrote: > Hi Dennis... > > We use ceph-fuse in 10.2.2 and we saw two main issues with it immediately after > upgrading from Infernalis to Jewel. > > In our case, we are enabling ceph-fuse in a heavily used Linux cluster, and our > users complained about the mount points becoming unavailable some time after > their applications start up. > > First we saw > > https://github.com/ceph/ceph/pull/10027 > > and once that was fixed, we saw > > http://tracker.ceph.com/issues/16610 > > > There is a long ML thread with the subject 'ceph-fuse segfaults ( jewel 10.2.2)' > on the topic. At the end, RH staff proposed some patches which we applied (we > recompile ceph ourselves) and which resolved the issues we saw. > > You should run ceph-fuse in debug mode to actually check what segfaults you may > have, and if it is a similar problem. You can do that by mounting ceph-fuse with > nohup and the '-d'. Something like: > > nohup ceph-fuse --id mount_user -k <path to you key> -m <mon ip>:6789 -d -r > /cephfs /coepp/cephfs > /path/to/some/log 2>&1 & > > If you want an even bigger log level, you should set 'debug client = 20' in your > /etc/ceph/ceph.conf before mounting. > > > Cheers > Goncalo > > On 08/24/2016 10:28 PM, Dennis Kramer (DT) wrote: >> Hi all, >> >> Running ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) on >> Ubuntu 16.04LTS. >> >> Currently I have the weirdest thing, I have a bunch of linux clients, mostly >> debian based (Ubuntu/Mint). They all use version 10.2.2 of ceph-fuse. I'm >> running cephfs since Hammer without any issues, but upgraded last week to >> Jewel and now my clients get: >> "Transport endpoint is not connected". >> >> It seems the error only arises when the client is using the GUI when they >> browse through the ceph-fuse mount, some use nemo, some nautilus. The error >> doesnt show up immediatly, sometimes the client can browse through the share >> for some time before they are kicked out with the error. >> >> But when I strictly use the shell to browse the ceph-fuse mount in the CLI it >> works without any issues, when I try to use the GUI browser on the same >> client, the error shows and I get kicked out of the ceph-fuse mount until I >> remount. >> >> Any suggestions? >> >> With regards, >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- > Goncalo Borges > Research Computing > ARC Centre of Excellence for Particle Physics at the Terascale > School of Physics A28 | University of Sydney, NSW 2006 > T: +61 2 93511937 > -- Kramer M.D. Infrastructure Engineer ........................................................................ Nederlands Forensisch Instituut Digitale Technologie & Biometrie Laan van Ypenburg 6 | 2497 GB | Den Haag Postbus 24044 | 2490 AA | Den Haag ........................................................................ T 070 888 64 30 M 06 29 62 12 02 d.kramer@xxxxxxxxxxxxxx / dennis@xxxxxxxxx PGP publickey: http://www.holmes.nl/dennis.asc www.forensischinstituut.nl ........................................................................ Nederlands Forensisch Instituut. In feiten het beste. ........................................................................ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com