Re: ceph-fuse "Transport endpoint is not connected" on Jewel 10.2.2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Op 30 augustus 2016 om 12:59 schreef "Dennis Kramer (DBS)" <dennis@xxxxxxxxx>:
> 
> 
> Hi Goncalo,
> 
> Thank you for providing below info. I'm getting the exact same errors:
> ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
>  1: (()+0x2ae88e) [0x5647a76f488e]
>  2: (()+0x113d0) [0x7f7d14c393d0]
>  3: (Client::get_root_ino()+0x10) [0x5647a75eb730]
>  4: (CephFuse::Handle::make_fake_ino(inodeno_t, snapid_t)+0x175)
> [0x5647a75e9595]
>  5: (()+0x1a3eb1) [0x5647a75e9eb1]
>  6: (()+0x14ef5) [0x7f7d15283ef5]
>  7: (()+0x15679) [0x7f7d15284679]
>  8: (()+0x11e38) [0x7f7d15280e38]
>  9: (()+0x76fa) [0x7f7d14c2f6fa]
>  10: (clone()+0x6d) [0x7f7d1351ab5d]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
> 
> After reading your thread I wasn't sure if your solution would work in
> our environment, since we don't use the AMD procs you mentioned. Though
> the segfaults are identical in debugging.
> 
> Have you recompiled ceph completely for your cluster or just the MDS server?
> 

This is just a client crash, so you only need to compile for the FUSE clients in this case. The MDS is not required to be fixed.

Wido

> 
> On 08/25/2016 02:45 AM, Goncalo Borges wrote:
> > Hi Dennis...
> > 
> > We use ceph-fuse in 10.2.2 and we saw two main issues with it immediately after 
> > upgrading from Infernalis to Jewel.
> > 
> > In our case, we are enabling ceph-fuse in a heavily used Linux cluster, and our 
> > users complained about the mount points becoming unavailable some time after 
> > their applications start up.
> > 
> > First we saw
> > 
> > https://github.com/ceph/ceph/pull/10027
> > 
> > and once that was fixed, we saw
> > 
> > http://tracker.ceph.com/issues/16610
> > 
> > 
> > There is a long ML thread with the subject 'ceph-fuse segfaults ( jewel 10.2.2)' 
> > on the topic. At the end, RH staff proposed some patches which we applied (we 
> > recompile ceph ourselves) and which resolved the issues we saw.
> > 
> > You should run ceph-fuse in debug mode to actually check what segfaults you may 
> > have, and if it is a similar problem. You can do that by mounting ceph-fuse with 
> > nohup and the '-d'. Something like:
> > 
> > nohup ceph-fuse --id mount_user -k <path to you key> -m <mon ip>:6789 -d -r 
> > /cephfs /coepp/cephfs > /path/to/some/log 2>&1 &
> > 
> > If you want an even bigger log level, you should set 'debug client = 20' in your 
> > /etc/ceph/ceph.conf before mounting.
> > 
> > 
> > Cheers
> > Goncalo
> > 
> > On 08/24/2016 10:28 PM, Dennis Kramer (DT) wrote:
> >> Hi all,
> >>
> >> Running ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) on 
> >> Ubuntu 16.04LTS.
> >>
> >> Currently I have the weirdest thing, I have a bunch of linux clients, mostly 
> >> debian based (Ubuntu/Mint). They all use version 10.2.2 of ceph-fuse. I'm 
> >> running cephfs since Hammer without any issues, but upgraded last week to 
> >> Jewel and now my clients get:
> >> "Transport endpoint is not connected".
> >>
> >> It seems the error only arises when the client is using the GUI when they 
> >> browse through the ceph-fuse mount, some use nemo, some nautilus. The error 
> >> doesnt show up immediatly, sometimes the client can browse through the share 
> >> for some time before they are kicked out with the error.
> >>
> >> But when I strictly use the shell to browse the ceph-fuse mount in the CLI it 
> >> works without any issues, when I try to use the GUI browser on the same 
> >> client, the error shows and I get kicked out of the ceph-fuse mount until I 
> >> remount.
> >>
> >> Any suggestions?
> >>
> >> With regards,
> >>
> >>
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users@xxxxxxxxxxxxxx
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> > -- 
> > Goncalo Borges
> > Research Computing
> > ARC Centre of Excellence for Particle Physics at the Terascale
> > School of Physics A28 | University of Sydney, NSW  2006
> > T: +61 2 93511937
> > 
> 
> -- 
> Kramer M.D.
> Infrastructure Engineer
> 
> ........................................................................
> Nederlands Forensisch Instituut
> Digitale Technologie & Biometrie
> Laan van Ypenburg 6 | 2497 GB | Den Haag
> Postbus 24044 | 2490 AA | Den Haag
> ........................................................................
> T 070 888 64 30
> M 06 29 62 12 02
> d.kramer@xxxxxxxxxxxxxx / dennis@xxxxxxxxx
> PGP publickey: http://www.holmes.nl/dennis.asc
> www.forensischinstituut.nl
> ........................................................................
> Nederlands Forensisch Instituut. In feiten het beste.
> ........................................................................
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux