Re: ceph-fuse "Transport endpoint is not connected" on Jewel 10.2.2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 30, 2016 at 10:46 AM, Dennis Kramer (DBS) <dennis@xxxxxxxxx> wrote:
>
>
> On 08/29/2016 08:31 PM, Gregory Farnum wrote:
>> On Sat, Aug 27, 2016 at 3:01 AM, Francois Lafont
>> <francois.lafont.1978@xxxxxxxxx> wrote:
>>> Hi,
>>>
>>> I had exactly the same error in my production ceph client node with
>>> Jewel 10.2.1 in my case.
>>>
>>> In the client node :
>>> - Ubuntu 14.04
>>> - kernel 3.13.0-92-generic
>>> - ceph 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269)
>>> - cephfs via _ceph-fuse_
>>>
>>> In the cluster node :
>>> - Ubuntu 14.04
>>> - kernel 3.13.0-92-generic
>>> - ceph 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269)
>>>
>>> It was during the execution of a very basic Python (2.7.6) script which
>>> makes some os.makedirs(...) and os.chown(...).
>>>
>>> Just in case, the logs are below. I'm sorry they are not verbose at all
>>> and so probably useless for you...
>>>
>>> Which settings should I put in my client and cluster configuration to
>>> have relevant logs if the same error happens again?
>>>
>>> Regards.
>>> François Lafont
>>>
>>> Here are the logs:
>>>
>>> 1. In the client node: http://francois-lafont.ac-versailles.fr/misc/ceph-client.cephfs.log.1.gz
>>
>> Ha, yep, that's one of the bugs Giancolo found:
>>
>>  ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269)
>>  1: (()+0x299152) [0x7f91398dc152]
>>  2: (()+0x10330) [0x7f9138bbb330]
>>  3: (Client::get_root_ino()+0x10) [0x7f91397df6c0]
>>  4: (CephFuse::Handle::make_fake_ino(inodeno_t, snapid_t)+0x175)
>> [0x7f91397dd3d5]
>>  5: (()+0x19ac09) [0x7f91397ddc09]
>>  6: (()+0x14b45) [0x7f91391f7b45]
>>  7: (()+0x1522b) [0x7f91391f822b]
>>  8: (()+0x11e49) [0x7f91391f4e49]
>>  9: (()+0x8184) [0x7f9138bb3184]
>>  10: (clone()+0x6d) [0x7f913752237d]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>> needed to interpret this.
>>
>>
>> So I that'll be in the next Jewel release if it's not already fixed in 10.2.2.
>> -Greg
>
> Hi Greg,
>
> Any news when the fix is getting backported? This is a serious problem
> for our production environment at the moment and we would rather not
> want to compile ceph from source if possible.

Looks like the locking fix (https://github.com/ceph/ceph/pull/10027)
didn't have an associated tracker ticket, so got missed for
backporting.

Opened PR for Jewel backport here: https://github.com/ceph/ceph/pull/10921

John

> With regards,
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux