0.80.5-1precise Not Able to Map RBD & CephFS

ilya.dryomov@xxxxxxxxxxx (Ilya Dryomov) · Sat, 2 Aug 2014 22:20:21 +0400

On Sat, Aug 2, 2014 at 10:03 PM, Christopher O'Connell
<cjo at sendfaster.com> wrote:
> On Sat, Aug 2, 2014 at 6:27 AM, Ilya Dryomov <ilya.dryomov at inktank.com>
> wrote:
>>
>> On Sat, Aug 2, 2014 at 1:41 AM, Christopher O'Connell
>> <cjo at sendfaster.com> wrote:
>> > So I've been having a seemingly similar problem and while trying to
>> > follow
>> > the steps in this thread, things have gone very south for me.
>>
>> Show me where in this thread have I said to set tunables to optimal ;)
>> optimal (== firefly for firefly) is actually the opposite of what you
>> are going to need.
>
>
> So what should tunables be set to? Optimal?

Ordinarily yes, but not if you are going to use older kernels.  In that
case you'd want "default" or "legacy".

>
>>
>>
>> >
>> > Kernal on OSDs and MONs: 2.6.32-431.20.3.0.1.el6.centos.plus.x86_64 #1
>> > SMP
>> > Wed Jul 16 21:27:52 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
>> >
>> > Kernal on RBD host: 3.2.0-61-generic #93-Ubuntu SMP Fri May 2 21:31:50
>> > UTC
>> > 2014 x86_64 x86_64 x86_64 GNU/Linux
>> >
>> > All are running 0.80.5
>>
>> Is this a new firefly cluster or was it created before firely
>> (specifically before v0.78) and then upgraded?
>
>
> It was created before 0.78 and upgraded. It has also been expanded several
> times.
>
>>
>>
>> >
>> > I updated the tunables as per this article
>> >
>> > http://cephnotes.ksperis.com/blog/2014/01/16/set-tunables-optimal-on-ceph-crushmap
>> >
>> > Here's what's happening:
>> >
>> > 1) On the rbd client node, trying to map rbd produces
>> > $ sudo rbd --conf /etc/ceph/mia1.conf --keyring
>> > /etc/ceph/mia1.client.admin.keyring map poolname
>> >
>> > rbd: add failed: (5) Input/output error
>> >
>> > Dmesg:
>> >
>> > [331172.147289] libceph: mon0 10.103.11.132:6789 feature set mismatch,
>> > my 2
>> > < server's 20042040002, missing 20042040000
>> > [331172.154059] libceph: mon0 10.103.11.132:6789 missing required
>> > protocol
>> > features
>> > [331182.176604] libceph: mon1 10.103.11.141:6789 feature set mismatch,
>> > my 2
>> > < server's 20042040002, missing 20042040000
>> > [331182.183535] libceph: mon1 10.103.11.141:6789 missing required
>> > protocol
>> > features
>> > [331192.192630] libceph: mon2 10.103.11.152:6789 feature set mismatch,
>> > my 2
>> > < server's 20042040002, missing 20042040000
>> > [331192.199810] libceph: mon2 10.103.11.152:6789 missing required
>> > protocol
>> > features
>> > [331202.209324] libceph: mon0 10.103.11.132:6789 feature set mismatch,
>> > my 2
>> > < server's 20042040002, missing 20042040000
>> > [331202.216957] libceph: mon0 10.103.11.132:6789 missing required
>> > protocol
>> > features
>> > [331212.224540] libceph: mon0 10.103.11.132:6789 feature set mismatch,
>> > my 2
>> > < server's 20042040002, missing 20042040000
>> > [331212.232276] libceph: mon0 10.103.11.132:6789 missing required
>> > protocol
>> > features
>> > [331222.240605] libceph: mon2 10.103.11.152:6789 feature set mismatch,
>> > my 2
>> > < server's 20042040002, missing 20042040000
>> > [331222.248660] libceph: mon2 10.103.11.152:6789 missing required
>> > protocol
>> > features
>> >
>> > However, running
>> > $ sudo rbd --conf /etc/ceph/mia1.conf --keyring
>> > /etc/ceph/mia1.client.admin.keyring ls
>> > poolname
>> >
>> > works fine and shows the expected pool name.
>> >
>> > 2) On the monitor where I ran the command to update the tunables, I can
>> > no
>> > longer run the ceph console:
>> > $ ceph -c /etc/ceph/mia1.conf --keyring
>> > /etc/ceph/mia1.client.admin.keyring
>> > 2014-08-01 17:32:05.026960 7f21943d2700  0 -- 10.103.11.132:0/1030058 >>
>> > 10.103.11.141:6789/0 pipe(0x7f2190028440 sd=3 :42360 s=1 pgs=0 cs=0 l=1
>> > c=0x7f21900286a0).connect protocol feature mismatch, my fffffffff < peer
>> > 20fffffffff missing 20000000000
>> > 2014-08-01 17:32:05.027024 7f21943d2700  0 -- 10.103.11.132:0/1030058 >>
>> > 10.103.11.141:6789/0 pipe(0x7f2190028440 sd=3 :42360 s=1 pgs=0 cs=0 l=1
>> > c=0x7f21900286a0).fault
>> > 2014-08-01 17:32:05.027544 7f21943d2700  0 -- 10.103.11.132:0/1030058 >>
>> > 10.103.11.141:6789/0 pipe(0x7f2190028440 sd=3 :42361 s=1 pgs=0 cs=0 l=1
>> > c=0x7f21900286a0).connect protocol feature mismatch, my fffffffff < peer
>> > 20fffffffff missing 20000000000
>> >
>> > and it just keeps spitting out a similar message. However I *can* run
>> > the
>> > ceph console and execute basic commands (status, at the very least) from
>> > other nodes.
>>
>> What does ceph -s from those other nodes say?  Check versions of all
>> monitors with
>>
>> ceph daemon mon.<id> version
>
>
> So with some suggestions from people on IRC last night, it seems that
> several nodes didn't get librados upgraded, but still had 0.72. I'm not
> entirely sure how this happened, but I had to use yum-transaction to sort
> out the fact that python-librados went away for 0.80, and it's quite
> possible that I made a mistake and didn't upgrade these libraries. After
> manually getting all of the libraries up to date the problems went away.
>
>>
>>
>> >
>> > At this point, I'm reluctant to continue without some advice from
>> > someone
>> > else. I can certainly try upgrading the kernal on the rbd client, but
>> > I'm
>> > worried I may just make things worse.
>>
>> Upgrading the kernel won't make things worse, it's just a client.  I'm
>> pretty sure we can make this work with 3.2, but if you actually plan on
>> using krbd for anything serious, I'd recommend an upgrade to 3.14.
>>
>> 3.13 will do too, if you don't plan on having any erasure pools in your
>> cluster.
>
>
> I went ahead and upgraded to 3.15 and it sorted out the problems with the
> client.

I hate to tell you this, but due to a subtle change in kernel's low
level primitives, rbd in 3.15 is prone to deadlocks.  It will be fixed
in future 3.15 stable releases, but a couple people have already run
into them and they are very reproducible under higher than average
loads, so you might want to downgrade to 3.14 and do

ceph osd getcrushmap -o /tmp/crush
crushtool -i /tmp/crush --set-chooseleaf_vary_r 0 -o /tmp/crush.new
ceph osd setcrushmap -i /tmp/crush.new

to make "optimal" work with 3.14.

Thanks,

                Ilya