0.80.5-1precise Not Able to Map RBD & CephFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So I've been having a seemingly similar problem and while trying to follow
the steps in this thread, things have gone very south for me.

Kernal on OSDs and MONs: 2.6.32-431.20.3.0.1.el6.centos.plus.x86_64 #1 SMP
Wed Jul 16 21:27:52 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Kernal on RBD host: 3.2.0-61-generic #93-Ubuntu SMP Fri May 2 21:31:50 UTC
2014 x86_64 x86_64 x86_64 GNU/Linux

All are running 0.80.5

I updated the tunables as per this article
http://cephnotes.ksperis.com/blog/2014/01/16/set-tunables-optimal-on-ceph-crushmap

Here's what's happening:

1) On the rbd client node, trying to map rbd produces
$ sudo rbd --conf /etc/ceph/mia1.conf --keyring
/etc/ceph/mia1.client.admin.keyring map poolname
rbd: add failed: (5) Input/output error

Dmesg:

[331172.147289] libceph: mon0 10.103.11.132:6789 feature set mismatch, my 2
< server's 20042040002, missing 20042040000
[331172.154059] libceph: mon0 10.103.11.132:6789 missing required protocol
features
[331182.176604] libceph: mon1 10.103.11.141:6789 feature set mismatch, my 2
< server's 20042040002, missing 20042040000
[331182.183535] libceph: mon1 10.103.11.141:6789 missing required protocol
features
[331192.192630] libceph: mon2 10.103.11.152:6789 feature set mismatch, my 2
< server's 20042040002, missing 20042040000
[331192.199810] libceph: mon2 10.103.11.152:6789 missing required protocol
features
[331202.209324] libceph: mon0 10.103.11.132:6789 feature set mismatch, my 2
< server's 20042040002, missing 20042040000
[331202.216957] libceph: mon0 10.103.11.132:6789 missing required protocol
features
[331212.224540] libceph: mon0 10.103.11.132:6789 feature set mismatch, my 2
< server's 20042040002, missing 20042040000
[331212.232276] libceph: mon0 10.103.11.132:6789 missing required protocol
features
[331222.240605] libceph: mon2 10.103.11.152:6789 feature set mismatch, my 2
< server's 20042040002, missing 20042040000
[331222.248660] libceph: mon2 10.103.11.152:6789 missing required protocol
features

However, running
$ sudo rbd --conf /etc/ceph/mia1.conf --keyring
/etc/ceph/mia1.client.admin.keyring ls
poolname

works fine and shows the expected pool name.

2) On the monitor where I ran the command to update the tunables, I can no
longer run the ceph console:
$ ceph -c /etc/ceph/mia1.conf --keyring /etc/ceph/mia1.client.admin.keyring
2014-08-01 17:32:05.026960 7f21943d2700  0 -- 10.103.11.132:0/1030058 >>
10.103.11.141:6789/0 pipe(0x7f2190028440 sd=3 :42360 s=1 pgs=0 cs=0 l=1
c=0x7f21900286a0).connect protocol feature mismatch, my fffffffff < peer
20fffffffff missing 20000000000
2014-08-01 17:32:05.027024 7f21943d2700  0 -- 10.103.11.132:0/1030058 >>
10.103.11.141:6789/0 pipe(0x7f2190028440 sd=3 :42360 s=1 pgs=0 cs=0 l=1
c=0x7f21900286a0).fault
2014-08-01 17:32:05.027544 7f21943d2700  0 -- 10.103.11.132:0/1030058 >>
10.103.11.141:6789/0 pipe(0x7f2190028440 sd=3 :42361 s=1 pgs=0 cs=0 l=1
c=0x7f21900286a0).connect protocol feature mismatch, my fffffffff < peer
20fffffffff missing 20000000000

and it just keeps spitting out a similar message. However I *can* run the
ceph console and execute basic commands (status, at the very least) from
other nodes.

At this point, I'm reluctant to continue without some advice from someone
else. I can certainly try upgrading the kernal on the rbd client, but I'm
worried I may just make things worse.

All the best,

~ Christopher


On Fri, Aug 1, 2014 at 1:34 PM, Larry Liu <larryliugml at gmail.com> wrote:

> Hi Ilya,  thank you sooooo much! I didn't know my crush map was all messed
> up. Now all is working! I guess it would have worked even without upgrading
> the kernel from 3.2 to 3.13.
>
>
> On Aug 1, 2014, at 12:48 PM, Ilya Dryomov <ilya.dryomov at inktank.com>
> wrote:
>
> > On Fri, Aug 1, 2014 at 10:32 PM, Larry Liu <larryliugml at gmail.com>
> wrote:
> >> cruhmap file is attached. I'm running kernel 3.13.0-29-generic after
> another person suggested. But the kernel upgrade didn't fix anything for
> me. Thanks!
> >
> > So there are two problems.  First, you either have erasure pools or had
> > them in the past.  Unfortunately there is currently a bug that prevents
> > kernel client from working in these circumstances even if you are
> > pointing it at "normal" replicated pools, such as rbd.  Your options
> > are to either upgrade to kernel 3.14 or remove all erasure coded pools
> > and erasure rule.
> >
> > ceph osd pool delete foo
> > ceph osd pool delete bar
> > ceph osd crush rule rm erasure-code
> >
> > Regardless of whether you upgrade to 3.14 or choose to get rid of your
> > erasure pools you'll also have to do
> >
> > ceph osd getcrushmap -o /tmp/crush
> > crushtool -i /tmp/crush --set-chooseleaf_vary_r 0 -o /tmp/crush.new
> > ceph osd setcrushmap -i /tmp/crush.new
> >
> > to take care of the second problem.
> >
> > Thanks,
> >
> >                Ilya
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140801/94acc811/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: crush
Type: application/octet-stream
Size: 781 bytes
Desc: not available
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140801/94acc811/attachment.obj>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux