Re: Issues with fresh 0.93 OSD adding to existing cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I think I might have found the issue....

Something is wrong with my crush map.

I was just attempting to modify it 

microserver-1:~ #  ceph osd getcrushmap -o /tmp/cm
got crush map from osdmap epoch 3937
microserver-1:~ # crushtool -d /tmp/cm -o /tmp/cm.txt
microserver-1:~ # vim /tmp/cm.txt 
microserver-1:~ # crushtool -c /tmp/cm.txt -o /tmp/cm.new
microserver-1:~ # ceph osd setcrushmap -i /tmp/cm.new 
Error EINVAL: Failed to parse crushmap: buffer::end_of_buffer
microserver-1:~ # crushtool -c /tmp/cm.txt -o /tmp/cm.new
microserver-1:~ # ceph osd setcrushmap -i /tmp/cm.new 
Error EPERM: Failed to parse crushmap: error running crushmap through crushtool: (1) Operation not permitted

It's like something is missing or broken from my crush map. This cluster has been around for at least two years and has been upgraded to each new version of ceph. 




-----Original Message-----
From: Malcolm Haak 
Sent: Wednesday, 18 March 2015 12:53 PM
To: Malcolm Haak; Joao Eduardo Luis; ceph-users@xxxxxxxxxxxxxx
Subject: RE:  Issues with fresh 0.93 OSD adding to existing cluster

Sorry to bump this one, but I have more hardware coming and I still cannot add another OSD to my cluster..

Does anybody have any clues?

-----Original Message-----
From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Malcolm Haak
Sent: Friday, 13 March 2015 10:05 AM
To: Joao Eduardo Luis; ceph-users@xxxxxxxxxxxxxx
Subject: Re:  Issues with fresh 0.93 OSD adding to existing cluster

Sorry about this,

I sent this at 1AM last night and went to bed, I didn't realise the log was far too long and the email had been blocked... 

I've reattached all the requested files and trimmed the body of the email. 

Thank you again for looking at this.

-----Original Message-----
From: Malcolm Haak
Sent: Friday, 13 March 2015 1:38 AM
To: 'Joao Eduardo Luis'; ceph-users@xxxxxxxxxxxxxx
Subject: RE:  Issues with fresh 0.93 OSD adding to existing cluster

Ok,

So, I've been doing things in the meantime and as such the osd is now requesting 3008 and 3009 instead of 2758/9 I've included the problem OSD's log file.

And attached all the osdmap's as requested.

Regards

Malcolm Haak

-----Original Message-----
From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Joao Eduardo Luis
Sent: Friday, 13 March 2015 1:02 AM
To: ceph-users@xxxxxxxxxxxxxx
Subject: Re:  Issues with fresh 0.93 OSD adding to existing cluster

On 03/12/2015 05:16 AM, Malcolm Haak wrote:
> Sorry about all the unrelated grep issues..
>
> So I've rebuilt and reinstalled and it's still broken.
>
> On the working node, even with the new packages, everything works.
> On the new broken node, I've added a mon and it works. But I still cannot start an OSD on the new node.
>
> What else do you need from me? I'll get logs run any number of tests.
>
> I've got data in this cluster already, and it's full so I need to expand it, I've already got hardware.
>
> Thanks in advance for even having a look

Sam mentioned to me on IRC that the next step would be to grab the offending osdmaps.  Easiest way for that will be to stop a monitor and run 'ceph-monstore-tool' in order to obtain the full maps, and then use 'ceph-kvstore-tool' to obtain incrementals.

Given the osd is crashing on version 2759, the following would be best:

(Assuming you have stopped a given monitor with id FOO, whose store is sitting at default path /var/lib/ceph/mon/ceph-FOO)

ceph-monstore-tool /var/lib/ceph/mon/ceph-FOO get osdmap -- --version
2758 --out /tmp/osdmap.full.2758

ceph-monstore-tool /var/lib/ceph/mon/ceph-FOO get osdmap -- --version
2759 --out /tmp/osdmap.full.2759

(please note the '--' between 'osdmap' and '--version', as that is required for the tool to do its thing)

and then

ceph-kvstore-tool /var/lib/ceph/mon/ceph-FOO/store.db get osdmap 2758 out /tmp/osdmap.inc.2758

ceph-kvstore-tool /var/lib/ceph/mon/ceph-FOO/store.db get osdmap 2759 out /tmp/osdmap.inc.2759

Cheers!

   -Joao


>
>
> -----Original Message-----
> From: Samuel Just [mailto:sjust@xxxxxxxxxx]
> Sent: Wednesday, 11 March 2015 1:41 AM
> To: Malcolm Haak; jluis@xxxxxxxxxx
> Cc: ceph-users@xxxxxxxxxxxxxx
> Subject: Re:  Issues with fresh 0.93 OSD adding to 
> existing cluster
>
> Joao, it looks like map 2759 is causing trouble, how would he get the 
> full and incremental maps for that out of the mons?
> -Sam
>
> On Tue, 2015-03-10 at 14:12 +0000, Malcolm Haak wrote:
>> Hi Samuel,
>>
>> The sha1? I'm going to admit ignorance as to what you are looking for. They are all running the same release if that is what you are asking.
>> Same tarball built into rpms using rpmbuild on both nodes...
>> Only difference being that the other node has been upgraded and the problem node is fresh.
>>
>> added the requested config here is the command line output
>>
>> microserver-1:/etc # /etc/init.d/ceph start osd.3 
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux