Hi Sage, Thanks, after thrashing it became a little bit better, but not yet healthy. ceph -s: http://pastebin.com/vD28FJ4A ceph osd dump: http://pastebin.com/37FLNxd7 ceph pg dump: http://pastebin.com/pccdg20j (osd.0 and 1 are not running. I issued some "osd in" commands. osd.4 are running but marked down/out: what is the "autoout" ?) After thrashing some times (maybe I thrash it too much ?), the osd clusters really thrashed much, like in ceph -w: http://pastebin.com/fjeqrhxp I thought osd's osdmap epoch was around 4900 (by seeing data/current/meta), but it needed 6 or 7 osd thrash command execs until it seemed to work on something, and epoch reached over 10000. Now I see "deep scrub ok" some time in ceph -w. But still the PGs are 'creating' state, and it does not seem to be creating anything really. I removed and re-creted pools, because the number of PGs are incorrect, and it changed pool id 0,1,2 to 3,4,5. Is this causing the problem ? By the way, MDS crashes on this cluster status. ceph-mds.2.log: http://pastebin.com/Ruf5YB8d Any suggestion is really appreciated. Thanks. regards, Yasu From: Sage Weil <sage@xxxxxxxxxxx> Subject: Re: OSDMap problem: osd does not exist. Date: Wed, 18 Sep 2013 19:58:16 -0700 (PDT) Message-ID: <alpine.DEB.2.00.1309181956020.23507@xxxxxxxxxxxxxxxxxx> > Hey, > > On Wed, 18 Sep 2013, Yasuhiro Ohara wrote: >> >> Hi, >> >> My OSDs are not joining the cluster correctly, >> because the nonce they assume and receive from the peer are different. >> It says "wrong node" because of the entity_id_t peer_addr (i.e., the >> combination of the IP address, port number, and the nonce) is different. >> >> Now, my questions are: >> 1, Are the nonces of OSD peer addrs are kept in the osdmap ? >> 2, (If so) can I modify the nonce value ? >> >> More generally, how can I fix the cluster if I blew away the mon data ? >> >> Below I'd like to summarize what I did. >> - I tried upgrade from 0.57 to 0.67.3 >> - the mon protocol is different, and the mon data format seemed also >> different (changed to use leveldb ?). So restarting all mons. >> - The mon data upgrade did not go well because of the full disk, >> but I didn't notice the cause and stupidly tried to start mon from scratch, >> building the mon data (mon --mkfs). (I solved the full disk problem >> later.) >> - Now there's no OSD exising in the cluster (i.e., in osdmap). >> - I added OSD configurations using "ceph osd create". >> - Still OSDs do not recognize each other; they do not become peers. >> - (The OSDs seem to hold the previous PG data still, and loading them >> is working fine. So I assume I still can recover the data.) >> >> Does anyone have any advice on this ? >> I'm planning to try to modify the source code because of no other choice, >> so that they ignore nonce values :( > > The nonce value is important; you can't just ignore it. If they addr in > the osdmap isn't changing, it si probably because the mon thinks the > latest osdmap is N and the osd's think the latest is >> N. I would look > in the osd data/current/meta directory and see what the newest osdmap > epoch is, compare that to 'ceph osd dump', and then do 'ceph osd thrash N' > to make it churn though a bunch of maps to get to an epoch that is > than > what he OSDs see. Once that happens, the osd boot messages will properly > update the cluster osdmap with their new addr and things should start up. > Until then, the osd will just sit and wait to get a map newer than what > they have that will never come... > > sage > >> >> Thanks in advance. >> >> regards, >> Yasu >> >> From: Yasuhiro Ohara <yasu@xxxxxxxxxxxx> >> Subject: Re: OSDMap problem: osd does not exist. >> Date: Thu, 12 Sep 2013 09:45:51 -0700 (PDT) >> Message-ID: <20130912.094551.06710597.yasu@xxxxxxxxxxxx> >> >> > >> > Hi Joao, >> > >> > Thank you for the response. >> > I meant "ceph-mon -i X --mkfs". >> > >> > Actually I did it on 3 node. On other 2 mon nodes, the original >> > mon data were left, but currently all 5 nodes run ceph-mon again. >> > That I shouldn't do that ? >> > >> > regards, >> > Yasu >> > >> > From: Joao Eduardo Luis <joao.luis@xxxxxxxxxxx> >> > Subject: Re: OSDMap problem: osd does not exist. >> > Date: Thu, 12 Sep 2013 11:35:40 +0100 >> > Message-ID: <523198FC.8050602@xxxxxxxxxxx> >> > >> >> On 09/12/2013 09:35 AM, Yasuhiro Ohara wrote: >> >>> >> >>> Hi, >> >>> >> >>> recently I tried to upgrade from 0.57 to 0.67.3, hit the changes >> >>> of mon protocol, and so I updated all of the 5 mons. >> >>> After upgrading the mon, (and during the debugging of other problems,) >> >>> I removed and created the mon filesystem from scratch. >> >> >> >> What do you mean by this? Did you recreate the file system on all 5 monitors? Did you backup any of your previous mon data directories? >> >> >> >> -Joao >> >> >> >> -- >> >> Joao Eduardo Luis >> >> Software Engineer | http://inktank.com | http://ceph.com >> >> _______________________________________________ >> >> ceph-users mailing list >> >> ceph-users@xxxxxxxxxxxxxx >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com