2011/7/18 Sage Weil <sage@xxxxxxxxxxxx>: > On Mon, 18 Jul 2011, Christian Brunner wrote: >> >> >> $ ceph pg dump -o - | grep crashed >> >> >> pg_stat objects mip degr unf kb bytes log >> >> >> disklog state v reported up acting last_scrub >> >> >> 1.1ac 0 0 0 0 0 0 0 0 >> >> >> crashed+peering 0'0 5869'576 [3,13] [3,13] 0'0 >> >> >> 2011-07-13 17:04:30.221618 >> >> >> 0.1ad 0 0 0 0 0 0 198 198 >> >> >> crashed+peering 3067'1194 5869'515 [3,13] [3,13] >> >> >> 3067'1194 2011-07-13 17:04:29.221726 >> >> >> 2.1ab 0 0 0 0 0 0 0 0 >> >> >> crashed+peering 0'0 5869'576 [3,13] [3,13] 0'0 >> >> >> 2011-07-13 17:04:31.222145 >> >> >> 1.6c 0 0 0 0 0 0 0 0 >> >> >> crashed+peering 0'0 5869'577 [3,13] [3,13] 0'0 >> >> >> 2011-07-13 17:05:35.237286 >> >> >> 0.6d 0 0 0 0 0 0 198 198 >> >> >> crashed+peering 3067'636 5869'516 [3,13] [3,13] >> >> >> 3067'636 2011-07-13 17:05:34.237024 >> >> >> 2.6b 0 0 0 0 0 0 0 0 >> >> >> crashed+peering 0'0 5869'577 [3,13] [3,13] 0'0 >> >> >> 2011-07-13 17:05:37.238474 >> > >> > Strange, none of these PGs show up in those logs. Can you do >> > >> > ceph pg map 1.1ac >> > >> > for each PG and see where the current CRUSH map thinks they should be >> > stored? That would be the node to look for them on. You may also want to >> > look for $osd_data/current/$pgid_head on all the OSDs to see where the >> > copies are. >> > >> > The location in the pg dump (from the monitors PGMap) is just the last >> > reported location. Primaries for each PG normally send stats updates >> > several times a minute for each PG that is touched (and less frequently >> > for those that are not). So it's not necessarily bad that it doesn't >> > match... but it is strange that no surviving copy is reporting updated >> > information. >> >> pg dump matches the data from pg map: >> >> 2011-07-18 09:41:02.340371 mon <- [pg,map,1.1ac] >> 2011-07-18 09:41:02.410063 mon0 -> 'osdmap e6517 pg 1.1ac (1.1ac) -> >> up [3,13] acting [3,13]' (0) >> 2011-07-18 09:41:02.434859 mon <- [pg,map,0.1ad] >> 2011-07-18 09:41:02.435546 mon1 -> 'osdmap e6517 pg 0.1ad (0.1ad) -> >> up [3,13] acting [3,13]' (0) >> 2011-07-18 09:41:02.442316 mon <- [pg,map,2.1ab] >> 2011-07-18 09:41:02.442839 mon1 -> 'osdmap e6517 pg 2.1ab (2.1ab) -> >> up [3,13] acting [3,13]' (0) >> 2011-07-18 09:41:02.449131 mon <- [pg,map,1.6c] >> 2011-07-18 09:41:02.449679 mon2 -> 'osdmap e6517 pg 1.6c (1.6c) -> up >> [3,13] acting [3,13]' (0) >> 2011-07-18 09:41:02.455090 mon <- [pg,map,0.6d] >> 2011-07-18 09:41:02.455429 mon0 -> 'osdmap e6517 pg 0.6d (0.6d) -> up >> [3,13] acting [3,13]' (0) >> 2011-07-18 09:41:02.461530 mon <- [pg,map,2.6b] >> 2011-07-18 09:41:02.462012 mon2 -> 'osdmap e6517 pg 2.6b (2.6b) -> up >> [3,13] acting [3,13]' (0) >> >> I've also looked at the filesystem: $pgid_head directories do neither >> exist on osd003 nor on osd013. > > Does it exist on any other nodes? No, it doesn't exist on any node. > Did the osd crash you mentioned happen at the end (when you started seeing > these 6 pgs misbehave), or did it recover fully after that, and only do > this after a later OSD was reformatted? The crash happened during the rebuild. The rebuild finished with these 6 PGs in state "crashed+peering". Everything else was fine (no degraded objects). My suspicion is, that these PGs had been skipped in the rebuild from osd013 to osd003, because of the crash. After that I did a reformat on osd013, which might explain, why these PGs are missing on osd013, too. Is there a way to create a PG manually? Thanks, Christian -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html