$ ceph pg dump -o - | grep crashed
pg_stat objects mip degr unf kb bytes log
disklog state v reported up acting last_scrub
1.1ac 0 0 0 0 0 0 0 0
crashed+peering 0'0 5869'576 [3,13] [3,13] 0'0
2011-07-13 17:04:30.221618
0.1ad 0 0 0 0 0 0 198
198
crashed+peering 3067'1194 5869'515 [3,13] [3,13]
3067'1194 2011-07-13 17:04:29.221726
2.1ab 0 0 0 0 0 0 0 0
crashed+peering 0'0 5869'576 [3,13] [3,13] 0'0
2011-07-13 17:04:31.222145
1.6c 0 0 0 0 0 0 0 0
crashed+peering 0'0 5869'577 [3,13] [3,13] 0'0
2011-07-13 17:05:35.237286
0.6d 0 0 0 0 0 0 198
198
crashed+peering 3067'636 5869'516 [3,13] [3,13]
3067'636 2011-07-13 17:05:34.237024
2.6b 0 0 0 0 0 0 0 0
crashed+peering 0'0 5869'577 [3,13] [3,13] 0'0
2011-07-13 17:05:37.238474
Strange, none of these PGs show up in those logs. Can you do
ceph pg map 1.1ac
for each PG and see where the current CRUSH map thinks they should be
stored? That would be the node to look for them on. You may also
want to
look for $osd_data/current/$pgid_head on all the OSDs to see where
the
copies are.
The location in the pg dump (from the monitors PGMap) is just the
last
reported location. Primaries for each PG normally send stats updates
several times a minute for each PG that is touched (and less
frequently
for those that are not). So it's not necessarily bad that it doesn't
match... but it is strange that no surviving copy is reporting
updated
information.
pg dump matches the data from pg map:
2011-07-18 09:41:02.340371 mon<- [pg,map,1.1ac]
2011-07-18 09:41:02.410063 mon0 -> 'osdmap e6517 pg 1.1ac (1.1ac) ->
up [3,13] acting [3,13]' (0)
2011-07-18 09:41:02.434859 mon<- [pg,map,0.1ad]
2011-07-18 09:41:02.435546 mon1 -> 'osdmap e6517 pg 0.1ad (0.1ad) ->
up [3,13] acting [3,13]' (0)
2011-07-18 09:41:02.442316 mon<- [pg,map,2.1ab]
2011-07-18 09:41:02.442839 mon1 -> 'osdmap e6517 pg 2.1ab (2.1ab) ->
up [3,13] acting [3,13]' (0)
2011-07-18 09:41:02.449131 mon<- [pg,map,1.6c]
2011-07-18 09:41:02.449679 mon2 -> 'osdmap e6517 pg 1.6c (1.6c)
-> up
[3,13] acting [3,13]' (0)
2011-07-18 09:41:02.455090 mon<- [pg,map,0.6d]
2011-07-18 09:41:02.455429 mon0 -> 'osdmap e6517 pg 0.6d (0.6d)
-> up
[3,13] acting [3,13]' (0)
2011-07-18 09:41:02.461530 mon<- [pg,map,2.6b]
2011-07-18 09:41:02.462012 mon2 -> 'osdmap e6517 pg 2.6b (2.6b)
-> up
[3,13] acting [3,13]' (0)
I've also looked at the filesystem: $pgid_head directories do neither
exist on osd003 nor on osd013.