On Tue, 26 Jul 2011, Christian Brunner wrote: > OK I've solved this by myself. > > Since I knew that ther is replication between > > osd001 and osd005, > > as well as > > osd001 and osd015, > osd001 and osd012, > > I decided to take osd005, osd012 and osd015 offline. After that ceph > started to rebuild the PGs on other nodes. At the same time you mean? Or just restarted them? The usual way to debug these situations is: - identify a stuck pg - figure out what osds it maps to. [15,1] - turn on logs on those nodes: ceph osd tell 15 injectargs '--debug-osd 20 --debug-ms 1' ceph osd tell 1 injectargs '--debug-osd 20 --debug-ms 1' - restart peering by togging the primary (first osd, 15) ceph osd down 15 - send us the resulting logs (for all nodes) Even better if you also include other (old) osds that include pg data (osd1 in your case) in this. We definitely want to fix the core issue, so any help gathering the logs would be appreciated! It's also possible that the above will 'fix' it because the peering issue is hard to hit. In that case, cranking up the debug level after the initial crash but before you restart everything might be a good idea. Thanks! sage > > Everything is fine now. > > Regards, > Christian > > 2011/7/26 Christian Brunner <chb@xxxxxx>: > > Another kernel crash another invalid ceph state... > > > > A memory allocation failure in the kernel (ixgbe) of one OSD-Server > > lead to a domino effect in our ceph cluster with "0 up, 0 in". > > > > When I restarted the cluster everything came up again. But I still > > have 6 peering PGs: > > > > pg v5898472: 3712 pgs: 3706 active+clean, 6 peering; 745 GB data, 775 > > GB used, 57642 GB / 59615 GB avail > > > > # ceph pg dump -o - | grep peering > > 0.190 22 0 0 0 90112 92274688 200 > > 200 peering 6500'1256 7167'1063 [15,1] [15,1] > > 6500'1256 2011-07-22 11:22:55.798745 > > 3.18d 385 0 0 0 1529498 1566204928 300 > > 300 peering 7013'134376 7167'20162 [15,1] [15,1] > > 6933'132427 2011-07-22 11:22:56.488471 > > 0.4c 9 0 0 0 36864 37748736 200 > > 200 peering 6500'673 7163'1095 [12,1] [12,1] > > 6500'673 2011-07-22 11:22:20.226119 > > 3.49 171 0 0 0 671467 687580272 295 > > 295 peering 7013'10276 7163'2879 [12,1] [12,1] > > 6933'9455 2011-07-22 11:22:20.701854 > > 0.35e 6 0 0 0 24576 25165824 200 > > 200 peering 6500'628 7163'1142 [12,1] [12,1] > > 6500'628 2011-07-22 11:22:19.267804 > > 3.35b 198 0 0 0 791800 810803200 297 > > 297 peering 7013'66727 7163'5759 [12,1] [12,1] > > 6933'65715 2011-07-22 11:22:20.035265 > > > > > > "ceph pg map" is consistent with "ceph pg dump": > > > > # ceph pg map 0.190 > > 2011-07-26 08:46:19.330981 mon <- [pg,map,0.190] > > 2011-07-26 08:46:19.331981 mon1 -> 'osdmap e7273 pg 0.190 (0.190) -> > > up [15,1] acting [15,1]' (0) > > > > > > But directorys of the PGs are present on multiple nodes (for example > > on osd005 for 0.190): > > > > /ceph/osd.001/current/0.190_head > > /ceph/osd.001/snap_1650435/0.190_head > > /ceph/osd.001/snap_1650445/0.190_head > > /ceph/osd.005/current/0.190_head > > /ceph/osd.005/snap_1572317/0.190_head > > /ceph/osd.005/snap_1572323/0.190_head > > /ceph/osd.015/current/0.190_head > > /ceph/osd.015/snap_1467152/0.190_head > > > > Any hint on how to proceed yould be great. > > > > Thanks, > > Christian > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > >