Hi Sage Thanks for your pointers in troubleshooting. Your assumption is correct. Here are the findings for osd.58 OSD maps from epoch 289500 to 289568 ARE PRESENT , starting from 289569 OSD maps are MISSING. The last available OSD map is 350946. All the in-between OSD maps are missing. I believe this would be the problem for rest other OSDs as well. The up gradation was not intentional and we do not uses CephFS , so i think we can go ahead for downgrade back to Firefly major release. It would be nice if you can suggest the downgrade procedure. (i have not done it before) Thanks again for your time and suggestion. - Karan Singh - On 25 May 2014, at 18:41, Sage Weil <sage at inktank.com> wrote: > On Sun, 25 May 2014, Sage Weil wrote: >> Hi Karan, >> >> Can you confirm that the next several map(s) (289569, 2895670, etc.) exist >> on those OSDs? >> >> cd /var/lib/ceph/osd/ceph-58/current/meta >> find . | grep 28957 > > Assuming those maps are in fact missing, then I see the bug. > wip-map-advance has a fix that ought to get past it. You can clear out > the map cache size config options if you like; they aren't needed with the > fix. You can grab packages for that branch from gitbuilder.ceph.com. > > Also, note that you are running a dev version that is newer than firefly, > not firefly. Was this intentional? > > We'd need to verify we didn't change any encodings before confirming, but > most likely you are better off downgrading to the firefly release as it > will be much more stable. I think if you aren't using CephFS then it is a > safe downgrade, but don't have time to confirm right now. :) > > sage > > > >> >> sage >> >> On Sun, 25 May 2014, Karan Singh wrote: >> >>> Hi >>> >>> This problem occurred somewhere between 16 - 18th May . Before this cluster >>> was healthy. >>> >>> # As far as i remember correctly last thing i performed was cluster up >>> gradation to 0.80-rc-1 and creating several cinder volumes after a day. >>> >>> # The cluster is not too big , consists of 165 OSD , total capacity 450 TB , >>> used capacity is very low. >>> >>> # I tried osd map cache size = 75000 for a single osd (osd.58) . >>> >>> * Memory utilisation on this node spiked UP. >>> * After monitoring for couple of hours , i found osd loading and decoding >>> operation is not able to go beyond map epoch 289XXX while the latest >>> osd map epoch is 350946. OSD.58 is continuously trying but not able to >>> cross epoch 289XXX. >>> * This is the same problem with rest 32 OSDs , they are not able to cross >>> the epoch number and repeating in the same loop. >>> >>> >>> Here are some output from osd.58 , for a specific epoch number where its >>> getting stuccoed. >>> >>> [root at storage0106-ib ceph]# grep 289568 ceph-osd.58.log > /tmp/osd.58 >>> >>> [root at storage0106-ib tmp]# tail -100 osd.58 >>> 2014-05-25 15:10:41.222579 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.38c( >>> empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/289473) >>> [72,2] r=-1 lpr=289568 pi=289471-289566/4 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.222628 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.38c( >>> empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/289473) >>> [72,2] r=-1 lpr=289568 pi=289471-289566/4 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.222650 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.38c( >>> empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/289473) >>> [72,2] r=-1 lpr=289568 pi=289471-289566/4 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.222665 7fbc0b54c700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[2.38c( empty >>> local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/289473) [72,2] >>> r=-1 lpr=289568 pi=289471-289566/4 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.222676 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.38c( >>> empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/289473) >>> [72,2] r=-1 lpr=289568 pi=289471-289566/4 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.243297 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[0.1f4( >>> empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289501) >>> [37,110] r=-1 lpr=289568 pi=571-289566/9373 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.255742 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[0.1f4( >>> empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289501) >>> [37,110] r=-1 lpr=289568 pi=571-289566/9373 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.255773 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[0.1f4( >>> empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289501) >>> [37,110] r=-1 lpr=289568 pi=571-289566/9373 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.255788 7fbc0b54c700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[0.1f4( empty >>> local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289501) [37,110] >>> r=-1 lpr=289568 pi=571-289566/9373 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.255799 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[0.1f4( >>> empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289501) >>> [37,110] r=-1 lpr=289568 pi=571-289566/9373 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.263083 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.71e( >>> empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289535) >>> [78,124] r=-1 lpr=289568 pi=625-289566/11022 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.277054 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.71e( >>> empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289535) >>> [78,124] r=-1 lpr=289568 pi=625-289566/11022 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.277094 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[2.71e( >>> empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289535) >>> [78,124] r=-1 lpr=289568 pi=625-289566/11022 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.277109 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[2.71e( empty >>> local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289535) [78,124] >>> r=-1 lpr=289568 pi=625-289566/11022 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.277121 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.71e( >>> empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289535) >>> [78,124] r=-1 lpr=289568 pi=625-289566/11022 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.282697 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.71d( >>> empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) >>> [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.282753 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.71d( >>> empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) >>> [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.282774 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.71d( >>> empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) >>> [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.282789 7fbc0b54c700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[2.71d( empty >>> local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) [38,2] >>> r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.282800 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.71d( >>> empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) >>> [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.302274 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.f9( >>> empty local-les=289561 n=0 ec=1 les/c 289561/289561 289567/289567/289567) >>> [79,21] r=-1 lpr=289568 pi=289560-289566/1 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.302328 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.f9( >>> empty local-les=289561 n=0 ec=1 les/c 289561/289561 289567/289567/289567) >>> [79,21] r=-1 lpr=289568 pi=289560-289566/1 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.302350 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[2.f9( >>> empty local-les=289561 n=0 ec=1 les/c 289561/289561 289567/289567/289567) >>> [79,21] r=-1 lpr=289568 pi=289560-289566/1 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.302365 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[2.f9( empty >>> local-les=289561 n=0 ec=1 les/c 289561/289561 289567/289567/289567) [79,21] >>> r=-1 lpr=289568 pi=289560-289566/1 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.302376 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.f9( >>> empty local-les=289561 n=0 ec=1 les/c 289561/289561 289567/289567/289567) >>> [79,21] r=-1 lpr=289568 pi=289560-289566/1 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.340166 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.626( >>> empty local-les=289415 n=0 ec=1 les/c 289415/289415 289567/289567/289414) >>> [66,41] r=-1 lpr=289568 pi=124449-289566/5588 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.343261 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.6c( >>> empty local-les=289454 n=0 ec=1 les/c 289454/289454 289567/289567/289567) >>> [38,55] r=-1 lpr=289568 pi=289453-289566/1 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.343304 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.6c( >>> empty local-les=289454 n=0 ec=1 les/c 289454/289454 289567/289567/289567) >>> [38,55] r=-1 lpr=289568 pi=289453-289566/1 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.343325 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.6c( >>> empty local-les=289454 n=0 ec=1 les/c 289454/289454 289567/289567/289567) >>> [38,55] r=-1 lpr=289568 pi=289453-289566/1 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.343339 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[0.6c( empty >>> local-les=289454 n=0 ec=1 les/c 289454/289454 289567/289567/289567) [38,55] >>> r=-1 lpr=289568 pi=289453-289566/1 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.343351 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.6c( >>> empty local-les=289454 n=0 ec=1 les/c 289454/289454 289567/289567/289567) >>> [38,55] r=-1 lpr=289568 pi=289453-289566/1 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.347767 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.626( >>> empty local-les=289415 n=0 ec=1 les/c 289415/289415 289567/289567/289414) >>> [66,41] r=-1 lpr=289568 pi=124449-289566/5588 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.347800 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.626( >>> empty local-les=289415 n=0 ec=1 les/c 289415/289415 289567/289567/289414) >>> [66,41] r=-1 lpr=289568 pi=124449-289566/5588 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.347815 7fbc0b54c700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[2.626( empty >>> local-les=289415 n=0 ec=1 les/c 289415/289415 289567/289567/289414) [66,41] >>> r=-1 lpr=289568 pi=124449-289566/5588 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.347828 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.626( >>> empty local-les=289415 n=0 ec=1 les/c 289415/289415 289567/289567/289414) >>> [66,41] r=-1 lpr=289568 pi=124449-289566/5588 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.363524 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.3b5( >>> empty local-les=289538 n=0 ec=1 les/c 289538/289541 289567/289567/289567) >>> [89,129] r=-1 lpr=289568 pi=289535-289566/1 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.363556 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.3b5( >>> empty local-les=289538 n=0 ec=1 les/c 289538/289541 289567/289567/289567) >>> [89,129] r=-1 lpr=289568 pi=289535-289566/1 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.365664 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.3b5( >>> empty local-les=289538 n=0 ec=1 les/c 289538/289541 289567/289567/289567) >>> [89,129] r=-1 lpr=289568 pi=289535-289566/1 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.365680 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[0.3b5( empty >>> local-les=289538 n=0 ec=1 les/c 289538/289541 289567/289567/289567) [89,129] >>> r=-1 lpr=289568 pi=289535-289566/1 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.365691 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.3b5( >>> empty local-les=289538 n=0 ec=1 les/c 289538/289541 289567/289567/289567) >>> [89,129] r=-1 lpr=289568 pi=289535-289566/1 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.396342 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.604( >>> empty local-les=289514 n=0 ec=1 les/c 289514/289514 289567/289567/289513) >>> [3,109] r=-1 lpr=289568 pi=161020-289566/4775 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.402783 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.604( >>> empty local-les=289514 n=0 ec=1 les/c 289514/289514 289567/289567/289513) >>> [3,109] r=-1 lpr=289568 pi=161020-289566/4775 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.402816 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.604( >>> empty local-les=289514 n=0 ec=1 les/c 289514/289514 289567/289567/289513) >>> [3,109] r=-1 lpr=289568 pi=161020-289566/4775 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.402831 7fbc0b54c700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[2.604( empty >>> local-les=289514 n=0 ec=1 les/c 289514/289514 289567/289567/289513) [3,109] >>> r=-1 lpr=289568 pi=161020-289566/4775 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.402842 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.604( >>> empty local-les=289514 n=0 ec=1 les/c 289514/289514 289567/289567/289513) >>> [3,109] r=-1 lpr=289568 pi=161020-289566/4775 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.405597 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.21d( >>> empty local-les=289521 n=0 ec=1 les/c 289521/289521 289567/289567/286195) >>> [77,137] r=-1 lpr=289568 pi=286192-289566/77 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.405733 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.21d( >>> empty local-les=289521 n=0 ec=1 les/c 289521/289521 289567/289567/286195) >>> [77,137] r=-1 lpr=289568 pi=286192-289566/77 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.405750 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.21d( >>> empty local-les=289521 n=0 ec=1 les/c 289521/289521 289567/289567/286195) >>> [77,137] r=-1 lpr=289568 pi=286192-289566/77 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.405764 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[0.21d( empty >>> local-les=289521 n=0 ec=1 les/c 289521/289521 289567/289567/286195) [77,137] >>> r=-1 lpr=289568 pi=286192-289566/77 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.405775 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.21d( >>> empty local-les=289521 n=0 ec=1 les/c 289521/289521 289567/289567/286195) >>> [77,137] r=-1 lpr=289568 pi=286192-289566/77 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.426717 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.782( >>> empty local-les=289344 n=0 ec=1 les/c 289344/289344 289567/289567/289343) >>> [97,38] r=-1 lpr=289568 pi=289340-289566/2 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.426757 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.782( >>> empty local-les=289344 n=0 ec=1 les/c 289344/289344 289567/289567/289343) >>> [97,38] r=-1 lpr=289568 pi=289340-289566/2 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.426772 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.782( >>> empty local-les=289344 n=0 ec=1 les/c 289344/289344 289567/289567/289343) >>> [97,38] r=-1 lpr=289568 pi=289340-289566/2 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.426786 7fbc0b54c700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[2.782( empty >>> local-les=289344 n=0 ec=1 les/c 289344/289344 289567/289567/289343) [97,38] >>> r=-1 lpr=289568 pi=289340-289566/2 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.426797 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.782( >>> empty local-les=289344 n=0 ec=1 les/c 289344/289344 289567/289567/289343) >>> [97,38] r=-1 lpr=289568 pi=289340-289566/2 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.449454 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.7b2( >>> empty local-les=289491 n=0 ec=1 les/c 289491/289491 289567/289567/289567) >>> [32,60] r=-1 lpr=289568 pi=289490-289566/1 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.449509 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.7b2( >>> empty local-les=289491 n=0 ec=1 les/c 289491/289491 289567/289567/289567) >>> [32,60] r=-1 lpr=289568 pi=289490-289566/1 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.449526 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.7b2( >>> empty local-les=289491 n=0 ec=1 les/c 289491/289491 289567/289567/289567) >>> [32,60] r=-1 lpr=289568 pi=289490-289566/1 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.449547 7fbc0b54c700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[2.7b2( empty >>> local-les=289491 n=0 ec=1 les/c 289491/289491 289567/289567/289567) [32,60] >>> r=-1 lpr=289568 pi=289490-289566/1 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.449559 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.7b2( >>> empty local-les=289491 n=0 ec=1 les/c 289491/289491 289567/289567/289567) >>> [32,60] r=-1 lpr=289568 pi=289490-289566/1 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.468461 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1d0( >>> empty local-les=289490 n=0 ec=1 les/c 289490/289490 289567/289567/289481) >>> [107,101] r=-1 lpr=289568 pi=289479-289566/4 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.468511 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1d0( >>> empty local-les=289490 n=0 ec=1 les/c 289490/289490 289567/289567/289481) >>> [107,101] r=-1 lpr=289568 pi=289479-289566/4 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.468533 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.1d0( >>> empty local-les=289490 n=0 ec=1 les/c 289490/289490 289567/289567/289481) >>> [107,101] r=-1 lpr=289568 pi=289479-289566/4 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.468560 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[0.1d0( empty >>> local-les=289490 n=0 ec=1 les/c 289490/289490 289567/289567/289481) >>> [107,101] r=-1 lpr=289568 pi=289479-289566/4 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.468573 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1d0( >>> empty local-les=289490 n=0 ec=1 les/c 289490/289490 289567/289567/289481) >>> [107,101] r=-1 lpr=289568 pi=289479-289566/4 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.495508 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.71d( >>> empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) >>> [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.495594 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.71d( >>> empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) >>> [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.495618 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.71d( >>> empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) >>> [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.495633 7fbc0b54c700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[2.71d( empty >>> local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) [38,2] >>> r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.495645 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.71d( >>> empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) >>> [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.508936 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.512( >>> empty local-les=289554 n=0 ec=1 les/c 289554/289554 289567/289567/289567) >>> [61,70] r=-1 lpr=289568 pi=289553-289566/1 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.508975 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.512( >>> empty local-les=289554 n=0 ec=1 les/c 289554/289554 289567/289567/289567) >>> [61,70] r=-1 lpr=289568 pi=289553-289566/1 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.508991 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.512( >>> empty local-les=289554 n=0 ec=1 les/c 289554/289554 289567/289567/289567) >>> [61,70] r=-1 lpr=289568 pi=289553-289566/1 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.509005 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[0.512( empty >>> local-les=289554 n=0 ec=1 les/c 289554/289554 289567/289567/289567) [61,70] >>> r=-1 lpr=289568 pi=289553-289566/1 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.509016 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.512( >>> empty local-les=289554 n=0 ec=1 les/c 289554/289554 289567/289567/289567) >>> [61,70] r=-1 lpr=289568 pi=289553-289566/1 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.541973 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1ee( >>> empty local-les=289331 n=0 ec=1 les/c 289331/289331 289567/289567/289567) >>> [146,108] r=-1 lpr=289568 pi=289330-289566/1 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.542031 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1ee( >>> empty local-les=289331 n=0 ec=1 les/c 289331/289331 289567/289567/289567) >>> [146,108] r=-1 lpr=289568 pi=289330-289566/1 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.542049 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.1ee( >>> empty local-les=289331 n=0 ec=1 les/c 289331/289331 289567/289567/289567) >>> [146,108] r=-1 lpr=289568 pi=289330-289566/1 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.542064 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[0.1ee( empty >>> local-les=289331 n=0 ec=1 les/c 289331/289331 289567/289567/289567) >>> [146,108] r=-1 lpr=289568 pi=289330-289566/1 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.542075 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1ee( >>> empty local-les=289331 n=0 ec=1 les/c 289331/289331 289567/289567/289567) >>> [146,108] r=-1 lpr=289568 pi=289330-289566/1 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.553850 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[3.7( v >>> 289523'6912 (193506'3911,289523'6912] local-les=289417 n=5 ec=71845 les/c >>> 289417/289417 289567/289567/289567) [152,85] r=-1 lpr=289568 >>> pi=289416-289566/1 crt=286267'6814 lcod 0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.553892 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[3.7( v >>> 289523'6912 (193506'3911,289523'6912] local-les=289417 n=5 ec=71845 les/c >>> 289417/289417 289567/289567/289567) [152,85] r=-1 lpr=289568 >>> pi=289416-289566/1 crt=286267'6814 lcod 0'0 inactive NOTIFY] take_waiters >>> 2014-05-25 15:10:41.553915 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[3.7( v >>> 289523'6912 (193506'3911,289523'6912] local-les=289417 n=5 ec=71845 les/c >>> 289417/289417 289567/289567/289567) [152,85] r=-1 lpr=289568 >>> pi=289416-289566/1 crt=286267'6814 lcod 0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.553930 7fbc0b54c700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[3.7( v 289523'6912 >>> (193506'3911,289523'6912] local-les=289417 n=5 ec=71845 les/c 289417/289417 >>> 289567/289567/289567) [152,85] r=-1 lpr=289568 pi=289416-289566/1 >>> crt=286267'6814 lcod 0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.553949 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[3.7( v >>> 289523'6912 (193506'3911,289523'6912] local-les=289417 n=5 ec=71845 les/c >>> 289417/289417 289567/289567/289567) [152,85] r=-1 lpr=289568 >>> pi=289416-289566/1 crt=286267'6814 lcod 0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.578314 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1d8( >>> empty local-les=289535 n=0 ec=1 les/c 289535/289535 289567/289567/289534) >>> [46,47] r=-1 lpr=289568 pi=629-289566/8210 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.589983 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1d8( >>> empty local-les=289535 n=0 ec=1 les/c 289535/289535 289567/289567/289534) >>> [46,47] r=-1 lpr=289568 pi=629-289566/8210 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.590025 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.1d8( >>> empty local-les=289535 n=0 ec=1 les/c 289535/289535 289567/289567/289534) >>> [46,47] r=-1 lpr=289568 pi=629-289566/8210 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.590042 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[0.1d8( empty >>> local-les=289535 n=0 ec=1 les/c 289535/289535 289567/289567/289534) [46,47] >>> r=-1 lpr=289568 pi=629-289566/8210 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.590055 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1d8( >>> empty local-les=289535 n=0 ec=1 les/c 289535/289535 289567/289567/289534) >>> [46,47] r=-1 lpr=289568 pi=629-289566/8210 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.594791 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[0.185( >>> empty local-les=289526 n=0 ec=1 les/c 289526/289526 289567/289567/289502) >>> [121,89] r=-1 lpr=289568 pi=492-289566/9900 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.607898 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[0.185( >>> empty local-les=289526 n=0 ec=1 les/c 289526/289526 289567/289567/289502) >>> [121,89] r=-1 lpr=289568 pi=492-289566/9900 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.607931 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[0.185( >>> empty local-les=289526 n=0 ec=1 les/c 289526/289526 289567/289567/289502) >>> [121,89] r=-1 lpr=289568 pi=492-289566/9900 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.607946 7fbc0b54c700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[0.185( empty >>> local-les=289526 n=0 ec=1 les/c 289526/289526 289567/289567/289502) [121,89] >>> r=-1 lpr=289568 pi=492-289566/9900 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.607959 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[0.185( >>> empty local-les=289526 n=0 ec=1 les/c 289526/289526 289567/289567/289502) >>> [121,89] r=-1 lpr=289568 pi=492-289566/9900 crt=0'0 inactive NOTIFY] null >>> 2014-05-25 15:10:41.614492 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.35d( >>> empty local-les=289560 n=0 ec=1 les/c 289560/289560 289567/289567/289559) >>> [86,89] r=-1 lpr=289568 pi=289555-289566/2 crt=0'0 inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 15:10:41.614528 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.35d( >>> empty local-les=289560 n=0 ec=1 les/c 289560/289560 289567/289567/289559) >>> [86,89] r=-1 lpr=289568 pi=289555-289566/2 crt=0'0 inactive NOTIFY] >>> take_waiters >>> 2014-05-25 15:10:41.614548 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[2.35d( >>> empty local-les=289560 n=0 ec=1 les/c 289560/289560 289567/289567/289559) >>> [86,89] r=-1 lpr=289568 pi=289555-289566/2 crt=0'0 inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is 289568 while >>> current is 289568 >>> 2014-05-25 15:10:41.614574 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue pg[2.35d( empty >>> local-les=289560 n=0 ec=1 les/c 289560/289560 289567/289567/289559) [86,89] >>> r=-1 lpr=289568 pi=289555-289566/2 crt=0'0 inactive NOTIFY] >>> 2014-05-25 15:10:41.614586 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.35d( >>> empty local-les=289560 n=0 ec=1 les/c 289560/289560 289567/289567/289559) >>> [86,89] r=-1 lpr=289568 pi=289555-289566/2 crt=0'0 inactive NOTIFY] null >>> [root at storage0106-ib tmp]# >>> >>> >>> >>> >>> >>> >>> - Karan Singh - >>> >>> From: Sage Weil <sage at inktank.com> >>> Subject: Re: 70+ OSD are DOWN and not coming up >>> Date: 25 May 2014 01:22:13 GMT+3 >>> To: Karan Singh <karan.singh at csc.fi> >>> Cc: Ceph Community <ceph-community at lists.ceph.com> >>> >>> On Sun, 25 May 2014, Karan Singh wrote: >>> Hello Sage >>> >>> i tried your procedure for this problem. Its been over 72 >>> hours the >>> problematic OSDs are still trying loading and decoding the >>> correct OSD map >>> epoch. All the 32 OSDS have the same problem . OSD service >>> is started , but >>> OSD is down in cluster and they are still performing >>> loading and decoding >>> operation. >>> >>> To troubleshoot this issue , i have kept debug osd = 20 , >>> and till now all >>> these OSDS have generated 7 TB of logs ( I do not have >>> space problem for >>> logs as logs are kept on a 60TB common NFS share ). >>> >>> Now after comparing the logs , i found that loading and >>> decoding operation >>> is repeating itself , its not at all reaching to the Final >>> correct OSD map >>> epoch. Here are some OSD logs for your reference. >>> >>> >>> http://pastebin.com/aSBY4XYm >>> >>> >>> >>> [root at storage0106-ib ceph]# tail ceph-osd.58.log >>> 2014-05-25 00:18:16.756357 7fbfa35b7700 20 osd.58 350946 >>> get_map 289768 - >>> >>> >>> In this case the PG is looking back ~60000 maps. I'm confused how it >>> got >>> into this state, but setting that aside for the moment, if your maps >>> aren't too large (your cluster isn't huge so that seems possible), you >>> can >>> set 'osd map cache size = 75000' so that the OSD only loads these >>> into >>> memory once. Note that this will eat a significant amount of RAM, so >>> you >>> may need to do this one OSD at a time. Which might be a good idea >>> anyway >>> to test that this works on a single OSD. >>> >>> Was the cluster healthy before this all started? What actually >>> happened >>> to get things into this state? >>> >>> sage >>> >>> >>> >>> loading and decoding 0x10f16200 >>> 2014-05-25 00:18:16.756464 7fbfa2bb6700 20 osd.58 350946 >>> get_map 289569 - >>> loading and decoding 0x10f15e00 >>> 2014-05-25 00:18:16.756457 7fbfa35b7700 10 osd.58 >>> pg_epoch: 289568 pg[2.429( >>> empty local-les=289539 n=0 ec=1 les/c 289539/289539 >>> 289567/289567/289538) >>> [52,65] r=-1 lpr=289568 pi=288989-289566/12 crt=0'0 >>> inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 00:18:16.756518 7fbfa35b7700 10 osd.58 >>> pg_epoch: 289568 pg[2.429( >>> empty local-les=289539 n=0 ec=1 les/c 289539/289539 >>> 289567/289567/289538) >>> [52,65] r=-1 lpr=289568 pi=288989-289566/12 crt=0'0 >>> inactive NOTIFY] >>> take_waiters >>> 2014-05-25 00:18:16.756536 7fbfa35b7700 20 osd.58 >>> pg_epoch: 289568 pg[2.429( >>> empty local-les=289539 n=0 ec=1 les/c 289539/289539 >>> 289567/289567/289538) >>> [52,65] r=-1 lpr=289568 pi=288989-289566/12 crt=0'0 >>> inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is >>> 289568 while >>> current is 289568 >>> 2014-05-25 00:18:16.756556 7fbfa35b7700 10 osd.58 350946 >>> advance_pg advanced >>> by max 200 past min epoch 289568 ... will requeue >>> pg[2.429( empty >>> local-les=289539 n=0 ec=1 les/c 289539/289539 >>> 289567/289567/289538) [52,65] >>> r=-1 lpr=289568 pi=288989-289566/12 crt=0'0 inactive >>> NOTIFY] >>> 2014-05-25 00:18:16.756580 7fbfa35b7700 10 osd.58 >>> pg_epoch: 289568 pg[2.429( >>> empty local-les=289539 n=0 ec=1 les/c 289539/289539 >>> 289567/289567/289538) >>> [52,65] r=-1 lpr=289568 pi=288989-289566/12 crt=0'0 >>> inactive NOTIFY] null >>> 2014-05-25 00:18:16.756599 7fbfa35b7700 10 log is not >>> dirty >>> 2014-05-25 00:18:16.756600 7fbfa2bb6700 20 osd.58 350946 >>> get_map 289570 - >>> loading and decoding 0x10f15e00 >>> 2014-05-25 00:18:16.756695 7fbfa2bb6700 20 osd.58 350946 >>> get_map 289571 - >>> loading and decoding 0x10f15e00 >>> >>> >>> >>> [root at storage0102-ib ceph]# tail ceph-osd.7.log >>> 2014-05-25 00:07:07.547447 7f7d1649b700 20 osd.7 350946 >>> get_map 299068 - >>> loading and decoding 0x35ec200 >>> 2014-05-25 00:07:07.547542 7f7d1649b700 20 osd.7 350946 >>> get_map 299069 - >>> loading and decoding 0x35ec200 >>> 2014-05-25 00:07:07.547636 7f7d1649b700 20 osd.7 350946 >>> get_map 299070 - >>> loading and decoding 0x35ec200 >>> 2014-05-25 00:07:07.547730 7f7d1649b700 20 osd.7 350946 >>> get_map 299071 - >>> loading and decoding 0x35ec200 >>> 2014-05-25 00:07:07.547825 7f7d1649b700 20 osd.7 350946 >>> get_map 299072 - >>> loading and decoding 0x35ec200 >>> 2014-05-25 00:07:07.547920 7f7d1649b700 20 osd.7 350946 >>> get_map 299073 - >>> loading and decoding 0x35ec200 >>> 2014-05-25 00:07:07.548014 7f7d1649b700 20 osd.7 350946 >>> get_map 299074 - >>> loading and decoding 0x35ec200 >>> 2014-05-25 00:07:07.548116 7f7d1649b700 20 osd.7 350946 >>> get_map 299075 - >>> loading and decoding 0x35ec200 >>> 2014-05-25 00:07:07.548211 7f7d1649b700 20 osd.7 350946 >>> get_map 299076 - >>> loading and decoding 0x35ec200 >>> 2014-05-25 00:07:07.548305 7f7d1649b700 20 osd.7 350946 >>> get_map 299077 - >>> loading and decoding 0x35ec200 >>> [root at storage0102-ib ceph]# tail ceph-osd.7.log >>> 2014-05-25 00:07:08.764574 7f7d1649b700 20 osd.7 350946 >>> get_map 299070 - >>> loading and decoding 0x35ec200 >>> 2014-05-25 00:07:08.764565 7f7d15a9a700 10 osd.7 pg_epoch: >>> 298887 pg[0.64a( >>> empty local-les=298850 n=0 ec=1 les/c 298850/298850 >>> 298887/298887/298887) >>> [113,86] r=-1 lpr=298887 pi=298849-298886/1 crt=0'0 >>> inactive NOTIFY] >>> handle_activate_map >>> 2014-05-25 00:07:08.764609 7f7d15a9a700 10 osd.7 pg_epoch: >>> 298887 pg[0.64a( >>> empty local-les=298850 n=0 ec=1 les/c 298850/298850 >>> 298887/298887/298887) >>> [113,86] r=-1 lpr=298887 pi=298849-298886/1 crt=0'0 >>> inactive NOTIFY] >>> take_waiters >>> 2014-05-25 00:07:08.764630 7f7d15a9a700 20 osd.7 pg_epoch: >>> 298887 pg[0.64a( >>> empty local-les=298850 n=0 ec=1 les/c 298850/298850 >>> 298887/298887/298887) >>> [113,86] r=-1 lpr=298887 pi=298849-298886/1 crt=0'0 >>> inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is >>> 298887 while >>> current is 298887 >>> 2014-05-25 00:07:08.764643 7f7d15a9a700 10 osd.7 350946 >>> advance_pg advanced >>> by max 200 past min epoch 298887 ... will requeue >>> pg[0.64a( empty >>> local-les=298850 n=0 ec=1 les/c 298850/298850 >>> 298887/298887/298887) [113,86] >>> r=-1 lpr=298887 pi=298849-298886/1 crt=0'0 inactive >>> NOTIFY] >>> 2014-05-25 00:07:08.764654 7f7d15a9a700 10 osd.7 pg_epoch: >>> 298887 pg[0.64a( >>> empty local-les=298850 n=0 ec=1 les/c 298850/298850 >>> 298887/298887/298887) >>> [113,86] r=-1 lpr=298887 pi=298849-298886/1 crt=0'0 >>> inactive NOTIFY] null >>> 2014-05-25 00:07:08.764675 7f7d15a9a700 10 log is not >>> dirty >>> 2014-05-25 00:07:08.764712 7f7d1649b700 20 osd.7 350946 >>> get_map 299071 - >>> loading and decoding 0x35ec200 >>> 2014-05-25 00:07:08.764814 7f7d1649b700 20 osd.7 350946 >>> get_map 299072 - >>> loading and decoding 0x35ec200 >>> 2014-05-25 00:07:08.764915 7f7d1649b700 20 osd.7 350946 >>> get_map 299073 - >>> loading and decoding 0x35ec200 >>> >>> >>> >>> >>> 2014-05-25 00:16:44.291343 7f38c9551700 20 osd.93 350946 >>> get_map 118210 - >>> loading and decoding 0x35e2c00 >>> 2014-05-25 00:16:44.294767 7f38c8d50700 10 osd.93 >>> pg_epoch: 118065 pg[2.752( >>> empty local-les=118029 n=0 ec=1 les/c 118029/118029 >>> 118065/118065/117839) >>> [14,126] r=-1 lpr=118065 pi=529-118064/6419 crt=0'0 >>> inactive NOTIFY] >>> take_waiters >>> 2014-05-25 00:16:44.294829 7f38c8d50700 20 osd.93 >>> pg_epoch: 118065 pg[2.752( >>> empty local-les=118029 n=0 ec=1 les/c 118029/118029 >>> 118065/118065/117839) >>> [14,126] r=-1 lpr=118065 pi=529-118064/6419 crt=0'0 >>> inactive NOTIFY] >>> handle_activate_map: Not dirtying info: last_persisted is >>> 118065 while >>> current is 118065 >>> 2014-05-25 00:16:44.294920 7f38c8d50700 10 osd.93 350946 >>> advance_pg advanced >>> by max 200 past min epoch 118065 ... will requeue >>> pg[2.752( empty >>> local-les=118029 n=0 ec=1 les/c 118029/118029 >>> 118065/118065/117839) [14,126] >>> r=-1 lpr=118065 pi=529-118064/6419 crt=0'0 inactive >>> NOTIFY] >>> 2014-05-25 00:16:44.294929 7f38c8d50700 10 osd.93 >>> pg_epoch: 118065 pg[2.752( >>> empty local-les=118029 n=0 ec=1 les/c 118029/118029 >>> 118065/118065/117839) >>> [14,126] r=-1 lpr=118065 pi=529-118064/6419 crt=0'0 >>> inactive NOTIFY] null >>> 2014-05-25 00:16:44.294954 7f38c8d50700 10 log is not >>> dirty >>> 2014-05-25 00:16:44.295162 7f38c9551700 20 osd.93 350946 >>> get_map 118211 - >>> loading and decoding 0x35e2c00 >>> 2014-05-25 00:16:44.295205 7f38c9551700 20 osd.93 350946 >>> get_map 118212 - >>> loading and decoding 0x35e2c00 >>> 2014-05-25 00:16:44.295242 7f38c9551700 20 osd.93 350946 >>> get_map 118213 - >>> loading and decoding 0x35e2c00 >>> 2014-05-25 00:16:44.295277 7f38c9551700 20 osd.93 350946 >>> get_map 118214 - >>> loading and decoding 0x35e2c00 >>> 2014-05-25 00:16:44.295312 7f38c9551700 20 osd.93 350946 >>> get_map 118215 - >>> loading and decoding 0x35e2c00 >>> >>> >>> >>> Can you suggest what next can be done here for these 32 >>> OSDS , your support >>> will be highly appreciated to make cluster healthy again. >>> >>> >>> - Karan - >>> >>> On 22 May 2014, at 20:08, Karan Singh <karan.singh at csc.fi> >>> wrote: >>> >>> Hello Sage >>> * I have followed your advice that you have mentioned >>> below , but >>> still OSDS are doing coding and decoding and cluster >>> is >>> unhealthy. >>> >>> [root at storage0112-ib ceph]# ceph -s >>> cluster 009d3518-e60d-4f74-a26d-c08c1976263c >>> health HEALTH_WARN 1709 pgs degraded; 275 pgs >>> incomplete; 52 pgs >>> stale; 275 pgs stuck inactive; 52 pgs stuck stale; 3143 >>> pgs stuck >>> unclean; recovery 543/3642 objects degraded (14.909%); >>> 54/165 in osds >>> are down; noup,nodown,noout flag(s) set >>> monmap e4: 4 monsat{storage0101-ib=192.168.100.101:6789/0,storage0110-ib=192.168.100.110:678 >>> 9/ >>> 0,storage0114-ib=192.168.100.114:6789/0,storage0115-ib=192.168.100.115:6789 >>> >>> /0}, election epoch 22, quorum 0,1,2,3 >>> storage0101-ib,storage0110-ib,storage0114-ib,storage0115-ib >>> mdsmap e163: 1/1/1 up {0=storage0102=up:active} >>> osdmap e358481: 165 osds: 111 up, 165 in >>> flags noup,nodown,noout >>> pgmap v606405: 4544 pgs, 6 pools, 6101 MB data, 1214 >>> objects >>> 3493 GB used, 357 TB / 361 TB avail >>> 543/3642 objects degraded (14.909%) >>> 6 stale+active+degraded+remapped >>> 1390 active+clean >>> 11 stale+incomplete >>> 21 stale+active+remapped >>> 600 active+degraded+remapped >>> 214 incomplete >>> 1138 active+remapped >>> 2 stale+active+degraded >>> 1 stale+remapped+incomplete >>> 1101 active+degraded >>> 11 stale+active+clean >>> 49 remapped+incomplete >>> >>> >>> * I have tried restarting those OSDs which were not >>> running OR were >>> dead , but they are now throwing a new error while >>> starting up. >>> (http://pastebin.com/v8F0v2Jz) >>> >>> 1. [root at storage0107-ib ceph]# service ceph start osd >>> 2. === osd.106 === >>> 3. Starting Ceph osd.106 on storage0107-ib...already >>> running >>> 4. === osd.143 === >>> 5. Starting Ceph osd.143 on storage0107-ib...already >>> running >>> 6. === osd.59 === >>> 7. Starting Ceph osd.59 on storage0107-ib...already >>> running >>> 8. === osd.94 === >>> 9. Starting Ceph osd.94 on storage0107-ib...already >>> running >>> 10. === osd.119 === >>> 11. Starting Ceph osd.119 on storage0107-ib...already >>> running >>> 12. === osd.82 === >>> 13. failed: 'timeout 30 /usr/bin/ceph -c >>> /etc/ceph/ceph.conf >>> --name=osd.82 >>> --keyring=/var/lib/ceph/osd/ceph-82/keyring osd >>> crush create-or-move -- 82 2.73 host=storage0107-ib >>> root=default' >>> >>> >>> After a continuous attempt of restarting them , it gets >>> restarts >>> (usually after 6-7 manual attempts) >>> >>> Can you advice me on this new error , there is absolutely >>> no >>> communication problem between all the cluster nodes , even >>> no resource >>> crunch. >>> >>> 1. failed: 'timeout 30 /usr/bin/ceph -c >>> /etc/ceph/ceph.conf >>> --name=osd.82 >>> --keyring=/var/lib/ceph/osd/ceph-82/keyring osd >>> crush create-or-move -- 82 2.73 host=storage0107-ib >>> root=default' >>> >>> I have applied a crontab which will do ?service ceph >>> start? every 30 >>> minutes , Since OSD service is not getting started logs >>> are not >>> getting created , no i don?t have any idea whats going on >>> here. >>> >>> * For other osds coding and decoding operations is going >>> on , i am >>> still waiting so that each osd reaches a correct osd >>> map eopoc >>> >>> >>> I will keep you posted with the status , but please help >>> in getting >>> this new error troubleshooted. >>> >>> >>> Regards >>> Karan >>> >>> >>> >>> >>> On 22 May 2014, at 10:26, Craig Lewis >>> <clewis at centraldesktop.com> >>> wrote: >>> >>> On 5/21/14 21:15 , Sage Weil wrote: >>> >>> On Wed, 21 May 2014, Craig Lewis wrote: >>> >>> If you do this over IRC, can you please post a summary to >>> the mailling >>> list? >>> >>> I believe I'm having this issue as well. >>> >>> In the other case, we found that some of the OSDs were >>> behind processing >>> maps (by several thousand epochs). The trick here to give >>> them a chance >>> to catch up is >>> >>> ceph osd set noup >>> ceph osd set nodown >>> ceph osd set noout >>> >>> and wait for them to stop spinning on the CPU. You can >>> check which map >>> each OSD is on with >>> >>> ceph daemon osd.NNN status >>> >>> to see which epoch they are on and compare that to >>> >>> ceph osd stat >>> >>> Once they are within 100 or less epochs, >>> >>> ceph osd unset noup >>> >>> and let them all start up. >>> >>> We haven't determined whether the original problem was >>> caused by this or >>> the other way around; we'll see once they are all caught >>> up. >>> >>> sage >>> >>> >>> I was seeing the CPU spinning too, so I think it is the >>> same >>> issue. Thanks for the explanation! I've been pulling my >>> hair >>> out for weeks. >>> >>> >>> I can give you a data point for the "how". My problems >>> started >>> with a kswapd problem on 12.04.04 (kernel 3.5.0-46-generic >>> #70~precise1-Ubuntu). kswapd was consuming 100% CPU, and >>> it was >>> blocking the ceph-osd processes. Once I prevented kswapd >>> from >>> doing that, my OSDs couldn't recover. noout and nodown >>> didn't >>> help; the OSDs would suicide and restart. >>> >>> >>> Upgrading to Ubuntu 14.04 seems to have helped. The >>> cluster >>> isn't all clear yet, but it's getting better. The cluster >>> is >>> finally healthy after 2 weeks of incomplete and stale. >>> It's >>> still unresponsive, but it's making progress. I am still >>> seeing >>> OSD's consuming 100% CPU, but only the OSDs that are >>> actively >>> deep-scrubing. Once the deep-scrub finishes, the OSD >>> starts >>> behaving again. They seem to be slowly getting better, >>> which >>> matches up with your explanation. >>> >>> >>> I'll go ahead at set noup. I don't think it's necessary >>> at this >>> point, but it's not going to hurt. >>> >>> I'm running Emperor, and looks like osd status isn't >>> supported. >>> Not a big deal though. Deep-scrub has made it through >>> half of >>> the PGs in the last 36 hours, so I'll just watch for >>> another day >>> or two. This is a slave cluster, so I have that luxury. >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Craig Lewis >>> Senior Systems Engineer >>> Office +1.714.602.1309 >>> Email clewis at centraldesktop.com >>> >>> Central Desktop. Work together in ways you never thought >>> possible. >>> Connect with us Website | Twitter | Facebook | >>> LinkedIn >>> | Blog >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users at lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140525/481563ac/attachment.htm>