On Wed, 21 May 2014, Craig Lewis wrote: > If you do this over IRC, can you please post a summary to the mailling > list? > > I believe I'm having this issue as well. In the other case, we found that some of the OSDs were behind processing maps (by several thousand epochs). The trick here to give them a chance to catch up is ceph osd set noup ceph osd set nodown ceph osd set noout and wait for them to stop spinning on the CPU. You can check which map each OSD is on with ceph daemon osd.NNN status to see which epoch they are on and compare that to ceph osd stat Once they are within 100 or less epochs, ceph osd unset noup and let them all start up. We haven't determined whether the original problem was caused by this or the other way around; we'll see once they are all caught up. sage