No, removing the snapshots didn't solve my problem. I eventually traced this problem to XFS deadlocks caused by [osd] "osd mkfs options xfs": "-l size=1024m -n size=64k -i size=2048 -s size=4096" Changing to just "-s size=4096", and reformatting all OSDs solved this problem. Since then, I ran into http://tracker.ceph.com/issues/5699. Snapshots are off until I've deployed Firefly. On Wed, Sep 17, 2014 at 8:09 AM, Florian Haas <florian at hastexo.com> wrote: > Hi Craig, > > just dug this up in the list archives. > > On Fri, Mar 28, 2014 at 2:04 AM, Craig Lewis <clewis at centraldesktop.com> > wrote: > > In the interest of removing variables, I removed all snapshots on all > pools, > > then restarted all ceph daemons at the same time. This brought up osd.8 > as > > well. > > So just to summarize this: your 100% CPU problem at the time went away > after you removed all snapshots, and the actual cause of the issue was > never found? > > I am seeing a similar issue now, and have filed > http://tracker.ceph.com/issues/9503 to make sure it doesn't get lost > again. Can you take a look at that issue and let me know if anything > in the description sounds familiar? > > You mentioned in a later message in the same thread that you would > keep your snapshot script running and "repeat the experiment". Did the > situation change in any way after that? Did the issue come back? Or > did you just stop using snapshots altogether? > > Cheers, > Florian > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140918/c2adaf9e/attachment.htm>