On Wed, Sep 17, 2014 at 5:42 PM, Dan Van Der Ster <daniel.vanderster at cern.ch> wrote: > From: Florian Haas <florian at hastexo.com> > Sent: Sep 17, 2014 5:33 PM > To: Dan Van Der Ster > Cc: Craig Lewis <clewis at centraldesktop.com>;ceph-users at lists.ceph.com > Subject: Re: RGW hung, 2 OSDs using 100% CPU > > On Wed, Sep 17, 2014 at 5:24 PM, Dan Van Der Ster > <daniel.vanderster at cern.ch> wrote: >> Hi Florian, >> >>> On 17 Sep 2014, at 17:09, Florian Haas <florian at hastexo.com> wrote: >>> >>> Hi Craig, >>> >>> just dug this up in the list archives. >>> >>> On Fri, Mar 28, 2014 at 2:04 AM, Craig Lewis <clewis at centraldesktop.com> >>> wrote: >>>> In the interest of removing variables, I removed all snapshots on all >>>> pools, >>>> then restarted all ceph daemons at the same time. This brought up osd.8 >>>> as >>>> well. >>> >>> So just to summarize this: your 100% CPU problem at the time went away >>> after you removed all snapshots, and the actual cause of the issue was >>> never found? >>> >>> I am seeing a similar issue now, and have filed >>> http://tracker.ceph.com/issues/9503 to make sure it doesn't get lost >>> again. Can you take a look at that issue and let me know if anything >>> in the description sounds familiar? >> >> >> Could your ticket be related to the snap trimming issue I?ve finally >> narrowed down in the past couple days? >> >> http://tracker.ceph.com/issues/9487 >> >> Bump up debug_osd to 20 then check the log during one of your incidents. >> If it is busy logging the snap_trimmer messages, then it?s the same issue. >> (The issue is that rbd pools have many purged_snaps, but sometimes after >> backfilling a PG the purged_snaps list is lost and thus the snap trimmer >> becomes very busy whilst re-trimming thousands of snaps. During that time (a >> few minutes on my cluster) the OSD is blocked.) > > That sounds promising, thank you! debug_osd=10 should actually be > sufficient as those snap_trim messages get logged at that level. :) > > Do I understand your issue report correctly in that you have found > setting osd_snap_trim_sleep to be ineffective, because it's being > applied when iterating from PG to PG, rather than from snap to snap? > If so, then I'm guessing that that can hardly be intentional... > > Cheers, > Florian > > Hi, > (Sorry for top posting, mobile now). I've taken the liberty to reformat. :) > That's exactly what I observe -- one sleep per PG. The problem is that the > sleep can't simply be moved since AFAICT the whole PG is locked for the > duration of the trimmer. So the options I proposed are to limit the number > of snaps trimmed per call to e.g 16, or to fix the loss of purged_snaps > after backfilling. Actually, probably both of those are needed. But a real > dev would know better. Okay. Certainly worth a try. Thanks again! I'll let you know when I know more. Cheers, Florian