Re: [Octopus] OSD overloading

Xiaoxi Chen <superdebuger@xxxxxxxxx> · Tue, 14 Apr 2020 01:40:06 +0800

I am not sure if any change in Octopus make this worse,  but we are in
Nautilus also seeing the RocksDB overhead during snaptrim is huge,   we
walk around by throttling the snaptrim speed to minimal as well as throttle
deep-scurb,  see https://www.spinics.net/lists/dev-ceph/msg01277.html for
detail.

We were expecting the Octopus get rid of removed_snaps key in OSDMap may
improve things.

Igor Fedotov <ifedotov@xxxxxxx> 于2020年4月13日周一 下午6:20写道：

> Given the symptoms high CPU usage within RocksDB and corresponding
> slowdown were presumably caused by RocksDB fragmentation.
>
> And temporary workaround would be to do manual DB compaction using
> ceph-kvstore-tool's compact command.
>
>
> Thanks,
>
> Igor
>
> On 4/13/2020 1:01 AM, Jack wrote:
> > Yep I am
> >
> > The issue is solved now .. and by solved, brace yourselves, I mean I had
> > to recreate all OSDs
> >
> > And this the cluster would not heal itself (because of the original
> > issue), I had to drop every rados pool, stop all OSDs, destroy &
> > recreate them ..
> > Yeah, well, hum
> >
> > There is definitly an underlying issue there
> > Those OSDs were created and upgraded since Luminous
> >
> > I have no more cue on the bug
> > Sadly, there is only so much downtime I can afford on this cluster
> >
> > Anyway ..
> >
> > On 4/9/20 4:51 AM, Ashley Merrick wrote:
> >> Are you sure your not being hit by:
> >>
> >>
> >>
> >> ceph config set osd bluestore_fsck_quick_fix_on_mount false @
> https://docs.ceph.com/docs/master/releases/octopus/
> >>
> >> Have all your OSD's successfully completed the fsck?
> >>
> >>
> >>
> >> Reasons I say that is I can see "20 OSD(s) reporting legacy (not
> per-pool) BlueStore omap usage stats"
> >>
> >>
> >>
> >>
> >>
> >> ---- On Thu, 09 Apr 2020 02:15:02 +0800 Jack <mailto:
> ceph@xxxxxxxxxxxxxx> wrote ----
> >>
> >>
> >>
> >> Just to confirm this does not get better:
> >>
> >> root@backup1:~# ceph status
> >>   cluster:
> >>   id:     9cd41f0f-936d-4b59-8e5d-9b679dae9140
> >>   health: HEALTH_WARN
> >>   20 OSD(s) reporting legacy (not per-pool) BlueStore omap
> >> usage stats
> >>   4/50952060 objects unfound (0.000%)
> >>   nobackfill,norecover,noscrub,nodeep-scrub flag(s) set
> >>   1 osds down
> >>   3 nearfull osd(s)
> >>   Reduced data availability: 826 pgs inactive, 616 pgs down,
> >> 185 pgs peering, 158 pgs stale
> >>   Low space hindering backfill (add storage if this doesn't
> >> resolve itself): 93 pgs backfill_toofull
> >>   Degraded data redundancy: 13285415/101904120 objects
> >> degraded (13.037%), 706 pgs degraded, 696 pgs undersized
> >>   989 pgs not deep-scrubbed in time
> >>   378 pgs not scrubbed in time
> >>   10 pool(s) nearfull
> >>   2216 slow ops, oldest one blocked for 13905 sec, daemons
> >> [osd.1,osd.11,osd.20,osd.24,osd.25,osd.29,osd.31,osd.37,osd.4,osd.5]...
> >> have slow ops.
> >>
> >>   services:
> >>   mon: 1 daemons, quorum backup1 (age 8d)
> >>   mgr: backup1(active, since 8d)
> >>   osd: 37 osds: 26 up (since 9m), 27 in (since 2h); 626 remapped pgs
> >>   flags nobackfill,norecover,noscrub,nodeep-scrub
> >>   rgw: 1 daemon active (backup1.odiso.net)
> >>
> >>   task status:
> >>
> >>   data:
> >>   pools:   10 pools, 2785 pgs
> >>   objects: 50.95M objects, 92 TiB
> >>   usage:   121 TiB used, 39 TiB / 160 TiB avail
> >>   pgs:     29.659% pgs not active
> >>   13285415/101904120 objects degraded (13.037%)
> >>   433992/101904120 objects misplaced (0.426%)
> >>   4/50952060 objects unfound (0.000%)
> >>   840 active+clean+snaptrim_wait
> >>   536 down
> >>   490 active+undersized+degraded+remapped+backfilling
> >>   326 active+clean
> >>   113 peering
> >>   88  active+undersized+degraded
> >>   83  active+undersized+degraded+remapped+backfill_toofull
> >>   79  stale+down
> >>   63  stale+peering
> >>   51  active+clean+snaptrim
> >>   24  activating
> >>   22  active+recovering+degraded
> >>   19  active+remapped+backfilling
> >>   13  stale+active+undersized+degraded
> >>   9   remapped+peering
> >>   9   active+undersized+remapped+backfilling
> >>   9
> >> active+undersized+degraded+remapped+backfill_wait+backfill_toofull
> >>   2   stale+active+clean+snaptrim
> >>   2   active+undersized
> >>   1   stale+active+clean+snaptrim_wait
> >>   1   active+remapped+backfill_toofull
> >>   1   active+clean+snaptrim_wait+laggy
> >>   1   active+recovering+undersized+remapped
> >>   1   down+remapped
> >>   1   activating+undersized+degraded+remapped
> >>   1   active+recovering+laggy
> >>
> >> On 4/8/20 3:27 PM, Jack wrote:
> >>> The CPU is used by userspace, not kernelspace
> >>>
> >>> Here is the perf top, see attachment
> >>>
> >>> Rocksdb eats everything :/
> >>>
> >>>
> >>> On 4/8/20 3:14 PM, Paul Emmerich wrote:
> >>>> What's the CPU busy with while spinning at 100%?
> >>>>
> >>>> Check "perf top" for a quick overview
> >>>>
> >>>>
> >>>> Paul
> >>>>
> >>>
> >>> _______________________________________________
> >>> ceph-users mailing list -- mailto:ceph-users@xxxxxxx
> >>> To unsubscribe send an email to mailto:ceph-users-leave@xxxxxxx
> >>>
> >> _______________________________________________
> >> ceph-users mailing list -- mailto:ceph-users@xxxxxxx
> >> To unsubscribe send an email to mailto:ceph-users-leave@xxxxxxx
> >>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx