Re: OSD omap disk write bursts

Haomai Wang <haomai@xxxxxxxx> · Tue, 20 Sep 2016 01:20:35 +0800

On Mon, Sep 19, 2016 at 8:25 PM, Daniel Schneller
<daniel.schneller@xxxxxxxxxxxxxxxx> wrote:
> Hello!
>
>
> We are observing a somewhat strange IO pattern on our OSDs.
>
>
> The cluster is running Hammer 0.94.1, 48 OSDs, 4 TB spinners, xfs,
>
> colocated journals.

I think we need to upgrade to newer hammer version.

>
>
> Over periods of days on end we see groups of 3 OSDs being busy with
>
> lots and lots of small writes for several minutes at a time.
>
> Once one group calms down, another group begins. Might be easier to
>
> understand in a graph:
>
>
> https://public.centerdevice.de/3e62a18d-dd01-477e-b52b-f65d181e2920
>
>
> (this shows a limited time range to make the individual lines
>
> discernable)
>
>
> Initial attemps to correlate this to client activity with small writes,
>
> turned out to be wrong -- not really surprising, because both VM RBD
>
> activity, as well as RGW object storage should show much evenly spread
>
> patterns across all OSDs.

>
>
> Using sysdig I figured it seems to be LevelDB activity:
>
>
> [16:58:42 B|daniel.schneller@node02] ~
>
> ➜  sudo sysdig -p "%12user.name %6proc.pid %12proc.name %3fd.num
> %fd.typechar %fd.name" "evt.type=write and proc.pid=8215"
>
> root         8215   ceph-osd     153 f
> /var/lib/ceph/osd/ceph-14/current/omap/763308.log
>
>
> ... (*lots and lots* more writes to 763308.log ) ...
>
>
> root         8215   ceph-osd     153 f
> /var/lib/ceph/osd/ceph-14/current/omap/763308.log
>
> root         8215   ceph-osd     153 f
> /var/lib/ceph/osd/ceph-14/current/omap/763308.log
>
> root         8215   ceph-osd     103 f
> /var/lib/ceph/osd/ceph-14/current/omap/763310.log
>
> root         8215   ceph-osd     103 f
> /var/lib/ceph/osd/ceph-14/current/omap/763310.log
>
> root         8215   ceph-osd     15  f
> /var/lib/ceph/osd/ceph-14/current/omap/LOG
>
> root         8215   ceph-osd     15  f
> /var/lib/ceph/osd/ceph-14/current/omap/LOG
>
> root         8215   ceph-osd     103 f
> /var/lib/ceph/osd/ceph-14/current/omap/763310.log
>
> root         8215   ceph-osd     103 f
> /var/lib/ceph/osd/ceph-14/current/omap/763310.log
>
> root         8215   ceph-osd     103 f
> /var/lib/ceph/osd/ceph-14/current/omap/763310.log
>
> root         8215   ceph-osd     103 f
> /var/lib/ceph/osd/ceph-14/current/omap/763310.log
>
> root         8215   ceph-osd     153 f
> /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb
>
> root         8215   ceph-osd     103 f
> /var/lib/ceph/osd/ceph-14/current/omap/763310.log
>
> root         8215   ceph-osd     103 f
> /var/lib/ceph/osd/ceph-14/current/omap/763310.log
>
> root         8215   ceph-osd     153 f
> /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb
>
> root         8215   ceph-osd     153 f
> /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb
>
> root         8215   ceph-osd     153 f
> /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb
>
> root         8215   ceph-osd     153 f
> /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb
>
> root         8215   ceph-osd     153 f
> /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb
>
> root         8215   ceph-osd     153 f
> /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb
>
> root         8215   ceph-osd     153 f
> /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb
>
> ... (*lots and lots* more writes to 763311.ldb ) ...
>
> root         8215   ceph-osd     15  f
> /var/lib/ceph/osd/ceph-14/current/omap/LOG
>
> root         8215   ceph-osd     15  f
> /var/lib/ceph/osd/ceph-14/current/omap/LOG
>
> root         8215   ceph-osd     18  f
> /var/lib/ceph/osd/ceph-14/current/omap/MANIFEST-171304
>
> root         8215   ceph-osd     18  f
> /var/lib/ceph/osd/ceph-14/current/omap/MANIFEST-171304
>
> root         8215   ceph-osd     15  f
> /var/lib/ceph/osd/ceph-14/current/omap/LOG
>
> root         8215   ceph-osd     15  f
> /var/lib/ceph/osd/ceph-14/current/omap/LOG
>
> root         8215   ceph-osd     103 f
> /var/lib/ceph/osd/ceph-14/current/omap/763310.log
>
> root         8215   ceph-osd     103 f
> /var/lib/ceph/osd/ceph-14/current/omap/763310.log
>
> ... (*lots and lots* more writes to 763310.log ) ...
>
>
>
> This correlates to the patterns in the graph for the given OSDs. If I
>
> understand this correctly, it looks like LevelDB compaction -- however,
>
> if that is the case, why would that happen in groups of only three at a
>
> time and why would it hit a single OSD in short succession? See this
>
> single-OSD graph of the same time as  before:
>
>
> https://public.centerdevice.de/ab5f417d-43af-435d-aad0-7becff2b9acb
>
>
> Are there any regular / event based maintenance tasks that are ensured to
> only run on n (=3)
>
> OSDs at time?

I guess the cluster is healthy all the time? Normally I think we
shouldn't have this outstanding spiking. It also may related to scrub
or deepscrub but these should happen without too much write. How long
time the write activities last? I think perf top `ceph-osd pid` may
help to dig the reason while write spiking happen.

>
>
> Can  I do anything to smooth this out or reduce it somehow?
>
>
> Thanks,
>
> Daniel
>
>
>
>
> --
>
> Daniel Schneller
>
> Principal Cloud Engineer
>
>
>
> CenterDevice GmbH
>
> https://www.centerdevice.de
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com