Re: high memory usage in osd_pglog

Robert Brooks <robert.brooks@xxxxxxxxxx> · Mon, 30 Nov 2020 09:18:38 -0800

Hi Kalle,

We are not using EC. The cluster is 15.2.5, it was upgraded from Mimic in
July. What is odd is the pg_logs report by a dump is much lower than we see
in osd mempool stats.

Regards,

Rob

On Thu, Nov 26, 2020 at 12:11 AM Kalle Happonen <kalle.happonen@xxxxxx>
wrote:

> Hi Robert,
> This sounds very much like a big problem we had 2 weeks back.
>
>
> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/EWPPEMPAJQT6GGYSHM7GIM3BZWS2PSUY/
>
> Are you running EC? Which version are you running? It would fit our
> narrative if you use EC and recently updated to 14.2.11+
>
> For some reason this memory use started growing a day after we updated to
> 14.2.13. Another case I read was 14.2.11 I think. We don't know if the
> pg_logs hadn't really been used before, or if each entry size just grew
> much larger after the update for some reason. We don't see this in our
> replicated pools.
>
> We significantly reduced the default pg_log amount from 3000->500. If your
> cluster is still up and pgs are healthy, this should be doable online.
> Sadly we couldn't support the memory usage, and OSD processes started to
> get OOM killed. We had to trim these logs offline, which sadly affected our
> production.
>
> Cheers,
> Kalle
>
>
> ----- Original Message -----
> > From: "Robert Brooks" <robert.brooks@xxxxxxxxxx>
> > To: "ceph-users" <ceph-users@xxxxxxx>
> > Sent: Wednesday, 25 November, 2020 20:23:05
> > Subject:  high memory usage in osd_pglog
>
> > We are seeing very high osd_pglog usage in mempools for ceph osds. For
> > example...
> >
> >    "mempool": {
> >        "bloom_filter_bytes": 0,
> >        "bloom_filter_items": 0,
> >        "bluestore_alloc_bytes": 41857200,
> >        "bluestore_alloc_items": 523215,
> >        "bluestore_cache_data_bytes": 50876416,
> >        "bluestore_cache_data_items": 1326,
> >        "bluestore_cache_onode_bytes": 6814080,
> >        "bluestore_cache_onode_items": 13104,
> >        "bluestore_cache_other_bytes": 57793850,
> >        "bluestore_cache_other_items": 2599669,
> >        "bluestore_fsck_bytes": 0,
> >        "bluestore_fsck_items": 0,
> >        "bluestore_txc_bytes": 29904,
> >        "bluestore_txc_items": 42,
> >        "bluestore_writing_deferred_bytes": 733191,
> >        "bluestore_writing_deferred_items": 96,
> >        "bluestore_writing_bytes": 0,
> >        "bluestore_writing_items": 0,
> >        "bluefs_bytes": 101400,
> >        "bluefs_items": 1885,
> >        "buffer_anon_bytes": 21505818,
> >        "buffer_anon_items": 14949,
> >        "buffer_meta_bytes": 1161512,
> >        "buffer_meta_items": 13199,
> >        "osd_bytes": 1962920,
> >        "osd_items": 167,
> >        "osd_mapbl_bytes": 825079,
> >        "osd_mapbl_items": 17,
> >        "osd_pglog_bytes": 14099381936,
> >        "osd_pglog_items": 134285429,
> >        "osdmap_bytes": 734616,
> >        "osdmap_items": 26508,
> >        "osdmap_mapping_bytes": 0,
> >        "osdmap_mapping_items": 0,
> >        "pgmap_bytes": 0,
> >        "pgmap_items": 0,
> >        "mds_co_bytes": 0,
> >        "mds_co_items": 0,
> >        "unittest_1_bytes": 0,
> >        "unittest_1_items": 0,
> >        "unittest_2_bytes": 0,
> >        "unittest_2_items": 0
> >    },
> >
> > Where roughly 14g is required for pg_logs. Cluster has 106 OSD and 2432
> > placement groups.
> >
> > The pg log count for placement groups is much less than 134285429 logs.
> >
> > Top counts are...
> >
> > 1486 1.41c
> > 883 7.3
> > 834 7.f
> > 683 7.13
> > 669 7.a
> > 623 7.5
> > 565 7.8
> > 560 7.1c
> > 546 7.16
> > 544 7.19
> >
> > Summing these gives 21594 pg logs.
> >
> > Overall the performance of the cluster is poor, OSD memory usage is high
> > (20-30G resident), and with a moderate workload we are seeing iowait on
> OSD
> > hosts. The memory allocated to caches appears to be low, I believe
> because
> > osd_pglog is taking most of the available memory.
> >
> > Regards,
> >
> > Rob
> >
> > --
> > *******************************************************************
> > This
> > message was sent from RiskIQ, and is intended only for the designated
> > recipient(s). It may contain confidential or proprietary information and
> > may be subject to confidentiality protections. If you are not a
> designated
> > recipient, you may not review, copy or distribute this message. If you
> > receive this in error, please notify the sender by reply e-mail and
> delete
> > this message. Thank you.
> >
> > *******************************************************************
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>

-- 
*******************************************************************
This 
message was sent from RiskIQ, and is intended only for the designated 
recipient(s). It may contain confidential or proprietary information and 
may be subject to confidentiality protections. If you are not a designated 
recipient, you may not review, copy or distribute this message. If you 
receive this in error, please notify the sender by reply e-mail and delete 
this message. Thank you.

*******************************************************************
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx