Large OMAP Objects & Pubsub

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

Looking to get some advice on an issue my clusters have been suffering from. Realize there are lots of text below. Thanks in advance for your consideration.

The cluster has a health warning of "32 large omap objects". It's persisted for several months.

It appears functional and there are no indications of a performance problem at the client for now (no slow ops - everything seems to work fine). It is a multisite cluster with CephFS & S3 in use, as well as pubsub. It is running Ceph version 15.2.13.

We run automated client load tests against this system every day and have been doing that for a year or longer against this system. The key counts of the large OMAP objects in question are growing, I've monitored this over a period of several months. Intuitively I gather this means at some point in the future I will hit performance problems as a result of this.

Large OMAP objects are split across two pools: siteApubsub.rgw.log and siteApubsub.rgw.buckets.index. My client is responsible for processing the pubsub queue. It appears to be doing that successfully: there are no objects in the pubsub data pool as shown in the details below.

I've been keeping a spreadsheet to track the growth of these, assuming I can't attach a file to the mailing list so I've uploaded an image of it here: https://imgur.com/a/gAtAcvp. The data shows constant growth of all of these objects through the last couple of months. It also includes the names of the objects, where there are two categories:

  *   16 instances of objects with names like: 9:03d18f4d:::data_log.47:head
  *   16 instances of objects with names like: 13:0118e6b8:::.dir.4f442377-4b71-4c6a-aaa9-ba945d7694f8.84778.1.15:head

Please find output of a few Ceph commands below giving details of the cluster.

  *   I'm really keen to understand this better and would be more than happy to share additional diags.
  *   I'd like to understand what I need to do to remove these large OMAP objects and prevent future build ups, so I don't need to worry about the stability of this system.

Thanks,
Alex


$ ceph -s
    id:     0b91b8be-3e01-4240-bea5-df01c7e53b7c
    health: HEALTH_WARN
            32 large omap objects

  services:
    mon: 3 daemons, quorum albans_sc0,albans_sc1,albans_sc2 (age 6w)
    mgr: albans_sc2(active, since 6w), standbys: albans_sc1, albans_sc0
    mds: cephfs:1 {0=albans_sc2=up:active} 2 up:standby
    osd: 3 osds: 3 up (since 6w), 3 in (since 10M)
    rgw: 6 daemons active (albans_sc0.pubsub, albans_sc0.rgw0, albans_sc1.pubsub, albans_sc1.rgw0, albans_sc2.pubsub, albans_sc2.rgw0)

  task status:

  data:
    pools:   14 pools, 137 pgs
    objects: 4.52M objects, 160 GiB
    usage:   536 GiB used, 514 GiB / 1.0 TiB avail
    pgs:     137 active+clean

  io:
    client:   28 MiB/s rd, 1.2 MiB/s wr, 673 op/s rd, 189 op/s wr


$ ceph health detail
HEALTH_WARN 32 large omap objects
[WRN] LARGE_OMAP_OBJECTS: 32 large omap objects
    16 large objects found in pool 'siteApubsub.rgw.log'
    16 large objects found in pool 'siteApubsub.rgw.buckets.index'
    Search the cluster log for 'Large omap object found' for more details.

$ ceph df
--- RAW STORAGE ---
CLASS  SIZE     AVAIL    USED     RAW USED  %RAW USED
ssd    1.0 TiB  514 GiB  496 GiB   536 GiB      51.07
TOTAL  1.0 TiB  514 GiB  496 GiB   536 GiB      51.07

--- POOLS ---
POOL                           ID  PGS  STORED   OBJECTS  USED     %USED  MAX AVAIL
device_health_metrics           1    1      0 B        0      0 B      0    153 GiB
cephfs_data                     2   32  135 GiB    1.99M  415 GiB  47.50    153 GiB
cephfs_metadata                 3   32  3.3 GiB    2.09M  9.8 GiB   2.09    153 GiB
siteA.rgw.buckets.data          4   32   24 GiB  438.62k   80 GiB  14.88    153 GiB
.rgw.root                       5    4   19 KiB       29  1.3 MiB      0    153 GiB
siteA.rgw.log                   6    4   79 MiB      799  247 MiB   0.05    153 GiB
siteA.rgw.control               7    4      0 B        8      0 B      0    153 GiB
siteA.rgw.meta                  8    4   13 KiB       37  1.6 MiB      0    153 GiB
siteApubsub.rgw.log             9    4  1.9 GiB      789  5.7 GiB   1.22    153 GiB
siteA.rgw.buckets.index        10    4  456 MiB       31  1.3 GiB   0.29    153 GiB
siteApubsub.rgw.control        11    4      0 B        8      0 B      0    153 GiB
siteApubsub.rgw.meta           12    4   11 KiB       40  1.7 MiB      0    153 GiB
siteApubsub.rgw.buckets.index  13    4  2.0 GiB       47  6.1 GiB   1.31    153 GiB
siteApubsub.rgw.buckets.data   14    4      0 B        0      0 B      0    153 GiB





_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux