Large OSD omap directories (LevelDBs)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Ceph folk,


We have a Ceph cluster (info at bottom) with some odd omap directory sizes in our OSDs.
We're looking at 1439 OSDs were the most common omap sizes are 15-40MB.
However a quick sampling reveals some outliers, looking at around the 100 largest omaps one can see sizes go to a few hundred MB and then up to single digit GB sized ones jumping for the last 10 or so:

14G    /var/lib/ceph/osd/ceph-769/current/omap
35G    /var/lib/ceph/osd/ceph-1278/current/omap
48G    /var/lib/ceph/osd/ceph-899/current/omap
49G    /var/lib/ceph/osd/ceph-27/current/omap
57G    /var/lib/ceph/osd/ceph-230/current/omap
58G    /var/lib/ceph/osd/ceph-343/current/omap
58G    /var/lib/ceph/osd/ceph-948/current/omap
60G    /var/lib/ceph/osd/ceph-470/current/omap
66G    /var/lib/ceph/osd/ceph-348/current/omap
67G    /var/lib/ceph/osd/ceph-980/current/omap


Any omap that's 500MB when most are 25 is worrying but 67GB is extremely worrying, something doesn't seem right. The 67GB omap has 37k .sst files and the oldest file in there is from Feb 21st.

Anyone seen this before who can point me in the right direction to start digging?

Cluster info:

ceph version 11.2.0

     monmap e8: 5 mons at {...}
            election epoch 1850, quorum 0,1,2,3,4 ceph-mon1,ceph-mon2,ceph-mon3,ceph-mon4,ceph-mon5
        mgr active: ceph-mon4 standbys: ceph-mon3, ceph-mon2, ceph-mon1, ceph-mon5
     osdmap e27138: 1439 osds: 1439 up, 1439 in
            flags sortbitwise,require_jewel_osds,require_kraken_osds
      pgmap v10911626: 5120 pgs, 21 pools, 1834 TB data, 61535 kobjects
            2525 TB used, 5312 TB / 7837 TB avail
                5087 active+clean
                  17 active+clean+scrubbing
                  16 active+clean+scrubbing+deep

Most pools are 64 PGs for RGW metadata. There are 3 pools with 1024 and another 2 with 512 PGs that hold our data. These are all using EC 8+3, the auxiliary ones are replicated.

Our data is put into pools via the libradosstriper interface which adds some xattrs to be able to read the data back (stripe count, stripe size, stripe unit size, original size (pre-striping)) and the client also puts in a couple of checksumming related attributes.



Thanks,

George
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux