Re: Large RocksDB (db_slow_bytes) on OSD which is marked as out

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Could you please run:  ceph daemon <osd-id> calc_objectstore_db_histogram

and share the output?


On 8/31/2020 4:33 PM, Wido den Hollander wrote:


On 31/08/2020 12:31, Igor Fedotov wrote:
Hi Wido,

'b' prefix relates to free list manager which keeps all the free extents for main device in a bitmap. Its records have fixed size hence you can easily estimate the overall size for these type of data.


Yes, so I figured.

But I doubt it takes that much. I presume that DB just lacks the proper compaction. Which could happen eventually but looks like you interrupted the process by going offline.

May be try manual compaction with ceph-kvstore-tool?


This cluster is suffering from a lot of spillovers. So we tested with marking one OSD as out.

After being marked as out it still had this large DB. A compact didn't work, the RocksDB database just stayed so large.

New OSDs coming into the cluster aren't suffering from this and they have a RocksDB of a couple of MB in size.

Old OSDs installed with Luminous and now upgraded to Nautilus are suffering from this.

It kind of seems like that garbage data stays behind in RocksDB which is never clean up.

Wido


Thanks,

Igor



On 8/31/2020 10:57 AM, Wido den Hollander wrote:
Hello,

On a Nautilus 14.2.8 cluster I am seeing large RocksDB database with many slow DB bytes in use.

To investigate this further I marked one OSD as out and waited for the all the backfilling to complete.

Once the backfilling was completed I exported BlueFS and investigated the RocksDB using 'ceph-kvstore-tool'. This resulted in 22GB of data.

Listing all the keys in the RocksDB shows me there are 747.000 keys in the DB. A small portion are osdmaps, but the biggest amount are keys prefixed with 'b'.

I dumped the stats of the RocksDB and this shows me:

L1: 1/0: 439.32 KB
L2: 1/0: 2.65 MB
L3: 5/0: 14.36 MB
L4: 127/0: 7.22 GB
L5: 217/0: 13.73 GB
Sum: 351/0: 20.98 GB

So there is almost 21GB of data in this RocksDB database. Why? Where is this coming from?

Throughout this cluster OSDs are suffering from many slow bytes used and I can't figure out why.

Has anybody seen this or has a clue on what is going on?

I have an external copy of this RocksDB database to do investigations on.

Thank you,

Wido
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux