Slow Requests when deep scrubbing PGs that hold Bucket Index

Christian Wimmer <christian.wimmer@xxxxxxxxx> · Wed, 11 Jul 2018 06:30:38 +0200

Hi,

I'm using ceph primarily for block storage (which works quite well) and as an object gateway using the S3 API.

Here is some info about my system:
Ceph: 12.2.4, OS: Ubuntu 18.04
OSD: Bluestore
6 servers in total, about 60 OSDs, 2TB SSDs each, no HDDs, CFQ scheduler
20 GBit private network
20 GBit public network
Block storage and object storage runs on separate disks

Main use case:
Saving small (30KB - 2MB) objects in rgw buckets. 
- dynamic bucket index resharding is disabled for now but I keep the index objects per shard at about 100k.
- data pool: EC4+2
- index pool: replicated (3)
- atm around 500k objects in each bucket

My problem:
Sometimes, I get "slow request" warnings like so:
"[WRN] Health check update: 7 slow requests are blocked > 32 sec (REQUEST_SLOW)"

It turned out that these warnings appear, whenever specific PGs are being deep scrubbed.
After further investigation, I figured out that these PG's hold the bucket index of the rados gateway.

I already tried some configuration changes like:
ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 0'
ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle'
ceph tell osd.* injectargs '--osd_scrub_sleep 1';
ceph tell osd.* injectargs '--osd_deep_scrub_stride 1048576'
ceph tell osd.* injectargs '--osd_scrub_chunk_max 1'
ceph tell osd.* injectargs '--osd_scrub_chunk_min 1'

This helped a lot to mitigate the effects but the problem is still there.

Does anybody else have this issue?

I have a few questions to better understand what's going on:

As far as I know, the bucket index is stored in rocksdb and the (empty) objects in the index pool are just references to the data in rocksdb. Is that correct?

How does a deep scrub affect rocksdb?
Does the index pool even need deep scrubbing or could I just disable it?

Also:

Does it make sense to create more index shards to get the objects per shard down to let's say 50k or 20k?

Right now, I have about 500k objects per bucket. I want to increase that number to a couple of hundred million objects. Do you see any problems with that, provided that the bucket index is sharded appropriately?

Any help is appreciated. Let me know if you need anything like logs, configs, etc.

Thanks! 

Christian
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com