On Wed, Mar 16, 2016 at 06:36:33AM +0000, Pavan Rallabhandi wrote: > I find this to be discussed here before, but couldn¹t find any solution > hence the mail. In RGW, for a bucket holding objects in the range of ~ > millions, one can find it to take for ever to delete the bucket(via > radosgw-admin). I understand the gc(and its parameters) that would reclaim > the space eventually, but am looking more at the bucket deletion options > that can possibly speed up the operation. This ties well into a mail I had sitting in my drafts, but never got around to sending. Whilst doing some rough benchmarking on bucket index sharding, I ran into some terrible performance for key deletion on non-existent keys. Shards did NOT alleviate this performance issue, but did help elsewhere. Numbers given below are for unsharded buckets; relatively empty buckets perform worse when shards before performance picks up again. Test methodology: - Fire single DELETE key ops to the RGW; not using multi-object delete. - I measured the time taken for each delete, and report it here for the 99% percentile (1% of operations took longer than this). - I took at least 1K samples for #keys up to and including 10k keys per bucket. For 50k keys/bucket I capped it to the first 100 samples instead of waiting 10 hours for the run to complete. - The DELETE operations were run single-threaded, with no concurrency. Test environments: Clusters are were both running Hammer 0.94.5 on Ubuntu precise; the hardware is a long way from being new; there are no SSDs, the journal is the first partition on each OSD's disk. The test source host was unloaded, and approx 1ms of latency away from the RGWs. Cluster 1 (Congress, ~1350 OSDs; production cluster; haproxy of 10 RGWs) #keys-in-bucket time per single key delete 0 6.899ms 10 7.507ms 100 13.573ms 1000 327.936ms 10000 4825.597ms 50000 33802.497ms 100000 did-not-finish Cluster 2 (Benjamin, ~50 OSDs; test cluster, practically idle; haproxy of 2 RGWs) #keys-in-bucket time per single key delete 0 4.825ms 10 6.749ms 100 6.146ms 1000 6.816ms 10000 1233.727ms 50000 64262.764ms 100000 did-not-finish The cases marked with did-not-finish are where the RGW seems to time out the operation even with the client having an unlimited timeout. It did occur also connected directly to CivetWeb and not HAProxy. I'm not sure why the 100-keys case on the second cluster seems to have been faster than the 10-key case, but I'm willing to put it down to statistical noise. The huge increase at the end, and the operation not returning over 100k items is concerning. -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead, Foundation Trustee E-Mail : robbat2@xxxxxxxxxx GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com