On Wed, Mar 5, 2014 at 12:26 AM, Christopher O'Connell <cjo@xxxxxxxxxxxxxx> wrote: > Hello, > > There are several older discussions regarding RGW performance with high > volume small files. > > I'm planning on running some tests on our test cluster to benchmark this > performance, but before I do I wanted to ask several questions, to make sure > that me test is valid. > > 1) does firefly have any meaningful performance increase in this regard? I > took a look at the commit history for src/rgw and I didn't see anything that > appeared to change it, but if it does, than I'll perform my test on firefly. There was a minor fix to the rgw cache that switched from a regular mutex to a read-write lock. > > 2) The best practice seems to be sharing across multiple buckets. Other than > the small overhead for bucket metatdata, is there any downside to sharding > to many buckets (e.g. 1024 buckets) instead of to just a few (e.g. 16)? I don't see any downside to that. > > 3) Having a bucket with a huge number of items (e.g. 50 million) should only > affect performance of that bucket, correct? Or will loading the large map to > perform operations on it potentially affect other requests through the RGW > by eating all of the memory? Bucket index is not sharded and the leveldb backend is shared, so it would affect other objects / indexes as well. Yehuda _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com