Very low performance with ceph kraken (11.2) with rados gw and erasure coded pool

fani rama <fanixrama@xxxxxxxxx> · Fri, 21 Apr 2017 10:53:50 -0400

Hi 
I have a 7 node ceph cluster built with ceph kraken. HW details: each node has 5 x 1TB drives and a single SSD which has been partitioned to provide ceph journal for each of the 5 drives per node. 
Network is 10GigE. Each node has 16 cpus (Intel Haswell family chipset)

I also setup  7 x radosgw's on each of the 7 nodes. 

Now, if I attempt to upload a single 1.5GB test file via s3cmd to the ceph cluser to an erasure coded pool -

pool 53 'default.rgw.buckets.data' erasure size 7 min_size 5 crush_ruleset 1 object_hash rjenkins pg_num 256 pgp_num 256 last_change 808 flags hashpspool stripe_width 4160

then the data xfer speed is very low. It averages about 100MB/s. I understand that EC pools are lower in performance than replicated pools but this seems extremely low. 

If I create a new replicated pool and stop the radosgw processes and rename the replicated pool as default.rgw.buckets.data and retransfer the file (after clearing all system caches), the data transfer rate is much higher at about 600MB/s. If I run it in parallel by uploading 6 different 1.5GB files to 6 of the  radosgw's simultaneously then monitoring via ceph -w (or ceph-dash tool I found online) I see total transfer speeds now around 2.5-3GB/s (about 400-500MB/s per radosgw).

Switching back to same test of uploading 6 different x 1.5GB to 6 radosgw's and backend being erasure coded pool, it drops again to cumulative 150MB/s transfer speed. (across all 6!)

Why is the performance under erasure coded pool for s3 so much lower? iostat shows drives not even being maxed out (utils are around 20% tops). 

Is there some tuning that needs to be done or something else that is missing here? I've also experimented by changing the pg_num/pgp_num and set it to 4096/8192 and still the performance with EC pool is much much lower than with Replicated pool.

Also, there seems to be another bug - sometimes the size of some objects shows up as 512k instead of 1.5GB. (s3cmd ls s3://bucket/testfile1). However, doing a "s3cmd info or radosgw-admin object state --bucket=bucket --object=testfile1 |grep obj_size" shows proper 1.5GB file size

Thanks,
Fani

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com