On 01/30/2017 05:56 PM, Martin Millnert wrote:
Hi, we're running RGW with indexes placed on fast pools, whereas the data is placed on slow pools. The write throughput is approximately 10% of rados bench and I guess we're experiencing some locking/syncronization feature, because parallelisation doesn't really speed it up. I've seen the sharding option, and the indexless option, but neither one of these seems like /the/ fix to me. My limited knowledge of the RGW code makes me guess it's due to the indexes, and possibly even additional metadata that RGW keeps up-to-date? Assuming that the index objectes must be downloaded, rewritten, and re-uploaded to the index pools by RGW (and that this should be locking?), the thought that I've had for a while now is: How hard is it to abstract these index operations and add support for actually using PostgreSQL as a backend?
The indexes are definitely something to watch! It would be very interesting to see how your write benchmark does with indexless buckets. Another thing we've seen in the past is when the bucket indexes are slow (in that particular case because the bucket index pool only had 8 PGs!), RGW is spending most of it's time waiting on bucket index updates. You can grab a couple of GDB stacktraces of the RGW process to see if this is happening in your case.
I don't mean to say bad words vis-a-vis a completely self-contained data management system, etc, but (more) right(*) (optional) tool for the job has some quality to it too. :-) I'd be willing to take a look at it, I guess. Thoughts?
I think first figure out if it's actually the bucket indexes. Like you said, there's more metadata associated with RGW, so at least some of what you are seeing could be metadata getting pushed out of the inode and into separate extents. Are you writing out lots of really small objects or fewer larger objects?
Best, Martin (* Latency/HA/etc tradeoffs may vary based on use-case and requirements.)
-- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html