Hi, we're running RGW with indexes placed on fast pools, whereas the data is placed on slow pools. The write throughput is approximately 10% of rados bench and I guess we're experiencing some locking/syncronization feature, because parallelisation doesn't really speed it up. I've seen the sharding option, and the indexless option, but neither one of these seems like /the/ fix to me. My limited knowledge of the RGW code makes me guess it's due to the indexes, and possibly even additional metadata that RGW keeps up-to-date? Assuming that the index objectes must be downloaded, rewritten, and re-uploaded to the index pools by RGW (and that this should be locking?), the thought that I've had for a while now is: How hard is it to abstract these index operations and add support for actually using PostgreSQL as a backend? I don't mean to say bad words vis-a-vis a completely self-contained data management system, etc, but (more) right(*) (optional) tool for the job has some quality to it too. :-) I'd be willing to take a look at it, I guess. Thoughts? Best, Martin (* Latency/HA/etc tradeoffs may vary based on use-case and requirements.)
Attachment:
signature.asc
Description: PGP signature