RGW Indexing-in-PostgreSQL?

Martin Millnert <martin@xxxxxxxxxxx> · Tue, 31 Jan 2017 00:56:29 +0100

Hi,

we're running RGW with indexes placed on fast pools, whereas the data is
placed on slow pools. The write throughput is approximately 10% of rados
bench and I guess we're experiencing some locking/syncronization
feature, because parallelisation doesn't really speed it up.
I've seen the sharding option, and the indexless option, but neither one
of these seems like /the/ fix to me.
My limited knowledge of the RGW code makes me guess it's due to the
indexes, and possibly even additional metadata that RGW keeps
up-to-date?

Assuming that the index objectes must be downloaded, rewritten, and
re-uploaded to the index pools by RGW (and that this should be
locking?), the thought that I've had for a while now is:
How hard is it to abstract these index operations and add support for
actually using PostgreSQL as a backend?

I don't mean to say bad words vis-a-vis a completely self-contained data
management system, etc, but (more) right(*) (optional) tool for the job has some
quality to it too. :-)

I'd be willing to take a look at it, I guess.

Thoughts?

Best,
Martin

(* Latency/HA/etc tradeoffs may vary based on use-case and
requirements.)
Attachment:
signature.asc

Description: PGP signature