On Mon, Jun 2, 2014 at 2:47 PM, Sébastien Lorion <sl@xxxxxxxxxxxxxxxxxxxxx> wrote:
On Mon, Jun 2, 2014 at 12:52 PM, Kevin Goess <kgoess@xxxxxxxxxxx> wrote:
> So my conclusion is that for now, the best way to scale read-only queries for a sharded master is to> implement map-reduce at the application level.That's the conclusion I would expect. It's the price you pay for sharding, it's part of the deal.But it's also the benefit you get from sharding. Once your read traffic grows to the point that it's too much for a single host, you're going to have to re-shard it all again *anyway*. The whole point of sharding is that it allows you to grow outside the capacities of a single host.I am not sure I am following you completely. I can replicate the read-only slaves almost as much as I want (with chained replication), so why would I be limited to a single host ? You would have a point concerning database size, but in my case, the main reason I need to shard is because of the amount of writes.
Not sure if this will work for you, but sharing a similar scenario in case it may work for you.
An extension I wrote provides similar logical replication as you've probably seen in other tools. https://github.com/omniti-labs/mimeo
A client of ours had a table sharded by UUID to 512 clusters but needed that data pulled to a single cluster for reporting purposes. The tables also had a timestamp column that was set on each insert/update, so the incremental replication method was able to be used here to pull data from all clusters to a single cluster. The single reporting cluster then just had an inheritance table set up with an empty parent table pointing to all the child tables that pulled data into them.
Yes, it was a lot of setup since each of the 512 tables has to be set up individually. But once it was set up it worked surprisingly well. And it's honestly a use case I had never foreseen for the extension.