Re: Merge a sharded master into a single read-only slave

Keith Fiske <keith@xxxxxxxxxx> · Wed, 4 Jun 2014 13:50:31 -0400

On Mon, Jun 2, 2014 at 2:47 PM, Sébastien Lorion <sl@xxxxxxxxxxxxxxxxxxxxx> wrote:

On Mon, Jun 2, 2014 at 12:52 PM, Kevin Goess <kgoess@xxxxxxxxxxx> wrote:

> So my conclusion is that for now, the best way to scale read-only queries for a sharded master is to 

> implement map-reduce at the application level. 

That's the conclusion I would expect. It's the price you pay for sharding, it's part of the deal.  

But it's also the benefit you get from sharding.  Once your read traffic grows to the point that it's too much for a single host, you're going to have to re-shard it all again *anyway*.  The whole point of sharding is that it allows you to grow outside the capacities of a single host.

I am not sure I am following you completely. I can replicate the read-only slaves almost as much as I want (with chained replication), so why would I be limited to a single host ? You would have a point concerning database size, but in my case, the main reason I need to shard is because of the amount of writes.

Not sure if this will work for you, but sharing a similar scenario in case it may work for you.

An extension I wrote provides similar logical replication as you've probably seen in other tools. 

https://github.com/omniti-labs/mimeo 
One
 difference is it has several methods for this replication, one being 
incremental based on either time or serial ID. Since incremental 
replication requires just read-only access on the 
source databases, it causes no extra write overhead as most logical 
replication solutions do (triggers writing to queue tables). 

A 
client of ours had a table sharded by UUID to 512 clusters but needed 
that data pulled to a single cluster for reporting purposes. The tables 
also had a timestamp column that was set on each insert/update, so the 
incremental replication method was able to be used here to pull data 
from all clusters to a single cluster. The single reporting cluster then
 just had an inheritance table set up with an empty parent table pointing
 to all the child tables that pulled data into them.

Yes, it was a
 lot of setup since each of the 512 tables has to be set up 
individually. But once it was set up it worked surprisingly well. And 
it's honestly a use case I had never foreseen for the extension. 

Not sure if this would work in your case, but maybe it can at least give you an idea of what can be done.

--
Keith Fiske

Database Administrator
OmniTI Computer Consulting, Inc.
http://www.keithf4.com