Re: Which replication is the best for our case ?

John R Pierce <pierce@xxxxxxxxxxxx> · Wed, 1 Jul 2015 10:58:43 -0700

On 7/1/2015 3:08 AM, ben.play wrote:
In fact, the cron job will :
-> select about 10 000 lines from a big table (>100 Gb of data). 1 user has
about 10 lines.
-> each line will be examinate by an algorithm
-> at the end of each line, the cron job updates a few parameters for the
user (add some points for example)
-> Then, it inserts a line in another table to indicate to the user each
transaction.

All updates and inserts can be inserted ONLY by the cron job ...
Therefore ... the merge can be done easily : no one can be update these new
datas.

But ... how big company like Facebook or Youtube can calculate on (a)
dedicated server(s) without impacting users ?

that sort of batch processing is not normally done in database-centric 
systems, rather, databases are usually updated continuously in realtime 
as the events come in via transactions.

your cron task is undoubtably single threaded which means it runs on one 
core only,  so the whole system ends up waiting on a single task 
crunching massive amounts of data, while your other processor cores have 
nothing to do.

it sounds to me like whomever designed this system didn't have a solid 
grip on transactional database processing.

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general