Re: PGSQL 9.3 - Materialized View - multithreading

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



- http://wiki.postgresql.org/wiki/Performance_Optimization
- run it on the most powerful machine you can find
- get some more memory
- get a big (512-1TB) SSD drive
- avoid recalculating the same things over and over. if your views have many similar elements, then calculate those first into a partial result, then build the final views from the partial result.
- make sure your source tables are fully indexed and have good statistics
- run all the views once with \timing and keep track of how long they took. Fix the slow ones.

G


On 07 Apr 2014, at 15:56, Nicolas Paris <niparisco@xxxxxxxxx> wrote:

> Excellent.
> 
> Maybe the last sub-question :
> 
> Those 3600 mat views do have indexes. 
> I guess I will get better performances in dropping indexes first, then refresh, then re-creating indexes.
> 
> Are there other way to improve performances (like mat views storage parameters), because this routines will be at night, and need to be finished quickly.
> 
> Thanks
> 
> Nicolas PARIS
> 
> 
> 2014-04-07 14:59 GMT+02:00 Graeme B. Bell <grb@xxxxxxxxxxxxxxxxx>:
> 
> Hi again Nick.
> 
> Glad it helped.
> 
> Generally, I would expect that doing all the A's first, then all the B's, and so on, would be fastest since you can re-use the data from cache.
> 
> Concurrency when reading isn't generally a problem. Lots of things can read at the same time and it will be nice and fast.
> It's concurrent writes or concurrent read/write of the same data item that causes problems with locking. That shouldn't be happening here, judging by your description.
> 
> If possible, try to make sure nothing is modifying those source tables A/B/C/D/E/F when you are doing your view refresh.
> 
> Graeme.
> 
> On 07 Apr 2014, at 14:49, Nicolas Paris <niparisco@xxxxxxxxx> wrote:
> 
> > Hello,
> > Thanks for this clear explanation !
> >
> > Then I have a sub-question :
> > Supposed I have 3600 materialised views say 600 mat views from 6 main table. (A,B,C,D,E,F are repetead 600 times with some differences)
> > Is it faster to :
> > 1) parallel refresh  600 time A, then 600 time B etc,
> > OR
> > 2) parallel refresh  600 time A,B,C,D,E,F
> >
> > I guess 1) is faster because they are 600 access to same table loaded in memory ? But do parallel access to the same table implies concurency
> >  and bad performance ?
> >
> > Thanks
> >
> > Nicolas PARIS
> >
> >
> > 2014-04-07 12:29 GMT+02:00 Graeme B. Bell <grb@xxxxxxxxxxxxxxxxx>:
> > On 04 Apr 2014, at 18:29, Nicolas Paris <niparisco@xxxxxxxxx> wrote:
> >
> > > Hello,
> > >
> > > My question is about multiprocess and materialized View.
> > > http://www.postgresql.org/docs/9.3/static/sql-creatematerializedview.html
> > > I (will) have something like 3600 materialised views, and I would like to know the way to refresh them in a multithread way
> > > (anderstand 8 cpu cores -> 8 refresh process  in the same time)
> >
> > Hi Nick,
> >
> > out of DB solution:
> >
> > 1. Produce a text file which contains the 3600 refresh commands you want to run in parallel. You can do that with select and format() if you don't have a list already.
> >
> > 2. I'm going to simulate your 3600 'refresh' commands here with some select and sleep statements that finish at unknown times.
> >
> > (In BASH):
> >   for i in {1..3600} ; do echo "echo \"select pg_sleep(1+random()::int*10); select $i\" | psql mydb" ; done > 3600commands
> >
> > 3. Install Gnu Parallel     and type:
> >
> > parallel < 3600commands
> >
> > 4. Parallel will automatically work out the appropriate number of cores/threads for your CPUs, or you can control it manually with -j.
> > It will also give you a live progress report if you use --progress.
> > e.g. this command balances 8 jobs at a time, prints a dynamic progress report and dumps stdout to /dev/null
> >
> > parallel -j 8 --progress  < 3600commands > /dev/null
> >
> > 5. If you want to make debugging easier use the parameter --tag to tag output for each command.
> >
> > Of course it would be much more elegant if someone implemented something like Gnu Parallel inside postgres or psql ... :-)
> >
> > Hope this helps & have a nice day,
> >
> > Graeme.
> >
> >
> >
> >
> >
> >
> 
> 



-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance





[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux