On 14/03/15 13:12, Tomas Vondra wrote:
On 14.3.2015 00:28, Vivekanand Joshi wrote:
Hi Guys,
So here is the full information attached as well as in the link
provided below:
http://pgsql.privatepaste.com/41207bea45
I can provide new information as well.
Thanks.
We still don't have EXPLAIN ANALYZE - how long was the query running (I
assume it got killed at some point)? It's really difficult to give you
any advices because we don't know where the problem is.
If EXPLAIN ANALYZE really takes too long (say, it does not complete
after an hour / over night), you'll have to break the query into parts
and first tweak those independently.
For example in the first message you mentioned that select from the
S_V_D_CAMPAIGN_HIERARCHY view takes ~9 minutes, so start with that. Give
us EXPLAIN ANALYZE for that query.
Few more comments:
(1) You're using CTEs - be aware that CTEs are not just aliases, but
impact planning / optimization, and in some cases may prevent
proper optimization. Try replacing them with plain views.
(2) Varadharajan Mukundan already recommended you to create index on
s_f_promotion_history.send_dt. Have you tried that? You may also
try creating an index on all the columns needed by the query, so
that "Index Only Scan" is possible.
(3) There are probably additional indexes that might be useful here.
What I'd try is adding indexes on all columns that are either a
foreign key or used in a WHERE condition. This might be an
overkill in some cases, but let's see.
(4) I suspect many of the relations referenced in the views are not
actually needed in the query, i.e. the join is performed but
then it's just discarded because those columns are not used.
Try to simplify the views as much has possible - remove all the
tables that are not really necessary to run the query. If two
queries need different tables, maybe defining two views is
a better approach.
(5) The vmstat / iostat data are pretty useless - what you provided are
averages since the machine was started, but we need a few samples
collected when the query is running. I.e. start the query, and then
give us a few samples from these commands:
iostat -x -k 1
vmstat 1
Would like to see if queries of these type can actually run in
postgres server?
Why not? We're running DWH applications on tens/hundreds of GBs.
If yes, what would be the minimum requirements for hardware? We would
like to migrate our whole solution on PostgreSQL as we can spend on
hardware as much as we can but working on a proprietary appliance is
becoming very difficult for us.
That's difficult to say, because we really don't know where the problem
is and how much the queries can be optimized.
I notice that no one appears to have suggested the default setting in
postgresql.conf - these need changing as they are initially set up for
small machines, and to let PostgreSQL take anywhere near full advantage
of a box have large amounts of RAM, you need to change some of the
configuration settings!
For example 'temp_buffers' (default 8MB) and 'maintenance_work_mem'
(default 16MB) should be drastically increased, and there are other
settings that need changing. The precise values depend on many factors,
but the initial values set by default are definitely far too small for
your usage.
Am assuming that you are looking at PostgreSQL 9.4.
Cheers,
Gavin
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance