Hello,
isn't the first test superfluous here ?
> where extract('day' from
message_copies.msg_date_rec) = 17
> and date_trunc('day',
message_copies.msg_date_rec) = '2012-07-17'
> Here is the index:
>
> CREATE INDEX idx_message_copies_wk2_date_src_pos_partial
> ON feed_all_y2012m07.message_copies_wk2
> USING btree
> (date_trunc('day'::text, msg_date_rec),
> src_id,
> (((pos_georef1::text || pos_georef2::text) ||
pos_georef3::text) || pos_georef4::text))
> TABLESPACE archive
> WHERE (((pos_georef1::text || pos_georef2::text) ||
pos_georef3::text) || pos_georef4::text) IS NOT NULL
> OR NOT (((pos_georef1::text || pos_georef2::text) ||
pos_georef3::text) || pos_georef4::text) = ''::text;
the georef test can be simplified using coalesce:
> and (message_copies.pos_georef1 ||
message_copies.pos_georef2 || message_copies.pos_georef3 ||
message_copies.pos_georef4) <> ''
> and not (message_copies.pos_georef1 ||
message_copies.pos_georef2 || message_copies.pos_georef3 ||
message_copies.pos_georef4) is null
=>
and coaesce (
(message_copies.pos_georef1 || message_copies.pos_georef2
|| message_copies.pos_georef3 || message_copies.pos_georef4),
'') <> ''
In order to avoid this test at query time you might add a
boolean column message_copies.pos.has_georef,
and keep it up to date with a before insert or update
trigger. This will allow to shorten your index definition and
simplify the planner task a little bit.
Moreover it will fasten your query in cases when the index
don't get used.
As Tom already mentioned it, it may make sense not to
concatenate the georef within the index, but keep them
separated, or even keep them in different indexes.
Which is the best depend on the other queries running against
this table
HTH,
Marc Mamin
-----Original Message-----
From: pgsql-performance-owner@xxxxxxxxxxxxxx on behalf of
Ioannis Anagnostopoulos
Sent: Sat 7/21/2012 1:56 AM
To: Tom Lane
Cc: Claudio Freire; pgsql-performance@xxxxxxxxxxxxxx
Subject: Re: A very long running query....
On 21/07/2012 00:10, Tom Lane wrote:
> Claudio Freire <klaussfreire@xxxxxxxxx> writes:
>> Looking at this:
>> " -> Index Scan using
>> idx_message_copies_wk2_date_src_pos_partial on
message_copies_wk2
>> message_copies (cost=0.00..19057.93 rows=52
width=32) (actual
>> time=62.124..5486270.845 rows=387524 loops=1)"
>> " Index Cond:
((date_trunc('day'::text,
>> msg_date_rec) = '2012-07-17 00:00:00'::timestamp
without time zone)
>> AND (src_id = 1))"
>> " Filter:
((date_part('day'::text,
>> msg_date_rec) = 17::double precision) AND (NOT
(((((pos_georef1)::text
>> || (pos_georef2)::text) || (pos_georef3)::text) ||
>> (pos_georef4)::text) IS NULL)) AND
(((((pos_georef1)::text ||
>> (pos_georef2)::text) || (pos_georef3)::text) ||
(pos_georef4)::text)
>> <> ''::text))"
> I think the real problem is that the planner has no hope
of doing
> anything very accurate with such an unwieldy filter
condition. I'd look
> at ways of making the filter conditions simpler, perhaps
by recasting
> the data representation. In particular, that's a
horridly bad way of
> asking whether some columns are empty, which I gather is
the intent.
> If you really want to do it just like that, creating an
index on the
> concatenation _expression_ would guide ANALYZE to collect
some stats about
> it, but it would probably be a lot more efficient to put
together an AND
> or OR of tests on the individual columns.
>
> regards, tom lane
So what you suggest is to forget all together the
concatenation of the
georef1/2/3/4 and instead alter my query with something like:
georef1 is not null and not georeg1 = ''....etc for georef2 3
and 4
That would require to alter my index and have the four georef
columns
separately in it and not as a concatenation and so on for the
partial
index part. And a final thing, you seem to imply that the
indexes are
used by the analyser to collect statistics even if they are
not used. So
an index serves not only as a way to speed up targeted queries
but also
to provide better statistics to the analyzer?
Kind Regards
Yiannis
--
Sent via pgsql-performance mailing list
(pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
No because it is used to select a partition. Otherwise it will go
through the whole hierarchy...