Job: On Fri, Dec 30, 2016 at 1:01 PM, Alban Hertroys <haramrae@xxxxxxxxx> wrote: >> On 30 Dec 2016, at 11:42, Job <Job@xxxxxxxxxxxxxxxxxxxx> wrote: ... >> The index applied on the timestamp field is a btree("timestamp") ... >> select domain, sum(accessi) as c_count from TABLE where action='1' AND profile IN ('PROFILE_CODE') AND timestamp::date BETWEEN '2016-12-27' AND '2016-12-30' AND timestamp::time BETWEEN '00:00:00' AND '23:59:59' GROUP BY domain ORDER BY c_count DESC LIMIT 101 ... >> Last question: the table is partitioned. I need to manually create index for every sub-tables or there is a way to create on every sub-tables once? I think Alban missed this ( or I missed his response ). Yes, you need to create the indexes for the partitions. When in doubt, create a partition and \d+ it. IIRC you can do 'create partition LIKE master INCLUDING indexes INHERITS(master)', but you'll have to test. Anyway, this is normally not too useful as the master table is normally indexless and kept empty. I normally script the partition creation, and I woill recommend doing that too. Now onto the BETWEEN PROBLEM: > It's usually more efficient to cast the constants you're comparing to, than to cast a field value for each record in the set. The exception to that is when you have an index on the casted field. > In your case, since you're casting to date and time separately, and whole days even, it's probably more efficient to combine that into: > … AND timestamp BETWEEN '2016-12-27 00:00:00'::timestamp with time zone AND '2016-12-30 23:59:59'::timestamp with time zone ... Alban is correct here, in both counts. - DO NOT CAST THE COLUMNS, cast the constants ( unless you are building a specialized index, you can build an index in cast(timestamp as date), and it would be useful if you did a lot of queries ONLY ON DATES ). - DO NOT USE CLOSED INTERVALS for real number-like columns ( remember I told you timestamps are a point in the time line, so real-like ). The man problems strives from the fact tht you cannot cover the real line with non-overlapping CLOSED intervals, BETWEEN uses closed intervals and subtle problems permeate from this fact. Math is a harsh mistress. ( Even when working with integer-like numbers half-open intervals are normally the best way to go in the not so short term, but between seems so nice and natural and reads so well that even I use it where I should not ) > But even then, you're excluding items that fall in the second between the end date and the next day. The new range types are useful there, for example: > > … AND timestamp <@ '[2016-12-27 00:00:00, 2016-12-31 00:00:00)'::tsrange And this is how it is solved with those new-fangled interval thingies ( I've been keying (ts>=xxx and ts <yyy), parens included, for so long that I never remember those, but they are nearly the same. Probably the optimizer splits it anyway. > The above isn't entirely correct, as tsrange uses timestamp without time zone, but you get the gist. Or use >=, < those work. > However, if those time ranges can have other values than '[00:00. 23:59]', then you probably need 2 indexes on that timestamp column; one cast to date and one to time. Otherwise, you end up creating timestamp range filters for each day in the range in the query (which could still be the better approach). Even if they have other values, single index on timestamp column is the way to go if you only select single intervals. I mean, Xmas morning ( data between 25 and 25 and time between 8:00 and 12:59 can easiliy be selected by the interval [20161225T080000, 20161225T130000), but all the mornings in december can not ( although a query with ts>='20160101' and ts <'20170101' and ts:time >='08:00' and ts:time<'13:00' should work quite well, the first two condition guide to an index scan and the rest is done with a filtering ). Francisco Olarte. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general