Hi Daniel: On Sat, May 23, 2015 at 8:37 PM, Daniel Begin <jfd553@xxxxxxxxxxx> wrote: > Following Francisco suggestion, I was able to do some tests earlier this morning when the partitioning process completed and all the resulting tables analyzed. > Here is what I got on both the original table and its partitioned counterpart while running the same queries. I tested them only for a couple of values but in summary... > Using a constant id: > All the queries I tried took longer on the partitioned table! I got similar results for multiple records using IN (id value1, id value2 ...) > Using a range of ids: > Surprisingly again, all the queries I tried took longer on the partitioned table! > Using a list of ids from a select clause: > More surprisingly, the queries I tried took less time on the partitioned table at least when using the primary key. Using an indexed field took so long compared to the old table that I cancelled the execution for the new one! > Guess what, I will get back to my old fat table unless someone tells me I missed something obvious! > > Note: Tables/indexes description, queries and execution plans are below. > > > > > > > Tables/indexes description ---------------------------------------------------------------------------------- > The original table has 3870130000 records. Primary key/index on each partition queries are > ALTER TABLE oldtable ADD CONSTRAINT oldtable_idversion_pk PRIMARY KEY (id, version); > CREATE INDEX oldtable_groupid_idx ON oldtable USING btree (group_id); > > The partitioned table has 3870130000 records distributed over 87 partitions. Primary key/index on each partition queries are > ALTER TABLE newtable_xx ADD CONSTRAINT newtablexx_idversion_pk PRIMARY KEY (id, version); > CREATE INDEX newtablexx_groupid_idx ON newtable_xx USING btree (group_id); > Where xx is the partition's number suffix It is missing here, but I supose you did not forget to add the exclusion check constraint. > constant id ------------------------------------------------------------------------------------------------------- > select * from oldtable where id=123456789; > "Index Scan using oldtable_idversion_pk on oldtable (cost=0.70..4437.15 rows=1682 width=66)" > " Index Cond: (id = 123456789::bigint)" > --Total query runtime: 62 ms. 1 rows retrieved > select * from newtable where id=123456789; > "Append (cost=0.00..20.19 rows=5 width=66)" > " -> Seq Scan on newtable (cost=0.00..0.00 rows=1 width=66)" > " Filter: (id = 123456789::bigint)" > " -> Index Scan using newtable72_idversion_pk on newtable_72 (cost=0.56..20.19 rows=4 width=66)" > " Index Cond: (id = 123456789::bigint)" > --Total query runtime: 156 ms. 1 rows retrieved This is not surprissing, partitions never help much against that type of queries, with 87 partitions index are not going to be much shallower and the optimizer has a lot more of work to do. Just a couple points. When doing a lot of sequential queries on nearby ids ( on the same partition ) it will help ( better data locality, possibility of having more relevant pages in the cache ). Also, how did you do your timinngs? ( you should repeat each query 2 or 3 times, to see if the times go down a lot due to caching and, if you can, time a couple of explains ( plain, not analyze ) to see how much time the planner takes. As an example, the explain I sent you in my previous message takes between 20 and 30 milliseconds and I only have 17 live partitions ( I routinely un-inherit and move to another schema partitions older than a two years, to help the planner, and re-inherit them if needed ) ). Also, your queries seem to be very, very slow for a single indexed fetch, if I do this on my tables ( similar query to what I sent, modified to hit just 1 partition ): select count(*) from carrier_cdrs where setup between '20150107T123456' and '20150107T222222'; It takes 40 ms in the first go, drops down to 27 after that, and I have ~15 ms RTT to the server ( which is more or less the time reported when I do a plain 'select 1' ). I mean, I suspect your measurements are not good, as they seem too slow. Also, when I use explain analyze on the previous query the server reports 13 ms, which is more or less one RTT less ( as the server starts measuring after receiving the query and ends before sending the reply ). Another thing, how are you getting the explain results ? I would urge you to use explain analyze, as, apart of seeming too slow, the stimations seem to be way off ( explain analyze sends you the estimated and actual results, so it is easy to see, and executes everything on the server, so result transmission time, which is not of use as it has to be the same for every method of calculating the same correct result, is not reported ). For my query I get this: QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=1229.36..1229.37 rows=1 width=0) (actual time=13.794..13.794 rows=1 loops=1) -> Append (cost=0.00..1169.73 rows=23851 width=0) (actual time=0.014..12.069 rows=24016 loops=1) -> Seq Scan on carrier_cdrs (cost=0.00..0.00 rows=1 width=0) (actual time=0.000..0.000 rows=0 loops=1) Filter: ((setup >= '2015-01-07 12:34:56+01'::timestamp with time zone) AND (setup <= '2015-01-07 22:22:22+01'::timestamp with time zone)) -> Index Only Scan using idx_carrier_cdrs_201501_setup on carrier_cdrs_201501 (cost=0.42..1169.73 rows=23850 width=0) (actual time=0.014..10.162 rows=24016 loops=1) Index Cond: ((setup >= '2015-01-07 12:34:56+01'::timestamp with time zone) AND (setup <= '2015-01-07 22:22:22+01'::timestamp with time zone)) Heap Fetches: 24016 Total runtime: 13.819 ms (8 rows) As you can see this reports the time for all the nodes, and the estimated and actual numbers of rows in every step, so things are easier to check ( like how the planner estimated 23850 rows which where really 24016, a .69% error, indicating good estimation ). > I got similar results for multiple records... > select * from oldtable where id IN(10000000,1000000000,2000000000,3000000000); > "Index Scan using oldtable_idversion_pk on oldtable (cost=0.71..17739.18 rows=6726 width=66)" > " Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))" > --Total query runtime: 187 ms. 4 rows retrieved This seems to slow too, also, estimated rows 6726 for an unique index with 4 values seems totally bad. > select * from newtable where id IN(10000000,1000000000,2000000000,3000000000); > "Append (cost=0.00..933.40 rows=223 width=66)" > " -> Seq Scan on newtable (cost=0.00..0.00 rows=1 width=66)" > " Filter: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))" > " -> Index Scan using newtable01_idversion_pk on newtable_01 (cost=0.57..622.78 rows=156 width=66)" > " Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))" > ... ********* How many partitions where omitted here? ( if they were two the problem of redudant checking of id is minor, if they where 85 it means constratint exclusion is not working for in ) > " -> Index Scan using newtable85_idversion_pk on newtable_85 (cost=0.57..53.37 rows=9 width=66)" > " Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))" > --Total query runtime: 421 ms. 4 rows retrieved And here it is doing strange things ( fetching every value in every partition > range of ids ------------------------------------------------------------------------------------------------------- > select * from oldtable where id between 1522999949 and 1523000049; > "Index Scan using oldtable_idversion_pk on oldtable (cost=0.70..383.51 rows=144 width=66)" > " Index Cond: ((id >= 1522999949) AND (id <= 1523000049))" > Total query runtime: 47 ms. 53 rows retrieved. As I was telling you above. It is not normal to take 62 ms for a single id and just 47 for a 53 range. And this is referring to the old table case. I keep thinking you are trashing the caches, and your measurements are not good, you are measuring cache effects ( as I said before, the easiest ways to discount them is to repeat every query three or more times in sucession, using explain analyze to discard network effects, and see where they stabilize ) > Using an index, not the primary key ------------------------------------------------------------------------------ Those are going to be specially bad. If you partition and then do indexed queries against a non partitioned field WITHOUT extra conditions which narrow the result to a couple of them, it will always take a lot of time. The server needs to scan every index ( as your query is basically a union all of every partition ). Think on it this way, if I have one big phone call table, partitioned by months, indexed by timestamp and number ( 2 indexes ) , queries for a TS range go fast, as they go to the relevant partitions. Queries for a number with a some-days timestamp range also go fast, as it has to index scan one or two partitions for the number, but queries for just the number need to index scan every partition, and they are going to be slow, as they need to read some pages of every number index. > --Subset provides 58 group_id pointing to 5978 records in the concerned tables > select * from oldtable where group_id IN (select * from subset) If you need to do this queries, and group id is uncorrelated wit id, you are not going to get good results from partitioning. Regards. Francisco Olarte. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general