On Mon, Jul 31, 2017 at 09:15:29AM +1000, Alex Samad wrote: > Hi > > I was about to partition a large (?) approx 3T of data 2B rows into > partition tables but broken up into YYYYMM ... > > Now I have been reading about limiting the number of partitions otherwise > it could slow down the parser. > > My reasoning for limiting to YYYYMM was that most of the request would be > monthly based. > > Should I be making the partitioning based on YYYY instead and have lots > more indexs. > > If I have an index on the timestamp field will it help limiting to YYYYMM ? The major advantages of partitions are enumerated here: https://www.postgresql.org/docs/10/static/ddl-partitioning.html#ddl-partitioning-overview For your case, it might be that seq scans of an entire "monthly" partition turn out to be very advantageous, compared with index scan (or seq scan of entire 3TB data). Also DROPing the oldest partition every month is commonly very much more efficient than DELETEing it.. There are warnings like these: |All constraints on all partitions of the master table are examined during |constraint exclusion, so large numbers of partitions are likely to increase |query planning time considerably. Partitioning using these techniques will work |well with up to perhaps a hundred partitions; don't try to use many thousands |of partitions. Unless you have 100s of years of data I don't think it would be a problem. For us, having hundreds of partitions hasn't been an issue (planning time is insignificant for our analytic report queries). But there's an overhead to partitions and at some point the cost becomes significant. (Actually, I think one cost which *did* hit us, while experimenting with *daily* partition granularity of every table, was probably due to very large pg_statistics and pg_attributes tables, which no longer fit in buffer cache). Justin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general