Re: Separate 100 M spatial data in 100 tables VS one big table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/5/24 13:47, Marc Millas wrote:
> Salut Kimaidou,
> why not a partitioned table with the department a partitioning Key ?
> each year just detach the obsolete data, department by
> department (ie.detach the partition, almost instantaneous) and drop or keep
> the obsolete data.
> No delete, quite easy to maintain. For each global index, Postgres will
> create one index per each partition. and detach them when you detach a
> department partition.
> so when importing, first create an appropriate table, load the data, and
> attach it to the main partitioned table. Postgres will
> automatically recreate all necessary indexes.
> 

Yes, a table partitioned like this is certainly a valid option - and
it's much better than the view with a UNION of all the per-department
tables. The optimizer has very little insight into the view, which
limits how it can optimize queries. For example if the query has a
condition like

   WHERE department = 'X'

with the declarative partitioning the planner can eliminate all other
partitions (and just ignore them), while with the view it will have to
scan all of them.

But is partitioning a good choice? Who knows - it makes some operations
simpler (e.g. you can detach/drop a partition instead of deleting the
rows), but it also makes other operations less efficient. For example a
query that can't eliminate partitions has to do more stuff during execution.

So to answer this we'd need to know how often stuff like bulk deletes /
reloads happen, what queries will be executed, and so on. Both options
(non-partitioned and partitioned table) are valid, but you have to try.

Also, partitioned table may not support / allow some features - for
example unique keys that don't contain the partition key. We're
improving this in every release, but there will always be a gap.

I personally would start with non-partitioned table, because that's the
simplest option. And once I get a better idea how often the reloads
happen, I'd consider if that's something worth the extra complexity of
partitioning the data. If it happens only occasionally (a couple times a
year), it probably is not. You'll just delete the data and reuse the
space for new data.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux