I have two partitioning questions I am hoping somebody can help me with.
I have a fairly busy metric(ish) table. It gets a few million records per day, the data is transactional for a while but then settles down and is used for analytical purposes later.
When a metric is reported both the UTC time and the local times are stored along with the other data belonging to the metric. I want to partition this table to both make it faster to query and also to spread out the writes. Ideally the partitions would be based on the UTC timestamp and the sending location. For example
metrics_location_XXXXX_2015_01_01
First problem with this approach is that there could be tens of thousands of locations so this is going to result hundreds of thousands of tables. I know there are no upper limits to how many tables there are but I am thinking this might really get me into trouble later.
Second and possibly more vexing problem is that often the local time is queried. Ideally I would like to put three constraints on the child tables. Location id, UTC timestamp and the local time but obviously the local timestamps would overlap with other locations in the same timezone Even if I was to only partition by UTC the local timestamps would overlap between tables.
So the questions in a nutshell are.
1. Should I be worried about having possibly hundreds of thousands of shards.
2. Is PG smart enough to handle overlapping constraints on table and limit it's querying to only those tables that have the correct time constraint.
Thanks.