On Wed, 7 Sept 2022 at 13:33, Levi Aul <levi@xxxxxxxxxxxxxx> wrote: > To be clear, this isn't a bug report. There is no bug—everything is working exactly as it should. The partitions are not being pruned because the workload consists of OLAP aggregations that fetch a small number of rows spread across all partitions in the set, relying for speed on an index that isn't prefixed with the partitioning key (nor can it be.) Probably the -hackers mailing list might a better place to discuss design ideas for new features. -general is more for general help with using the software, not hacking on it. The main reason individual partitions need to be locked is because they can still be referenced by queries directly as if they were just a normal table. To get around that we'd either need to have the locking groups, as you describe, or remove the ability to access the partition directly, not through the top-level partitioned table. The ship has probably sailed on the latter one, but it probably could be done as an opt-in feature if the former was too difficult or impractical. FWIW, I'm not quite seeing why you need "sealed" partitions for the group locking idea. I understand the other parts you mentioned about conversion to a table AM which is more optimized for non-transactional workloads, but that seems like a different problem that you're mixing in and adding complexity to the whole thing. If that's true, then it might be better not to mix that in and confuse / complicate your explanation of the problem and proposed solution. I'd suggest posting to -hackers and stating that your queries can't make use of partition pruning and that currently all partitions are being locked and you believe that this is a bottleneck. Some examples of perf output to show how large the locking overhead is. Extra points for hacking up some crude code so we don't obtain the partition locks to show what the performance could be if we didn't lock all the partitions. That'll help show you have a worthy cause, as FWIW, I'm surprised that executor startup / shutdown for a plan which accesses a large number of partitions is not drowning out the locking overheads. As far as I knew, this problem was only visible when run-time partition pruning removed the large majority of the Append/MergeAppend subnodes and made executor startup/shutdown significantly faster. David