On Tue, Jun 3, 2008 at 12:04 PM, Jeff Davis <pgsql@xxxxxxxxxxx> wrote: > On Tue, 2008-06-03 at 09:43 -0500, Mason Hale wrote: >> I've been working on partitioning a rather large dataset into multiple >> tables. One limitation I've run into the lack of cross-partition-table >> unique indexes. In my case I need to guarantee the uniqueness of a >> two-column pair across all partitions -- and this value is not used to >> partition the tables. The table is partitioned based on a insert date >> timestamp. > > You're looking for a constraint across tables. > Yes, for this particular case. But I'm also interested in speeding up cross-partition queries whether it is for a uniqueness check or not. This uniqueness check is just one (important) instance of a cross-partition query. >> To check the uniqueness of this value I've added an insert/update >> trigger to search for matches in the other partitions. This trigger is >> adding significant overhead to inserts and updates. > > Do you lock all of the tables before doing the check? If not, then you > have a race condition. > Yes, I was concerned about that. > It's possible this index strategy will be better for your case. > However, I think what you really want is some kind of multi-table > primary key. Have you considered storing the key in its own two-column > table with a UNIQUE index and having the partitions reference it? Thanks for the suggestion -- I'll explore maintaining the compound key in its own non-partitioned table. I was trying to avoid any application-layer code changes. I guess I can still accomplish that by updating this table via an insert/update trigger. But to reiterate, having bloom filter-based index would allow constant time determination of whether a given partition *may* contain the data. This would be very useful for large partitioned data-sets, especially in (very common) cases where performance is critical. This feature would also be useful for applications where data is partitioned (aka 'federated') across multiple servers.