Search Postgresql Archives

Re: Smaller multiple tables or one large table?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Will the processes know that I have n tables which are constrained in their definition on primary keys? I am thinking a table constraint specifying that the primary key on that table is within some boundary. That way the single process can spawn one thread per n table and leave the thread management to the OS. Assuming it is well behaved, this should use every ounce of resource I throw at it and instead of sequentially going though one large table, it will sequentially go through 1 of n short tables in parallel with k other tables. The results of this would have to be aggregated but with a large enough table, the aggregation would pale in comparison to the run time of the query split between several smaller tables.

The tables would have to be specified with a table pk constraint falling between two ranges. A view would then be created to manage all of the small tables with triggers handling insert and update operations. Select would have to be view specific but that is really cheap compared to updates. That should have the additional benefit of only hitting a specific table(s) with an update.

Basically, I don't see how this particular configuration breaks and if PostgreSQL already has the ability to do this as it seems very useful to manage very large data sets.

Thanks,
~Ben

On Fri, Jun 15, 2012 at 2:42 PM, John R Pierce <pierce@xxxxxxxxxxxx> wrote:
On 06/15/12 11:34 AM, Benedict Holland wrote:
I am on postgres 9.0. I don't know the answer to what should be a fairly straight forward question. I have several static tables which are very large (around the order of 14 million rows and about 10GB). They are all linked together through foreign keys and indexed on rows which are queried and used most often. While they are more or less static, update operations do occur. This is not on a super fast computer. It has 2 cores with 8gb of ram so I am not expecting queries against them to be very fast but I am wondering in a structural sense if I should be dividing up the tables into 1 million row tables through constraints and a view. The potential speedup I could see being quite large where postgresql would split off all of the queries into n table chucks running on k cores and then aggregate all of the data for display or operation. Is there any documentation to make postgesql do this and is it worth it?

postgres won't do that, one query is one process.  your application could conceivably run multiple threads, each with a seperate postgres connection, and execute multiple queries in parallel, but it would have to do any aggregation of the results itself.



Also, is there a benefit to have one large table or many small tables as far indexes go?

small tables only help  if you can query the specific table you 'know' has your data, for instance, if you have time based data, and you put a month in each table, and you know that this query only needs to look at the current month, so you just query that one month's table.



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux