Re: Thousands of tables versus on table?

Scott Marlowe <smarlowe@xxxxxxxxxxxxxxxxx> · Tue, 05 Jun 2007 14:34:08 -0500

Gregory Stark wrote:
"Scott Marlowe" <smarlowe@xxxxxxxxxxxxxxxxx> writes:

Sorry, I think I initially read your response as "Postgres doesn't really get
any faster by breaking the tables up" without the "like that" part.

Well breaking up the tables like that or partitioning, either way should be
about equivalent really. Breaking up the tables and doing it in the
application should perform even better but it does make the schema less
flexible and harder to do non-partition based queries and so on.

True, but we can break it up by something other than the company name on 
the survey, in this instance, and might find it far easier to manage by, 
say, date range, company ID range, etc...
Plus with a few hand rolled bash or perl scripts we can maintain our 
database and keep all the logic of partitioning out of our app.  Which 
would allow developers not wholly conversant in our partitioning scheme 
to participate in development without the fear of them putting data in 
the wrong place.
Where the win in partitioning comes in is in being able to disappear some of
the data entirely. By making part of the index key implicit in the choice of
partition you get away with a key that's half as large. And in some cases you
can get away with using a different key entirely which wouldn't otherwise have
been feasible to index. In some cases you can even do sequential scans whereas
in an unpartitioned table you would have to use an index (or scan the entire
table).

Yeah, I found that out recently while I benchmarking a 12,000,000 row 
geometric data set.  Breaking it into 400 or so partitions resulted in 
no need for indexes and response times of 0.2 or so seconds, where 
before that I'd been in the 1.5 to 3 second range.