Re: 15,000 tables

Jaime Casanova <systemguards@xxxxxxxxx> · Thu, 1 Dec 2005 14:40:59 -0500

On 12/1/05, Michael Riess <mlriess@xxxxxx> wrote:
> > Michael Riess <mlriess@xxxxxx> writes:
> >>> On 12/1/05, Michael Riess <mlriess@xxxxxx> wrote:
> >>>> we are currently running a postgres server (upgraded to 8.1) which
> >>>> has one large database with approx. 15,000 tables. Unfortunately
> >>>> performance suffers from that, because the internal tables
> >>>> (especially that which holds the attribute info) get too large.
> >>>>
> >>>> (We NEED that many tables, please don't recommend to reduce them)
> >>>>
> >>> Have you ANALYZEd your database? VACUUMing?
> >> Of course ... before 8.1 we routinely did a vacuum full analyze each
> >> night. As of 8.1 we use autovacuum.
> >
> > VACUUM FULL was probably always overkill, unless "always" includes
> > versions prior to 7.3...
>
> Well, we tried switching to daily VACUUM ANALYZE and weekly VACUUM FULL,
> but the database got considerably slower near the end of the week.
>
> >
> >>> BTW, are you using some kind of weird ERP? I have one that treat
> >>> informix as a fool and don't let me get all of informix potential...
> >>> maybe the same is in your case...
> >> No. Our database contains tables for we content management
> >> systems. The server hosts approx. 500 cms applications, and each of
> >> them has approx. 30 tables.
> >>
> >> That's why I'm asking if it was better to have 500 databases with 30
> >> tables each. In previous Postgres versions this led to even worse
> >> performance ...
> >
> > This has the feeling of fitting with Alan Perlis' dictum below...
> >
> > Supposing you have 500 databases, each with 30 tables, each with 4
> > indices, then you'll find you have, on disk...
> >
> > # of files = 500 x 30 x 5 = 75000 files
> >
> > If each is regularly being accessed, that's bits of 75000 files
> > getting shoved through OS and shared memory caches.  Oh, yes, and
> > you'll also have regular participation of some of the pg_catalog
> > files, with ~500 instances of THOSE, multiplied some number of ways...
> >
>
> Not all of the tables are frequently accessed. In fact I would estimate
> that only 20% are actually used ... but there is no way to determine if
> or when a table will be used. I thought about a way to "swap out" tables
> which have not been used for a couple of days ... maybe I'll do just
> that. But it would be cumbersome ... I had hoped that an unused table
> does not hurt performance. But of course the internal tables which
> contain the meta info get too large.
>
> > An application with 15000 frequently accessed tables doesn't strike me
> > as being something that can possibly turn out well.  You have, in
> > effect, more tables than (arguably) bloated ERP systems like SAP R/3;
> > it only has a few thousand tables, and since many are module-specific,
> > and nobody ever implements *all* the modules, it is likely only a few
> > hundred that are "hot spots."  No 15000 there..
>
> I think that my systems confirms with the 80/20 rule ...
> .
>

How many disks do you have i imagine you can put tables forming one
logical database in a tablespace and have tables spread on various
disks...

--
Atentamente,
Jaime Casanova
(DBA: DataBase Aniquilator ;)