Re: How to best use 32 15k.7 300GB drives?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There is a process in Oracle which essentially allows you to do the equivalent of a CLUSTER in Postgres, but without locking the table, and so updates can continue throughout the process. It requires a bit of manual setup and fiddling (at least in Oracle 10g) .... this would probably scratch a lot of people's itches in this area. Of course, it's not trivial at all to implement :-(

The Oracle equivalent of "too many dead rows" is "too many chained rows" and that's where I've seen it used.

Cheers
Dave

2011/2/3 Robert Haas <robertmhaas@xxxxxxxxx>
2011/1/30 ÐÑÑÐÐÑÐ ÐÐÐÑÐÑÐÐ <tivv00@xxxxxxxxx>:
> I was thinking if a table file could be deleted if it has no single live
> row. And if this could be done by vacuum. In this case vacuum on table that
> was fully updated recently could be almost as good as cluster - any scan
> would skip such non-existing files really fast. Also almost no disk space
> would be wasted.

VACUUM actually already does something along these lines. ÂIf there
are 1 or any larger number of entirely-free pages at the end of a
table, VACUUM will truncate them away. ÂIn the degenerate case where
ALL pages are entirely-free, this results in zeroing out the file.

The problem with this is that it rarely does much. ÂConsider a table
with 1,000,000 pages, 50% of which contain live rows. ÂOn average, how
many pages will this algorithm truncate away? ÂAnswer: if the pages
containing live rows are randomly distributed, approximately one.
(Proof: There is a 50% chance that the last page will contain live
rows. ÂIf so, we can't truncate anything. ÂIf not, we can truncate one
page, and maybe more. ÂNow the chances of the next page being free are
499,999 in 999,999, or roughly one-half. ÂSo we have an almost-25%
chance of being able to truncate at least two pages. ÂAnd so on. Â So
you get roughly 1/2 + 1/4 + 1/8 + 1/16 + 1/32 + ... = 1 page.)

Your idea of having a set of heaps rather than a single heap is an
interesting one, but it's pretty much catering to the very specific
case of a full-table update. ÂI think the code changes needed would be
far too invasive to seriously contemplate doing it just for that one
case - although it is an important case that I would like to see us
improve. ÂTom Lane previously objected to the idea of on-line table
compaction on the grounds that people's apps might break if CTIDs
changed under them, but I think a brawl between all the people who
want on-line table compaction and all the people who want to avoid
unexpected CTID changes would be pretty short. ÂA bigger problem - or
at least another problem - is that moving tuples this way is
cumbersome and expensive. ÂYou basically have to move some tuples
(inserting new index entries for them), vacuum away the old index
entries (requiring a full scan of every index), and then repeat as
many times as necessary to shrink the table. ÂThis is not exactly a
smooth maintenance procedure, or one that can be done without
significant disruption, but AFAIK nobody's come up with a better idea
yet.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux