On 07/11/2012 01:22 PM, Daniel Farina wrote:
On Tue, Jul 10, 2012 at 5:37 PM, Craig Ringer <ringerc@xxxxxxxxxxxxx> wrote:
Hi
After seeing a few discussions here and on Stack Overflow I've put together
a quick explanation of why "DELETE FROM table;" may be faster than "TRUNCATE
table" for people doing unit testing on lots of tiny tables, people who're
doing this so often they care how long it takes.
I'd love it if a few folks who know the guts were to take a look and verify
its correctness:
I haven't said this before, but think it every time someone asks me
about this, so I'll say it now:
This is a papercut that should be solved with improved mechanics.
TRUNCATE should simply be very nearly the fastest way to remove data
from a table while retaining its type information, and if that means
doing DELETE without triggers when the table is small, then it should.
The only person who could thwart me is someone who badly wants their
128K table to be exactly 8 or 0K, which seems unlikely given the 5MB
of catalog anyway.
Does that sound reasonable? As in, would anyone object if TRUNCATE
learned this behavior?
Yep, I'd object. It's more complicated and less predictable. Also, as I
strongly and repeatedly highlighted in my post, DELETE FROM table; does
a different job to TRUNCATE. You'd at minimum need the effect of DELETE
followed by a VACUUM on the table and its indexes to be acceptable and
avoid the risk of rapid table + index bloat - and that'd be lots slower
than a TRUNCATE. You could be clever and lock the table then DELETE and
set xmax at the same time I guess, but I suspect that'd be a bit of work
and still wouldn't take care of the indexes.
It's also too complicated, not least because AFAIK util commands and
CRUD commands go through very different paths in PostgreSQL.
I guess you could propose and post a prototype patch for a new command
that tried to empty the table via whatever method it thought would be
fastest. Such a new command wouldn't be bound by the accepted and
expected rules followed by TRUNCATE so it could vary its behaviour based
on the table, doing a real truncate on big tables and a
delete-then-vaccum on small tables. I suspect you'd land up writing the
fairly complicated code for the potentially multi-table
delete-and-vaccum yourself.
Honestly, though, it might be much better to start with "how can
TRUNCATE of empty or near-empty tables be made faster?" and start
examining where the time goes.
--
Craig Ringer
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance