Re: DELETE vs TRUNCATE explanation

Craig Ringer <ringerc@xxxxxxxxxxxxx> · Thu, 12 Jul 2012 09:41:54 +0800

On 07/12/2012 06:51 AM, Daniel Farina wrote:
15x slower.  This is a Macbook Air with full disk encryption and SSD
disk with fsync off, e.g. a very typical developer configuration.
Don't use full disk encryption for throwaway test data if you care about 
how long those tests take. It's a lot like tuning the engine in your car 
while ignoring the fact that the handbrake is jammed on and you're 
dragging a parachute. Use a ramdisk or un-encrypted partition, something 
that doesn't take three weeks to fsync().

That said, this performance gap makes me wonder if TRUNCATE is forcing 
metadata synchronisation even with fsync=off, causing the incredibly 
glacially awesomely slow disk access of your average FDE system to kick 
in, possibly even once per table or even once per file (index, table, 
toast, etc). If so, it may be worth:

- Allowing TRUNCATE to skip synchronization when fsync=off. Pg is 
already allowed to eat all your data if it feels like it in this 
configuration, so there's no point flushing filesystem metadata to make 
sure files are really swapped.

- When fsync=on, trying to flush all changes to all files out at once 
rather than once per file as it could be doing (haven't checked) right 
now. How to do this without also flushing all other pending I/O on the 
whole system (with a global "sync()") would be somewhat OS/filesystem 
dependent, unfortunately.

You could help progress this issue constructively by doing some 
profiling on your system, tracing Pg's system calls, and determining 
what exactly it's doing with DELETE vs TRUNCATE and where the time goes. 
On Linux you'd use OProfile for this and on Solaris you'd use DTrace. 
Dunno what facilities Mac OS X has but there must be something similar.

Once you've determined why it's slow, you have a useful starting point 
for making it faster, first for test systems with fsync=off then, once 
that's tracked down, maybe for robust systems with fsync=on.

I've seen way more than one complaint, and I'm quite sure there are
thousands of man hours (or more) spent on people who don't even know
to complain about such atrocious performance (or maybe it's so bad
that most people run a web search and find out, probably being left
really annoyed from having to yak shave as a result).
I suspect you're right - as DB based unit testing becomes more 
commonplace this is turning up a lot more. As DB unit tests were first 
really popular in the ruby/rails crowd they've probably seen the most 
pain, but as someone who doesn't move in those circles I wouldn't have 
known. They certainly don't seem to have been making noise about it 
here, and I've only recently seen some SO questions about it.

Perhaps a doc fix is all we need (TRUNCATE is constant-time on large
tables, but can be very slow compared to DELETE on small tables), but
I completely and enthusiastically reject any notion from people
calling this "contrived" or an "edge case," because people writing
software against PostgreSQL that have unit tests have this use case
constantly, often dozens or even hundreds of times a day.
I have to agree with this - it may have been an edge case in the past, 
but it's becoming mainstream and is worth being aware of.

That said, the group of people who care about this most are not well 
represented as active contributors to PostgreSQL. I'd love it if you 
could help start to change that by stepping in and taking a little time 
to profile exactly what's going on with your system so we can learn 
what, exactly, is slow.

--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance