On Wed, Jul 11, 2012 at 7:05 AM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > Daniel Farina <daniel@xxxxxxxxxx> writes: >> TRUNCATE should simply be very nearly the fastest way to remove data >> from a table while retaining its type information, and if that means >> doing DELETE without triggers when the table is small, then it should. >> The only person who could thwart me is someone who badly wants their >> 128K table to be exactly 8 or 0K, which seems unlikely given the 5MB >> of catalog anyway. > >> Does that sound reasonable? As in, would anyone object if TRUNCATE >> learned this behavior? > > Yes, I will push back on that. > > (1) We don't need the extra complexity. Well, a "need" is justified by the gains, no? It seems like this follows from the thoughts presented afterwards, so I'll discuss those. > (2) I don't believe that you know where the performance crossover point > would be (according to what metric, anyway?). Nope. I don't. But an exact crossover is a level of precision I don't really need, because here are where things stand on a completely unremarkable test suite on the closest project to me that meets the "regular web-app" profile case: With en-masse DELETE: rake 41.89s user 3.08s system 76% cpu 58.629 total With TRUNCATE: rake 49.86s user 2.93s system 5% cpu 15:17.88 total 15x slower. This is a Macbook Air with full disk encryption and SSD disk with fsync off, e.g. a very typical developer configuration. This is a rather small schema -- probably a half a dozen tables, and probably about a dozen indexes. This application is entirely unremarkable in its test-database workload: it wants to load a few records, do a few things, and then clear those handful of records. > (3) The performance of the truncation itself should not be viewed in > isolation; subsequent behavior also needs to be considered. An example > of possible degradation is that index bloat would no longer be > guaranteed to be cleaned up over a series of repeated truncations. > (You might argue that if the table is small then the indexes couldn't > be very bloated, but I don't think that holds up over a long series.) I'm not entirely convinced to the mechanism, it was simply the most obvious one, but I bet a one that is better in every respect is also possible. It did occur to me that bloat might be a sticky point. > IOW, I think it's fine as-is. I'd certainly wish to see many more > than one complainant before we expend effort in this area. I've seen way more than one complaint, and I'm quite sure there are thousands of man hours (or more) spent on people who don't even know to complain about such atrocious performance (or maybe it's so bad that most people run a web search and find out, probably being left really annoyed from having to yak shave as a result). In spite of how familiar I am with Postgres and its mailing lists, I have glossed over this for a long time, just thinking "wow, that really sucks" and only now -- by serendipity of having skimmed this post -- have seen fit to complain on behalf of quite a few rounds of dispensing workaround advice to other people. It's only when this was brought to the fore of my mind did I stop to consider how much wasted time I've seen in people trying to figure this out over and over again (granted, they tend to remember after the first time). Perhaps a doc fix is all we need (TRUNCATE is constant-time on large tables, but can be very slow compared to DELETE on small tables), but I completely and enthusiastically reject any notion from people calling this "contrived" or an "edge case," because people writing software against PostgreSQL that have unit tests have this use case constantly, often dozens or even hundreds of times a day. What I don't know is how many people figure out that they should use DELETE instead, and after how long. Even though the teams I work with are very familiar with many of the finer points of Postgres, doing some probing for the first time took a little while. If we're going to live with it, I contest that we should own it as a real and substantial weakness for development productivity, and not sweep it under the rug as some "contrived" or "corner" case. -- fdr -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance