Search Postgresql Archives

Re: Delete/update with limit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



nagy@xxxxxxxxxxxxxx (Csaba Nagy) writes:
> In postgres we're currently not chunking, due to the fact that the code
> to do it is simply overly contorted and inefficient compared to the
> other DBs we use. At least all the solutions we could think of to do the
> chunking in a safe way while the inserts are running in parallel,
> without disturbing them, have invariably resulted in overly complicated
> code compared to the simple delete with limit + delete trigger solution
> we have for the other DBs.
>
> Now I don't put too much hope I can convince anybody that the limit on
> the delete/update commands has valid usage scenarios, but then can
> anybody help me find a good solution to chunk-wise process such a buffer
> table where insert speed is the highest priority (thus no indexes, the
> minimum of fields), and batch processing should still work fine with big
> table size, while not impacting at all the inserts, and finish in short
> time to avoid long running transactions ? Cause I can't really think of
> one... other than our scheme with the delete with limit + trigger +
> private temp table thing.

All that comes to mind is to put a SERIAL primary key on the table,
which shouldn't be *too* terribly expensive an overhead, assuming
there is reasonably complex processing going on; you then do something
like:

- select ID from the incoming table, order by ID, limit 500, to grab a
  list of IDs;

- delete from the table for that set of IDs.

Actually, is there any particular reason why you couldn't simply have
your "batch processing" loop look like:

Loop Forever
  DELETE from incoming_table;
  VACUUM incoming_table;
End Loop;

???

The alternative that I suggested amounts to:

Loop Forever
  DELETE from incoming_table where id in (select id from incoming_table limit 500);
  VACUUM incoming_table;
End Loop;

I realize you're concerned that maintaining the index will be too
costly; I don't think it is obvious without actual benchmarking that
this is *in fact* too costly.

I'm pretty sure of one countervailing consideration: there's a cost to
VACUUMing the table that will throw in some costs; it is possible that
the cost of the index would be noise against that.
-- 
"cbbrowne","@","acm.org"
http://cbbrowne.com/info/lisp.html
When a man talks dirty to a woman, its sexual harassment. When a woman
talks dirty to a man, it's 3.95 per minute.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux