2009/11/5 Matthew Wakeling <matthew@xxxxxxxxxxx>
as with everything, you have to find the right balance. I think he is looking for low impact, not speed. So he has to trade one for another. Find a small enough batch size, but not too small, cos like you said - things will have too much impact otherwise.
On Thu, 5 Nov 2009, Grzegorz Jaśkiewicz wrote:Unfortunately, dividing the work up can cause a much greater load, which would make things worse. If you are inserting in smaller chunks and committing more frequently that can reduce performance. If you split up queries with limit and offset, that will just multiply the number of times the query has to be run. Each time, the query will be evaluated, the first <offset> rows thrown away, and the next <limit> rows returned, which will waste a huge amount of time.
If it is an insert of some sort, than divide it up. If it is a query that runs over data,
use limits, and do it in small batches. Overall, divide in conquer approach works in
these scenarios.
If you are inserting data, then use a COPY from stdin, and then you can throttle the data stream. When you are querying, declare a cursor, and fetch from it at a throttled rate.
as with everything, you have to find the right balance. I think he is looking for low impact, not speed. So he has to trade one for another. Find a small enough batch size, but not too small, cos like you said - things will have too much impact otherwise.
--
GJ