On Mon, Apr 13, 2015 at 7:37 PM, Jim Nasby <Jim.Nasby@xxxxxxxxxxxxxx> wrote:
I wonder if what I need to do, considering that I update a lot of "the same" rows as I process this queue, is to create a temp table, update the rows there, and then update the actual tables once at the end...
On 4/13/15 7:01 PM, Pawel Veselov wrote:
Cursors tend to make things slow. Avoid them if you can.
Is there an alternative to iterating over a number of rows, where a
direct update query is not an option?
I really doubt that either the actual processing logic, including use of
types has anything to do with my problem. This is based on the fact that
out of the tables that are being changed, only one is exhibiting the
problem. All of the involved tables have nearly the same structure, and
have the same logical operations performed on them. I thought may be the
"bad" table is slow because it was first in the list, and Postgres was
caching the functions results, but I moved things around, and pattern is
the same.
I'm guessing that you're essentially processing a queue. Take a look at http://www.postgresql.org/message-id/552C750F.2010401@BlueTreble.com for some ideas. Basically, not only do cursors have non-trivial overhead, doing a ton of single-row queries is going to have a non-trivial overhead itself.
Thank you for the pointers. PgQ sounds interesting, it has to be remote for RDS (I use RDS), but I'll try implementing a solution based on it.
However, for all the times that is being spent during this update, the breakdown is:
update "total" table: 10.773033
update "hourly" table: 00.179711
update "daily" table: 01.082467
update some other table (actually, it has cardinality similar to "total" table): 00.168287
clean the queue table: 00.021733
overhead: 00.014922
The "overhead" is time taken to run the whole procedure, minus all these other times that have been counted.
(some notes about the "daily" table below)
Keep in mind that the estimated cost is not terribly useful; it's the actual times that matter.As for your specific question, I suggest you modify the plpgsql
function so that it's doing an EXPLAIN ANALYZE on the slow table.
EXPLAIN ANALYZE actually returns a recordset the same way a SELECT
would, with a single column of type text. So you just need to do
something with that output. The easiest thing would be to replace
this in your function:
UPDATE slow_table SET ...
to this (untested)
RETURN QUERY EXPLAIN ANALYZE UPDATE slow_table SET ...
and change the function so it returns SETOF text instead of whatever
it returns now.
Thank you, that made it a lot easier to see into what's really going on.
But the outcome is somewhat the same. The "bad" table analysis shows a
very high cost, and thousands of rows, where the table contains only 24
rows. This time, however, the actual run time is shown, and one can see
where the time is spent (I was using just a sum of clock_time()s around
the update statements to see where the problem is).
r_agrio_hourly - "good", r_agrio_total - "bad".
Update on r_agrio_hourly (cost=0.42..970.32 rows=250 width=329)
(actual time=2.248..2.248 rows=0 loops=1)
-> Index Scan using u_r_agrio_hourly on r_agrio_hourly
(cost=0.42..970.32 rows=250 width=329) (actual time=0.968..1.207
rows=1 loops=1)
Index Cond: ((tagid = 1002::numeric) AND (unitid =
1002::numeric) AND ((rowdate)::text = '2015-04-09T23'::text) AND
(device_type = 3::numeric) AND (placement = 2::numeric))
Total runtime: 2.281 ms
Update on r_agrio_total (cost=0.42..45052.56 rows=12068 width=321)
(actual time=106.766..106.766 rows=0 loops=1)
-> Index Scan using u_r_agrio_total on r_agrio_total
(cost=0.42..45052.56 rows=12068 width=321) (actual time=0.936..32.626
rows=1 loops=1)
Index Cond: ((tagid = 1002::numeric) AND (unitid =
1002::numeric) AND (device_type = 3::numeric) AND (placement = 2::numeric))
Total runtime: 106.793 ms
I suspect what's happening here is a combination of things. First, the hourly table is basically living in cache, but the total table is not. That means that when you go to find a row in the total table you're actually hitting the disk instead of pulling the data from memory.
Second, you may have a lot of dead rows in the total table. I suspect this because of the very large amount of time the index scan is taking. Unless you're running on an old 10MB MFM drive you'd be pretty hard pressed for even 2 IO operations (one for the index leaf page and one for the heap page) to take 32ms. I suspect the index scan is having to read many dead rows in before it finds a live one, and incurring multiple IOs. Swiching to EXPLAIN (analyze, buffers) would help confirm that.
That looks most likely to me as well. Most of the updates in a single "batch", for the total table would be on the same record, while for hourly table it's a lot less. Logically, the tables contain identical data, except that hourly table breaks it down per hour, and total table contains the data for all times. The daily table contains the same data per day.
So, if I compared the tables, the "total" table has the smallest amount of rows, followed by daily, and then by hourly. However, total table gets the most amount of updates on the same row, followed by daily followed by hourly. It seems to me that the more of these inline updates are done on rows, the worse this particular situation is (as you can see from the above, the daily table is also already over a second long to get updated)
Here is the explain analyze output with buffers:
Update on r_agrio_hourly (cost=0.42..986.10 rows=254 width=329) (actual time=5.945..5.945 rows=0 loops=1)
Buffers: shared hit=827 read=431 dirtied=17
-> Index Scan using u_r_agrio_hourly on r_agrio_hourly (cost=0.42..986.10 rows=254 width=329) (actual time=2.220..2.645 rows=1 loops=1)
Index Cond: ((tagid = 1002::numeric) AND (unitid = 1002::numeric) AND ((rowdate)::text = '2015-04-09T23'::text) AND (device_type = 3::numeric) AND (placement = 2::numeric))
Buffers: shared hit=402 read=14 dirtied=11
Total runtime: 6.005 ms
Update on r_agrio_total (cost=0.42..6.49 rows=1 width=321) (actual time=320.494..320.494 rows=0 loops=1)
Buffers: shared hit=170298 read=79791 dirtied=6 written=41
-> Index Scan using tag_r_agrio_total on r_agrio_total (cost=0.42..6.49 rows=1 width=321) (actual time=0.977..233.330 rows=1 loops=1)
Index Cond: (tagid = 1002::numeric)
Filter: ((unitid = 1002::numeric) AND (placement = 2::numeric) AND (device_type = 3::numeric))
Rows Removed by Filter: 7
Buffers: shared hit=26817 read=59973 written=41
Total runtime: 320.530 ms
Update on r_agrio_hourly (cost=0.42..989.90 rows=255 width=329) (actual time=2.605..2.605 rows=0 loops=1)
Buffers: shared hit=817 read=429 dirtied=16
-> Index Scan using u_r_agrio_hourly on r_agrio_hourly (cost=0.42..989.90 rows=255 width=329) (actual time=1.606..1.644 rows=1 loops=1)
Index Cond: ((tagid = 1002::numeric) AND (unitid = 1002::numeric) AND ((rowdate)::text = '2015-04-09T23'::text) AND (device_type = 1::numeric) AND (placement = 2::numeric))
Buffers: shared hit=133 read=283 dirtied=12
Total runtime: 2.634 ms
Update on r_agrio_total (cost=0.42..6.49 rows=1 width=321) (actual time=283.720..283.720 rows=0 loops=1)
Buffers: shared hit=196398 read=53955 dirtied=5 written=18
-> Index Scan using tag_r_agrio_total on r_agrio_total (cost=0.42..6.49 rows=1 width=321) (actual time=0.868..210.946 rows=1 loops=1)
Index Cond: (tagid = 1002::numeric)
Filter: ((unitid = 1002::numeric) AND (placement = 2::numeric) AND (device_type = 1::numeric))
Rows Removed by Filter: 7
Buffers: shared hit=39758 read=47033 written=18
Total runtime: 283.757 ms
Update on r_agrio_hourly (cost=0.42..1020.91 rows=263 width=329) (actual time=1.850..1.850 rows=0 loops=1)
Buffers: shared hit=1211 read=35 dirtied=15
-> Index Scan using u_r_agrio_hourly on r_agrio_hourly (cost=0.42..1020.91 rows=263 width=329) (actual time=0.976..1.087 rows=1 loops=1)
Index Cond: ((tagid = 1002::numeric) AND (unitid = 1002::numeric) AND ((rowdate)::text = '2015-04-09T23'::text) AND (device_type = 2::numeric) AND (placement = 2::numeric))
Buffers: shared hit=399 read=17 dirtied=11
Total runtime: 1.878 ms
Update on r_agrio_total (cost=0.42..6.49 rows=1 width=321) (actual time=287.481..287.481 rows=0 loops=1)
Buffers: shared hit=181548 read=68207 dirtied=4 written=24
-> Index Scan using tag_r_agrio_total on r_agrio_total (cost=0.42..6.49 rows=1 width=321) (actual time=0.785..206.947 rows=1 loops=1)
Index Cond: (tagid = 1002::numeric)
Filter: ((unitid = 1002::numeric) AND (placement = 2::numeric) AND (device_type = 2::numeric))
Rows Removed by Filter: 7
Buffers: shared hit=31168 read=55623 written=24
Total runtime: 287.506 ms
Third, I think something odd is happening with the update itself. I'm pretty sure that the index scan itself is visiting the heap pages, so each page should be in shared buffers by the time each tuple hits the update node. That makes me wonder what on earth is taking 60ms to update the tuple. I suspect it's going into either finding a free buffer to put the new tuple on, or waiting to try and extend the relation. Selecting ctid from the freshly updated rows and comparing the first number to the total number of pages in the heap would show if the new tuples are all ending up at the end of the heap.
before:
ctid
-----------
(25536,8)
(77292,7)
(77292,6)
(77292,5)
(77292,4)
(77292,3)
(77292,2)
(77292,1)
after:
ctid
-----------
(77314,2)
(77314,1)
(77313,7)
(77313,6)
(77313,5)
(77313,4)
(77313,8)
(77312,9)
But, I'm sorry, I don't know how to determine "total number of pages in the heap"...