Search Postgresql Archives

Re: explain analyze showed improved results without changes, why?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2016-12-23 10:35:26 -0600, Chris Richards wrote:
> Howdy. I was noticing a significant problem with a query on one of my tables. I
> tried recreating the problem and wasn't able to do so on a different install,
> and so a few days later I ran the same query on the problem table. Lo' and
> behold, there wasn't a problem anymore. I'm at a loss to why.
[...]
>     "blocks_off_sz_idx" btree (off, sz)
> 
> mdb=> explain analyze SELECT * FROM blocks
>  WHERE cloudidx=98038 AND off+sz >= 0 AND state='seeded'
>  ORDER BY off LIMIT 1 FOR UPDATE;
>                                                                       QUERY
> PLAN
> -------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.43..587.83 rows=1 width=100) (actual time=4814.579..4814.579 rows=1 loops=1)
>    ->  LockRows  (cost=0.43..1358633.99 rows=2313 width=100) (actual time= 4814.577..4814.577 rows=1 loops=1)
>          ->  Index Scan using blocks_off_sz_idx on blocks  (cost= 0.43..1358610.86 rows=2313 width=100) (actual time=4813.498..4814.384 rows=2 loops=1)
>                Filter: ((cloudidx = 98038) AND (state = 'seeded'::block_state) AND ((off + sz) >= 0))
>                Rows Removed by Filter: 6935023
>  Total runtime: 4814.619 ms
> (6 rows)

This scans the table in ascending (off, sz) order until it finds one row
matching the filter. Apparently at the time of the query there were
6935023 rows in the table before the matching row.

[...]
> And here's the second. Notice that even though there are more rows, it was much
> faster and the "rows removed by filter" were significantly reduced by several
> orders of magnitude.
> 
> 
> mdb=> explain analyze SELECT * FROM blocks
>  WHERE cloudidx=98038 AND off+sz >= 0 AND state='seeded'
>  ORDER BY off LIMIT 1 FOR UPDATE;
>                                                                    QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.43..584.57 rows=1 width=100) (actual time=0.071..0.071 rows=1 loops=1)
>    ->  LockRows  (cost=0.43..1390825.21 rows=2381 width=100) (actual time=0.070..0.070 rows=1 loops=1)
>          ->  Index Scan using blocks_off_sz_idx on blocks  (cost=0.43..1390801.40 rows=2381 width=100) (actual time=0.055..0.055 rows=1 loops=1)
>                Filter: ((cloudidx = 98038) AND (state = 'seeded'::block_state) AND ((off + sz) >= 0))
>                Rows Removed by Filter: 26
>  Total runtime: 0.114 ms
> (6 rows)

The plan here is exactly the same, but only 26 rows are discarded. My
guess is that between those two queries a row was inserted with a really
low (off, sz) value which matches the query. So now the query can return
after checking only a handful of rows.

LIMIT, EXISTS, etc. are awful when you want predictable performance. You
may be lucky and the rows you are looking for are just at the start or
you may be unlucky and you have to scan through the whole table to find
them. The optimizer (usually) doesn't have enough information and
assumes they are spread randomly through the table.

        hp

-- 
   _  | Peter J. Holzer    | A coding theorist is someone who doesn't
|_|_) |                    | think Alice is crazy.
| |   | hjp@xxxxxx         | -- John Gordon
__/   | http://www.hjp.at/ |    http://downlode.org/Etext/alicebob.html

Attachment: signature.asc
Description: Digital signature


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux