On Sat, Jun 10, 2017 at 11:10 PM, David G. Johnston <david.g.johnston@xxxxxxxxx> wrote:
Specifically, the notes section.
That seems to fit the behavior. Thanks; I wasn't aware of that feature of prepared statements. I changed the Python code to do EXPLAIN ANALYZE EXECUTE rather than EXECUTE, and I do indeed see a change in plan after the fifth iteration:
(first five iterations)
Index Only Scan using test_pkey on test (cost=0.29..476.29 rows=9999 width=4) (actual time=0.058..2.439 rows=10000 loops=1)
Index Cond: (col1 = 'xyz'::text)
Filter: (col2 ~~ '%'::text)
Heap Fetches: 0
Execution time: 2.957 ms
(iterations 6+)
Sort (cost=205.41..205.54 rows=50 width=4) (actual time=104.986..105.784 rows=10000 loops=1)
Sort Key: col2
Sort Method: quicksort Memory: 853kB
-> Seq Scan on test (cost=0.00..204.00 rows=50 width=4) (actual time=0.014..2.100 rows=10000 loops=1)
Filter: ((col2 ~~ $2) AND (col1 = $1))
Execution time: 106.282 ms
So the problem here may be that the cost estimate for the generic execution plan is way off in the case of a LIKE bind variable that matches a large number of rows. I did make sure to have the Java code do a VACUUM ANALYZE after doing its inserts, just to eliminate lack of statistics as a possible explanation. Maybe the incorrect row count estimate (50 instead of 10000) is causing it to think the quicksort will be a lot cheaper than it ends up being with the actual rows?
Interesting that the Java version switches to the suboptimal plan after 9 iterations rather than 5. I don't know how to get the JDBC driver to do an EXPLAIN on a prepared statement, so I can't confirm that the same thing is happening there, but it seems plausible. Happy to try that if there's a way to do it.
-Steve