"David G. Johnston" <david.g.johnston@xxxxxxxxx> writes: > On Fri, Feb 17, 2017 at 3:49 PM, Tomas Vondra <tomas.vondra@xxxxxxxxxxxxxxx> > wrote: >> That may seem a bit strange, but I'd bet it finds the short value in some >> statistic (MCV, histogram) ans so can provide very accurate estimate. > I'm not seeing how any of the statistic columns would capture a value that > doesn't actually appear in the table...(actual ... row=0) I think it's the other way around. It found '01625cfa-2bf8-45cf-bf4c-aa5f3c6fa8ea' in the stats, concluded (accurately) that there would be five matches, and on the strength of that decided that a seqscan over this very tiny table would be faster than an indexscan. In the other case, the short string exists neither in the table nor the stats, and the default estimate is turning out to be that there's a single match, for which it likes the indexscan solution. This is all pretty unsurprising if '01625cfa-2bf8-45cf-bf4c-aa5f3c6fa8ea' is in the most-common-values list. Anything that's *not* in that list is going to get a smaller rowcount estimate. (I don't think that the string length, per se, has anything to do with it.) I'm not sure what performance problem the OP was looking to solve, but expecting experiments on toy-sized tables to give the same plans as you get on large tables is a standard mistake when learning to work with the PG planner. Also, if toy-sized tables are all you've got, meaning the whole database can be expected to stay RAM-resident at all times, it'd be a good idea to reduce random_page_cost to reflect that. The default planner cost settings are meant for data that's mostly on spinning rust. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance