On Wed, May 19, 2010 at 8:06 PM, Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote: > On Wed, May 19, 2010 at 8:04 PM, Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote: >> On Wed, May 19, 2010 at 7:46 PM, Matthew Wakeling <matthew@xxxxxxxxxxx> wrote: >>> On Wed, 19 May 2010, Scott Marlowe wrote: >>>>> >>>>> It's apparently estimating (wrongly) that the merge join won't have to >>>>> scan very much of "files" before it can stop because it finds an eid >>>>> value larger than any eid in the other table. So the issue here is an >>>>> inexact stats value for the max eid. >>> >>> I wandered if it could be something like that, but I rejected that idea, as >>> it obviously wasn't the real world case, and statistics should at least get >>> that right, if they are up to date. >>> >>>> I changed stats target to 1000 for that field and still get the bad plan. >>> >>> What do the stats say the max values are? >> >> 5277063,5423043,13843899 (I think). >> >> # select count(distinct eid) from files; >> count >> ------- >> 365 >> (1 row) >> >> # select count(*) from files; >> count >> --------- >> 3793748 > > A followup. of those rows, > > select count(*) from files where eid is null; > count > --------- > 3793215 > > are null. So, Tom, so you think it's possible that the planner isn't noticing all those nulls and thinks it'll just take a row or two to get to the value it needs to join on? -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance