On 10/09/2013 05:57 PM, Brian Wong wrote:
But from a user's perspective, why would it ever make sense that by
adding an additional where clause, it actually brings in more data
into the picture? If I have query returning 100 rows. Adding an
additional where clause should only cut down the number of rows, not
increase it.
...
It may be a bit surprising at first when one does not consider they are
looking at a view, not at a table, but for the planner to do its job of
returning the data you want efficiently it must be allowed to optimize
your request by rewriting it into a logically equivalent form. For
example, given something like ...where eventtime >
abstime(12334554321)... the planner will detect that abstime(1234554321)
will always return the same result and will calculate that value once,
not redo it once for every row.
The idea that you are just reducing the number of rows doesn't mean that
doing things in that order is efficient. Imagine that you had a view
that returned the records of all males from a table of all people in the
US that includes an indexed birthday field as MMDD. This view would
return a bit over 155-million rows - roughly 50% of the approximately
310-million US population. You then query that view to find men who have
a birthday today. It is obviously suboptimal to first retrieve
155-million records from disk then evaluate those for the appropriate
birthday when you can, instead, start by grabbing 1/365th or less than
1-million rows which can be efficiently returned using the index then
evaluate those records for male/female.
It's sort of like sending someone to the store with a shopping list -
you have no expectation that they will fill the cart in the order you
put things on the list, just that they will come back with the items on
the list.
Cheers,
Steve
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general