> You would benefit from adding the age column to view_customer, or at
> least consider having some view which contains all the columns you'll > ever need from those tables and if you need special views with only a > subset of columns due to some software doing "select * from > viewname;", then you could just create some. Joining to the same table > again seems like a bit of a waste of effort for the planner and > executor. I would argue that the anti pattern would be the software that insist on using "select * from viewname;" from a view that has calculated columns that you do not care for. I recommend introducing both lightweight views and heavyweight views, so you can join up probably for what you need. My example is fabricated trying to simplify things, but I seem to
create more confusion than clarity in my example. My point was only to see if anything could be added to the fabricated execution path. I agree that the listed example does not make sense. So I will try and give some more context to real use cases. Imagine an invoice entity where you have one relation for invoice
base data and a relation for invoice_line. The invoice has some invoice_id, customer_id, due_date, paid_date and invoice_line contains each line with a invoice_id, display_name, amount. A view (view_invoice_with_amount) where you calculate the total. so a query could be SELECT c.customer_id, i.invoice_amount_total FROM view_customer c JOIN view_invoice_with_amount i ON c.customer_id = i.customer_id WHERE c.first_name = 'John'; If you ever need to filter by invoice_amount_total, it might be necesary denormalize the relations and cache the amount in the invoice table. > I'd assume customer_id is the PRIMARY KEY of customer and
> is unique. This is a continuation of the previous example, maybe I should have included it all to make it more clear. But customer_id is declared as a primary key. > It's not all that clear what your view is doing here. Confusingly
> there's a Sort in the plan, yet nothing in the query asked for that, > so I guess that the view must have an ORDER BY. If you get rid of that > the planner would likely use an index on product (customer_id) to > parameterise the nested loop, at least, it likely would, if you have > one. The view is defined in the original post. What I was trying to illustrate was a DISTINCT ON clause to prioritize multiple products pr customer to a somewhat "main" product for the customer. The ORDER BY on product_id would in this case then map the first product a customer gets to its "main" product. It could also be the most valuable product or newest ordered active product etc. It is just some way of mapping one to many relation to a one to one. Again the example is simplified and fabricated and maybe looses its power to explain its intents. > It's pretty bad practice to have ORDER BY in views. I kinda wish we > didn't even allow it, but that ship sailed many years ago... It is required by DISTINCT ON and as soon as you go into reporting, datawarehouse then it gets difficult to avoid these along with group by. Instead of writing each query from the ground up you get a huge benefit by factorizing each query into meaningful entities that can stand alone and make sense by themself, and from these build up the query to answer your questions. That way you gain lots of re-use of code and definition doesn't change between queries. The down side is it leaves alot of work to the planner. It's a trade off between optimization, readability and simplicity. I hope I make more sense now.
- Kim
|