I have a view to generate a list of instructors and a count of their > future classes. > > "instructors" is a link table between "class" and "person". > > CREATE VIEW future_instructor_counts > AS > SELECT person.id AS person_id, > first_name, > last_name, > count(instructors.class) AS class_count > > FROM class, instructors, person > > WHERE class.id = instructors.class AND > person.id = instructors.person > AND class_time > now() > > GROUP BY person_id, first_name, last_name; The trick is to do the data aggregation separately, then JOIN in whatever other fields you want. Something like this: CREATE VIEW future_instructor_counts AS SELECT * FROM (SELECT person.id AS person_id, first_name, last_name) personinfo INNER JOIN (SELECT class.id FROM class WHERE class_time > now() ) classes INNER JOIN (SELECT id, count(class) AS class_count FROM instructors GROUP BY id) classcount ON personinfo.person_id = instructors.id AND classes.id = instructors.id In many cases when using aggregate functions you get just the fields you need from the agg function (typically an id plus the aggregate result) and JOIN with other tables (or even the same table) to get other info such as first_name, last_name, etc. Otherwise, if you GROUP BY additional fields so you can get them in the output, you may be making the db do additional work. > 1) With an aggregate function in the query, is there any way to remove > the "AND class_time > now()" so that timestamp can be passed in the > select? That is, I'd like to be able to do this? > > select * from instructor_counts where class_time > now(); > > But class_time is not part of the VIEW so that's not valid. No problem, just make it a part of the view. See the classes section below. CREATE VIEW future_instructor_counts AS SELECT * FROM (SELECT person.id AS person_id, first_name, last_name) personinfo INNER JOIN -- Add class_time field! (SELECT class.id, class_time FROM class WHERE class_time > now() ) classes INNER JOIN (SELECT id, count(class) AS class_count FROM instructors GROUP BY id) classcount ON personinfo.person_id = instructors.id AND classes.id = instructors.id [Disclaimer: I've not tested this code at all. It could help if you sent table definitions and maybe even dummy data via insert commands.] > And if it was included then I don't have an aggregate function any more - no > more grouping. If you do the agg function separately like this that isn't an issue. You join tables to get whatever fields you'd like to have in your output. > 2) I think I'm missing something obvious. I know that I need to > specify all my non-aggregate columns in the "GROUP BY", but I don't > under stand why. Really, the results are just grouped only by > person.id so why the need to specify the other columns. > > And if you don't specify all the columns then Postgresql reports: > > ERROR: column "person.id" must appear in the GROUP BY > clause or be used in an aggregate function > > Is there a reason Postgresql doesn't just add the column > automatically? It does in other cases (like a missing table in a > join). As I mention above, if you GROUP BY additional fields just to get them in the output, you may be making the db do additional work. I seem to remember that in a later SQL standard (ie, after SQL-99 but I could be wrong) I believe it allows you to specify additional fields in SELECT that are not in the GROUP BY clause. But PG isn't there yet. -Roger -----Original Message----- From: pgsql-general-owner@xxxxxxxxxxxxxx [mailto:pgsql-general-owner@xxxxxxxxxxxxxx]On Behalf Of Bill Moseley Sent: Friday, September 16, 2005 11:30 AM To: pgsql-general@xxxxxxxxxxxxxx Subject: Setting WHERE on a VIEW with aggregate function. I have a view to generate a list of instructors and a count of their future classes. "instructors" is a link table between "class" and "person". CREATE VIEW future_instructor_counts AS SELECT person.id AS person_id, first_name, last_name, count(instructors.class) AS class_count FROM class, instructors, person WHERE class.id = instructors.class AND person.id = instructors.person AND class_time > now() GROUP BY person_id, first_name, last_name; I have two very basic SQL questions: 1) With an aggregate function in the query, is there any way to remove the "AND class_time > now()" so that timestamp can be passed in the select? That is, I'd like to be able to do this? select * from instructor_counts where class_time > now(); But class_time is not part of the VIEW so that's not valid. And if it was included then I don't have an aggregate function any more - no more grouping. 2) I think I'm missing something obvious. I know that I need to specify all my non-aggregate columns in the "GROUP BY", but I don't under stand why. Really, the results are just grouped only by person.id so why the need to specify the other columns. And if you don't specify all the columns then Postgresql reports: ERROR: column "person.id" must appear in the GROUP BY clause or be used in an aggregate function Is there a reason Postgresql doesn't just add the column automatically? It does in other cases (like a missing table in a join). Thanks -- Bill Moseley moseley@xxxxxxxx ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings