Search Postgresql Archives

Re: Setting WHERE on a VIEW with aggregate function.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have a view to generate a list of instructors and a count of their
> future classes.
> 
> "instructors" is a link table between "class" and "person".
> 
> CREATE VIEW future_instructor_counts
>     AS
>         SELECT  person.id AS person_id,
>                 first_name,
>                 last_name,
>                 count(instructors.class) AS class_count
> 
>           FROM  class, instructors, person
> 
>          WHERE  class.id    = instructors.class AND
>                 person.id   = instructors.person
>                 AND class_time > now()
> 
>       GROUP BY  person_id, first_name, last_name;

The trick is to do the data aggregation separately, then JOIN in whatever other fields you want.

Something like this:

CREATE VIEW future_instructor_counts
    AS
        SELECT  * FROM 

	(SELECT 
	 person.id AS person_id,
                first_name,
                last_name) personinfo

	INNER JOIN 

	(SELECT class.id FROM class
	WHERE class_time > now() ) classes

	INNER JOIN

	(SELECT 
	 id, count(class) AS class_count 
	FROM instructors GROUP BY id) classcount

	ON personinfo.person_id = instructors.id
	AND classes.id = instructors.id
            
In many cases when using aggregate functions you get just the fields you need from the agg function (typically an id plus the aggregate result) and JOIN with other tables (or even the same table) to get other info such as first_name, last_name, etc.

Otherwise, if you GROUP BY additional fields so you can get them in the output, you may be making the db do additional work.

> 1) With an aggregate function in the query, is there any way to remove
> the "AND class_time > now()" so that timestamp can be passed in the
> select?  That is, I'd like to be able to do this?
> 
>     select * from instructor_counts where class_time > now();
> 
> But class_time is not part of the VIEW so that's not valid.

No problem, just make it a part of the view. See the classes section below.

CREATE VIEW future_instructor_counts
    AS
        SELECT  * FROM 

	(SELECT 
	 person.id AS person_id,
                first_name,
                last_name) personinfo

	INNER JOIN 

	-- Add class_time field!
	(SELECT class.id, class_time FROM class
	WHERE class_time > now() ) classes

	INNER JOIN

	(SELECT 
	 id, count(class) AS class_count 
	FROM instructors GROUP BY id) classcount

	ON personinfo.person_id = instructors.id
	AND classes.id = instructors.id

[Disclaimer: I've not tested this code at all. It could help if you sent table definitions and maybe even dummy
data via insert commands.]

>  And if it was included then I don't have an aggregate function any more - no
> more grouping.

If you do the agg function separately like this that isn't an issue. You join tables to get whatever fields you'd like to have in your output.
 
> 2) I think I'm missing something obvious.  I know that I need to
> specify all my non-aggregate columns in the "GROUP BY", but I don't
> under stand why.  Really, the results are just grouped only by
> person.id so why the need to specify the other columns.
> 
> And if you don't specify all the columns then Postgresql reports:
> 
>   ERROR:  column "person.id" must appear in the GROUP BY 
>             clause or be used in an aggregate function
> 
> Is there a reason Postgresql doesn't just add the column
> automatically?  It does in other cases (like a missing table in a
> join).

As I mention above, if you GROUP BY additional fields just to get them in the output, you may be making the db do additional work.

I seem to remember that in a later SQL standard (ie, after SQL-99 but I could be wrong) I believe it allows you to specify additional fields in SELECT that are not in the GROUP BY clause. But PG isn't there yet. 

-Roger

-----Original Message-----
From: pgsql-general-owner@xxxxxxxxxxxxxx
[mailto:pgsql-general-owner@xxxxxxxxxxxxxx]On Behalf Of Bill Moseley
Sent: Friday, September 16, 2005 11:30 AM
To: pgsql-general@xxxxxxxxxxxxxx
Subject:  Setting WHERE on a VIEW with aggregate function.


I have a view to generate a list of instructors and a count of their
future classes.

"instructors" is a link table between "class" and "person".

CREATE VIEW future_instructor_counts
    AS
        SELECT  person.id AS person_id,
                first_name,
                last_name,
                count(instructors.class) AS class_count

          FROM  class, instructors, person

         WHERE  class.id    = instructors.class AND
                person.id   = instructors.person
                AND class_time > now()

      GROUP BY  person_id, first_name, last_name;


I have two very basic SQL questions:

1) With an aggregate function in the query, is there any way to remove
the "AND class_time > now()" so that timestamp can be passed in the
select?  That is, I'd like to be able to do this?

    select * from instructor_counts where class_time > now();

But class_time is not part of the VIEW so that's not valid.  And if it
was included then I don't have an aggregate function any more - no
more grouping.


2) I think I'm missing something obvious.  I know that I need to
specify all my non-aggregate columns in the "GROUP BY", but I don't
under stand why.  Really, the results are just grouped only by
person.id so why the need to specify the other columns.

And if you don't specify all the columns then Postgresql reports:

  ERROR:  column "person.id" must appear in the GROUP BY 
            clause or be used in an aggregate function

Is there a reason Postgresql doesn't just add the column
automatically?  It does in other cases (like a missing table in a
join).

Thanks


-- 
Bill Moseley
moseley@xxxxxxxx


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux