Re: Selecting distinct records

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



Dave,
    How about a job that runs once that cleans the whole thing?
Or:
    [Warning, pseudo PHPish code]

    $rs_sid = select distinct (sessionid) from "logs";
    while ($s = pg_fetch_object($rs_sid)
    {
        $rs_stuff = select * from "logs" where .... and "sessionid" = $s->id
        for (loop of $rs_stuff)
        {
            $total+=$rs_stuff->sessiontime
        }
        echo "Session $s->id has $total seconds used";
    }

This method involves a select for each session (might be slow on millions of
records :) ).  This might work once or twice but I don't see it as a
suitable solution if this has to happen on a daily basis.

    I know that the latest PostgreSQL does nested selects just fine (I've
got one paticular statement that has three nested selects ) and it's
performance isn't as bad as you'd think, well at least after the query
analyzer has seen it once and it gets cached.

/B

----- Original Message -----
From: "Dave [Hawk-Systems]" <dave@xxxxxxxxxxxxxxxx>
To: "David Busby" <busby@xxxxxxxx>; <pgsql-php@xxxxxxxxxxxxxx>
Sent: Thursday, November 21, 2002 13:26
Subject: RE: Selecting distinct records


> >    Try:
> >select distinct on (sessionid) sum(sessiontime) from logs
> >where name='joeblowuser' and datetime > 1036040400
> >group by sessionid.
>
> sorry, I wasn't clear...  the problem arises when we need to sort the data
for
> output.  postgres demands that the distinct items be first in the ordering
> process, which would not allow display by session, name etc...
>
> the obvious result would be to nest the select statements but can't seem
to get
> that to work either (working with version 7.0.3)
>
> eg/  select * from (select distinct on (sessionid) from logs where....) as
> tempname order by sessiondate desc
>
> gives me an error on the second select, so not sure that is a workable
solution
> either.
>
> >sum() is an aggerate (sp?) function, that means it munges a field
> >(sessiontime) from multiple records into one field in one record.  Since
you
> >are also selecting sessionid (from mulitple records) you need to munge it
> >some how, that munge is accomplished via 'group by'.  From your previous
> >e-mail it seems that (IMHO) the real problem is that duplicates are
getting
> >inserted via external hardware interaction, this select might be a
bandage
> >on a wound whose true size isn't known...
>
> agreed, and that problem has been corrected, but we are dealing with close
to a
> million records which have these duplicates strewn about within...  rather
> annoying, looking for a bandaid in teh select to avoid intensive
post-select
> processing of the output.
>
> Dave
>



[Index of Archives]     [Postgresql General]     [Postgresql Admin]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Backpacking]     [Postgresql Jobs]

  Powered by Linux