Search Postgresql Archives

Re: Select unique set of record, distinct on, please help!!!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 16, 2008 at 10:13 AM, worker <tzhai2007@xxxxxxxxx> wrote:
> Dear all,
>     I am still new to the PostGres DB, I 've been using other DB for
> quite a while, so SQL is OK with me. I am running into problem trying
> to condense duplicate records in a table into one record. Naturally, I
> think about 'DISTINCT ON' in select, but I just couldn't figure out
> the proper syntax to use it correctly. So, please help.
>
>    My table is as follow:
>   Column    |         Type                |
> Modifiers
> -----------------+-------------------------------
> +-------------------------------------------------------
>   tag_id      | character varying(15) |
>  enddate     | date                         |
>  storetcodei | integer                     |
>      gtlt        | character(1)             |
>    value       | real                          |
>    autoid      | integer                     | not null default
> nextval('results_autoid_seq'::regclass)
>
>
>    The main issue is that a lot of records have the same 'tag_id' and
> 'enddate', which is redundant. So I was trying to get rid of all the
> duplicates, but just leave one copy in the database, simple you say,
> right? NOT!
>
>    I used many variants of the following query, but it seems psql is
> not doing the right thing, it always gotten rid of all duplicate, not
> leaving even one record from all the duplicates, in other db, all I
> need is 'unique', so please could someone just give me a SELECT INOT
> or CREATE TABLE AS statements that can rid of the duplicate but leave
> one copy in the table.
>
> create table results1 as select distinct on (tag_id,enddate) * from
> resultsall;

What exactly happens that's wrong.  I've used something like that
before and it should work.

Another possibility is to do it in place.  With transactions you can
always try something, check it, and roll back as needed.

let's say you have an id field that IS unique, we'll call it id (you
can create one if you don't have one)

alter table results1 add id int;
create temp sequence myseq;
update results1 set id=nextval('myseq');
select =id from results1 r1 join results r2 on (r1.tag_id=r2.tag and
r1.enddate=r2.enddate and r1.id > r2.id);

That query should return all the max ids of the matching rows.  The
id>id is the "order by" here.  You could replace it with some other
comparison for ordering.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux