Re: Need some help with a query (uniq -c)

Kenichiro Tanaka <ketanaka@xxxxxxxxxxxxxx> · Tue, 13 Apr 2010 20:12:40 +0900

Hello.

I try with "With Query".
http://www.postgresql.org/docs/8.4/static/queries-with.html

#We can use "With Queries" >  v8.4
#That'll only work if the time values are contiguous, but there's probably a
#similar trick for non-contiguous ,too.

--create data
drop table foo;
create table foo( time int,message text);
insert into foo values(1,'a');
insert into foo values(2,'b');
insert into foo values(3,'b');
insert into foo values(4,'c');
insert into foo values(5,'a');
insert into foo values(6,'c');
insert into foo values(7,'c');
insert into foo values(8,'a');
insert into foo values(9,'a');
insert into foo values(10,'a');

--begin Answer
with recursive r as (
select foo.time,foo.message,1  as dummy from foo
union all
select foo.time,foo.message,r.dummy+1 from foo , r
where foo.time=r.time-1 and foo.message=r.message
)
,rr as (
select foo.time,foo.message,'OLID' as flag  from foo
union all
select foo.time,foo.message,'DUP' as flag from foo , rr
where foo.time-1=rr.time-2 and foo.message=rr.message
)
select time min,time+max(dummy)-1 max,message,max(dummy) counts
from r where time not in (select distinct (time+1) times from rr
where flag='DUP') group by time,message order by time;

--result
postgres(# where flag='DUP') group by time,message order by time;
 min | max | message | counts
-----+-----+---------+--------
   1 |   1 | a       |      1
   2 |   3 | b       |      2
   4 |   4 | c       |      1
   5 |   5 | a       |      1
   6 |   7 | c       |      2
   8 |  10 | a       |      3
(6 rows)

--end

But I think some one can provide more simple SQL.

Thank you.
On Apr 12, 2010, at 11:31 AM, Scott Marlowe wrote:

On Mon, Apr 12, 2010 at 12:22 PM, A B<gentosaker@xxxxxxxxx>  wrote:

Hello!

I have a table (think of it as a table of log messages)

time |  message
-----------------------
1      |   a
2      |   b
3      |   b
4      |  b
5      |  a

the three 'b' are the same message, so I would like to write a query
that would give me a result that is similar to what the unix command
"uniq -c" would give:

first |  message | last | count
--------------------------------------
1     |     a              |   1   |     1
2     |     b              |   4   |     3<--- here it squeezes
similar consecutive messages into a single row
5     |     a              |   5   |     1

How do I write such a command?

Pretty straight ahead:

select min(t), message, max(t), count(*) from table group by message.

That was my first though too, but it combines everything not just adjacent messages.

Something like this, maybe

select t1.message, t1.time as first, t2.time as last, t2.time-t1.time+1 as count
         from foo as t1, foo as t2
     where t1.time<= t2.time and t1.message = t2.message
         and not exists
             (select * from foo as t3
              where (t3.time between t1.time and t2.time and t3.message<>  t1.message)
              or (t3.time = t2.time + 1 and t3.message = t1.message)
              or (t3.time = t1.time - 1 and t3.message = t1.message));

  message | first | last | count
---------+-------+------+-------
  a       |     1 |    1 |     1
  b       |     2 |    4 |     3
  a       |     5 |    5 |     1

That'll only work if the time values are contiguous, but there's probably a
similar trick for non-contiguous.

Cheers,
   Steve

--
================================================
Kenichiro Tanaka
K.K.Ashisuto
http://www.ashisuto.co.jp/english/index.html
================================================

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general