Search Postgresql Archives

getting the ranks out of items with SHARED

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

in postgresql you have several possibilites to get the rank of items. A thread 
earlier this year shows correlated subqueries (not very performant) and other 
tricks and techniques to solve the ranking problem:

http://archives.postgresql.org/pgsql-general/2005-05/msg00157.php

The possibility to use a SHARED variable in plperl can be another nice way to 
get a rank of items. ( good example showing SHARED in use is at 
http://www.varlena.com/varlena/GeneralBits/114.php)

So i wrote two functions for my purpose:

CREATE OR REPLACE function ranking(int4) RETURNS int4 LANGUAGE 'plperl' AS $$
  my %this;
 
  $this{'punkte'}  = shift;
  $this{'ranking'} = $_SHARED{'prev'}{'ranking'}; # defaults to 0
  $this{'count'}   = $_SHARED{'prev'}{'count'} + 1; # defaults to 1
  
  $this{'ranking'} = $this{'count'} unless
    $this{'punkte'} == $_SHARED{'prev'}{'punkte'};
  
  $_SHARED{'prev'} = \%this; 
  return $this{'ranking'};
$$;

CREATE OR REPLACE FUNCTION reset_ranking() RETURNS void LANGUAGE 'plperl' AS 
$$
  $_SHARED{'prev'} = undef;
$$;


Nice Thing: the function drops rankings which other ranking solutions in the 
given thread can't. Like this:

rank | points
-------------
1    | 10
2    |  9
2    |  9
4    |  8
5    |  7

It drops rank 3 because we have to entries for second rank.

It would be even nice if you can write a genric ranking() function which takes 
anyarray as an argument, but as far as i know you can't pass an "anyarray" to 
a plperl function, right?


Now i can do the following in plpsql Function which updates a caching table 
for me and it works fine:

PERFORM reset_ranking();        

CREATE TEMP TABLE ranking AS 
SELECT 
  *,
  ranking(r1.gc_gesamtpunkte) AS rank
FROM (
  SELECT 
    mg_name, 
    gc_gesamtpunkte
    FROM temp_gc
    ORDER BY gc_gesamtpunkte DESC, mg_name ASC
) AS r1
;
    
EXECUTE '
      UPDATE temp_gc 
      SET gc_rank = ranking.rank
      FROM ranking  
      WHERE temp_gc.mg_name = ranking.mg_name;    
';
    
DROP TABLE ranking;
 

Problems arrise when you try to do the select and update step together without 
any temporary table in between:

PERFORM reset_ranking();        
    
UPDATE temp_gc SET gc_rank = ranking.rank
FROM (
  SELECT 
    *,
    ranking(r1.gc_gesamtpunkte) AS rank
  FROM (
    SELECT 
      mg_name, 
      gc_gesamtpunkte
    FROM temp_gc
    ORDER BY gc_gesamtpunkte DESC, mg_name ASC
  ) AS r1
) AS ranking
WHERE temp_gc.mg_name = ranking.mg_name;    
';

I have a guess, what happens here: The order of the subselect statement is 
dropped by the optimizer because the optimizer doesn't see the "side-effect" 
of the ranking function. that's ok because using such functions isn't SQLish, 
i guess.

Is there a way to FORCE the optimizer to keep things orders like the sql 
statement author wanted it? 

kind regards,
janning


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux