Adrian Klaver wrote:
On 12/05/2015 08:08 AM, Berend Tober wrote:
/*
Deletion Challenge
I want to delete all but the most recent transaction, per person, from a
table that records a transaction history because at some point the
transaction history grows large enough to adversely effect performance,
and also becomes less relevant for retention.
...
test=> delete from cash_journal where ARRAY[click, cash_journal_id] NOT in (select max(ARRAY[click,cash_journal_id]) from cash_journal group by fairian_id);
DELETE 7
test=> SELECT * FROM cash_journal order by fairian_id, click, cash_journal_id;
click | cash_journal_id | fairian_id | debit | credit | balance | description
-------+-----------------+------------+-------+--------+---------+----------------------------------
412 | 1 | 7 | 5 | | 14 | Sold food quantity 7 units.
37 | 7 | 8 | 8 | | 8 | Ratified contract f1abd670358e03
37 | 9 | 9 | 7 | | 7 | Ratified contract 1574bddb75c78a
36 | 14 | 18 | 0 | 0 | 0 | initial cash balance
413 | 1 | 25 | | 995 | 0 | Redeemed bond 7719a1c782a1ba
(5 rows)
Nice.
The idea of a NOT IN query had occurred to me briefly, but I failed to pursue it because at some
point in the distant past I had gained the impression that NOT IN queries were not computationally
efficient. During one round of testing I had like a million rows. I'll have to run some EXPLAIN
query testing with a larger data sample for comparison. Thanks!
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general