Search Postgresql Archives

Re: Deletion Challenge

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/05/2015 08:08 AM, Berend Tober wrote:
> /*
> 
> Deletion Challenge
> 
> I want to delete all but the most recent transaction, per person, from a
> table that records a transaction history because at some point the
> transaction history grows large enough to adversely effect performance,
> and also becomes less relevant for retention.
> 
> I have devised a way to accomplish this, but it is a 'two-stage'
> approach: that is, it requires two delete statements. I would like to
> know if there is a way to do it in a single statement.
> 
> Bonus challenge: Same question, except preserving the most recent N, for
> N > 1, rows for each person so that a short history is retained after
> the deletion.
> 
> I have included below an annotated test case and my current solution for
> the N = 1 case.
> 
> */
> 
> DROP TABLE IF EXISTS cash_journal;
> 
> 
> CREATE TABLE cash_journal (
>      click bigint NOT NULL,
>      cash_journal_id bigint NOT NULL,
>      fairian_id bigint NOT NULL,
>      debit double precision,
>      credit double precision,
>      balance real DEFAULT 0,
>      description text
> );
> 
> COMMENT ON COLUMN cash_journal.click        IS 'Time of transaction.';
> COMMENT ON COLUMN cash_journal.cash_journal_id    IS 'Sequence of 
> transaction within current click.';
> COMMENT ON COLUMN cash_journal.fairian_id    IS 'Fairian account 
> effected.';
> COMMENT ON COLUMN cash_journal.debit        IS 'Account balance increase 
> amount.';
> COMMENT ON COLUMN cash_journal.credit        IS 'Account balance 
> decrease amount.';
> COMMENT ON COLUMN cash_journal.balance        IS 'Account balance, per 
> Fairian running total.';
> COMMENT ON COLUMN cash_journal.description    IS 'Transaction 
> description.';
> 
> /*
> 
> Below is some sample data, listed in the click/sequence order that the
> data would actually be entered. That is, the 'click' column represents
> advancing time, and within each click, transactions are sequenced by the
> 'cash_journal_id' column. Note there are some missing cash_journal_id
> sequence numbers. This is an artifact of having presented here only
> an illustrative sample. Generally, within each click, the sequence
> would start at one and increment uniformly by one for each new row
> in the same click, and then reset to one for the next click. The
> missing increments in the sample data should not make any difference
> in the solution.
> 
> The 'balance' column is a per-player running total, which is a
> deliberate denormalization. It is calculated in a before insert trigger
> by starting with the per-player previous balance, and then adding
> the new row debit, if any, and subtracting the new row credit, if any.
> 
> Note, not all Fairians will have a transaction in every click, but any
> number of Fairians may have multiple transactions in any click.
> 
> */
> 
> copy cash_journal 
> (click,cash_journal_id,fairian_id,debit,credit,balance,description) from 
> stdin;
> 36    3    7    0    0    0    Initial cash balance
> 36    4    8    0    0    0    Initial cash balance
> 36    5    9    0    0    0    Initial cash balance
> 36    14    18    0    0    0    initial cash balance
> 37    5    7    9    \N    9    Ratified contract fa35e192121eab
> 37    7    8    8    \N    8    Ratified contract f1abd670358e03
> 37    9    9    7    \N    7    Ratified contract 1574bddb75c78a
> 411    1    25    0    0    0    Initial cash balance
> 411    2    25    1000    \N    1000    Issued bond 7719a1c782a1ba
> 412    1    7    5    \N    14    Sold food quantity 7 units.
> 412    2    25    \N    5    995    Bought food quantity 7 units.
> 413    1    25    \N    995    0    Redeemed bond 7719a1c782a1ba
> \.
> 
> 
> SELECT * FROM cash_journal order by fairian_id, click, cash_journal_id;
> 
> /*
> 
> The sample starting data is shown here in order by Fairian so that it is
> perhaps easier to see what is happening for each player. Note that the
> result of the deletion should be the last row for each player.
> 
>   click | cash_journal_id | fairian_id | debit | credit | balance 
> |           description
> -------+-----------------+------------+-------+--------+---------+---------------------------------- 
> 
>      36 |               3 |          7 |     0 |      0 |       0 | 
> Initial cash balance
>      37 |               5 |          7 |     9 |        |       9 | 
> Ratified contract fa35e192121eab
>     412 |               1 |          7 |     5 |        |      14 | Sold 
> food quantity 7 units.
>      36 |               4 |          8 |     0 |      0 |       0 | 
> Initial cash balance
>      37 |               7 |          8 |     8 |        |       8 | 
> Ratified contract f1abd670358e03
>      36 |               5 |          9 |     0 |      0 |       0 | 
> Initial cash balance
>      37 |               9 |          9 |     7 |        |       7 | 
> Ratified contract 1574bddb75c78a
>      36 |              14 |         18 |     0 |      0 |       0 | 
> initial cash balance
>     411 |               1 |         25 |     0 |      0 |       0 | 
> Initial cash balance
>     411 |               2 |         25 |  1000 |        |    1000 | 
> Issued bond 7719a1c782a1ba
>     412 |               2 |         25 |       |      5 |     995 | 
> Bought food quantity 7 units.
>     413 |               1 |         25 |       |    995 |       0 | 
> Redeemed bond 7719a1c782a1ba
> (12 rows)
> 
> */
> 
> 
> /*
> 
> Here is the current, two-stage solution in use. Is there a way to do it
> with a single statement?
> 
> Can you create a solution that retains an arbitrarily specified number
> of rows per player?
> 
> */
> BEGIN;
> 
> WITH max_click AS (
>    SELECT
>      cash_journal.fairian_id,
>      max(cash_journal.click) AS click
>      FROM cash_journal
>      GROUP BY cash_journal.fairian_id
>      )
>    delete from cash_journal j
>      using max_click b
>      where j.fairian_id = b.fairian_id
>      and j.click        < b.click;
> 
> WITH max_journal_id AS (
>    SELECT
>      cash_journal.fairian_id,
>      cash_journal.click,
>      max(cash_journal.cash_journal_id) AS cash_journal_id
>      FROM cash_journal
>      GROUP BY cash_journal.fairian_id, cash_journal.click
>      )
>    delete from cash_journal j
>       using max_journal_id b
>       where j.fairian_id    = b.fairian_id
>       and j.click           = b.click
>       and j.cash_journal_id < b.cash_journal_id;
> 
> COMMIT;
> 
> SELECT * FROM cash_journal order by fairian_id, click, cash_journal_id;
> 
> /*
> 
>   click | cash_journal_id | fairian_id | debit | credit | balance 
> |           description
> -------+-----------------+------------+-------+--------+---------+---------------------------------- 
> 
>     412 |               1 |          7 |     5 |        |      14 | Sold 
> food quantity 7 units.
>      37 |               7 |          8 |     8 |        |       8 | 
> Ratified contract f1abd670358e03
>      37 |               9 |          9 |     7 |        |       7 | 
> Ratified contract 1574bddb75c78a
>      36 |              14 |         18 |     0 |      0 |       0 | 
> initial cash balance
>     413 |               1 |         25 |       |    995 |       0 | 
> Redeemed bond 7719a1c782a1ba
> (5 rows)
> 
> 
> */
> 
> 


test=> delete from cash_journal where ARRAY[click, cash_journal_id] NOT in (select max(ARRAY[click,cash_journal_id]) from cash_journal group by fairian_id);
DELETE 7

test=> SELECT * FROM cash_journal order by fairian_id, click, cash_journal_id;
 click | cash_journal_id | fairian_id | debit | credit | balance |           description            
-------+-----------------+------------+-------+--------+---------+----------------------------------
   412 |               1 |          7 |     5 |        |      14 | Sold food quantity 7 units.
    37 |               7 |          8 |     8 |        |       8 | Ratified contract f1abd670358e03
    37 |               9 |          9 |     7 |        |       7 | Ratified contract 1574bddb75c78a
    36 |              14 |         18 |     0 |      0 |       0 | initial cash balance
   413 |               1 |         25 |       |    995 |       0 | Redeemed bond 7719a1c782a1ba
(5 rows)


-- 
Adrian Klaver
adrian.klaver@xxxxxxxxxxx


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux