Search Postgresql Archives

Deletion Challenge

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



/*

Deletion Challenge

I want to delete all but the most recent transaction, per person, from a
table that records a transaction history because at some point the
transaction history grows large enough to adversely effect performance,
and also becomes less relevant for retention.

I have devised a way to accomplish this, but it is a 'two-stage'
approach: that is, it requires two delete statements. I would like to
know if there is a way to do it in a single statement.

Bonus challenge: Same question, except preserving the most recent N, for
N > 1, rows for each person so that a short history is retained after
the deletion.

I have included below an annotated test case and my current solution for
the N = 1 case.

*/

DROP TABLE IF EXISTS cash_journal;


CREATE TABLE cash_journal (
    click bigint NOT NULL,
    cash_journal_id bigint NOT NULL,
    fairian_id bigint NOT NULL,
    debit double precision,
    credit double precision,
    balance real DEFAULT 0,
    description text
);

COMMENT ON COLUMN cash_journal.click		IS 'Time of transaction.';
COMMENT ON COLUMN cash_journal.cash_journal_id	IS 'Sequence of transaction within current click.';
COMMENT ON COLUMN cash_journal.fairian_id	IS 'Fairian account effected.';
COMMENT ON COLUMN cash_journal.debit		IS 'Account balance increase amount.';
COMMENT ON COLUMN cash_journal.credit		IS 'Account balance decrease amount.';
COMMENT ON COLUMN cash_journal.balance		IS 'Account balance, per Fairian running total.';
COMMENT ON COLUMN cash_journal.description	IS 'Transaction description.';

/*

Below is some sample data, listed in the click/sequence order that the
data would actually be entered. That is, the 'click' column represents
advancing time, and within each click, transactions are sequenced by the
'cash_journal_id' column. Note there are some missing cash_journal_id
sequence numbers. This is an artifact of having presented here only
an illustrative sample. Generally, within each click, the sequence
would start at one and increment uniformly by one for each new row
in the same click, and then reset to one for the next click. The
missing increments in the sample data should not make any difference
in the solution.

The 'balance' column is a per-player running total, which is a
deliberate denormalization. It is calculated in a before insert trigger
by starting with the per-player previous balance, and then adding
the new row debit, if any, and subtracting the new row credit, if any.

Note, not all Fairians will have a transaction in every click, but any
number of Fairians may have multiple transactions in any click.

*/

copy cash_journal (click,cash_journal_id,fairian_id,debit,credit,balance,description) from stdin;
36	3	7	0	0	0	Initial cash balance
36	4	8	0	0	0	Initial cash balance
36	5	9	0	0	0	Initial cash balance
36	14	18	0	0	0	initial cash balance
37	5	7	9	\N	9	Ratified contract fa35e192121eab
37	7	8	8	\N	8	Ratified contract f1abd670358e03
37	9	9	7	\N	7	Ratified contract 1574bddb75c78a
411	1	25	0	0	0	Initial cash balance
411	2	25	1000	\N	1000	Issued bond 7719a1c782a1ba
412	1	7	5	\N	14	Sold food quantity 7 units.
412	2	25	\N	5	995	Bought food quantity 7 units.
413	1	25	\N	995	0	Redeemed bond 7719a1c782a1ba
\.


SELECT * FROM cash_journal order by fairian_id, click, cash_journal_id;

/*

The sample starting data is shown here in order by Fairian so that it is
perhaps easier to see what is happening for each player. Note that the
result of the deletion should be the last row for each player.

 click | cash_journal_id | fairian_id | debit | credit | balance |           description
-------+-----------------+------------+-------+--------+---------+----------------------------------
    36 |               3 |          7 |     0 |      0 |       0 | Initial cash balance
    37 |               5 |          7 |     9 |        |       9 | Ratified contract fa35e192121eab
   412 |               1 |          7 |     5 |        |      14 | Sold food quantity 7 units.
    36 |               4 |          8 |     0 |      0 |       0 | Initial cash balance
    37 |               7 |          8 |     8 |        |       8 | Ratified contract f1abd670358e03
    36 |               5 |          9 |     0 |      0 |       0 | Initial cash balance
    37 |               9 |          9 |     7 |        |       7 | Ratified contract 1574bddb75c78a
    36 |              14 |         18 |     0 |      0 |       0 | initial cash balance
   411 |               1 |         25 |     0 |      0 |       0 | Initial cash balance
   411 |               2 |         25 |  1000 |        |    1000 | Issued bond 7719a1c782a1ba
   412 |               2 |         25 |       |      5 |     995 | Bought food quantity 7 units.
   413 |               1 |         25 |       |    995 |       0 | Redeemed bond 7719a1c782a1ba
(12 rows)

*/


/*

Here is the current, two-stage solution in use. Is there a way to do it
with a single statement?

Can you create a solution that retains an arbitrarily specified number
of rows per player?

*/
BEGIN;

WITH max_click AS (
  SELECT
    cash_journal.fairian_id,
    max(cash_journal.click) AS click
    FROM cash_journal
    GROUP BY cash_journal.fairian_id
    )
  delete from cash_journal j
    using max_click b
    where j.fairian_id = b.fairian_id
    and j.click        < b.click;

WITH max_journal_id AS (
  SELECT
    cash_journal.fairian_id,
    cash_journal.click,
    max(cash_journal.cash_journal_id) AS cash_journal_id
    FROM cash_journal
    GROUP BY cash_journal.fairian_id, cash_journal.click
    )
  delete from cash_journal j
     using max_journal_id b
     where j.fairian_id    = b.fairian_id
     and j.click           = b.click
     and j.cash_journal_id < b.cash_journal_id;

COMMIT;

SELECT * FROM cash_journal order by fairian_id, click, cash_journal_id;

/*

 click | cash_journal_id | fairian_id | debit | credit | balance |           description
-------+-----------------+------------+-------+--------+---------+----------------------------------
   412 |               1 |          7 |     5 |        |      14 | Sold food quantity 7 units.
    37 |               7 |          8 |     8 |        |       8 | Ratified contract f1abd670358e03
    37 |               9 |          9 |     7 |        |       7 | Ratified contract 1574bddb75c78a
    36 |              14 |         18 |     0 |      0 |       0 | initial cash balance
   413 |               1 |         25 |       |    995 |       0 | Redeemed bond 7719a1c782a1ba
(5 rows)


*/


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux