On 12/05/2015 08:08 AM, Berend Tober wrote: > /* > > Deletion Challenge > > I want to delete all but the most recent transaction, per person, from a > table that records a transaction history because at some point the > transaction history grows large enough to adversely effect performance, > and also becomes less relevant for retention. > > I have devised a way to accomplish this, but it is a 'two-stage' > approach: that is, it requires two delete statements. I would like to > know if there is a way to do it in a single statement. > > Bonus challenge: Same question, except preserving the most recent N, for > N > 1, rows for each person so that a short history is retained after > the deletion. > > I have included below an annotated test case and my current solution for > the N = 1 case. > > */ > > DROP TABLE IF EXISTS cash_journal; > > > CREATE TABLE cash_journal ( > click bigint NOT NULL, > cash_journal_id bigint NOT NULL, > fairian_id bigint NOT NULL, > debit double precision, > credit double precision, > balance real DEFAULT 0, > description text > ); > > COMMENT ON COLUMN cash_journal.click IS 'Time of transaction.'; > COMMENT ON COLUMN cash_journal.cash_journal_id IS 'Sequence of > transaction within current click.'; > COMMENT ON COLUMN cash_journal.fairian_id IS 'Fairian account > effected.'; > COMMENT ON COLUMN cash_journal.debit IS 'Account balance increase > amount.'; > COMMENT ON COLUMN cash_journal.credit IS 'Account balance > decrease amount.'; > COMMENT ON COLUMN cash_journal.balance IS 'Account balance, per > Fairian running total.'; > COMMENT ON COLUMN cash_journal.description IS 'Transaction > description.'; > > /* > > Below is some sample data, listed in the click/sequence order that the > data would actually be entered. That is, the 'click' column represents > advancing time, and within each click, transactions are sequenced by the > 'cash_journal_id' column. Note there are some missing cash_journal_id > sequence numbers. This is an artifact of having presented here only > an illustrative sample. Generally, within each click, the sequence > would start at one and increment uniformly by one for each new row > in the same click, and then reset to one for the next click. The > missing increments in the sample data should not make any difference > in the solution. > > The 'balance' column is a per-player running total, which is a > deliberate denormalization. It is calculated in a before insert trigger > by starting with the per-player previous balance, and then adding > the new row debit, if any, and subtracting the new row credit, if any. > > Note, not all Fairians will have a transaction in every click, but any > number of Fairians may have multiple transactions in any click. > > */ > > copy cash_journal > (click,cash_journal_id,fairian_id,debit,credit,balance,description) from > stdin; > 36 3 7 0 0 0 Initial cash balance > 36 4 8 0 0 0 Initial cash balance > 36 5 9 0 0 0 Initial cash balance > 36 14 18 0 0 0 initial cash balance > 37 5 7 9 \N 9 Ratified contract fa35e192121eab > 37 7 8 8 \N 8 Ratified contract f1abd670358e03 > 37 9 9 7 \N 7 Ratified contract 1574bddb75c78a > 411 1 25 0 0 0 Initial cash balance > 411 2 25 1000 \N 1000 Issued bond 7719a1c782a1ba > 412 1 7 5 \N 14 Sold food quantity 7 units. > 412 2 25 \N 5 995 Bought food quantity 7 units. > 413 1 25 \N 995 0 Redeemed bond 7719a1c782a1ba > \. > > > SELECT * FROM cash_journal order by fairian_id, click, cash_journal_id; > > /* > > The sample starting data is shown here in order by Fairian so that it is > perhaps easier to see what is happening for each player. Note that the > result of the deletion should be the last row for each player. > > click | cash_journal_id | fairian_id | debit | credit | balance > | description > -------+-----------------+------------+-------+--------+---------+---------------------------------- > > 36 | 3 | 7 | 0 | 0 | 0 | > Initial cash balance > 37 | 5 | 7 | 9 | | 9 | > Ratified contract fa35e192121eab > 412 | 1 | 7 | 5 | | 14 | Sold > food quantity 7 units. > 36 | 4 | 8 | 0 | 0 | 0 | > Initial cash balance > 37 | 7 | 8 | 8 | | 8 | > Ratified contract f1abd670358e03 > 36 | 5 | 9 | 0 | 0 | 0 | > Initial cash balance > 37 | 9 | 9 | 7 | | 7 | > Ratified contract 1574bddb75c78a > 36 | 14 | 18 | 0 | 0 | 0 | > initial cash balance > 411 | 1 | 25 | 0 | 0 | 0 | > Initial cash balance > 411 | 2 | 25 | 1000 | | 1000 | > Issued bond 7719a1c782a1ba > 412 | 2 | 25 | | 5 | 995 | > Bought food quantity 7 units. > 413 | 1 | 25 | | 995 | 0 | > Redeemed bond 7719a1c782a1ba > (12 rows) > > */ > > > /* > > Here is the current, two-stage solution in use. Is there a way to do it > with a single statement? > > Can you create a solution that retains an arbitrarily specified number > of rows per player? > > */ > BEGIN; > > WITH max_click AS ( > SELECT > cash_journal.fairian_id, > max(cash_journal.click) AS click > FROM cash_journal > GROUP BY cash_journal.fairian_id > ) > delete from cash_journal j > using max_click b > where j.fairian_id = b.fairian_id > and j.click < b.click; > > WITH max_journal_id AS ( > SELECT > cash_journal.fairian_id, > cash_journal.click, > max(cash_journal.cash_journal_id) AS cash_journal_id > FROM cash_journal > GROUP BY cash_journal.fairian_id, cash_journal.click > ) > delete from cash_journal j > using max_journal_id b > where j.fairian_id = b.fairian_id > and j.click = b.click > and j.cash_journal_id < b.cash_journal_id; > > COMMIT; > > SELECT * FROM cash_journal order by fairian_id, click, cash_journal_id; > > /* > > click | cash_journal_id | fairian_id | debit | credit | balance > | description > -------+-----------------+------------+-------+--------+---------+---------------------------------- > > 412 | 1 | 7 | 5 | | 14 | Sold > food quantity 7 units. > 37 | 7 | 8 | 8 | | 8 | > Ratified contract f1abd670358e03 > 37 | 9 | 9 | 7 | | 7 | > Ratified contract 1574bddb75c78a > 36 | 14 | 18 | 0 | 0 | 0 | > initial cash balance > 413 | 1 | 25 | | 995 | 0 | > Redeemed bond 7719a1c782a1ba > (5 rows) > > > */ > > test=> delete from cash_journal where ARRAY[click, cash_journal_id] NOT in (select max(ARRAY[click,cash_journal_id]) from cash_journal group by fairian_id); DELETE 7 test=> SELECT * FROM cash_journal order by fairian_id, click, cash_journal_id; click | cash_journal_id | fairian_id | debit | credit | balance | description -------+-----------------+------------+-------+--------+---------+---------------------------------- 412 | 1 | 7 | 5 | | 14 | Sold food quantity 7 units. 37 | 7 | 8 | 8 | | 8 | Ratified contract f1abd670358e03 37 | 9 | 9 | 7 | | 7 | Ratified contract 1574bddb75c78a 36 | 14 | 18 | 0 | 0 | 0 | initial cash balance 413 | 1 | 25 | | 995 | 0 | Redeemed bond 7719a1c782a1ba (5 rows) -- Adrian Klaver adrian.klaver@xxxxxxxxxxx -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general