Re: Strange deletion problem

"John Lister" <john.lister-ps@xxxxxxxxxxxxx> · Wed, 31 Mar 2010 11:36:23 +0100

Thanks to all that helped, I've eventually solved it, I set up a cron job to 
monitor the table every minute to help me narrow down the time frame when 
the event happened - the thought of scanning 20+Gb log files  for the 6hr 
period it happened again didn't fill me with joy. I also added 
log_statement=all and changed log_prefix to give me a clue, additionally I 
added a trigger to log all changes to an audit_table in case I still missed 
it.  I was using row level triggers as I didn't realise that postgresql had 
statement level ones, so you learn something new every day.

It turns out one of the developers had for some insane reason put "delete 
from product_list" into an overnight batch job. Bizarrely this wasn't 
appearing in the logs when I used log_min_duration_statement=0, but 
log_statement=all, which is something else I've learnt.

Right off to shoot the developer in the leg as we speak..

John
----- Original Message ----- 
From: "robin" <robin@xxxxxxxxxx>
To: "John Lister" <john.lister-ps@xxxxxxxxxxxxx>
Cc: <pgsql-admin@xxxxxxxxxxxxxx>
Sent: Wednesday, March 31, 2010 8:00 AM
Subject: Re:  Strange deletion problem

You could create a statement level delete trigger on the relevant table,
then get it to snapshot all the contents of the pg_stat_activity table
(which will show all running queries) into some sort of log table.

If you look at the plpgsql documentation (part of the postgresql manual
for your release) there is a section on triggers and an example on
maintaining a summary table; this is sort of the pattern you want to 
adopt,
except that what you insert into the "summary" table will be something
like:

"insert into my_table_delete_log (select * from pg_stat_activity where
current_query not like '%<IDLE>%');"

If you want to run something similar at the row level instead, beware that
you may then see it trigger 15K times and thus you could end up with 15K
copies of the active queries ...

You might also want to add a sequence column to your log table so you can
be sure which order the queries were inserted, although if you're using
statement level logging, you'll should see distinct query and backend 
start
timestamps anyway.

Just an idea - never tried it myself - so probably best to test it out on
a test database of some sort first!

Cheers,
Robin

On Tue, 30 Mar 2010 22:51:46 +0100, "John Lister"
<john.lister-ps@xxxxxxxxxxxxx> wrote:
2010/3/30 John Lister <john.lister-ps@xxxxxxxxxxxxx>

    Hi, I have a table which is constantly updated through out the day
    with no problems, I'm running Postgresql 8.3.8 in ubuntu karmic.
    However, within the last week for some reason overnight it is being
    emptied and I can't work out why. I've set
log_min_duration_statement
    to 0 so that postgresql dumps out every query made to the db but it
    hasn't brought anything up. During the period between from when I
know
    the table is ok to the next morning when it is empty I can see
entries
    in the logs for the expected 200 or so deletions, but during this
    period approx 15k rows are removed. The odd thing is that there is
    nothing else in the logs that references the table (I've looked for
the
    obvious deletion and trunctate statements).

    The table is modified using JDBC prepared statements so I see 3
    entries for each of the expected delete statements (parse, bind,
    execute) and the statement is as follows:

    delete from product_list where retailer_id=? and product_id=?

    Therefore I have a few questions:
    - Is there any other statements that could be causing the rows to be
    removed that I've missed
    - Is there anything that could be deleting them without generating a
    log entry for the statement?
    - Is it possible that data corruption of either the index/table is
    making my delete statements removed more rows?
    - Is it a possibly bug?

  Hi,
  - maybe there are some other settings for this database (they are not
  stored in the file), check the pg_settings table in the database
  - maybe you're looking in a wrong log file - sometimes do (they change
  from time to time - depending on the configuration)
  - ensure that the logging is done to file and to THIS file, because
  there are more logging settings than just log_min_duration and
sometimes
  it can be messed up
  - try to restart the database and see if there isn't any other file
  created as usually I observe that after deleting current log file, the
  database doesn't recreate while logging so the logs are not stored.

Cheers for replying, I've checked the config and nothing seems to be
amiss, as I'm running ubuntu the defaults seem to be to dump to stderr
and
somehow this is redirect to the log file, there doesn't seem to be any
other log files used - although it is possible the ubuntu startup
scripts
inject the logfile on startup?
Unfortunately I can't restart the database easily, whatever changed
seemed
to have happened on friday without a restart so I'm hoping I can find
and
undo it...

John

--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin