Erik Jones wrote:
On Aug 27, 2007, at 4:44 PM, Kamil Srot wrote:
Also, in your original post you mentioned a "proprietal CMS
system". Is this proprietary to your company or one that you've
purchased? The fact that the same table going on multiple dbs all
being run by that CMS system certainly makes it worthy of suspicion.
This is software developed in our company... so I'm sure it's not
duing aby schema manipulation. I'm actually senior developer of this
project by accident :-)
That doesn't mean that there's not some whacked out race condition
causing corruption. I'm just saying, keep exploring every option.
Saying, "I wrote and am in charge of that so I know it didn't do it"
is bad. You should at the very least have someone else, as well as
you, review any code that touches that table.
The strange thing is, all the projects are completelly independend...
has its own DB, folder with scripts, different data... just the DB
user is the same... so it's higly unprobable, that it'll do 2 similar
errors in thow distinct databases at nearly the same time...
Just the DB and your CMS. Given that setup, if your CMS is causing
this to happen on one DB, it actually is not unlikely that it would do
it to the others.
OK, sure I cannot put my hand in the fire for it, but after searching
the sources for DDL commands and also becouse of the manner of the
application, I simply do not belive in this opinion. OK, I have the logs
on and I do log everything what goes to the server in the timestamped
logfile...
When this problem appeared for the first time, I had clearly the
wraparound problem... I did vacuum it and partially restored the
data... but in some meantime, I had commands like \dt showing all
relations twice... (some system catalog problem)... then I did full
dump and restore along with upgrade to newest pgsql server
software... this duplicity was gone and never appeared again.
From above mentioned duplications of relatio names and what Tom wrote
recently (doesn't see like WA problem), it looks like the relation
name is/gets corrupted in some way and this corruption is internally
taken over to another instance of relation named the same but in
another database... but I know - it's too speculative.
Is it the same instance of your CMS managing each of the databases or
a separate instance per DB?
Every DB has it's own web domain and folder structure with copy of
script files... nothing is shared.
Also, how often has this happened?
It happened I thing four times... first time about 4 months ago and then
3 times after about 20 to 30 days... last time yesterday. Then I noticed
this problem one more time today just in one database (one of the two
affected yesterday).
Since wraparound has been ruled out, it's hard to say what else could
be the culprit or what to look at and do next without any more
specific details about the system at the time(s) this has happened.
What kind of monitoring do you have set up on your DBs?
Starting 2 ours ago, I have complete logs of all commands along with
user and database they are applied to. In the past I have just the
general logs w/o timestamps and DB names... so hard to read...
Have you verified that the table's files are still on disk after
it's "disappeared"?
Do not have any idea how to do it... I wasn't able to access it using
any DML/DDL commands... can try it on a binary backup of the damaged DB
if you'll guide me...
Thank you,
--
Kamil
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings