Let me make it more degenerate; maybe you can help me make sense of whether this is a side issue.
I sat down and made a tool to run db_verify against all my systems via mcollective. Large suite of systems, I'm getting a noticeable number of broken systems even just a few days after I've run a full patch via spacewalk's direct-rpm method.
Cleaning them up, I noticed _more_ systems showing up as broken.
Then noticed that db_verify's man page says it doesn't do proper locking. Nice!
So I turn on auditctl watching for anything touching /var/lib/rpm/*, run db_verify in a while-1 loop against all tables to see what happens, and after about a minute (so, 120 calls or so to db_verify later), it starts smoking, segfaulting while opening __db.002.
This whole story is giving me the proper horrors of someone who's seeing how the sausage is made for the first time.
It does raise two more questions:
1) Should I expect that db_verify does the opposite of what it's supposed to do now and again - that is, should it, all by its lonesome, occasionally ruin the __db files?
2) Is there a proper way that doesn't have this no-lock-ruins-everything risk for me to check on the current health of an RPM database?
On Mon, Feb 10, 2014 at 2:35 PM, Tristan Smith <triss@xxxxxxxxxxxxxxxxxx> wrote:
Hiya, folks.I'm having a bit of a time in my CentOS 6 environment with what I'm guessing is some kind of knuckleheaded behavior in one or more of my utilities.We have Spacewalk and Puppet working in general harmony, but I have a chronic issue with a significant percentage (call it... 10%) of my hosts turning up with rpmdb problems on a regular basis. Not the same hosts every time, either. There's some correlation I'm drawing to relatively idle systems, but it may be BS.When yum tries to install on a borked systems, I get error 12s; db_verify comes up with 'Cannot allocate memory' for Basenames (or sometimes just Packages). rpm --rebuilddb almost universally makes them okayish again, but not entirely; I'm enjoying lost dependencies here and there (yum check dependencies crying into its beer a lot, and I've got an xargs nightmare to re-install the missing packages)Basically, I've got a handle on an ever lengthening list of mitigation methods, but what I can't seem to isolate is whodunit. I have no idea what's reaching into the DB hamfisted and making a mess quite so often.Does anyone have suggestions as to what in hell I should be doing to narrow down causes?
_______________________________________________ Rpm-list mailing list Rpm-list@xxxxxxxxxxxxx http://lists.rpm.org/mailman/listinfo/rpm-list