On Wed, 1 Nov 2006, Brian Long wrote: > On Wed, 2006-11-01 at 08:58 -0500, Robert G. Brown wrote: >> On Wed, 1 Nov 2006, seth vidal wrote: >> >>> rm -f /var/lib/rpm/__db* >>> rpm --rebuilddb >> >> Yes this was exactly it. yum -d 10 showed the hang was at the rpmdb >> read, so I could figure it out. > > Robert, Seth, > > We've had so many issues with the futex issue that our auto-update > script has a ton of logic in it before calling yum update. > > We run "/usr/lib/rpm/rpmdb_stat -c -h /var/lib/rpm | grep 'current > locks' | cut -f 1" and if the number is not 0, we remove > the /var/lib/rpm/__db* files and rebuild the RPM database. In my case it was probably a self-inflicted wound. One thing about yum that drives me periodically bananas is that it doesn't interrupt with Ctrl-C (IMO) sanely. So if I (for example) start a massive yum update on my laptop and then have to go somewhere and need to shut it down, or realize that I mistyped something and will have to wait three or four minutes for yum to do its usual find a repo, download any changed xml stuff, and then actually execute the command before I get to re-enter the command CORRECTLY, I have to suspend and kill -9 or wait anyway. kill -9 of course in turn means that things don't always get cleaned up -- last night obviously I interrupted under the latter circumstance and left behind lockfiles, and it must have been on the very last yum command I ran because it worked fine up to that point. What I'd really like is: a) Automated handling of locks -- a check for existing locks and bomb with a message, and/or a flag that says "nuke any locks and rebuild the rpmdb as needed". As I said, GUI-level (non-sysadmin) users are NOT going to be able to remove locks and rebuild the db, ever, so for yum to be usable via a GUI by general users it basically can never lock up. Really it should "never" lock up even when run from a command line on the basis of this sort of state (so this is a bug, not just an annoying feature) especially "silently" so you have to wait a really long time to see that it is hung and not just thinking or waiting on a network resource. b) Ctrl-C should cause yum to gracefully exit, no matter what it is doing, at the earliest possible time, and with immediate messages saying "yes, I got your command to quit, be patient while I clean up" as it works itself free of locks, interrupts downloads, etc. It should NOT cause fallthrough on the repository being used for a download as it now does. Every similar (non-interactive) application in the known universe quits on Ctrl-C. c) Some OTHER key sequence can cycle through the repos -- I'd suggest Ctrl-N(ext)/Ctrl-P(revious) for Next/Previous repository, or some other hook that can easily be connected to yumex. That way slow repos, hung repos, slow networks can be skipped in real time without forcing a yum quit, but one can force a yum quit more easily than eight million Ctrl-C's or Ctrl-Z followed by kill -9. > It's surprising the number of rpm --rebuilddb's we run on a daily basis > across 7,000+ RHEL 3 and 4 hosts. A bunch of fixes were put into place > on RHEL 3 Update 5's rpm, so we've pushed it to all RHEL 3 boxes > (including those not yet running Update 5+). It reduced the issue, but > we still have a few hundred rebuilds a day and we've haven't been able > to track it down. Well, maybe the locks were NOT because of interrupting yum. Maybe there are bugs somewhere else. Either way, yum can be made robust against the locks, I think. The thing one would have to worry about is cron job A running yum, or rpm, at some time while a user or sysadmin B is simultaneously trying to run yum, or rpm, from a command line or other interface or script. yum has its own lock, which is good (it seems to work whenever I've accidentally challenged it), but it sounds like it can maybe still get into a race condition with rpm, which presumably has its own independent locks. Usually there is a pattern of conflict resolution while requesting both locksets that can prevent this sort of race, if it is what is leaving the locks behind. rgb > > /Brian/ > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@xxxxxxxxxxxx