On Wed, Nov 01, 2006 at 11:27:25AM -0500, Brian Long alleged: > On Wed, 2006-11-01 at 08:58 -0500, Robert G. Brown wrote: > > On Wed, 1 Nov 2006, seth vidal wrote: > > > > > rm -f /var/lib/rpm/__db* > > > rpm --rebuilddb > > > > Yes this was exactly it. yum -d 10 showed the hang was at the rpmdb > > read, so I could figure it out. > > Robert, Seth, > > We've had so many issues with the futex issue that our auto-update > script has a ton of logic in it before calling yum update. > > We run "/usr/lib/rpm/rpmdb_stat -c -h /var/lib/rpm | grep 'current > locks' | cut -f 1" and if the number is not 0, we remove > the /var/lib/rpm/__db* files and rebuild the RPM database. > > It's surprising the number of rpm --rebuilddb's we run on a daily basis > across 7,000+ RHEL 3 and 4 hosts. A bunch of fixes were put into place > on RHEL 3 Update 5's rpm, so we've pushed it to all RHEL 3 boxes > (including those not yet running Update 5+). It reduced the issue, but > we still have a few hundred rebuilds a day and we've haven't been able > to track it down. Just to show that it isn't always that bad, I have near 2000 RHEL3 hosts and I've only had to rebuilddb a few times. I think the only real cause I've seen is an OOM condition that kills yum at exactly the wrong point. -- Garrick Staples, Linux/HPCC Administrator University of Southern California -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://lists.dulug.duke.edu/pipermail/yum/attachments/20061101/c74ae7fb/attachment-0001.bin