On 02/05/2013 12:32 AM, Richard W.M. Jones wrote:
On Mon, Feb 04, 2013 at 07:17:35PM +0200, Panu Matilainen wrote:
On 02/04/2013 07:01 PM, Richard W.M. Jones wrote:
On Mon, Feb 04, 2013 at 04:38:08PM +0000, Richard W.M. Jones wrote:
Cleanup : cpp-4.8.0-0.7.fc19.x86_64 215/262
Cleanup : gdb-7.5.50.20130118-2.fc19.x86_64 216/262
Cleanup : 1:findutils-4.5.10-7.fc19.x86_64 217/262
Cleanup : spice-server-0.12.2-2.fc19.x86_64 218/262
Cleanup : cracklib-2.8.22-2.fc19.x86_64 219/262
Cleanup : libvirt-daemon-driver-interface-1.0.1-6.fc19.x86_64 220/262
Cleanup : libvirt-daemon-driver-nodedev-1.0.1-6.fc19.x86_64 221/262
Cleanup : libvirt-daemon-driver-nwfilter-1.0.1-6.fc19.x86_64 222/262
Cleanup : libvirt-daemon-driver-secret-1.0.1-6.fc19.x86_64 223/262
Cleanup : libvirt-daemon-1.0.1-6.fc19.x86_64 224/262
Cleanup : libvirt-client-1.0.1-6.fc19.x86_64 225/262
Cleanup : cyrus-sasl-2.1.25-2.fc19.x86_64 226/262
Cleanup : openldap-2.4.33-3.fc19.x86_64 227/262
Cleanup : nss-tools-3.14.1-3.fc19.x86_64 228/262
Cleanup : nss-sysinit-3.14.1-3.fc19.x86_64 229/262
Cleanup : nss-3.14.1-3.fc19.x86_64 230/262
(and here it hangs, for at least 20 minutes)
So how odd is this? Suddenly it leaps back into life, after maybe
30-40 minutes.
Sounds like https://bugzilla.redhat.com/show_bug.cgi?id=860500
Yes, this looks similar.
It's possible that I ran a non-root yum command in another terminal.
A non-root yum/rpm/similar command wouldn't do. Only processes running
as root can participate in the shared environment (those __db.* files)
locking, others use a "private environment" which pretty much equals to
no locking at all.
So whatever it is that causes the jam is running as root, and equally
only a root-process can unjam it. Could even be the same thing that
caused the jam re-running, it's quite clearly something that runs
automatically in the background and does so more or less periodically,
occasionally exiting or crashing without freeing the rpmdb iterator it
holds. Whether its time-based or triggered by some other "external"
event I dunno. And when it causes a jam its either still running while
yum is started, or has started after yum.
Rpm uses Berkeley DB's "Concurrent Data Store" model for its database.
This is a simple model which supposedly provides a deadlock-free
operation without caller having to bother with explicit locking, but
unfortunately this only works when all callers are well-behaved. Not
entirely unlike multitasking in Windows 3.x... All it takes a single
buggy application forgetting to release its rpmdb iterators (or crashing
while holding them) to block a concurrent writer "forever". Stale locks
from no longer active processes are automatically cleaned but only on
rpmdb open, so a potentially long-running application like yum can get
stuck if the bad apple comes along after yum started.
Come to think of it, it should be possible to have rpm check for stale
locks when opening write-cursors. That would help some of the cases
(where the bad caller already exited/died) at least, but it'd still be
"vulnerable" to long-running process hanging on to iterators.
- Panu -
--
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/devel