vgmknodes --refresh blocks forever waiting on a semaphore

Joel Friedly <joelfriedly@gmail.com> · Thu, 19 Feb 2015 14:19:33 -0800

I'm trying to some disk failure testing, and we're using LVM on top of raw disks.  After replacing the disk, the LV on the disk is unreadable until I run vgmknodes --refresh.  That command hangs forever, but I can kill -9 it.  After running the command, everything works again and I can read the LV.
I've seen this twice, so I ran strace the second time and you can find the output here:  https://gist.github.com/jfriedly/50fe9134c4bc616f9f90 and Ctrl-F for "425989"
On line 4250, LVM sets the semaphore's value to 1, then it immediately checks the semaphore's value and confirms that it's 1.

On line 4253, LVM increments the semaphore's value to 2, then it immediately checks the semaphore's value and confirms that it's 2.

On line 4295, LVM gets the semaphore's value and sees that it's 2, then it immediately decrements the value to 1 and then waits indefinitely for the value to hit 0.

Is LVM expecting some other process to decrement the semaphore?  Is this a bug in vgmknodes --refresh?  Running without the refresh flag doesn't block forever, but it also doesn't make the LV readable.

System Info:

Ubuntu 12.04
Kernel 3.13.0-39-generic
LVM 2.02.95-4ubuntu1
The disk is part of a VG named "vg.nebula.alexandria" and it is dedicated to an LV called "alexandria.tlog".

Thanks for your help guys, and let me know if you need any more debugging info,
Joel
_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/