On Tue, Sep 10, 2013 at 04:39:56PM +0000, Prantis, Kelsey wrote: > We have a cluster of 7 KVM vms on a host. The host OS is Fedora 18, and the guest OS is Centos 6.4. Installed kvm/qemu/kernel packages are as follows: > > qemu-system-x86-1.2.2-11.fc18.x86_64 > qemu-common-1.2.2-11.fc18.x86_64 > qemu-img-1.2.2-11.fc18.x86_64 > libvirt-daemon-driver-qemu-0.10.2.5-1.fc18.x86_64 > qemu-kvm-1.2.2-11.fc18.x86_64 > ipxe-roms-qemu-20120328-2.gitaac9718.fc18.noarch > kernel-3.9.4-200.fc18.x86_64 > > To 4 of the vms we have attached the same 5 lvs to be used as shared storage, with definitions like the below (disk1-disk5): > > <disk type='block' device='disk'> > <driver name='qemu' type='raw' /> > <source dev='/dev/vg_00/disk1'/> > <target dev='sda' bus='scsi'/> > <shareable/> > <serial>disk1</serial> > <alias name='scsi0-0-0'/> > <address type='drive' controller='0' bus='0' target='0' unit='0'/> > </disk> > > Throughout the course of our automated test suite, our tests format the device with an ext4 file system and then immediately mount the file system to write a few files after the format completes. Most of the time this works great. However, some small percentage of the time it is failing on the mount command with "No such device". > > Unable to mount /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_disk1: No such device > > > We know that the device does in fact exist and was operable, since the mkfs command just had completed successfully and without error, so I am not sure why suddenly it is returning "No such device" when trying to mount, and only a small percentage of the time. To prove that the device is in fact there, we've tried putting the mount into a retry-loop as a debug measure to show the device is eventually there, and without fail in one of the loop iterations the mount does complete successfully. It seems like there could possibly be some sort of race between closing the device after the mkfs and quickly opening it again for the mount? > > We've reproduced this both with directly attached devices, as above, as well as with iscsi devices. This is weird because the symlinks in /dev/disk/by-*/ just point back to ../../sd*. The "No such device" error message implies the device node exists on the file system but the kernel thinks a device for that major/minor number is not present. I wonder if the output of "udevadm monitor" during the mfks and mount steps shows devices appearing/disappearing? That might explain a race condition. Can you share your script that runs mkfs and mounts the file system? At which point in the boot process does your script run? Stefan -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html