Re: Migrating a VM makes its gluster storage inaccessible

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi James,

Thanks for the quick reply.

We are only using the fuse mounted paths at the moment. So libvirt/qemu simply know of these files as /gluster/guest.raw, and the guests are not aware of libgluster.

Some version numbers:

Kernel: Ubuntu 3.8.0-35-generic (13.10, Raring)
Glusterfs: 3.4.1-ubuntu1~raring1
qemu: 1.4.0+dfsg-1expubuntu4
libvirt0: 1.0.2-0ubuntu11.13.04.4
The gluster bricks are on xfs.

Regards, Paul Boven.


On 01/21/2014 03:25 PM, James wrote:
Are you using the qemu gluster:// storage or are you using a fuse
mounted file path?

I would actually expect it to work with either, however I haven't had
a chance to test this yet.

It's probably also useful if you post your qemu versions...

James

On Tue, Jan 21, 2014 at 9:15 AM, Paul Boven <boven@xxxxxxx> wrote:
Hi everyone

We've been running glusterfs-3.4.0 on Ubuntu 13.04, using semiosis'
packages. We're using kvm (libvrt) to host guest installs, and thanks to
gluster and libvirt, we can live-migrate guests between the two hosts.

Recently I ran an apt-get update/upgrade to stay up-to-date with security
patches, and this also upgraded our glusterfs to the 3.4.1 version of the
packages.

Since this upgrade (which updated the gluster packages, but also the Ubuntu
kernel package), kvm live migration fails in a most unusual manner. The live
migration itself succeeds, but on the receiving machine, the vm-storage for
that machine becomes inaccessible. Which in turn causes the guest OS to no
longer be able to read or write its filesystem, with of course fairly
disastrous consequences for such a guest.

So before a migration, everything is running smoothly. The two cluster nodes
are 'cl0' and 'cl1', and we do the migration like this:

virsh migrate --live --persistent --undefinesource <guest>
qemu+tls://cl1/system

The migration itself works, but soon as you do the migration, the
/gluster/guest.raw file (which holds the filesystem for the guest) becomes
completely inaccessible: trying to read it (e.g. with dd or md5sum) results
in a 'permission denied' on the destination cluster node, whereas the file
is still perfectly fine on the machine that the migration originated from.

As soon as I stop the guest (virsh destroy), the /gluster/guest.raw file
becomes readable again and I can start up the guest on either server without
further issues. It does not affect any of the other files in /gluster/.

The problem seems to be in the gluster or fuse part, because once this error
condition is triggered, the /gluster/guest.raw cannot be read by any
application on the destination server. This situation is 100% reproducible,
every attempted live migration fails in this way.

Has anyone else experienced this? Is this a known or new bug?

We've done some troubleshooting already in the irc channel (thanks to
everyone for their help) but haven't found the smoking gun yet. I would
appreciate any help in debugging and resolving this.

Regards, Paul Boven.
--
Paul Boven <boven@xxxxxxx> +31 (0)521-596547
Unix/Linux/Networking specialist
Joint Institute for VLBI in Europe - www.jive.nl
VLBI - It's a fringe science
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users


--
Paul Boven <boven@xxxxxxx> +31 (0)521-596547
Unix/Linux/Networking specialist
Joint Institute for VLBI in Europe - www.jive.nl
VLBI - It's a fringe science
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux