Re: Random data corruption in VM, possibly caused by rbd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/08/2012 06:55 AM, Sage Weil wrote:
On Fri, 8 Jun 2012, Oliver Francke wrote:
Hi Guido,

yeah, there is something weird going on. I just started to establish some
test-VM's. Freshly imported from running *.qcow2 images.
Kernel panic with INIT, seg-faults and other "funny" stuff.

Just added the rbd_cache=true in my config, voila. All is
fast-n-up-n-running...
All my testing was done with cache enabled... Since our errors all came from
rbd_writeback from former ceph-versions...

Are you guys able to reproduce the corruption with 'debug osd = 20' and
'debug ms = 1'?  Ideally we'd like to:

  - reproduce from a fresh vm, with osd logs
  - identify the bad file
  - map that file to a block offset (see
    http://ceph.com/qa/fiemap.[ch], linux_fiemap.h)
  - use that to identify the badness in the log

I suspect the cache is just masking the problem because it submits fewer
IOs...

The cache also doesn't do sparse reads. Is it still reproducible with
a fresh vm when you set filestore_fiemap_threshold = 0 for the osds,
and run without rbd caching?

Josh

sage



Josh? Sage? Help?!

Oliver.

On 06/08/2012 02:55 PM, Guido Winkelmann wrote:
Am Donnerstag, 7. Juni 2012, 12:48:05 schrieben Sie:
On 06/07/2012 11:04 AM, Guido Winkelmann wrote:
Hi,

I'm using Ceph with RBD to provide network-transparent disk images for
KVM-
based virtual servers. The last two days, I've been hunting some weird
elusive bug where data in the virtual machines would be corrupted in
weird ways. It usually manifests in files having some random data -
usually zeroes - at the start before the actual contents that should be
in there start.
I definitely want to figure out what's going on with this.
A few questions:

Are you using rbd caching? If so, what settings?

In either case, does the corruption still occur if you
switch caching on/off? There are different I/O paths here,
and this might tell us if the problem is on the client side.
Okay, I've tried enabling rbd caching now, and so far, the problem appears
to
be gone.

I am using libvirt for starting and managing the virtual machines, and what
I
did was change the<source>   element for the virtual disk from

<source protocol='rbd' name='rbd/name_of_image'>

to

<source protocol='rbd' name='rbd/name_of_image:rbd_cache=true'>

and then restart the VM.
(I found that in one of your mails on this list; there does not appear to be
any proper documentation on this...)

The iotester does not find any corruptions with these settings.

The VM ist still horribly broken, but that's probably lingering filesystem
damage from yesterday. I'll try with a fresh image next.

I did not change anything else in the setup. In particular, the OSDs still
use
btrfs. One of the OSD has been restarted, though. I will run another test
with
a VM without rbd caching, to make sure it wasn't by random chance restarting
that one osd that made the real difference.

Enabling btrfs did not appear to make any difference wrt performance, but
that's probably because my tests mostly create sustained sequential IO, for
which caches are generally not very helpful.

Enabling rbd caching is not a solution I particularly like, for two reasons:

1. In my setup, migrating VMs from one host to another is a normal part of
operation, and I still don't know ho to prevent data corruption (in the form
of silently lost writes) when combining rbd caching and migration.

2. I'm not really looking into speeding up single VM, I'm really more
interested in just how many VMs I can run before performance starts
degrading
for everyone, and I don't think rbd caching will help with that.

Regards,
	Guido

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--

Oliver Francke

filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh

Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux