Re: Random data corruption in VM, possibly caused by rbd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am Samstag, 9. Juni 2012, 20:04:20 schrieb Sage Weil:
> On Fri, 8 Jun 2012, Guido Winkelmann wrote:
> > Am Freitag, 8. Juni 2012, 07:50:36 schrieb Josh Durgin:
> > > On 06/08/2012 06:55 AM, Sage Weil wrote:
> > > > On Fri, 8 Jun 2012, Oliver Francke wrote:
> > > >> Hi Guido,
> > > >> 
> > > >> yeah, there is something weird going on. I just started to establish
> > > >> some
> > > >> test-VM's. Freshly imported from running *.qcow2 images.
> > > >> Kernel panic with INIT, seg-faults and other "funny" stuff.
> > > >> 
> > > >> Just added the rbd_cache=true in my config, voila. All is
> > > >> fast-n-up-n-running...
> > > >> All my testing was done with cache enabled... Since our errors all
> > > >> came
> > > >> from rbd_writeback from former ceph-versions...
> > > > 
> > > > Are you guys able to reproduce the corruption with 'debug osd = 20'
> > > > and
> > > > 
> > > > 'debug ms = 1'?  Ideally we'd like to:
> > > >   - reproduce from a fresh vm, with osd logs
> > > >   - identify the bad file
> > > >   - map that file to a block offset (see
> > > >   
> > > >     http://ceph.com/qa/fiemap.[ch], linux_fiemap.h)
> > > >   
> > > >   - use that to identify the badness in the log
> > > > 
> > > > I suspect the cache is just masking the problem because it submits
> > > > fewer
> > > > IOs...
> > > 
> > > The cache also doesn't do sparse reads. Is it still reproducible with
> > > a fresh vm when you set filestore_fiemap_threshold = 0 for the osds,
> > > and run without rbd caching?
> > 
> > I have set filestore_fiemap_threshold = 0 on all osds and restarted them.
> > The problem is still there, and so bad I cannot even run this fiemap
> > utility that Sage posted. I guess I should have tried booting the VM from
> > a livecd instead...
> 
> Whoops,
> 
> 	filestore fiemap threshold = 0
> 
> doesn't turn it off, but
> 
> 	filestore fiemap = false

Okay, I changed "filestore fiemap threshold = 0" to "filestore fiemap = false" 
under [osd]. So far, the problem does not seem to resurface.

	Guido

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux