Re: raid10 make_request failure during iozone benchmark upon btrfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 02 Jul 2012 03:58:57 +0100 Kerin Millar <kerframil@xxxxxxxxx> wrote:

> Hi Neil,
> 
> On 02/07/2012 03:52, NeilBrown wrote:
> > On Mon, 02 Jul 2012 03:34:16 +0100 Kerin Millar<kerframil@xxxxxxxxx>  wrote:
> >
> >> >  Hello,
> >> >
> >> >  I'm running a 4-way RAID-10 array with the f2 layout scheme on a 3.5-rc5
> > I thought I fixed this in 3.5-rc2.
> > Maybe there is another bug....
> >
> > Could you please double check that you are running a kernel with
> >
> > commit aba336bd1d46d6b0404b06f6915ed76150739057
> > Author: NeilBrown<neilb@xxxxxxx>
> > Date:   Thu May 31 15:39:11 2012 +1000
> >
> >      md: raid1/raid10: fix problem with merge_bvec_fn
> >
> > in it?
> 
> I am indeed. I searched the list beforehand and noticed the patch in
> question. Not sure which -rc it landed in but I checked my source tree
> and it's definitely in there.
> 
> Cheers,
> 
> --Kerin

Thanks.
Looking at it again I see that it is definitely a different bug, that patch
wouldn't affect it.

But I cannot see what could possibly be causing the problem.
You have a 256K chunk size, so requests should be limited to 512 sectors
aligned at a 512-sector boundary.
However all the requests that a causing errors are 512 sectors long, but
aligned on a 256-sector boundary (which is not also 512-sector).  This is
wrong.

It could be that btrfs is submitting bad requests, but I think it always uses
bio_add_page, and bio_add_page appears to do the right thing.
It could be that dm-linear is causing problem, but it seems to correctly after
the underlying device for alignment, and reports that alignment to
bio_add_page.
It could be that md/raid10 is the problem but I cannot find any fault in
raid10_mergeable_bvec - performs much the same tests that the
raid01 make_request function does.

So it is a mystery.

Is this failure repeatable?

If so, could you please insert
   WARN_ON_ONCE(1);
in drivers/md/raid10.c where it prints out the message: just after the
"bad_map:" label.

Also, in raid10_mergeable_bvec, insert 
   WARN_ON_ONCE(max < 0);
just before
		if (max < 0)
			/* bio_add cannot handle a negative return */
			max = 0;

and then see if either of those generate a warning, and post the full stack
trace  if they do.

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux