Re: [patch] latency problem in md driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Friday December 22, Lars.Ellenberg@xxxxxxxxxx wrote:
> 
> md raidX make_request functions strip off the BIO_RW_SYNC flag,
> this introducing additional latency.
> 
> below is a suggested patch for the raid1.c .
> other suggested solutions would be to let the bio_clone do its work,
> and not reassign thereby stripping off all flags.
> at most strip off known unwanted flags (the BARRIER flag).
> 
> similar pattern in the other raid versions.

Thanks.  I think your patch is appropriate.
I don't like the idea of passing down any flags by default.  If a flag
makes failure more likely (like BIO_RW_AHEAD or BIO_RW_FAILFAST) then
I *don't* want it passed down without making a conscious decision that
it is a good idea.

So yes, _SYNC should be passed down in raid1/raid10.  More interesting
stuff would be needed in raid456.
_FAILFAST should probably never be passed down (well, maybe in
multipath, but who uses md/multipath??)
_META ... what is that ?? I'm not passing it down until I know!!

I'll look into this after the holidays.  Meanwhile if you want to be
certain that this is in 2.6.20,
  - Fix the retry-read case as well (where ->bi_rw is assigned about
    10 lines from the end of raid1d()
  - Create a 'perfect patch',
  - add Acked-by: NeilBrown <neilb@xxxxxxx>
  - and post it to akpm@xxxxxxxx (and appropriate lists).

Thanks,
NeilBrown

> 
> cheers,
> 
> 	Lars
> 
> 
> --- /mnt/kernel-src/linux-2.6.19/drivers/md/raid1.c.orig	2006-12-11 10:06:17.661776243 +0100
> +++ /mnt/kernel-src/linux-2.6.19/drivers/md/raid1.c	2006-12-12 11:09:55.975762364 +0100
> @@ -776,6 +776,7 @@ static int make_request(request_queue_t 
>  	struct page **behind_pages = NULL;
>  	const int rw = bio_data_dir(bio);
>  	int do_barriers;
> +	int do_sync = bio_sync(bio);
>  
>  	/*
>  	 * Register the new request and wait if the reconstruction
> @@ -835,7 +836,7 @@ static int make_request(request_queue_t 
>  		read_bio->bi_sector = r1_bio->sector + mirror->rdev->data_offset;
>  		read_bio->bi_bdev = mirror->rdev->bdev;
>  		read_bio->bi_end_io = raid1_end_read_request;
> -		read_bio->bi_rw = READ;
> +		read_bio->bi_rw = READ | do_sync;
>  		read_bio->bi_private = r1_bio;
>  
>  		generic_make_request(read_bio);
> @@ -906,7 +907,7 @@ static int make_request(request_queue_t 
>  		mbio->bi_sector	= r1_bio->sector + conf->mirrors[i].rdev->data_offset;
>  		mbio->bi_bdev = conf->mirrors[i].rdev->bdev;
>  		mbio->bi_end_io	= raid1_end_write_request;
> -		mbio->bi_rw = WRITE | do_barriers;
> +		mbio->bi_rw = WRITE | do_barriers | do_sync;
>  		mbio->bi_private = r1_bio;
>  
>  		if (behind_pages) {
> 
> 
> -- 
> : Lars Ellenberg                            Tel +43-1-8178292-55 :
> : LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
> : Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux