2011-06-09 (ë), 12:21 +1000, NeilBrown: > On Thu, 9 Jun 2011 03:04:33 +0900 Namhyung Kim <namhyung@xxxxxxxxx> wrote: > > > When performing a recovery, only first 2 slots in r10_bio are in use, > > for read and write respectively. However all of pages in the write bio > > are never used and just replaced to read bio's when the read completes. > > > > Get rid of those unused pages and share read pages properly. > > > > Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxx> > > --- > > drivers/md/raid10.c | 13 ++++++++++--- > > 1 files changed, 10 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > > index a53779ffdf89..621594981339 100644 > > --- a/drivers/md/raid10.c > > +++ b/drivers/md/raid10.c > > @@ -116,6 +116,13 @@ static void * r10buf_pool_alloc(gfp_t gfp_flags, void *data) > > goto out_free_bio; > > r10_bio->devs[j].bio = bio; > > } > > + > > + /* > > + * We can share bv_page's during the recovery > > + */ > > + if (!test_bit(MD_RECOVERY_SYNC, &conf->mddev->recovery)) > > + nalloc--; > > + > > /* > > * Allocate RESYNC_PAGES data pages and attach them > > * where needed. > > @@ -1363,16 +1370,16 @@ static void recovery_request_write(mddev_t *mddev, r10bio_t *r10_bio) > > int i, d; > > struct bio *bio, *wbio; > > > > - > > - /* move the pages across to the second bio > > + /* > > + * share the pages with the first bio > > * and submit the write request > > */ > > bio = r10_bio->devs[0].bio; > > wbio = r10_bio->devs[1].bio; > > for (i=0; i < wbio->bi_vcnt; i++) { > > struct page *p = bio->bi_io_vec[i].bv_page; > > - bio->bi_io_vec[i].bv_page = wbio->bi_io_vec[i].bv_page; > > wbio->bi_io_vec[i].bv_page = p; > > + get_page(p); > > } > > d = r10_bio->devs[1].devnum; > > > > > Thanks. Interesting idea, but I don't think this code is safe. > > We end up calling bio_add_page on with an uninitialised 'page' (in > sync_request()). This could result on BIOVEC_PHYS_MERGEABLE doing funny > things. > Arrrh, right. I missed that part. > It would be OK to set up the two links to the one page in r10buf_pool_alloc, > so recovery_request_write doesn't need to do anything with pages. > That would probably be even more safe than the current code. > Agreed, will resend v2. Thanks for the review. -- Regards, Namhyung Kim -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html