On Mon, Aug 16, 2021 at 02:27:48PM +0800, Guoqing Jiang wrote: > Hi Ming and Christoph, > > On 8/14/21 4:57 PM, Ming Lei wrote: > > On Sat, Aug 14, 2021 at 08:55:21AM +0100, Christoph Hellwig wrote: > > > On Fri, Aug 13, 2021 at 04:38:59PM +0800, Guoqing Jiang wrote: > > > > Ok, thanks. > > > > > > > > > In general the size of a bio only depends on the number of vectors, not > > > > > the total I/O size. But alloc_behind_master_bio allocates new backing > > > > > pages using order 0 allocations, so in this exceptional case the total > > > > > size oes actually matter. > > > > > > > > > > While we're at it: this huge memory allocation looks really deadlock > > > > > prone. > > > > Hmm, let me think more about it, or could you share your thought? ???? > > > Well, you'd need a mempool which can fit the max payload of a bio, > > > that is BIO_MAX_VECS pages. > > IIUC, the behind bio is allocated from bio_set (mddev->bio_set) which is > allocated in md_run by > call bioset_init, so the mempool (bvec_pool) of this bio_set is created by > biovec_init_pool which > uses global biovec slabs. Do we really need another mempool? Or, there is no > potential deadlock > for this case. > > > > FYI, this is what I'd do instead of this patch for now. We don't really > > > need a vetor per sector, just per page. So this limits the I/O > > > size a little less. > > > > > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > > > index 3c44c4bb40fc..5b27d995302e 100644 > > > --- a/drivers/md/raid1.c > > > +++ b/drivers/md/raid1.c > > > @@ -1454,6 +1454,15 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio, > > > goto retry_write; > > > } > > > + /* > > > + * When using a bitmap, we may call alloc_behind_master_bio below. > > > + * alloc_behind_master_bio allocates a copy of the data payload a page > > > + * at a time and thus needs a new bio that can fit the whole payload > > > + * this bio in page sized chunks. > > > + */ > > Thanks for the above, will copy it accordingly. I will check if WriteMostly > is set before, then check both > the flag and bitmap. > > > > + if (bitmap) > > > + max_sectors = min_t(int, max_sectors, BIO_MAX_VECS * PAGE_SIZE); > > s/PAGE_SIZE/PAGE_SECTORS > > Agree. > > > > + > > > if (max_sectors < bio_sectors(bio)) { > > > struct bio *split = bio_split(bio, max_sectors, > > > GFP_NOIO, &conf->bio_split); > > Here the limit is max single-page vectors, and the above way may not work, > > such as:ust splitted and not > > > > 0 ~ 254: each bvec's length is 512 > > 255: bvec's length is 8192 > > > > the total length is just 512*255 + 8192 = 138752 bytes = 271 sectors, but it > > still may need 257 bvecs, which can't be allocated via bio_alloc_bioset(). > > Thanks for deeper looking! I guess it is because how vcnt is calculated. > > > One solution is to add queue limit of max_single_page_bvec, and let > > blk_queue_split() handle it. > > The path (blk_queue_split -> blk_bio_segment_split -> bvec_split_segs) which > respects max_segments > of limit. Do you mean introduce max_single_page_bvec to limit? Then perform > similar checking as for > max_segment. Yes, then the bio is guaranteed to not reach max single-page bvec limit, just like what __blk_queue_bounce() does. thanks, Ming