On 2019/4/11 16:09, Ming Lei wrote: > On Thu, Apr 11, 2019 at 03:43:02PM +0800, Gao Xiang wrote: >> >> >> On 2019/4/11 15:08, Ming Lei wrote: >>> Hi Gao Xiang, >>> >>> On Thu, Apr 11, 2019 at 01:47:49PM +0800, Gao Xiang wrote: >>>> Hi Ming, >>>> >>>> I found a erofs issue after commit 07173c3ec276 >>>> ("block: enable multipage bvecs") is merged. It seems that >>>> it tries to merge more physical continuous pages in one iovec. >>>> >>>> However it breaks the current erofs_read_raw_page logic since it uses >>>> nr_iovecs of bio_alloc to limit the maximum number of physical >>> >>> I believe you can do the limit outside easily, such as by checking >>> how many pages have been added to the bio. >> >> Yes, I agree that. However, I noticed that bio_add_page is exported as >> EXPORT_SYMBOL. I have no idea how many out-of-tree drivers break as well. > > I don't think it is a good behaviour to use bio->bi_max_vecs to limit > max allowed page, you may see the idea from the naming simply... I hoped to simplify all these page limits to one number which is the minimum number among them all then... > > If there were other such drivers, we may fix it easily, and the following > patch should fix your issue: > > diff --git a/drivers/staging/erofs/data.c b/drivers/staging/erofs/data.c > index 526e0dbea5b5..8878f2f2593e 100644 > --- a/drivers/staging/erofs/data.c > +++ b/drivers/staging/erofs/data.c > @@ -298,7 +298,7 @@ static inline struct bio *erofs_read_raw_page(struct bio *bio, > *last_block = current_block; > > /* shift in advance in case of it followed by too many gaps */ > - if (unlikely(bio->bi_vcnt >= bio->bi_max_vecs)) { > + if (unlikely(bio->bi_iter.bi_size >= bio->bi_max_vecs * PAGE_SIZE)) { > /* err should reassign to 0 after submitting */ > err = 0; > goto submit_bio_out; Yeah, it is fine, thanks for your suggestion :) Thanks, Gao Xiang >> >> Is there no way to take old behavior into consideration in the block layer as well? > > One new interface, such as bio_add_page_limit(), may be helpful for > addressing this issue, but not sure if it is necessary, since it is > more reasonable to put the logic in user side. > >> >>> >>>> continuous blocks as well. It was practicable since the old >>>> __bio_try_merge_page only tries to merge in the same page. >>>> it is a kAPI behavior change which also affects bio_alloc... >>>> >>>> ... >>>> 231 err = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW); >>>> 232 if (unlikely(err)) >>>> 233 goto err_out; >>>> ... >>>> 284 /* max # of continuous pages */ >>>> 285 if (nblocks > DIV_ROUND_UP(map.m_plen, PAGE_SIZE)) >>>> 286 nblocks = DIV_ROUND_UP(map.m_plen, PAGE_SIZE); >>>> 287 if (nblocks > BIO_MAX_PAGES) >>>> 288 nblocks = BIO_MAX_PAGES; >>>> 289 >>>> 290 bio = erofs_grab_bio(sb, blknr, nblocks, sb, >>>> 291 read_endio, false); >>> >>> Previously this bio is allowed to add at most 'nblocks' pages, however, >>> now we are allowed to add at most 'nblocks' io vecs, instead of pages. >>> >>>> 292 if (IS_ERR(bio)) { >>>> 293 err = PTR_ERR(bio); >>>> 294 bio = NULL; >>>> 295 goto err_out; >>>> 296 } >>>> 297 } >>>> 298 >>>> 299 err = bio_add_page(bio, page, PAGE_SIZE, 0); >>>> 300 /* out of the extent or bio is full */ >>>> 301 if (err < PAGE_SIZE) >>>> 302 goto submit_bio_retry; >>>> ... >>>> >>>> After commit 07173c3ec276 ("block: enable multipage bvecs"), erofs could >>>> read more beyond what erofs_map_blocks assigns, and out-of-bound data could >>>> be read and it breaks tail-end inline determination. >>> >>> I don't understand why, could you explain a bit why erofs reads more? >> >> In current erofs, file could be break in two parts (non tail-end and tail-end blocks). >> Considering this on-disk layout, >> ________________________________________________________________ >> | non tail-end data | meta data | >> |____________________________________|_(inode) + non-inline data_| >> ^ ^ >> | | >> bio start \-- what nr_iovecs indicates as well: >> the end of the non tail-end data. >> >> Before this commit, it will stop just before the next meta data. However, >> if the new bio merging behavior is introduced, it could add pages more than >> nr_iovecs, thus meta data will be read as the normal data... > > I'd suggest you to not re-use bio->bi_max_vecs for this purpose. > >> >> >>> >>> The amount depends on how many pages you added to the bio. >>> >>>> >>>> I can change the logic in erofs. However, out of curiosity, I have no idea >>>> if some other places also are designed like this. >>>> >>>> IMO, it's better to provide a total count which indicates how many real >>>> pages have been added in this bio. some thoughts? >>> >>> As I mentioned, you may count how many pages added to bio, or you still >>> can get the number via bio_segments(bio). >> >> It is unsuitable to introduce bio_segments for each bio_add_page... > > It isn't necessary, see the above patch. > > > Thanks, > Ming >