On 1/9/23 3:24?PM, David Howells wrote: > Would you be okay with me flipping the logic of BIO_NO_PAGE_REF, so I end up > with: > > static void bio_release_page(struct bio *bio, struct page *page) > { > if (bio_flagged(bio, BIO_PAGE_PINNED)) > unpin_user_page(page); > if (bio_flagged(bio, BIO_PAGE_REFFED)) > put_page(page); > } > > See attached patch. I think it makes more sense to have NO_REF check, to be honest, as that means the general path doesn't have to set that flag. But I don't feel too strongly about that part. > diff --git a/block/bio.c b/block/bio.c > index 5f96fcae3f75..5b9f9fc62345 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -243,6 +243,11 @@ static void bio_free(struct bio *bio) > * Users of this function have their own bio allocation. Subsequently, > * they must remember to pair any call to bio_init() with bio_uninit() > * when IO has completed, or when the bio is released. > + * > + * We set the initial assumption that pages attached to the bio will be > + * released with put_page() by setting BIO_PAGE_REFFED, but this should be set > + * to BIO_PAGE_PINNED if the page should be unpinned instead; if the pages > + * should not be put or unpinned, these flags should be cleared. > */ > void bio_init(struct bio *bio, struct block_device *bdev, struct bio_vec *table, > unsigned short max_vecs, blk_opf_t opf) > @@ -274,6 +279,7 @@ void bio_init(struct bio *bio, struct block_device *bdev, struct bio_vec *table, > #ifdef CONFIG_BLK_DEV_INTEGRITY > bio->bi_integrity = NULL; > #endif > + bio_set_flag(bio, BIO_PAGE_REFFED); This is first set to zero, then you set the flag. Why not just initialize it like that to begin with? > @@ -302,6 +308,8 @@ void bio_reset(struct bio *bio, struct block_device *bdev, blk_opf_t opf) > { > bio_uninit(bio); > memset(bio, 0, BIO_RESET_BYTES); > + bio_set_flag(bio, BIO_PAGE_REFFED); > + bio_clear_flag(bio, BIO_PAGE_PINNED); > atomic_set(&bio->__bi_remaining, 1); > bio->bi_bdev = bdev; > if (bio->bi_bdev) You just memset bi_flags here, surely we don't need to clear BIO_PAGE_PINNED after that? > @@ -814,6 +822,8 @@ static int __bio_clone(struct bio *bio, struct bio *bio_src, gfp_t gfp) > bio_set_flag(bio, BIO_CLONED); > bio->bi_ioprio = bio_src->bi_ioprio; > bio->bi_iter = bio_src->bi_iter; > + bio_clear_flag(bio, BIO_PAGE_REFFED); > + bio_clear_flag(bio, BIO_PAGE_PINNED); Maybe it would make sense to have a set/clear mask operation? Not strictly required for this patch, but probably worth checking after the fact. > @@ -1273,12 +1295,20 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) > * result to ensure the bio's total size is correct. The remainder of > * the iov data will be picked up in the next bio iteration. > */ > - size = iov_iter_get_pages(iter, pages, > - UINT_MAX - bio->bi_iter.bi_size, > - nr_pages, &offset, gup_flags); > + size = iov_iter_extract_pages(iter, &pages, > + UINT_MAX - bio->bi_iter.bi_size, > + nr_pages, gup_flags, > + &offset, &cleanup_mode); > if (unlikely(size <= 0)) > return size ? size : -EFAULT; > > + bio_clear_flag(bio, BIO_PAGE_REFFED); > + bio_clear_flag(bio, BIO_PAGE_PINNED); > + if (cleanup_mode & FOLL_GET) > + bio_set_flag(bio, BIO_PAGE_REFFED); > + if (cleanup_mode & FOLL_PIN) > + bio_set_flag(bio, BIO_PAGE_PINNED); > + > nr_pages = DIV_ROUND_UP(offset + size, PAGE_SIZE); The cleanup_mode pass-back isn't the prettiest thing in the world, and that's a lot of arguments. Maybe it'd be slightly better if we just have gup_flags be an output parameter too? Also not great to first clear both flags, then set them with the two added branches... > diff --git a/include/linux/bio.h b/include/linux/bio.h > index 22078a28d7cb..1c6f051f6ff2 100644 > --- a/include/linux/bio.h > +++ b/include/linux/bio.h > @@ -482,7 +482,8 @@ void zero_fill_bio(struct bio *bio); > > static inline void bio_release_pages(struct bio *bio, bool mark_dirty) > { > - if (!bio_flagged(bio, BIO_NO_PAGE_REF)) > + if (bio_flagged(bio, BIO_PAGE_REFFED) || > + bio_flagged(bio, BIO_PAGE_PINNED)) > __bio_release_pages(bio, mark_dirty); > } Same here on a mask check, but perhaps it ends up generating the same code? -- Jens Axboe