In this function: void __bio_release_pages(struct bio *bio, bool mark_dirty) { unsigned int gup_flags = bio_to_gup_flags(bio); struct bvec_iter_all iter_all; struct bio_vec *bvec; bio_for_each_segment_all(bvec, bio, iter_all) { if (mark_dirty && !PageCompound(bvec->bv_page)) set_page_dirty_lock(bvec->bv_page); >>>> page_put_unpin(bvec->bv_page, gup_flags); } } that ought to be a call to bio_release_page(), but the optimiser doesn't want to inline it:-/ I found the only way I can get the compiler to properly inline it without it repeating the calculations is to renumber the FOLL_* constants down and then make bio_release_page() something like: static inline __attribute__((always_inline)) void bio_release_page(struct bio *bio, struct page *page) { page_put_unpin(page, ((bio->bi_flags & (1 << BIO_PAGE_REFFED)) ? FOLL_GET : 0) | ((bio->bi_flags & (1 << BIO_PAGE_PINNED)) ? FOLL_PIN : 0)); } I guess the compiler optimiser isn't perfect yet:-) David