On Tue, Aug 06, 2019 at 12:05:51AM +0000, Damien Le Moal wrote: > On 2019/08/06 7:05, Damien Le Moal wrote: > > On 2019/08/06 6:59, Damien Le Moal wrote: > >> On 2019/08/06 6:28, Jens Axboe wrote: > >>> On 8/5/19 2:27 PM, Damien Le Moal wrote: > >>>> On 2019/08/06 6:26, Jens Axboe wrote: > >>>>>> In any case, looking again at this code, it looks like there is a > >>>>>> problem with dio->size being incremented early, even for fragments > >>>>>> that get BLK_QC_T_EAGAIN, because dio->size is being used in > >>>>>> blkdev_bio_end_io(). So an incorrect size can be reported to user > >>>>>> space in that case on completion (e.g. large asynchronous no-wait dio > >>>>>> that cannot be issued in one go). > >>>>>> > >>>>>> So maybe something like this ? (completely untested) > >>>>> > >>>>> I think that looks pretty good, I like not double accounting with > >>>>> this_size and dio->size, and we retain the old style ordering for the > >>>>> ret value. > >>>> > >>>> Do you want a proper patch with real testing backup ? I can send that > >>>> later today. > >>> > >>> Yeah that'd be great, I like your approach better. > >>> > >> > >> Looking again, I think this is not it yet: dio->size is being referenced after > >> submit_bio(), so blkdev_bio_end_io() may see the old value if the bio completes > >> before dio->size increment. So the use-after-free is still there. And since > >> blkdev_bio_end_io() processes completion to user space only when dio->ref > >> becomes 0, adding an atomic_inc/dec(&dio->ref) over the loop would not help and > >> does not cover the single BIO case. Any idea how to address this one ? > >> > > > > May be add a bio_get/put() over the 2 places that do submit_bio() would work, > > for all cases (single/multi BIO, sync & async). E.g.: > > > > + bio_get(bio); > > qc = submit_bio(bio); > > if (qc == BLK_QC_T_EAGAIN) { > > if (!dio->size) > > ret = -EAGAIN; > > + bio_put(bio); > > goto error; > > } > > dio->size += bio_size; > > + bio_put(bio); > > > > Thoughts ? > > > > That does not work since the reference to dio->size in blkdev_bio_end_io() > depends on atomic_dec_and_test(&dio->ref) which counts the BIO fragments for the > dio (+1 for async multi-bio case). So completion of the last bio can still > reference the old value of dio->size. Didn't we fix this same use-after-free in iomap_dio_rw() in commit 4ea899ead278 ("iomap: fix a use after free in iomap_dio_rw")? Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx