On 2019/08/06 6:28, Jens Axboe wrote: > On 8/5/19 2:27 PM, Damien Le Moal wrote: >> On 2019/08/06 6:26, Jens Axboe wrote: >>>> In any case, looking again at this code, it looks like there is a >>>> problem with dio->size being incremented early, even for fragments >>>> that get BLK_QC_T_EAGAIN, because dio->size is being used in >>>> blkdev_bio_end_io(). So an incorrect size can be reported to user >>>> space in that case on completion (e.g. large asynchronous no-wait dio >>>> that cannot be issued in one go). >>>> >>>> So maybe something like this ? (completely untested) >>> >>> I think that looks pretty good, I like not double accounting with >>> this_size and dio->size, and we retain the old style ordering for the >>> ret value. >> >> Do you want a proper patch with real testing backup ? I can send that >> later today. > > Yeah that'd be great, I like your approach better. > Looking again, I think this is not it yet: dio->size is being referenced after submit_bio(), so blkdev_bio_end_io() may see the old value if the bio completes before dio->size increment. So the use-after-free is still there. And since blkdev_bio_end_io() processes completion to user space only when dio->ref becomes 0, adding an atomic_inc/dec(&dio->ref) over the loop would not help and does not cover the single BIO case. Any idea how to address this one ? -- Damien Le Moal Western Digital Research