On Wed, Aug 21, 2019 at 04:30:41PM -0700, Christoph Hellwig wrote: > On Wed, Aug 21, 2019 at 09:39:04AM -0400, Brian Foster wrote: > > > @@ -36,9 +57,12 @@ xfs_rw_bdev( > > > unsigned int off = offset_in_page(data); > > > unsigned int len = min_t(unsigned, left, PAGE_SIZE - off); > > > > > > - while (bio_add_page(bio, page, len, off) != len) { > > > + while ((ret = xfs_bio_add_page(bio, page, len, off)) != len) { > > > struct bio *prev = bio; > > > > > > + if (ret < 0) > > > + goto submit; > > > + > > > > Hmm.. is submitting the bio really the right thing to do if we get here > > and have failed to add any pages to the bio? If we're already seeing > > weird behavior for bios with unaligned data memory, this seems like a > > recipe for similar weirdness. We'd also end up doing a partial write in > > scenarios where we already know we're returning an error. Perhaps we > > should create an error path or use a check similar to what is already in > > xfs_buf_ioapply_map() (though I'm not a fan of submitting a partial I/O > > when we already know we're going to return an error) to call bio_endio() > > to undo any chaining. > > It is not the right thing to do. Calling bio_endio after setting > an error is the right thing to do (modulo any other cleanup needed). bio_endio() doesn't wait for completion or all the IO in progress. In fact, if we have chained bios still in progress, it does absolutely nothing: void bio_endio(struct bio *bio) { again: >>>>> if (!bio_remaining_done(bio)) return; and so we return still with the memory we've put into the buffers in active use by the chained bios under IO. On error, we'll free the allocated buffer immediately, and that means we've got a use-after-free as the bios in progress still have references to it. If it's heap memory we are using here, then that's a memory corruption (read) or kernel memory leak (write) vector. So we have to wait for IO completion before we return from this function, and AFAICT, the only way to do that is to call submit_bio_wait() on the parent of the bio chain to wait for all child bios to drop their references and call bio_endio() on the parent of the chain.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx