On 7/25/18 2:15 PM, Martin Wilck wrote: > Hello Jens, Ming, Jan, and all others, > > the following patches have been verified by a customer to fix a silent data > corruption which he has been seeing since "72ecad2 block: support a full bio > worth of IO for simplified bdev direct-io". > > The patches are based on our observation that the corruption is only > observed if the __blkdev_direct_IO_simple() code path is executed, > and if that happens, "short writes" are observed in this code path, > which causes a fallback to buffered IO, while the application continues > submitting direct IO requests. > > Following Ming's suggestion, I've changed the patch set such that > bio_iov_iter_get_pages() now always returns as many pages as possible. > This simplifies the patch set a lot. Except for > __blkdev_direct_IO_simple(), all callers of bio_iov_iter_get_pages() > call it in a loop, and expect to get just some pages. Therefore I > have made bio_iov_iter_get_pages() return success if it can pin some > pages, even if MM returns an error on the way. Error is returned only > if no pages at all could be pinned. This also avoids the need for > cleanup code in the helper - callers will submit the bio with the > allocated pages, and clean up later as appropriate. Thanks everyone involved in this, I've queued it up for 4.18. -- Jens Axboe