Re: Silent data corruption in blkdev_direct_IO()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/12/18 5:29 PM, Ming Lei wrote:
> On Thu, Jul 12, 2018 at 10:36 PM, Hannes Reinecke <hare@xxxxxxx> wrote:
>> Hi Jens, Christoph,
>>
>> we're currently hunting down a silent data corruption occurring due to
>> commit 72ecad22d9f1 ("block: support a full bio worth of IO for
>> simplified bdev direct-io").
>>
>> While the whole thing is still hazy on the details, the one thing we've
>> found is that reverting that patch fixes the data corruption.
>>
>> And looking closer, I've found this:
>>
>> static ssize_t
>> blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>> {
>>         int nr_pages;
>>
>>         nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES + 1);
>>         if (!nr_pages)
>>                 return 0;
>>         if (is_sync_kiocb(iocb) && nr_pages <= BIO_MAX_PAGES)
>>                 return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
>>
>>         return __blkdev_direct_IO(iocb, iter, min(nr_pages, BIO_MAX_PAGES));
>> }
>>
>> When checking the call path
>> __blkdev_direct_IO()->bio_alloc_bioset()->bvec_alloc()
>> I found that bvec_alloc() will fail if nr_pages > BIO_MAX_PAGES.
>>
>> So why is there the check for 'nr_pages <= BIO_MAX_PAGES' ?
>> It's not that we can handle it in __blkdev_direct_IO() ...
>>
>> Thanks for any clarification.
> 
> Maybe you can try the following patch from Christoph to see if it makes a
> difference:
> 
> https://marc.info/?l=linux-kernel&m=153013977816825&w=2

That's not a bad idea.

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux