[Bug 208827] [fio io_uring] io_uring write data crc32c verify failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=208827

--- Comment #22 from Jens Axboe (axboe@xxxxxxxxx) ---
On 8/11/20 4:09 PM, Dave Chinner wrote:
> On Tue, Aug 11, 2020 at 04:56:37PM -0400, Jeff Moyer wrote:
>> Jens Axboe <axboe@xxxxxxxxx> writes:
>>
>>> So it seems to me like the file state is consistent, at least after the
>>> run, and that this seems more likely to be a fio issue with short
>>> read handling.
>>
>> Any idea why there was a short read, though?
> 
> Yes. See:
> 
>
> https://lore.kernel.org/linux-xfs/20200807024211.GG2114@xxxxxxxxxxxxxxxxxxx/T/#maf3bd9325fb3ac0773089ca58609a2cea0386ddf
> 
> It's a race between the readahead io completion marking pages
> uptodate and unlocking them, and the io_uring worker function
> getting woken on the first page being unlocked and running the
> buffered read before the entire readahead IO completion has unlocked
> all the pages in the IO.
> 
> Basically, io_uring is re-running the IOCB_NOWAIT|IOCB_WAITQ IO when
> there are still pages locked under IO. This will happen much more
> frequently the larger the buffered read (these are only 64kB) and
> the readahead windows are opened.
> 
> Essentially, the io_uring buffered read needs to wait until _all_
> pages in the IO are marked up to date and unlocked, not just the
> first one. And not just the last one, either - readahead can be
> broken into multiple bios (because it spans extents) and there is no
> guarantee of order of completion of the readahead bios given by the
> readahead code....

Yes, it would ideally wait, or at least trigger on the last one. I'll
see if I can improve that. For any of my testing, the amount of
triggered short reads is minimal. For the verify case that we just ran,
we're talking 8-12 ios out of 820 thousand, or 0.001% of them. So
nothing that makes a performance difference in practical terms, though
it would be nice to not hand back short reads if we can avoid it. Not
for performance reasons, but for usage reasons.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux