在 2020/10/13 下午8:01, Matthew Wilcox 写道:
On Tue, Oct 13, 2020 at 01:13:48PM +0800, Hao_Xu wrote:
在 2020/10/13 上午5:13, Matthew Wilcox 写道:
This one's pretty unlikely, but there's a case in buffered reads where
an IOCB_WAITQ read can end up sleeping.
generic_file_buffered_read():
page = find_get_page(mapping, index);
...
if (!PageUptodate(page)) {
...
if (iocb->ki_flags & IOCB_WAITQ) {
...
error = wait_on_page_locked_async(page,
iocb->ki_waitq);
wait_on_page_locked_async():
if (!PageLocked(page))
return 0;
(back to generic_file_buffered_read):
if (!mapping->a_ops->is_partially_uptodate(page,
offset, iter->count))
goto page_not_up_to_date_locked;
page_not_up_to_date_locked:
if (iocb->ki_flags & (IOCB_NOIO | IOCB_NOWAIT)) {
unlock_page(page);
put_page(page);
goto would_block;
}
...
error = mapping->a_ops->readpage(filp, page);
(will unlock page on I/O completion)
if (!PageUptodate(page)) {
error = lock_page_killable(page);
So if we have IOCB_WAITQ set but IOCB_NOWAIT clear, we'll call ->readpage()
and wait for the I/O to complete. I can't quite figure out if this is
intentional -- I think not; if I understand the semantics right, we
should be returning -EIOCBQUEUED and punting to an I/O thread to
kick off the I/O and wait.
I think the right fix is to return -EIOCBQUEUED from
wait_on_page_locked_async() if the page isn't locked. ie this:
@@ -1258,7 +1258,7 @@ static int wait_on_page_locked_async(struct page *page,
struct wait_page_queue *wait)
{
if (!PageLocked(page))
- return 0;
+ return -EIOCBQUEUED;
return __wait_on_page_locked_async(compound_head(page), wait, false);
}
But as I said, I'm not sure what the semantics are supposed to be.
Hi Matthew,
which kernel version are you use, I believe I've fixed this case in the
commit c8d317aa1887b40b188ec3aaa6e9e524333caed1
Ah, I don't have that commit in my tree.
Nevertheless, there is still a problem. The ->readpage implementation
is not required to execute asynchronously. For example, it may enter
page reclaim by using GFP_KERNEL. Indeed, I feel it is better if it
works synchronously as it can then report the actual error from an I/O
instead of the almost-meaningless -EIO.
This patch series documents 12 filesystems which implement ->readpage
in a synchronous way today (for at least some cases) and converts iomap
to be synchronous (making two more filesystems synchronous).
https://lore.kernel.org/linux-fsdevel/20201009143104.22673-1-willy@xxxxxxxxxxxxx/
Thanks, Matthew. I didn't have this knowledge before, thank you for your
share and information. It's really kind of you. I'll look into it soon.