On 2019/4/9 下午9:14, Michael S. Tsirkin wrote:
On Tue, Apr 09, 2019 at 12:16:47PM +0800, Jason Wang wrote:
We set dirty bit through setting up kmaps and access them through
kernel virtual address, this may result alias in virtually tagged
caches that require a dcache flush afterwards.
Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Cc: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>
Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server")
This is like saying "everyone with vhost needs this".
In practice only might affect some architectures.
For the archs that does need dcache flushing, the function is just a nop.
Which ones?
There're more than 10 archs that have ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
defined, just cc some maintainers of some more influenced ones.
You want to Cc the relevant maintainers
who understand this...
Signed-off-by: Jason Wang <jasowang@xxxxxxxxxx>
I am not sure this is a good idea.
The region in question is supposed to be accessed
by userspace at the same time, through atomic operations.
How do we know userspace didn't access it just before?
get_user_pages() will do both flush_annon_page() to make sure the
userspace write is visible to kernel.
Is that an issue at all given we use
atomics for access? Documentation/core-api/cachetlb.rst does
not mention atomics.
Which architectures are affected?
Assuming atomics actually do need a flush, then don't we need
a flush in the other direction too? How are atomics
supposed to work at all?
It's the issue of visibility, atomic operation is just one of the
possible operations. If we can finally makes the write visible to each
other, there will be no issue.
It looks to me we could still end up alias if userspace is accessing the
dirty log between get_user_pages_fast() and flush_dcache_page(). But the
flush_dcache_page() can guarantee what kernel wrote is visible to
userspace finally though some bits cleared by userspace might still
there. We may end up with more dirty pages noticed by userspace which
should be harmless.
I really think we need new APIs along the lines of
set_bit_to_user.
Can we simply do:
get_user()
set bit
put_user()
instead?
---
drivers/vhost/vhost.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 351af88231ad..34a1cedbc5ba 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1711,6 +1711,7 @@ static int set_bit_to_user(int nr, void __user *addr)
base = kmap_atomic(page);
set_bit(bit, base);
kunmap_atomic(base);
+ flush_dcache_page(page);
set_page_dirty_lock(page);
put_page(page);
return 0;
Ignoring the question of whether this actually helps, I doubt
flush_dcache_page is appropriate here. Pls take a look at
Documentation/core-api/cachetlb.rst as well as the actual
implementation.
I think you meant flush_kernel_dcache_page, and IIUC it must happen
before kunmap, not after (which you still have the va locked).
Looks like you're right.
Thanks
--
2.19.1