On 4 Dec 2024, at 13:16, Zi Yan wrote: > On 4 Dec 2024, at 13:13, Zi Yan wrote: > >> On 4 Dec 2024, at 12:46, Vlastimil Babka wrote: >> >>> On 12/4/24 18:33, Zi Yan wrote: >>>> On 4 Dec 2024, at 11:29, Matthew Wilcox wrote: >>>> >>>>> On Wed, Dec 04, 2024 at 11:16:51AM -0500, Zi Yan wrote: >>>>>>> So maybe the clearing done as part of page allocator isn't enough here. >>>>>>> >>>>>> Basically, mips needs to flush data cache if kmap address is aliased to >>>>> >>>>> People use "aliased" in contronym ways. Do you mean "has a >>>>> non-congruent alias" or "has a congruent alias"? >>>>> >>>>>> userspace address. This means when mips has THP on, the patch below >>>>>> is not enough to fix the issue. >>>>>> >>>>>> In post_alloc_hook(), it does not make sense to pass userspace address >>>>>> in to determine whether to flush dcache or not. >>>>>> >>>>>> One way to fix it is to add something like arch_userpage_post_alloc() >>>>>> to flush dcache if kmap address is aliased to userspace address. >>>>>> But my questions are that >>>>>> 1) if kmap address will always be the same for two separate kmap_local() calls, >>>>> >>>>> No. It just takes the next address in the stack. >>>> >>>> Hmm, if kmap_local() gives different addresses, wouldn’t init_on_alloc be >>>> causing issues before my patch? In the page allocator, the page is zeroed >>>> from one kmap address without flush, then clear_user_highpage() clears >>>> it again with another kmap address with flush. After returning to userspace, >>>> the user application works on the page but when the cache line used by >>>> init_on_alloc is written back (with 0s) at eviction, user data is corrupted. >>>> Am I missing anything? Or all arch with cache aliasing never enables >>>> init_on_alloc? >>> >>> Maybe the arch also defines some hooks like arch_kmap_local_post_unmap() ? >> >> But this does not solve the possible init_on_alloc issue, since init_on_alloc >> is done in mm/page_alloc.c without userspace address and has no knowledge of >> whether the zeroed page will be used in userspace nor the cache line will >> be the same as the userspace page cache line. If dcache is flushed >> unconditionally for kmap_local, that could degrade performance. >> >>> As for the fix, could it rely on e.g. __HAVE_ARCH_COPY_USER_HIGHPAGE instead >>> of CONFIG_MIPS? That affects more arches, I don't know if we broke only mips >>> or others too. >> >> Yes, this is much better, since this issue affects any arch with cache aliasing. >> Let me update my fix. Thanks. > > I notice that arm64 has __HAVE_ARCH_COPY_USER_HIGHPAGE defined, so I will > need to look for an alternative. It turns out sh, sparc, arm, xtensa, nios2, m68k, parisc, csky, and powerpc all have cache flush operations in clear_user_page() compared to clear_page() and arc clears PG_dc_clean bit in addition to clear_page(). So __GFP_ZERO cannot simply replace clear_user_page()/clear_user_highpage(). I can add ARCH_REQUIRE_CLEAR_USER_PAGE for these arch and use it to decide whether clear_user_page()/clear_user_highpage() needs to be used regardless of the presence of init_on_alloc. I also wonder if INIT_ON_ALLOC_DEFAULT_ON works on these arch or not. Best Regards, Yan, Zi