Re: [PATCH v2 2/3] LoongArch: Add barrier between set_pte and memory access

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2024/10/15 下午8:27, Huacai Chen wrote:
On Tue, Oct 15, 2024 at 10:54 AM maobibo <maobibo@xxxxxxxxxxx> wrote:



On 2024/10/14 下午2:31, Huacai Chen wrote:
Hi, Bibo,

On Mon, Oct 14, 2024 at 11:59 AM Bibo Mao <maobibo@xxxxxxxxxxx> wrote:

It is possible to return a spurious fault if memory is accessed
right after the pte is set. For user address space, pte is set
in kernel space and memory is accessed in user space, there is
long time for synchronization, no barrier needed. However for
kernel address space, it is possible that memory is accessed
right after the pte is set.

Here flush_cache_vmap/flush_cache_vmap_early is used for
synchronization.

Signed-off-by: Bibo Mao <maobibo@xxxxxxxxxxx>
---
   arch/loongarch/include/asm/cacheflush.h | 14 +++++++++++++-
   1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
index f8754d08a31a..53be231319ef 100644
--- a/arch/loongarch/include/asm/cacheflush.h
+++ b/arch/loongarch/include/asm/cacheflush.h
@@ -42,12 +42,24 @@ void local_flush_icache_range(unsigned long start, unsigned long end);
   #define flush_cache_dup_mm(mm)                         do { } while (0)
   #define flush_cache_range(vma, start, end)             do { } while (0)
   #define flush_cache_page(vma, vmaddr, pfn)             do { } while (0)
-#define flush_cache_vmap(start, end)                   do { } while (0)
   #define flush_cache_vunmap(start, end)                 do { } while (0)
   #define flush_icache_user_page(vma, page, addr, len)   do { } while (0)
   #define flush_dcache_mmap_lock(mapping)                        do { } while (0)
   #define flush_dcache_mmap_unlock(mapping)              do { } while (0)

+/*
+ * It is possible for a kernel virtual mapping access to return a spurious
+ * fault if it's accessed right after the pte is set. The page fault handler
+ * does not expect this type of fault. flush_cache_vmap is not exactly the
+ * right place to put this, but it seems to work well enough.
+ */
+static inline void flush_cache_vmap(unsigned long start, unsigned long end)
+{
+       smp_mb();
+}
+#define flush_cache_vmap flush_cache_vmap
+#define flush_cache_vmap_early flush_cache_vmap
  From the history of flush_cache_vmap_early(), It seems only archs with
"virtual cache" (VIVT or VIPT) need this API, so LoongArch can be a
no-op here.
OK,  flush_cache_vmap_early() also needs smp_mb().


Here is usage about flush_cache_vmap_early in file linux/mm/percpu.c,
map the page and access it immediately. Do you think it should be noop
on LoongArch.

rc = __pcpu_map_pages(unit_addr, &pages[unit * unit_pages],
                                       unit_pages);
if (rc < 0)
      panic("failed to map percpu area, err=%d\n", rc);
      flush_cache_vmap_early(unit_addr, unit_addr + ai->unit_size);
      /* copy static data */
      memcpy((void *)unit_addr, __per_cpu_load, ai->static_size);
}



And I still think flush_cache_vunmap() should be a smp_mb(). A
smp_mb() in flush_cache_vmap() prevents subsequent accesses be
reordered before pte_set(), and a smp_mb() in flush_cache_vunmap()
smp_mb() in flush_cache_vmap() does not prevent reorder. It is to flush
pipeline and let page table walker HW sync with data cache.

For the following example.
    rb = vmap(pages, nr_meta_pages + 2 * nr_data_pages,
                    VM_MAP | VM_USERMAP, PAGE_KERNEL);
    if (rb) {
<<<<<<<<<<< * the sentence if (rb) can prevent reorder. Otherwise with
any API kmalloc/vmap/vmalloc and subsequent memory access, there will be
reorder issu. *
        kmemleak_not_leak(pages);
        rb->pages = pages;
        rb->nr_pages = nr_pages;
        return rb;
    }

prevents preceding accesses be reordered after pte_clear(). This
Can you give an example about such usage about flush_cache_vunmap()? and
we can continue to talk about it, else it is just guessing.
Since we cannot reach a consensus, and the flush_cache_* API look very
strange for this purpose (Yes, I know PowerPC does it like this, but
ARM64 doesn't). I prefer to still use the ARM64 method which means add
a dbar in set_pte(). Of course the performance will be a little worse,
but still better than the old version, and it is more robust.

I know you are very busy, so if you have no time you don't need to
send V3, I can just do a small modification on the 3rd patch.
No, I will send V3 by myself. And I will drop the this patch in this patchset since by actual test vmalloc_test works well even without this patch on 3C5000 Dual-way, also weak function kernel_pte_init will be replaced with inline function rebased on

https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-define-general-function-pxd_init.patch

I dislike the copy-paste method without further understanding :(, although I also copy and paste code, but as least I try best to understand it.

Regards
Bibo Mao


Huacai


Regards
Bibo Mao
potential problem may not be seen from experiment, but it is needed in
theory.

Huacai

+
   #define cache_op(op, addr)                                             \
          __asm__ __volatile__(                                           \
          "       cacop   %0, %1                                  \n"     \
--
2.39.3









[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux