Re: [External] Re: [patch 079/147] fs/proc/kcore.c: add mmap interface

Feng Zhou <zhoufeng.zf@xxxxxxxxxxxxx> · Thu, 9 Sep 2021 17:56:55 +0800

    在 2021/9/9 上午2:13, Linus Torvalds 写道:

      On Tue, Sep 7, 2021 at 7:57 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

        After looking at the kcore code, it is found that kcore does not implement
mmap, resulting in frequent context switching triggered by read.
Therefore, we want to add mmap interface to optimize performance.  Since
vmalloc and module areas will change with allocation and release,
consistency cannot be guaranteed, so mmap interface only maps KCORE_TEXT
and KCORE_RAM.

      Honestly, I still hate this patch.

The last time people wanted to speed up /dev/kcore accesses, it was
all for black-hat reasons and speeding up kernel attacks.

And this code just makes me nervous even aside from that, because I do
not understand what the heck it's doing.

        +       if (kern_addr_valid(start)) {
+               if (m->type == KCORE_RAM)
+                       pfn = __pa(start) >> PAGE_SHIFT;
+               else if (m->type == KCORE_TEXT)
+                       pfn = __pa_symbol(start) >> PAGE_SHIFT;

      Why is "__pa(start)" right in one situation, and "__pa_symbol(start)"
in another.

    Hi, Linus
    The use here is refer to "read_kcore" in fs/proc/kcore.c.
    list_for_each_entry(m, &kclist_head, list) {
    ...
    if (m->type == KCORE_RAM || m->type == KCORE_REMAP)
        phdr->p_paddr = __pa(m->addr);
    else if (m->type == KCORE_TEXT)
        phdr->p_paddr = __pa_symbol(m->addr);
    ...
}
    Ensure consistency of usage.

So this just makes me go "this is all confusing, dangerous, and the
use-case is dubious".

Mapping kernel memory is dangerous. The use-cases for it are dubious.
The patch isn't obvious.

    Kcore mmap kernel memory just has read permission.

    +	if (vma->vm_flags & (VM_WRITE | VM_EXEC)) {
+		ret = -EPERM;
+		goto out;
+	}
+
+	vma->vm_flags &= ~(VM_MAYWRITE | VM_MAYEXEC);
+	vma->vm_flags |= VM_MIXEDMAP;
+	vma->vm_ops = &kcore_mmap_ops;

Compared to the read interface, kcore mmap has no increased risk, just 
reduce context switching.

All of that screams "I'll skip this".

           Linus