[Crash-utility] Re: [PATCH 2/2] Add vmemmap support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




在 2023/12/1 22:34, Shijie Huang 写道:

在 2023/12/1 16:08, HAGIO KAZUHITO(萩尾 一仁) 写道:
On 2023/11/30 17:26, Shijie Huang wrote:
Hi Kazu,

在 2023/11/30 15:29, HAGIO KAZUHITO(萩尾 一仁) 写道:
On 2023/11/30 16:21, HAGIO KAZUHITO(萩尾 一仁) wrote:
thanks for the patch.

On 2023/11/28 12:41, Huang Shijie wrote:
If the kernel exports the vmmemap then we can use that symbol in
crash to optimize access. vmmemap is just an array of page structs
after all.

This patch tries to get the "vmemmap" from the vmcore file.
If we can use the "vmemmap", use the vmemmap_is_page_ptr to replace
the generic_is_page_ptr().
I recalled that I also did a similar thing for x86_64 [1] in the past..
so it's not so slow on x86_64 and it has valid section check too.
Maybe your vmlinux is not so big as mine(441M)...
It's not so small :)

crash> files | grep -m1 vmlinux
    5 ffff97c21fa2cf00 ffff97c2400e5200 ffff97c0f5cf5ec8 REG
/usr/lib/debug/usr/lib/modules/4.18.0-305.el8.x86_64/vmlinux
crash> !ls -lh /usr/lib/debug/usr/lib/modules/4.18.0-305.el8.x86_64/vmlinux
-rwxr-xr-x. 3 root root 826M Apr 29  2021
/usr/lib/debug/usr/lib/modules/4.18.0-305.el8.x86_64/vmlinux
crash> files -n ffff97c0f5cf5ec8 | ts.py
16:42:30.902 START
16:42:30.902      INODE        NRPAGES
16:42:30.902 ffff97c0f5cf5ec8   211283
16:42:30.902
16:42:31.106      NODE           PAGES
16:42:31.106       0            211283
16:42:31.106 END (ELAPS 0:00:00.204)

(This crash has your "files -n" v2 patch and is_page_ptr() check at the
beginning of summary_inode_page)

[1]
https://github.com/crash-utility/crash/commit/4141373d9de3fea29f5d2b58f60e44bc132726c0

I don't object to add an entry to vmcoreinfo as a way of getting the
vmemmap value.  Seems like arm64 has another value from VMEMMAP_START.

but there are some concerns in crash:

- machdep->flags already has VMEMMAP flag.  It's used like "if
(IS_SPARSEMEM() && (machdep->flags & VMEMMAP)" e.g. [1].  Adding the new
flag is very confusing and inconsistent.

- Do all architectures have VMEMMAP_VADDR/END values in crash with
CONFIG_SPARSEMEM_VMEMMAP configured?
yes. your concern are right. I found some arch does not support the
VMEMMAP_VADDR/END.


So it is better to add this into ARM64 first. (I use the arm64 machine
all the time..)


we cannot test all archs, so this kind of arch-independent change makes me nervous.  I would like to suggest an arch-specific use first.  e.g.
how about making x86_64_is_page_ptr more generic and use it?

I will add arm64 version code. I am not familar with x86, and I do not
have x86 machine for the testing.

so it's better to not change it.
OK, I see.


What happens if you use it on arm64?  maybe the valid section can check it with VMEMMAP_VADDR without the vmemmap value.  Just an idea though.
hmm sorry, probably this does not work, I missed this.

+        pfn = (addr - vt->vmemmap) / size;
The code is converted by the linux code:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/asm-generic/memory_model.h?h=v6.7-rc3#n38

#define __page_to_pfn(page) (unsigned long)((page) - vmemmap)


So it's correct.


but can VMEMMAP_VADDR and phys_offset can be used to get vmemmap value
on arm64?
On arm64, I can get vmemmap by parsing "memstart_addr".

But I cannot know if CONFIG_SPARSEMEM_VMEMMAP is enabled.
on x86_64, vmemmap_populate is used:

                          if (symbol_exists("vmemmap_populate"))
                                  machdep->flags |= VMEMMAP;

on arm64, it's set unconditionally, is it ok?

arm64_init(int when)
{
...
          case PRE_GDB:
...
                  machdep->stacksize = ARM64_STACK_SIZE;
                  machdep->flags |= VMEMMAP;


So if the kernel can export "vmemmap", everything become easy.

All the archs supporting the VMEMMAP can benifit from it, such as arm64,
ppc,loongarch,riscv..
Yes but we're not sure whether the users of the other archs have the
same issue (e.g. x86_64 is fast enough) and it's hard to test all archs.

I suddenly realised that the reason why x86_64 is fast is because the x86 has already implement

the vmemmap (or part of the vmemmap).

I read the x86 code, and find this:

https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/pgtable_64.h#L257

           #define vmemmap ((struct page *)VMEMMAP_START)


In crash code, the x86_64_is_page_ptr():

--
1812:x86_64_is_page_ptr(ulong addr, physaddr_t *phys)
1813-{
1814-   ulong pfn, nr;
1815-
1816-   if (IS_SPARSEMEM() && (machdep->flags & VMEMMAP) &&
1817-       (addr >= VMEMMAP_VADDR && addr <= VMEMMAP_END) &&
1818-       !((addr - VMEMMAP_VADDR) % SIZE(page))) {
1819-
1820-           pfn = (addr - VMEMMAP_VADDR) / SIZE(page);
1821-           nr = pfn_to_section_nr(pfn);
1822-           if (valid_section_nr(nr)) {
1823-                   if (phys)
1824-                           *phys = PTOB(pfn);
1825-                   return TRUE;
1826-           }
1827-   }
1828-   return FALSE;
1829-}


In x86, the VMEMMAP_VADDR is equal to "vmemmap". Since the current kernel does not export "vmemmap",

x86 cannot use the vmemmap too, so it uses the VMEMMAP_VADDR in the code.

In logic, the x86_64_is_page_ptr() is wrong, it gets the correct result by accident.


If the "vmemmap" is exported by kernel, the x86_64_is_page_ptr() and arm64_vmemmap_is_page_ptr() can merge into

one function, and rename it to "vmemmap_is_page_ptr()", and move it to a common place, such as memory.c.


Thanks

Huang Shijie



--
Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki




[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux