----- Original Message ----- > > > ----- Original Message ----- > > Hi Dave, > > > > Thank you for your comment! > > > > On 2/13/2018 2:33 PM, Dave Anderson wrote: > > > > > > Hi Kasuhito, > > > > > > I am planning on releasing crash-7.2.1 today, so this will have to be deferred > > > to 7.2.2. > > > > > > Because of your questions about ppc64, possible backwards-compatibility issues, > > > or potential future changes to page.flags usage, this permanent change to the > > > is_page_ptr() function solely for the purposes of SLUB's count_partial() function > > > makes me nervous. > > > > Sorry, my explanation was not enough. Please let me supplement a little. > > As far as I know, the count_partial() is the function which could receive the > > effect of this patch most. But is_page_ptr() could be called many times also > > through the other functions, so this will improve them, too. Moreover, the > > is_page_ptr()'s loop could loop NR_MEM_SECTIONS() times, which is 33554432 > > on x86_64(5level) if I understand correctly, in the worst case. > > > > So I thought that ideally we should improve the is_page_ptr() itself if possible, > > rather than find the callers which could call it many times and change them. > > Also, I am willing to drop the definition of VMEMMAP_VADDR for ppc64 this > > time. > > OK, I understand your point. But what concerns me is that the function's > purpose is to absolutely identify whether the incoming page structure address > is a correct page structure address. But if an invalid address gets passed > into is_page_ptr(), your patch would take the invalid address, calculate an > invalid "nr", and continue from there, right? Another suggestion/question -- if is_page_ptr() is called with a NULL phys argument (as is done most of the time), could it skip the "if IS_SPARSEMEM()" section at the top, and still utilize the part at the bottom, where it walks through the vt->node_table[x] array? I'm not sure about the "ppend" calculation though -- even if there are holes in the node's address space, is it still a contiguous chunk of page structure addresses per-node? > > Dave > > > > > > > > > There is really no compelling reason that count_partial() absolutely > > > *must* use > > > is_page_ptr(), and so I'm thinking that perhaps you could come up with a > > > less > > > heavy-handed method for simply testing whether a page.lru entry points to > > > another > > > vmemmap'd page. Something along the lines of adding this for > > > vmemmap-enabled kernels: > > > > > > #define IN_VMEMMAP_RANGE(page) ((page >= VMEMMAP_VADDR) && (page <= > > > VMEMMAP_END)) > > > > > > and then have count_partial() replace the is_page_ptr() call with another > > > slub function that does something like this for vmemmap-enabled kernels: > > > > > > (IN_VMMEMAP_RANGE(next) && accessible(next)) > > > > > > Or instead of accessible(), it could read "next" as a list_head with > > > RETURN_ON_ERROR, > > > and verify that next->prev points back to the current list_head. > > > > > > Non-vmemmap-enabled kernels could still use is_page_ptr(). > > > > > > What do you think of doing something like that? > > > > Given possible compatibility issues you said, I think that the way you > > suggested > > might well be enough for now. I'll try a method like the above. > > > > Thanks, > > Kazuhito Hagio > > > > > > > > Dave > > > > > > > > > > > > > > > ----- Original Message ----- > > >> Hi, > > >> > > >> The "kmem -[sS]" commands can take several minutes to complete with > > >> the following conditions: > > >> * The system has a lot of memory sections with CONFIG_SPARSEMEM. > > >> * The kernel uses SLUB and it has a very long partial slab list. > > >> > > >> crash> kmem -s dentry | awk '{print strftime("%T"), $0}' > > >> 10:18:34 CACHE NAME OBJSIZE ALLOCATED > > >> TOTAL > > >> SLABS SSIZE > > >> 10:19:41 ffff88017fc78a00 dentry 192 9038949 > > >> 10045728 > > >> 239184 8k > > >> crash> kmem -S dentry | bash -c 'cat >/dev/null ; echo $SECONDS' > > >> 334 > > >> > > >> One of the causes is that is_page_ptr() in count_partial() checks if > > >> a given slub page address is a page struct by searching all memory > > >> sections linearly for the one which includes it. > > >> > > >> nr_mem_sections = NR_MEM_SECTIONS(); > > >> for (nr = 0; nr < nr_mem_sections ; nr++) { > > >> if ((sec_addr = valid_section_nr(nr))) { > > >> ... > > >> > > >> With CONFIG_SPARSEMEM{_VMEMMAP}, we can calculate the memory section > > >> which includes a page struct with its page.flags, or its address and > > >> VMEMMAP_VADDR. With this patch doing so, the computation amount can be > > >> significantly reduced in that case. > > >> > > >> crash> kmem -s dentry | awk '{print strftime("%T"), $0}' > > >> 10:34:55 CACHE NAME OBJSIZE ALLOCATED > > >> TOTAL > > >> SLABS SSIZE > > >> 10:34:55 ffff88017fc78a00 dentry 192 9038949 > > >> 10045728 > > >> 239184 8k > > >> crash> kmem -S dentry | bash -c 'cat >/dev/null ; echo $SECONDS' > > >> 2 > > >> > > >> This patch uses VMEMMAP_VADDR. It is not defined on PPC64, but it looks > > >> like PPC64 supports VMEMMAP flag and machdep->machspec->vmemmap_base is > > >> it, so this patch also defines it for PPC64. This might need some help > > >> from PPC folks. > > >> > > >> Signed-off-by: Kazuhito Hagio <k-hagio@xxxxxxxxxxxxx> > > >> --- > > >> defs.h | 2 ++ > > >> memory.c | 15 +++++++++++++++ > > >> 2 files changed, 17 insertions(+) > > >> > > >> diff --git a/defs.h b/defs.h > > >> index aa17792..84e68ca 100644 > > >> --- a/defs.h > > >> +++ b/defs.h > > >> @@ -3861,6 +3861,8 @@ struct efi_memory_desc_t { > > >> #define IS_VMALLOC_ADDR(X) machdep->machspec->is_vmaddr(X) > > >> #define KERNELBASE machdep->pageoffset > > >> > > >> +#define VMEMMAP_VADDR (machdep->machspec->vmemmap_base) > > >> + > > >> #define PGDIR_SHIFT (machdep->pageshift + (machdep->pageshift -3) + > > >> (machdep->pageshift - 2)) > > >> #define PMD_SHIFT (machdep->pageshift + (machdep->pageshift - 3)) > > >> > > >> diff --git a/memory.c b/memory.c > > >> index 0df8ecc..0696763 100644 > > >> --- a/memory.c > > >> +++ b/memory.c > > >> @@ -13348,10 +13348,25 @@ is_page_ptr(ulong addr, physaddr_t *phys) > > >> ulong nr_mem_sections; > > >> ulong coded_mem_map, mem_map, end_mem_map; > > >> physaddr_t section_paddr; > > >> +#ifdef VMEMMAP_VADDR > > >> + ulong flags; > > >> +#endif > > >> > > >> if (IS_SPARSEMEM()) { > > >> nr_mem_sections = NR_MEM_SECTIONS(); > > >> +#ifdef VMEMMAP_VADDR > > >> + nr = nr_mem_sections; > > >> + if (machdep->flags & VMEMMAP) > > >> + nr = pfn_to_section_nr((addr - VMEMMAP_VADDR) / SIZE(page)); > > >> + else if (readmem(addr + OFFSET(page_flags), KVADDR, &flags, > > >> + sizeof(ulong), "page.flags", RETURN_ON_ERROR|QUIET)) > > >> + nr = (flags >> (SIZE(page_flags)*8 - SECTIONS_SHIFT()) > > >> + & ((1UL << SECTIONS_SHIFT()) - 1)); > > >> + > > >> + if (nr < nr_mem_sections) { > > >> +#else > > >> for (nr = 0; nr < nr_mem_sections ; nr++) { > > >> +#endif > > >> if ((sec_addr = valid_section_nr(nr))) { > > >> coded_mem_map = > > >> section_mem_map_addr(sec_addr); > > >> mem_map = sparse_decode_mem_map(coded_mem_map, > > >> nr); > > >> -- > > >> 1.8.3.1 > > >> > > >> -- > > >> Crash-utility mailing list > > >> Crash-utility@xxxxxxxxxx > > >> https://www.redhat.com/mailman/listinfo/crash-utility > > >> > > > > > > -- > > > Crash-utility mailing list > > > Crash-utility@xxxxxxxxxx > > > https://www.redhat.com/mailman/listinfo/crash-utility > > > > > > > -- > > Crash-utility mailing list > > Crash-utility@xxxxxxxxxx > > https://www.redhat.com/mailman/listinfo/crash-utility > > > > -- > Crash-utility mailing list > Crash-utility@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/crash-utility > -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility