----- Original Message ----- > > -----Original Message----- > > From: Dave Anderson <anderson@xxxxxxxxxx> > > Sent: Wednesday, June 5, 2019 2:56 PM > > To: Nuno Das Neves <Nuno.Das@xxxxxxxxxxxxx> > > Cc: Nuno Das Neves <Nuno.Das@xxxxxxxxxxxxx> > > Subject: Fwd: [PATCH] Fix unsigned signed comparison > > causing segfault for small VMCOREINFO notes > > > > > > > > ----- Forwarded Message ----- > > From: "Dave Anderson" <anderson@xxxxxxxxxx> > > To: "Discussion list for crash utility usage, maintenance and development" > > <crash-utility@xxxxxxxxxx> > > Sent: Wednesday, June 5, 2019 4:18:02 PM > > Subject: Re: [PATCH] Fix unsigned signed comparison causing > > segfault for small VMCOREINFO notes > > > > > > > > ----- Original Message ----- > > > Hi, > > > > > > This is a fix for a signed/unsigned comparison bug in vmcoreinfo_read_string. > > > The bug causes a segmentation fault if size_vmcoreinfo + 1 is smaller than > > > the length of the key string passed in. > > > > I suppose that's true, but can you describe the instance where that actually happened? > > I'm debugging some issues with vm2core > (https://github.com/Azure/azure-linux-utils/tree/master/vm2core), a tool for > converting Hyper-V snapshots to elf core files that can be analyzed with Crash. > There was a problem where versions of Crash newer than 7.1.5 crashed with elf > files generated by vm2core due to this issue. > > > Can you show the actual note contents as shown by "help -D"? > > Elf64_Nhdr: > n_namesz: 11 ("VMCOREINFO") > n_descsz: 10 > n_type: 0 (unused) > VMCOREINFO > > So it seems the note doesn't really contain valid data as defined in > Documentation/kdump/vmcoreinfo.txt. > > I've done some additional testing and it appears this note isn't needed by > Crash at all - I can simply patch vm2core so it doesn't add the note. > So, I suppose this fix isn't required to solve my particular issue. I don't think you should avoid adding the VMCOREINFO note contents from the kernel. While the vast majority of VMCOREINFO items are only used by makedumpfile(8), several items are used by the crash utility. For example, the x86_64 requires things like the kernel's phys_offset value (at least if it's non-zero), either the relocated "_stext" symbol value or KERNELOFFSET value for KASLR kernels, possibly the KERNEL_IMAGE_SIZE, and to be able to recognize 5-level page tables. > > However, could the VMCOREINFO note legitimately contain a single field e.g. "PAGE_SIZE=4096"? > If so, this is just 14 characters; 14 + 1 < > strlen("NUMBER(KERNEL_IMAGE_SIZE)") - this string is used as the argument to > vmcoreinfo_read_string on line 202 of x86_64.c. No, it would never contains a single field. Here's a typical dump on a 5.0.0 x86_64 kernel: crash> help -D ... [ cut ] ... n_namesz: 11 ("VMCOREINFO") n_descsz: 1980 n_type: 0 (unused) OSRELEASE=5.0.0+ PAGESIZE=4096 SYMBOL(init_uts_ns)=ffffffffa9615540 SYMBOL(node_online_map)=ffffffffa976e068 SYMBOL(swapper_pg_dir)=ffffffffa960e000 SYMBOL(_stext)=ffffffffa8200000 SYMBOL(vmap_area_list)=ffffffffa9666230 SYMBOL(mem_section)=ffff90adbbff8000 LENGTH(mem_section)=2048 SIZE(mem_section)=16 OFFSET(mem_section.section_mem_map)=0 SIZE(page)=64 SIZE(pglist_data)=14016 SIZE(zone)=1280 SIZE(free_area)=72 SIZE(list_head)=16 SIZE(nodemask_t)=8 OFFSET(page.flags)=0 OFFSET(page._refcount)=52 OFFSET(page.mapping)=24 OFFSET(page.lru)=8 OFFSET(page._mapcount)=48 OFFSET(page.private)=40 OFFSET(page.compound_dtor)=16 OFFSET(page.compound_order)=17 OFFSET(page.compound_head)=8 OFFSET(pglist_data.node_zones)=0 OFFSET(pglist_data.nr_zones)=13344 OFFSET(pglist_data.node_start_pfn)=13352 OFFSET(pglist_data.node_spanned_pages)=13368 OFFSET(pglist_data.node_id)=13376 OFFSET(zone.free_area)=192 OFFSET(zone.vm_stat)=1088 OFFSET(zone.spanned_pages)=112 OFFSET(free_area.free_list)=0 OFFSET(list_head.next)=0 OFFSET(list_head.prev)=8 OFFSET(vmap_area.va_start)=0 OFFSET(vmap_area.list)=48 LENGTH(zone.free_area)=11 SYMBOL(log_buf)=ffffffffa964b250 SYMBOL(log_buf_len)=ffffffffa964b24c SYMBOL(log_first_idx)=ffffffffa9d71668 SYMBOL(clear_idx)=ffffffffa9d71634 SYMBOL(log_next_idx)=ffffffffa9d71658 SIZE(printk_log)=16 OFFSET(printk_log.ts_nsec)=0 OFFSET(printk_log.len)=8 OFFSET(printk_log.text_len)=10 OFFSET(printk_log.dict_len)=12 LENGTH(free_area.free_list)=4 NUMBER(NR_FREE_PAGES)=0 NUMBER(PG_lru)=4 NUMBER(PG_private)=13 NUMBER(PG_swapcache)=10 NUMBER(PG_swapbacked)=19 NUMBER(PG_slab)=9 NUMBER(PG_head_mask)=65536 NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-129 NUMBER(HUGETLB_PAGE_DTOR)=2 NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE)=-257 NUMBER(phys_base)=-497025024 SYMBOL(init_top_pgt)=ffffffffa960e000 NUMBER(pgtable_l5_enabled)=0 SYMBOL(node_data)=ffffffffa976d680 LENGTH(node_data)=64 KERNELOFFSET=27200000 NUMBER(KERNEL_IMAGE_SIZE)=1073741824 NUMBER(sme_mask)=0 CRASHTIME=1555324290 I would guess the problem is how much awareness your vm2core facility has concerning the target kernel, e.g., can it can find the physical memory containing the VMCOREINFO data. On the target system, the physical address and a page-granularity size can be located here: $ cat /sys/kernel/vmcoreinfo 2a21197c0 1000 $ The kernel symbols of the data is "vmcoreinfo_data", and the actual size of the note is "vmcoreinfo_size". Or in 4.19 and later kernels, /proc/kcore now has the VMCOREINFO note appended. The KVM developers went through the same process as you are, and they eventually implemented a mechanism where they pass the vmcoreinfo data from the target kernel to the KVM host that is doing a "virsh dump". In any case, even though it's theoretically impossible to have a zero-length or very small VMCOREINFO note, your patch does makes logical sense, so I will apply it. Thanks, Dave > Regards, > Nuno > > > > > Thanks, > > Dave > > > > > > > > > > Signed-off-by: Nuno Das Neves <nudasnev@xxxxxxxxxxxxx> > > > --- > > > netdump.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/netdump.c b/netdump.c > > > index 40f9cde..d257ecd 100644 > > > --- a/netdump.c > > > +++ b/netdump.c > > > @@ -1838,7 +1838,7 @@ vmcoreinfo_read_string(const char *key) > > > return NULL; > > > > > > /* the '+ 1' is the equal sign */ > > > - for (i = 0; i < (size_vmcoreinfo - key_length + 1); i++) { > > > + for (i = 0; i < (int)(size_vmcoreinfo - key_length + 1); i++) { > > > /* > > > * We must also check if we're at the beginning of VMCOREINFO > > > * or the separating newline is there, and of course if we > > > -- > > > 1.8.3.1 > > > > > > -- > > > Crash-utility mailing list > > > Crash-utility@xxxxxxxxxx > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2Fmailman%2Flistinfo%2Fcrash- > > utility&data=02%7C01%7CNuno.Das%40microsoft.com%7C3a91428c54b34ed7855d08d6ea008afe%7C72f988bf86f141af91ab2d7cd0 > > 11db47%7C1%7C0%7C636953685382744097&sdata=BVID2b0QYkDmBip0bzPjX%2Fl4vltJtf78JrTE7pnqSIM%3D&reserved=0 > > > > -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility