RE: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report())

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

> -----Original Message-----
> >  > > There are some other failure cases with non-null data, so maybe there's >1 bug here.
> >  > > I've not seen an obvious pattern to this. eg...
> >  > >
> >  > > https://pastebin.com/2uM4sBCF
> >  > >
> >  >
> >  > As for this case, I suspect that Elf64_Ehdr.e_phnum overflows
> >  > (i.e. num_loads_dumpfile > 65535):
> >
> > Oh, good catch.  These are 256GB machines, so after discarding
> > everything, that explains why we end up with so many sections.
> > This also explains why it sometimes works I think, when the discarding
> > manages to get the total nr headers <64k.

I also could reproduce this issue on a system with 192GB memory.
The note was actually overwritten by the following program headers.
-----
num_loads_dumpfile=76318                # more than 64k
ehdr64.e_phnum=10783                    # overflowed
note.p_offset=0x93708 .p_filesz=0x2958  # The note data is at 0x93708
note cd_header->offset=0x40
...
    head->off=     90040 load.p_addr= 44552e000 .p_off=  ed270060 ...
                   ^^^^^ # these headers overwrote the note data.
    head->off=     a0040 load.p_addr= 445630000 .p_off=  ed272060 ...
...
The dumpfile is saved to dump.Ed25.devel.

makedumpfile Completed.

# readelf -a dump.Ed25.devel 
...
  Number of program headers:         10783
...
Displaying notes found at file offset 0x00093708 with length 0x00002958:
  Owner                 Data size       Description
                       0x00000007       Unknown note type: (0xdbce6060)
   description data: 00 00 7a 39 fffffff2 ffffff8a ffffffff
# ../crash vmlinux dump.Ed25.devel

WARNING: possibly corrupt Elf64_Nhdr: n_namesz: 4185522176 n_descsz: 3 n_type: f4000
...
WARNING: cannot read linux_banner string
crash: vmlinux and dump.Ed25.devel do not match!
-----

> I think this will be the one of the causes, and had a look at how
> we can fix it.  If you get a vmcore where this pattern occurs,
> you can try this tree:
> https://github.com/k-hagio/makedumpfile/tree/support-extended-elf
> 
> Then, the crash utility also needs a patch to support a dumpfile
> that has more than 64k program headers:
> https://github.com/k-hagio/crash/tree/support-extended-elf

These trees look to work well, though need more tests and tweaks.
-----
# readelf -a dump.Ed25.test
...
  Number of program headers:         65535 (76319)  <<-- note + loads
...
Displaying notes found at file offset 0x00413748 with length 0x00002958:
  Owner                 Data size       Description
  CORE                 0x00000150       NT_PRSTATUS (prstatus structure)
  CORE                 0x00000150       NT_PRSTATUS (prstatus structure)
  CORE                 0x00000150       NT_PRSTATUS (prstatus structure)
...
# ../crash-test vmlinux dump.Ed25.test

crash-test> help -D
vmcore_data: 
                  flags: c0 (KDUMP_LOCAL|KDUMP_ELF64) 
                   ndfd: 3
                    ofp: 3141560
            header_size: 4284576
   num_pt_load_segments: 76318   <<-- loads
     pt_load_segment[0]:
-----

It is possible that the issue occurs on general systems if they have
large memory, so I'm going to proceed with those patches.

Thanks,
Kazu



_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec



[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux