RE: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report()) (Kazuhito Hagio)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message -----
> Date: Thu, 7 Nov 2019 16:12:06 +0000
> From: Kazuhito Hagio <k-hagio@xxxxxxxxxxxxx>
> To: Dave Jones <davej@xxxxxxxxxxxxxxxxx>
> Cc: "kexec@xxxxxxxxxxxxxxxxxxx" <kexec@xxxxxxxxxxxxxxxxxxx>
> Subject: RE: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report())
> Message-ID: <4AE2DC15AC0B8543882A74EA0D43DBEC035949A4@xxxxxxxxxxxxxxxxxxxxxxx>
> Content-Type: text/plain; charset="iso-2022-jp"
> 
> Hi,
> 
> > -----Original Message-----
> > >  > > There are some other failure cases with non-null data, so maybe
> > >  > > there's >1 bug here.
> > >  > > I've not seen an obvious pattern to this. eg...
> > >  > >
> > >  > > https://pastebin.com/2uM4sBCF
> > >  > >
> > >  >
> > >  > As for this case, I suspect that Elf64_Ehdr.e_phnum overflows
> > >  > (i.e. num_loads_dumpfile > 65535):
> > >
> > > Oh, good catch.  These are 256GB machines, so after discarding
> > > everything, that explains why we end up with so many sections.
> > > This also explains why it sometimes works I think, when the discarding
> > > manages to get the total nr headers <64k.
> 
> I also could reproduce this issue on a system with 192GB memory.
> The note was actually overwritten by the following program headers.
> -----
> num_loads_dumpfile=76318                # more than 64k
> ehdr64.e_phnum=10783                    # overflowed
> note.p_offset=0x93708 .p_filesz=0x2958  # The note data is at 0x93708
> note cd_header->offset=0x40
> ...
>     head->off=     90040 load.p_addr= 44552e000 .p_off=  ed270060 ...
>                    ^^^^^ # these headers overwrote the note data.
>     head->off=     a0040 load.p_addr= 445630000 .p_off=  ed272060 ...
> ...
> The dumpfile is saved to dump.Ed25.devel.
> 
> makedumpfile Completed.
> 
> # readelf -a dump.Ed25.devel
> ...
>   Number of program headers:         10783
> ...
> Displaying notes found at file offset 0x00093708 with length 0x00002958:
>   Owner                 Data size       Description
>                        0x00000007       Unknown note type: (0xdbce6060)
>    description data: 00 00 7a 39 fffffff2 ffffff8a ffffffff
> # ../crash vmlinux dump.Ed25.devel
> 
> WARNING: possibly corrupt Elf64_Nhdr: n_namesz: 4185522176 n_descsz: 3
> n_type: f4000
> ...
> WARNING: cannot read linux_banner string
> crash: vmlinux and dump.Ed25.devel do not match!
> -----
> 
> > I think this will be the one of the causes, and had a look at how
> > we can fix it.  If you get a vmcore where this pattern occurs,
> > you can try this tree:
> > https://github.com/k-hagio/makedumpfile/tree/support-extended-elf
> > 
> > Then, the crash utility also needs a patch to support a dumpfile
> > that has more than 64k program headers:
> > https://github.com/k-hagio/crash/tree/support-extended-elf
> 
> These trees look to work well, though need more tests and tweaks.
> -----
> # readelf -a dump.Ed25.test
> ...
>   Number of program headers:         65535 (76319)  <<-- note + loads
> ...
> Displaying notes found at file offset 0x00413748 with length 0x00002958:
>   Owner                 Data size       Description
>   CORE                 0x00000150       NT_PRSTATUS (prstatus structure)
>   CORE                 0x00000150       NT_PRSTATUS (prstatus structure)
>   CORE                 0x00000150       NT_PRSTATUS (prstatus structure)
> ...
> # ../crash-test vmlinux dump.Ed25.test
> 
> crash-test> help -D
> vmcore_data:
>                   flags: c0 (KDUMP_LOCAL|KDUMP_ELF64)
>                    ndfd: 3
>                     ofp: 3141560
>             header_size: 4284576
>    num_pt_load_segments: 76318   <<-- loads
>      pt_load_segment[0]:
> -----
> 
> It is possible that the issue occurs on general systems if they have
> large memory, so I'm going to proceed with those patches.

Hi Kazu,

Do you want me to go ahead with the crash utility patch?  It looks
safe enough to apply, and I did test it to make sure there were no
ill-effects with sample ELF dumpfiles.

Thanks,
  Dave


_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec



[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux