----- Original Message ----- > > OK, now I'm getting confused... > The more I look at this patch, the more confused I get... During initialization, the ELF notes contained in the dumpfile file header are scanned, and if an NT_PRSTATUS note is seen, a pointer to its location in the dumpfile is saved in dd->nt_prstatus_percpu[num] and the "num" of valid notes is kept in dd->num_prstatus_notes. If the dd->num_prstatus_notes is equal to the online cpu count, then it is presumed that there is a one-to-one relationship, where the cpu number can be used as the index into the dd->nt_prstatus_percpu[num] array. If the number of notes is not equal to the number of online cpus, then the "mapping" function is called, where if a cpu is found to be offline, then its (incorrectly) associated entry in the dd->nt_prstatus_percpu[num] array is "pushed up" to the next higher entry. But the dd->num_prstatus_notes does not seem to get incremented to reflect that move, so then it's seems like diskdump_get_prstatus_percpu() can possibly return NULL when there actually is a relevant NT_PRSTATUS note. That seems to be a bug (?), but it's not particularly important, because for x86 and x86_64, the data in the NT_PRSTATUS notes is only used if the starting point for backtraces if the PC/SP pair cannot be determined otherwise, which is the case virtually all of the time. So the registers found in the NT_PRSTATUS notes are pretty much useless... Now, to complicate matters, your patch does not look at the NT_PRSTATUS notes in the dumpfile header, but instead looks at the base kernel's original notes, and verifies their contents, and correlates what's found there against what was found in the dumpfile? So I don't understand what you are attempting to do -- what is the difference between the notes that are copied into the dumpfile vs. what you are looking at in the base kernel? I'm also wondering what would happen in your case if there were a combination of "lost" notes *and* offline cpus? How would that work? So at this point I really don't want to add this patch at all because it touches common code, and I don't want to risk breaking the other arches. Nobody has ever reported any "lost" cpus so far, probably because the kdump facility uses non-maskable NMI's to shutdown the non-panicking cpus. This is such a highly-unlikely corner case, that it does not even seem worth addressing for fear of breaking something else. I didn't look at the reasoning behind why you ran into a segmentation violation, but since the PPC code path would be: ... back_trace() get_diskdump_regs() get_diskdump_regs_ppc() perhaps you can rework your patch so that it is segregated to PPC only? Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility