----- Original Message ----- > On Sun, Feb 03, 2013 at 10:16:37PM -0500, Luc Chouinard wrote: > > > > > > On Feb 1, 2013, at 6:55 PM, "Mika Westerberg" > > <mika.westerberg@xxxxxx> wrote: > > > > > On Fri, Feb 01, 2013 at 11:37:31PM +0200, Mika Westerberg wrote: > > >> It is important and we should definitely make sure it is available for ARM > > >> developers. However, I'm working for Intel now so I don't have suitable > > >> development environment handy. > > > > > > And this means that I do have an environment but it takes some time to setup > > > in order to get the dump. That's why I would like to ask other ARM developers > > > to try makedumpfile, should they have their environment in a better shape. > > > > > > makedumpfile is very easy to use and it really helps to save some space as you > > > can instruct it to drop all unnecessary pages, like those userspace pages. > > > > > > -- > > > Crash-utility mailing list > > > Crash-utility@xxxxxxxxxx > > > https://www.redhat.com/mailman/listinfo/crash-utility > > > > > > I will try and get one such core this week. > > My setup is easy to get up and going again. > > Great :) > > > Anything specific needed on the makedumpfile run? Default dump level > > etc....? > > Well, at least you could try to drop the unnecessary stuff like userspace > pages. Maybe just specify dump level of 31 and make sure it is compressed (-c > option). Much appreciated, Luc... And on that note, my descriptions of the ARM header bug is sketchy at best. Here are the details, showing a sample 32-bit x86 header in comparison to what's seen in the 32-bit ARM headers I've got on hand. The compressed kdump has a generic header at the beginning of the dumpfile at offset 0. The generic header has a "header_version" that defines how much stuff is contained within the kdump_sub_header, which is located at the beginning of the second page of the dump: struct kdump_sub_header { unsigned long phys_base; int dump_level; /* header_version 1 and later */ int split; /* header_version 2 and later */ unsigned long start_pfn; /* header_version 2 and later */ unsigned long end_pfn; /* header_version 2 and later */ off_t offset_vmcoreinfo; /* header_version 3 and later */ unsigned long size_vmcoreinfo; /* header_version 3 and later */ off_t offset_note; /* header_version 4 and later */ unsigned long size_note; /* header_version 4 and later */ off_t offset_eraseinfo; /* header_version 5 and later */ unsigned long size_eraseinfo; /* header_version 5 and later */ }; The kdump_sub_header is followed by copies of the original ELF notes taken from the /proc/vmcore file. So with header version 3 dumpfiles, there is only the singular vmcoreinfo ELF note. Header version 4 and later dumpfiles also contain other ELF notes and eraseinfo. The variably-sized ELF notes consist of this 3-word header: typedef struct { Elf32_Word n_namesz; /* Length of the note's name. */ Elf32_Word n_descsz; /* Length of the note's descriptor. */ Elf32_Word n_type; /* Type of the note. */ } Elf32_Nhdr; followed by the note's name string (which is of size n_namesz), and then the note's data (which is of size n_descsz). So taking a 32-bit x86 compressed kdump as an example, here is the kdump_sub_header at dumpfile offset 0x1000 (4k page), where the data up to and including the size_vmcoreinfo field looks like this: crash> rd -f 0x1000 8 1000: 00000000 0000001f 00000000 00000000 ................ 1010: 00000000 00001564 00000000 000004c6 ....d........... crash> so the phys_base is 00000000, the dump_level is 0x1f (31), and the next three fields (split, start_pfn and end_pfn) are unused. The offset_vmcoreinfo is 64-bits, so it is 0000000000001564, and the size_vmcoreinfo is 4c6. Per design, the offset_vmcoreinfo offset points to the actual data, although the 3-word (24-byte) Elf32_Nhdr and its name string do precede it in the dumpfile. So looking at the dumpfile before offset_vmcoreinfo value of 0x1564, we see the Elf32_Nhdr, followed by its name string, and then the vmcoreinfo data starting with the "OSRELEASE=" string: crash> rd -f 154c 40 154c: 0000000b 000004c6 00000000 4f434d56 ............VMCO 155c: 4e494552 00004f46 4552534f 5341454c REINFO..OSRELEAS 156c: 2e323d45 32332e36 3931312d 366c652e E=2.6.32-119.el6 157c: 3836692e 41500a36 49534547 343d455a .i686.PAGESIZE=4 158c: 0a363930 424d5953 69284c4f 5f74696e 096.SYMBOL(init_ 159c: 5f737475 3d29736e 66393063 30343362 uts_ns)=c09fb340 15ac: 4d59530a 284c4f42 65646f6e 6c6e6f5f .SYMBOL(node_onl 15bc: 5f656e69 2970616d 6130633d 36356135 ine_map)=c0a5a56 15cc: 59530a34 4c4f424d 61777328 72657070 4.SYMBOL(swapper 15dc: 5f67705f 29726964 3930633d 30303366 _pg_dir)=c09f300 crash> i.e., the n_namesz is 0xb (the string length of "VMCOREINFO" plus the NULL terminator), the n_descsz is the vmcoreinfo data size of 4c6, and the n_type is 0. Then comes the note's "VMCOREINFO" name string, followed by the actual 4c6 bytes of vmcoreinfo data. But here's what is seen in an ARM compressed kdump, where there are two inconsistencies, one in the kdump_sub_header itself, and one w/respect to the vmcoreinfo note: crash> rd -f 1000 40 1000: 80000000 0000001f 00000000 00000000 ................ 1010: 00000000 00000000 00001028 00000000 ........(....... 1020: 0000046d 00000000 4552534f 5341454c m.......OSRELEAS 1030: 2e323d45 38332e36 3263722d 3230302d E=2.6.38-rc2-002 1040: 672d3437 33306631 2d633432 74726964 74-g1f0324c-dirt 1050: 41500a79 49534547 343d455a 0a363930 y.PAGESIZE=4096. 1060: 424d5953 69284c4f 5f74696e 5f737475 SYMBOL(init_uts_ 1070: 3d29736e 39353063 38363330 4d59530a ns)=c0590368.SYM 1080: 284c4f42 65646f6e 6c6e6f5f 5f656e69 BOL(node_online_ 1090: 2970616d 3530633d 32636338 59530a63 map)=c058cc2c.SY crash> The phys_base is 80000000, the dump_level is 1f, and the split, start_pfn and end pfn values are all 0. So far so good. But the 64-bit offset_vmcoreinfo does not start immediately at offset 1014 like the x86 example, but rather there's another 00000000 field, effectively pushing the offset_vmcoreinfo up by 4 bytes to offset 1018. Now, the "pushed-up" offset_vmcoreinfo value of 0000000000001028 does correctly point to the beginning of the actual vmcoreinfo data at 0x1028. However, note that the associated 24-bit Elf32_Nhdr and its "VMCOREINFO" name string that should precede the actual vmcoreinfo data was not copied to the dumpfile. There is a single 0 field at offset 1024 -- I'm not sure what that is supposed to be? And again, the sample ARM dumpfiles that I've got were created back in the header_version 3 timeframe, and perhaps things have been resolved in later header_versions, so that's what would be interesting to see first. So if by chance the "crash --osrelease vmcore" option works OK, then perhaps one or both of the issues have been addressed. When you get everything set up, let us know -- this should be fairly easy to debug. Thanks again, Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility