----- Original Message ----- > On Fri, Mar 22, 2013 at 09:51:39AM -0400, Dave Anderson wrote: > > > > > > ----- Original Message ----- > > > On Thu, Mar 21, 2013 at 03:02:54PM -0400, Dave Anderson wrote: > > > > If for some reason you can't get them, I can make them > > > > available to > > > > you. > > > > And Lei Wen can also give you a sample dumpfile from his > > > > environment. > > > > > > Got them from Luc. > > > > > > > > Are you able to access module symbols on ARM dump (the one > > > > > that Luc provided)? > > > > > Or is it failing completely? > > > > > > > > I *think* so... > > > > > > > > This module text disassembly looks right: > > > > > > > > crash> dis usbnet_suspend > > > > 0xbf000ae8 <usbnet_suspend>: push {r3, r4, r5, lr} > > > > 0xbf000aec <usbnet_suspend+4>: add r0, r0, #32 > > > > 0xbf000af0 <usbnet_suspend+8>: mov r5, r1 > > > > 0xbf000af4 <usbnet_suspend+12>: bl 0xc01b8264 > > > > <dev_get_drvdata> > > > > 0xbf000af8 <usbnet_suspend+16>: ldrb r3, [r0, #36] ; 0x24 > > > > 0xbf000afc <usbnet_suspend+20>: mov r4, r0 > > > > 0xbf000b00 <usbnet_suspend+24>: add r2, r3, #1 > > > > 0xbf000b04 <usbnet_suspend+28>: cmp r3, #0 > > > > 0xbf000b08 <usbnet_suspend+32>: strb r2, [r0, #36] ; 0x24 > > > > 0xbf000b0c <usbnet_suspend+36>: bne 0xbf000bdc > > > > <usbnet_suspend+244> > > > > 0xbf000b10 <usbnet_suspend+40>: mrs r3, CPSR > > > > 0xbf000b14 <usbnet_suspend+44>: orr r3, r3, #128 ; 0x80 > > > > 0xbf000b18 <usbnet_suspend+48>: msr CPSR_c, r3 > > > > 0xbf000b1c <usbnet_suspend+52>: mov r0, #1 > > > > 0xbf000b20 <usbnet_suspend+56>: bl 0xc0015f40 > > > > <add_preempt_count> > > > > 0xbf000b24 <usbnet_suspend+60>: ldr r3, [r4, #200] ; 0xc8 > > > > 0xbf000b28 <usbnet_suspend+64>: cmp r3, #0 > > > > 0xbf000b2c <usbnet_suspend+68>: beq 0xbf000b70 > > > > <usbnet_suspend+136> > > > > 0xbf000b30 <usbnet_suspend+72>: tst r5, #1024 ; 0x400 > > > > 0xbf000b34 <usbnet_suspend+76>: beq 0xbf000b70 > > > > <usbnet_suspend+136> > > > > 0xbf000b38 <usbnet_suspend+80>: mrs r3, CPSR > > > > ... > > > > > > > > This (r) data looks OK: > > > > > > > > crash> p smsc95xx_netdev_ops > > > > smsc95xx_netdev_ops = $8 = { > > > > ndo_init = 0, > > > > ndo_uninit = 0, > > > > ndo_open = 0xbf000514 <usbnet_open>, > > > > ndo_stop = 0xbf000bec <usbnet_stop>, > > > > ndo_start_xmit = 0xbf001a60 <usbnet_start_xmit>, > > > > ndo_select_queue = 0, > > > > ndo_change_rx_flags = 0, > > > > ndo_set_rx_mode = 0, > > > > ndo_set_multicast_list = 0xbf008abc <smsc95xx_set_multicast>, > > > > ndo_set_mac_address = 0xc025d854 <eth_mac_addr>, > > > > ndo_validate_addr = 0xc025d6f8 <eth_validate_addr>, > > > > ndo_do_ioctl = 0xbf00926c <smsc95xx_ioctl>, > > > > ndo_set_config = 0, > > > > ndo_change_mtu = 0xbf000de0 <usbnet_change_mtu>, > > > > ndo_neigh_setup = 0, > > > > ndo_tx_timeout = 0xbf000d4c <usbnet_tx_timeout>, > > > > ndo_get_stats64 = 0, > > > > ndo_get_stats = 0, > > > > ndo_vlan_rx_add_vid = 0, > > > > ndo_vlan_rx_kill_vid = 0, > > > > ndo_set_vf_mac = 0, > > > > ndo_set_vf_vlan = 0, > > > > ndo_set_vf_tx_rate = 0, > > > > ndo_get_vf_config = 0, > > > > ndo_set_vf_port = 0, > > > > ndo_get_vf_port = 0, > > > > ndo_setup_tc = 0, > > > > ndo_add_slave = 0, > > > > ndo_del_slave = 0, > > > > ndo_fix_features = 0, > > > > crash> > > > > > > I'm able to see the same. > > > > > > Setting suitable debug level reveals: > > > > > > bf00f040 (bf00f000): scsi_wait_scan syms: 0 gplsyms: 0 ksyms: 1 > > > bf00a1f8 (bf008000): smsc95xx syms: 0 gplsyms: 0 ksyms: 60 > > > bf002a40 (bf000000): usbnet syms: 0 gplsyms: 24 ksyms: 65 > > > > > > The ksyms comes from KALLSYMS and by default it only includes > > > text and > > > inittext symbols. This explains why Lei is not able to see data > > > etc. symbols > > > when he runs 'sym -m <module>'. > > > > > > So I believe crash on ARM works as it should in this case. > > > > I note that the symbols exported by ARM modules prior to mod -[sS] > > contains a bunch of "$d" and "$a" symbols. The ARM > > arm_verify_symbol() > > function rejects symbols of that type, but that is only called if > > the > > "mod -[sS]" function is run. > > > > In other words, this is the flow during session initialization: > > > > module_init() > > store_module_symbols_v2() -> symbols from KALLSYMS + > > in-kernel module struct > > > > And if "mod -[sS]" is done, it goes like this: > > > > cmd_mod() > > do_module_cmd() > > load_module_symbols() > > store_load_module_symbols() -> symbols from module.ko file > > machdep->verify_symbol() > > > > So the "$d" and "$a" are there from the initialization-time onward. > > > > But since store_module_symbols_v2() has never called > > machdep->verify_symbol() > > I'm a bit hesitant to make it do so for all architectures without > > knowing the > > consequences. But it certainly seems legitimate in the > > "machine_type("ARM")" case. > > Indeed. However, I'm a bit concerned because there is this check: > > if (STREQ(name, "swapper_pg_dir")) > machdep->flags |= KSYMS_START; > > if (!name || !strlen(name) || !(machdep->flags & > KSYMS_START)) > return FALSE; > > so if the KSYMS_START is not yet set (is that possible?) we might > reject a > valid symbol from a module. > > > > > But the user-space vtop is clearly wrong: > > > > > > > > crash> vm > > > > PID: 1495 TASK: c1ef1380 CPU: 0 COMMAND: "bash" > > > > MM PGD RSS TOTAL_VM > > > > c30cd1e0 c1de4000 1484k 2940k > > > > VMA START END FLAGS FILE > > > > c1e9ae90 8000 c2000 8001875 /bin/bash > > > > c1e9aee8 c9000 ce000 8101877 /bin/bash > > > > c1e9af40 ce000 d3000 100077 > > > > c2fc27b0 1247000 1268000 100077 > > > > c2fc2650 4001c000 4001d000 100077 > > > > c1e9af98 40038000 40055000 8000875 /lib/ld-linux.so.3 > > > > c2fc20d0 4005c000 4005d000 8100875 /lib/ld-linux.so.3 > > > > c2fc2758 4005d000 4005e000 8100877 /lib/ld-linux.so.3 > > > > ... > > > > > > > > > > > > crash> vtop 8000 > > > > VIRTUAL PHYSICAL > > > > 8000 8000 > > > > > > > > PAGE DIRECTORY: c1de4000 > > > > PGD: c1de4000 => 412 > > > > PMD: c1de4000 => 412 > > > > PAGE: 0 (1MB) > > > > > > > > > > > > VMA START END FLAGS FILE > > > > c1e9ae90 8000 c2000 8001875 /bin/bash > > > > > > > > crash> vtop 4005d000 > > > > VIRTUAL PHYSICAL > > > > 4005d000 4005d000 > > > > > > > > PAGE DIRECTORY: c1de4000 > > > > PGD: c1de5000 => 40000412 > > > > PMD: c1de5000 => 40000412 > > > > PAGE: 40000000 (1MB) > > > > > > > > > > > > VMA START END FLAGS FILE > > > > c2fc2758 4005d000 4005e000 8100877 /lib/ld-linux.so.3 > > > > > > This is actually a known issue on ARM (just remembered that). > > > When the crash > > > happens it identity maps the whole address space of the running > > > process. This > > > has been fixed by upstream commit: > > > > > > commit 2c8951ab0c337cb198236df07ad55f9dd4892c26 > > > Author: Will Deacon <will.deacon@xxxxxxx> > > > Date: Wed Jun 8 15:53:34 2011 +0100 > > > > > > ARM: idmap: use idmap_pgd when setting up mm for reboot > > > > > > For soft-rebooting a system, it is necessary to map the > > > MMU-off code > > > with an identity mapping so that execution can continue > > > safely once the > > > MMU has been switched off. > > > > > > Currently, switch_mm_for_reboot takes out a 1:1 mapping from > > > 0x0 to > > > TASK_SIZE during reboot in the hope that the reset code lives > > > at a > > > physical address corresponding to a userspace virtual > > > address. > > > > > > This patch modifies the code so that we switch to the > > > idmap_pgd tables, > > > which contain a 1:1 mapping of the cpu_reset code. This has > > > the > > > advantage of only remapping the code that we need and also > > > means we > > > don't need to worry about allocating a pgd from an atomic > > > context in the > > > case that the physical address of the cpu_reset code aliases > > > with the > > > virtual space used by the kernel. > > > > > > It went in for 3.2 and Luc's kernel is v3.1.1 which explains > > > this. > > > > > > If you select any other task vtop should work fine. For example > > > cron daemon: > > > > > > crash> vm > > > PID: 316 TASK: c2a7c160 CPU: 0 COMMAND: "crond" > > > MM PGD RSS TOTAL_VM > > > c30cd060 c0a70000 836k 2916k > > > VMA START END FLAGS FILE > > > c1cdd860 8000 15000 8001875 /usr/sbin/crond > > > c1cddcd8 1c000 1d000 8101875 /usr/sbin/crond > > > c1d7d758 1d000 1e000 8101877 /usr/sbin/crond > > > c1cddd88 1e000 9e000 100077 > > > c1d7d5a0 9a4000 9c5000 100077 > > > ... > > > > > > crash> vtop 8000 > > > VIRTUAL PHYSICAL > > > 8000 c1030000 > > > > > > PAGE DIRECTORY: c0a70000 > > > PGD: c0a70000 => c2b3d831 > > > PMD: c0a70000 => c2b3d831 > > > PTE: c2b3d020 => c103018f > > > > > > PAGE: c1030000 > > > > > > PTE PHYSICAL FLAGS > > > c103018f c1030000 (PRESENT|YOUNG|EXEC) > > > > > > VMA START END FLAGS FILE > > > c1cdd860 8000 15000 8001875 /usr/sbin/crond > > > > > > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > > > c047d600 c1030000 c09b1590 0 2 228 > > > > > > > OK good, that explains that... > > > > Is it something that can be worked-around, or is the original pgd > > lost forever? If it is not recoverable, then maybe the user-space > > vtop should recognize that the bait-and-switch has occurred and > > fail? > > In this case the original PGD is lost forever. But we can certainly detect > that and bail out instead of confusing our users. Maybe something like the > patch below? > > Note that I have not tested it on 3.2+ dump (I have none) but it works on the > dumps I have. > > Per, Jan, any comments on this? > > diff --git a/arm.c b/arm.c > index a3a7c23..03f63e6 100644 > --- a/arm.c > +++ b/arm.c > @@ -265,6 +265,10 @@ arm_init(int when) > STRUCT_EXISTS("pteval_t")) > machdep->flags |= PGTABLE_V2; > > + if (THIS_KERNEL_VERSION >= LINUX(3,2,0) || > + symbol_exists("idmap_pgd")) > + machdep->flags |= IDMAP_PGD; > + > machdep->section_size_bits = _SECTION_SIZE_BITS; > machdep->max_physmem_bits = _MAX_PHYSMEM_BITS; > > @@ -352,6 +356,8 @@ arm_dump_machdep_table(ulong arg) > fprintf(fp, "%sPHYS_BASE", others++ ? "|" : ""); > if (machdep->flags & PGTABLE_V2) > fprintf(fp, "%sPGTABLE_V2", others++ ? "|" : ""); > + if (machdep->flags & IDMAP_PGD) > + fprintf(fp, "%sIDMAP_PGD", others++ ? "|" : ""); > fprintf(fp, ")\n"); > > fprintf(fp, " kvbase: %lx\n", machdep->kvbase); > @@ -1042,6 +1048,15 @@ arm_uvtop(struct task_context *tc, ulong > uvaddr, physaddr_t *paddr, int verbose) > if (!tc) > error(FATAL, "current context invalid\n"); > > + /* > + * Before idmap_pgd was introduced with upstream commit 2c8951ab0c > + * (ARM: idmap: use idmap_pgd when setting up mm for reboot), the > + * panic task pgd was overwritten by soft reboot code, so we can't do > + * any vtop translations. > + */ > + if (!(machdep->flags & IDMAP_PGD) && tc->task == tt->panic_task) > + error(FATAL, "panic task pgd is trashed by soft reboot code\n"); > + > *paddr = 0; > > if (is_kernel_thread(tc->task) && IS_KVADDR(uvaddr)) { > diff --git a/defs.h b/defs.h > index 1f693c3..8b8b9f3 100755 > --- a/defs.h > +++ b/defs.h > @@ -4649,6 +4649,7 @@ struct arm_pt_regs { > #define KSYMS_START (0x1) > #define PHYS_BASE (0x2) > #define PGTABLE_V2 (0x4) > +#define IDMAP_PGD (0x8) > > struct machine_specific { > ulong phys_base; > Unless NAK'd by Per or Jan, then consider it queued for crash-6.1.5. Thanks, Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility