On 03/23/2016 11:26 AM, Lucas Stach wrote: Hi Lucas, > Am Mittwoch, den 23.03.2016, 07:13 -0300 schrieb Fabio Estevam: >> On Tue, Mar 22, 2016 at 1:36 PM, Roberto Fichera <kernel@xxxxxxxxxxxxx> wrote: >>> Hi All, >>> >>> I'm getting a kernel panic with a kernel v4.4.x when enabling interrupt for 2 cards behind a PCIe-to-PCI bridge >>> not sharing the same IRQ. If they don't share the same IRQ most often I'm getting a message claiming the emmc >>> is stuck. I've disabled MSI from PCI config because the cards are not getting any interrupt delivered. > Is the interrupt-map on your PCIe-to-PCI bridge correct? If it isn't the > CPU may drown in unhandled IRQ storms. Can you please be more specific? Are you talking about INTA/B/C/D mapping? > >> It is always a good idea to put the maintainers on Cc. Ok! >> Adding Lucas and Richard. >> > Thanks Fabio. > > I do _not_ regularly look at the Linux-PCI list. Roberto, if you want > the right people to respond, please add them to CC. > > Regards, > Lucas > >>> The driver's probe routine seems ok to me: >>> >>> dt = (struct devtype *)(ent->driver_data); >>> dev_info(&pdev->dev, "probe called for b4xx...\n"); >>> >>> if ((ret = pci_enable_device(pdev))) >>> goto err_out_disable_pdev; >>> >>> if ((ret = pci_request_regions(pdev, dt->desc))) { >>> dev_err(&pdev->dev, "Unable to request regions!\n"); >>> goto err_out_disable_pdev; >>> } >>> >>> if (!pdev->irq) { /* we better have an IRQ */ >>> dev_err(&pdev->dev, "Device has no associated IRQ?\n"); >>> ret = -EIO; >>> goto err_out_release_regions; >>> } >>> >>> ... >>> >>> if (request_irq(pdev->irq, b4xxp_interrupt, IRQF_SHARED, "b4xxp", b4)) { >>> dev_err(&b4->pdev->dev, "Unable to request IRQ %d\n", >>> pdev->irq); >>> ret = -EIO; >>> goto err_out_del_from_card_array; >>> } >>> >>> /* initialize the tasklet structure */ >>> /* TODO: perhaps only one tasklet for any number of cards in the system... don't need one per card I don't think. */ >>> tasklet_init(&b4->b4xxp_tlet, b4xxp_bottom_half, (unsigned long)b4); >>> >>> >>> Here is a typical crash in case of unshared IRQs: >>> >>> [ 2.748244] wcb4xxp 0000:02:00.0: probe called for b4xx... >>> [ 2.753847] pci 0000:01:00.0: enabling device (0140 -> 0143) >>> [ 2.759627] wcb4xxp 0000:02:00.0: enabling device (0140 -> 0143) >>> [ 2.766209] wcb4xxp 0000:02:00.0: Identified OpenVox B400P (controller rev 1) at fee01000, *IRQ 290* >>> [ 2.802498] wcb4xxp 0000:02:00.0: NOTE: hardware echo cancellation has been disabled >>> [ 2.810853] wcb4xxp 0000:02:00.0: Port 1: NT mode >>> [ 2.815573] wcb4xxp 0000:02:00.0: Port 1: NT mode >>> [ 2.820327] wcb4xxp 0000:02:00.0: Port 2: NT mode >>> [ 2.825060] wcb4xxp 0000:02:00.0: Port 2: NT mode >>> [ 2.829825] wcb4xxp 0000:02:00.0: Port 3: TE mode >>> [ 2.834632] wcb4xxp 0000:02:00.0: Port 3: TE mode >>> [ 2.839421] wcb4xxp 0000:02:00.0: Port 4: TE mode >>> [ 2.844146] wcb4xxp 0000:02:00.0: Port 4: TE mode >>> [ 2.877825] wcb4xxp 0000:02:00.0: Did not do the highestorder stuff >>> [ 2.887252] wcb4xxp 0000:02:02.0: probe called for b4xx... >>> [ 2.892842] wcb4xxp 0000:02:02.0: enabling device (0140 -> 0143) >>> [ 2.900029] wcb4xxp 0000:02:02.0: Identified OpenVox B400P (controller rev 1) at fee01008, *IRQ 291* >>> [ 2.938753] wcb4xxp 0000:02:02.0: Port 1: NT mode >>> [ 2.943588] wcb4xxp 0000:02:02.0: Port 1: NT mode >>> [ 2.948341] wcb4xxp 0000:02:02.0: Port 2: NT mode >>> [ 2.953180] wcb4xxp 0000:02:02.0: Port 2: NT mode >>> [ 2.957931] wcb4xxp 0000:02:02.0: Port 3: TE mode >>> [ 2.962776] wcb4xxp 0000:02:02.0: Port 3: TE mode >>> [ 2.967521] wcb4xxp 0000:02:02.0: Port 4: TE mode >>> [ 2.972355] wcb4xxp 0000:02:02.0: Port 4: TE mode >>> [ 3.074940] wcb4xxp 0000:02:02.0: Did not do the highestorder stuff >>> [ 3.135775] random: nonblocking pool is initialized >>> [ 17.049407] mmc0: Timeout waiting for hardware interrupt. >>> [ 29.849303] INFO: rcu_sched self-detected stall on CPU >>> [ 29.854473] 0-...: (1 GPs behind) idle=d8d/2/0 softirq=91/95 fqs=2600 >>> [ 29.861093] (t=2601 jiffies g=-207 c=-208 q=1005) >>> [ 29.866005] Task dump for CPU 0: >>> [ 29.869238] swapper/0 R running 0 0 0 0x00000002 >>> [ 29.875637] Backtrace: >>> [ 29.878126] [<80014320>] (dump_backtrace) from [<80014514>] (show_stack+0x18/0x1c) >>> [ 29.885701] r7:806f9a80 r6:8073324e r5:00000000 r4:806f3278 >>> [ 29.891439] [<800144fc>] (show_stack) from [<800545a8>] (sched_show_task+0x11c/0x230) >>> [ 29.899279] [<8005448c>] (sched_show_task) from [<80056d00>] (dump_cpu_task+0x34/0x44) >>> [ 29.907201] r6:80060193 r5:806f9a80 r4:00000000 >>> [ 29.911881] [<80056ccc>] (dump_cpu_task) from [<80085a20>] (rcu_dump_cpu_stacks+0x8c/0xd0) >>> [ 29.920150] r5:806f9a80 r4:00000000 >>> [ 29.923769] [<80085994>] (rcu_dump_cpu_stacks) from [<80089be0>] (rcu_check_callbacks+0x488/0x7b0) >>> [ 29.932732] r9:806f9a80 r8:806f0a34 r7:6e8a9000 r6:806f08c4 r5:806ed6c0 r4:eef966c0 >>> [ 29.940574] [<80089758>] (rcu_check_callbacks) from [<8008c980>] (update_process_times+0x40/0x6c) >>> [ 29.949450] r10:8009eca8 r9:eef92d0c r8:eef92d00 r7:00000006 r6:f29a98a6 r5:00000000 >>> [ 29.957369] r4:806f3278 >>> [ 29.959932] [<8008c940>] (update_process_times) from [<8009eca4>] (tick_sched_handle+0x50/0x54) >>> [ 29.968633] r5:806efe00 r4:eef92fb8 >>> [ 29.972250] [<8009ec54>] (tick_sched_handle) from [<8009ed08>] (tick_sched_timer+0x60/0xac) >>> [ 29.980612] [<8009eca8>] (tick_sched_timer) from [<8008d780>] (__hrtimer_run_queues+0xc0/0x1e0) >>> [ 29.989313] r7:00000000 r6:807042e8 r5:eef92fb8 r4:eef92c80 >>> [ 29.995050] [<8008d6c0>] (__hrtimer_run_queues) from [<8008db38>] (hrtimer_interrupt+0xbc/0x214) >>> [ 30.003838] r10:eef92d38 r9:eef92d58 r8:eef92cc0 r7:eef92d78 r6:ffffffff r5:00000003 >>> [ 30.011756] r4:eef92c80 >>> [ 30.014320] [<8008da7c>] (hrtimer_interrupt) from [<80018454>] (twd_handler+0x34/0x48) >>> [ 30.022240] r10:804f4fc0 r9:ee808000 r8:00000010 r7:ee81ed00 r6:807044a4 r5:ee81f000 >>> [ 30.030159] r4:00000001 >>> [ 30.032722] [<80018420>] (twd_handler) from [<8007fbc8>] (handle_percpu_devid_irq+0x8c/0xac) >>> [ 30.041163] r5:ee81f000 r4:eef98c40 >>> [ 30.044785] [<8007fb3c>] (handle_percpu_devid_irq) from [<8007b548>] (generic_handle_irq+0x28/0x3c) >>> [ 30.053834] r9:ee808000 r8:00000001 r7:806eff00 r6:806f0a34 r5:00000010 r4:806ea5fc >>> [ 30.061672] [<8007b520>] (generic_handle_irq) from [<8007b864>] (__handle_domain_irq+0x6c/0xe8) >>> [ 30.070381] [<8007b7f8>] (__handle_domain_irq) from [<800095d8>] (gic_handle_irq+0x48/0x94) >>> [ 30.078735] r9:f4001100 r8:806f0ba4 r7:f4000100 r6:80704480 r5:806efe00 r4:f400010c >>> [ 30.086574] [<80009590>] (gic_handle_irq) from [<800150b8>] (__irq_svc+0x58/0x78) >>> [ 30.094062] Exception stack(0x806efe00 to 0x806efe48) >>> [ 30.099126] fe00: 00000001 00000000 00000000 806f3278 00000082 00000000 806ee000 00000000 >>> [ 30.107311] fe20: 00000001 ee808000 804f4fc0 806efe94 806efe20 806efe50 8006ec8c 8002d7f0 >>> [ 30.115493] fe40: 60060113 ffffffff >>> [ 30.118986] r9:ee808000 r8:00000001 r7:806efe34 r6:ffffffff r5:60060113 r4:8002d7f0 >>> [ 30.126828] [<8002d730>] (__do_softirq) from [<8002dd20>] (irq_exit+0xc4/0x138) >>> [ 30.134141] r10:804f4fc0 r9:ee808000 r8:00000001 r7:00000000 r6:806f0a34 r5:00000000 >>> [ 30.142059] r4:806ea5fc >>> [ 30.144618] [<8002dc5c>] (irq_exit) from [<8007b86c>] (__handle_domain_irq+0x74/0xe8) >>> [ 30.152451] r5:00000000 r4:806ea5fc >>> [ 30.156067] [<8007b7f8>] (__handle_domain_irq) from [<800095d8>] (gic_handle_irq+0x48/0x94) >>> [ 30.164421] r9:f4001100 r8:806f0ba4 r7:f4000100 r6:80704480 r5:806eff00 r4:f400010c >>> [ 30.172258] [<80009590>] (gic_handle_irq) from [<800150b8>] (__irq_svc+0x58/0x78) >>> [ 30.179744] Exception stack(0x806eff00 to 0x806eff48) >>> [ 30.184805] ff00: 00000001 00000001 00000000 806f3278 806ee000 806f0908 00000000 806f08bc >>> [ 30.192989] ff20: 806e93e4 00000001 804f4fc0 806eff5c 806eff20 806eff50 8006ed10 80010820 >>> [ 30.201171] ff40: 20060013 ffffffff >>> [ 30.204663] r9:00000001 r8:806e93e4 r7:806eff34 r6:ffffffff r5:20060013 r4:80010820 >>> [ 30.212501] [<800107f8>] (arch_cpu_idle) from [<800696b8>] (default_idle_call+0x28/0x38) >>> [ 30.220602] [<80069690>] (default_idle_call) from [<800697d8>] (cpu_startup_entry+0x110/0x1b0) >>> [ 30.229226] [<800696c8>] (cpu_startup_entry) from [<804ea900>] (rest_init+0x12c/0x16c) >>> [ 30.237147] r7:806f0800 r4:807331cc >>> [ 30.240773] [<804ea7d4>] (rest_init) from [<8069fcc8>] (start_kernel+0x360/0x3d4) >>> [ 30.248259] r5:ffffffff r4:807337cc >>> [ 30.251878] [<8069f968>] (start_kernel) from [<1000807c>] (0x1000807c) >>> [ 30.258414] INFO: rcu_sched detected stalls on CPUs/tasks: >>> [ 30.263929] 0-...: (1 GPs behind) idle=d8d/2/0 softirq=91/95 fqs=2602 >>> [ 30.270550] (detected by 2, t=2602 jiffies, g=-207, c=-208, q=1005) >>> [ 30.276930] Task dump for CPU 0: >>> [ 30.280163] swapper/0 R running 0 0 0 0x00000002 >>> [ 30.286557] Backtrace: >>> [ 30.289028] Backtrace aborted due to bad frame pointer <806eff44> >>> >>> root@voneus-janas-imx6q:~# lspci -v >>> 00:00.0 PCI bridge: Synopsys, Inc. Device abcd (rev 01) (prog-if 00 [Normal decode]) >>> Flags: bus master, fast devsel, latency 0, IRQ 290 >>> Memory at 01000000 (32-bit, non-prefetchable) [size=1M] >>> Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 >>> I/O behind bridge: 00001000-00001fff >>> Memory behind bridge: 01100000-011fffff >>> [virtual] Expansion ROM at 01200000 [disabled] [size=64K] >>> Capabilities: [40] Power Management version 3 >>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ >>> Capabilities: [70] Express Root Port (Slot-), MSI 00 >>> Capabilities: [100] Advanced Error Reporting >>> Capabilities: [140] Virtual Channel >>> Kernel driver in use: pcieport >>> >>> 01:00.0 PCI bridge: Texas Instruments XIO2001 PCI Express-to-PCI Bridge (prog-if 00 [Normal decode]) >>> Flags: bus master, fast devsel, latency 0 >>> Bus: primary=01, secondary=02, subordinate=02, sec-latency=0 >>> I/O behind bridge: 00001000-00001fff >>> Memory behind bridge: 01100000-011fffff >>> Capabilities: [40] Subsystem: Device 0000:0000 >>> Capabilities: [48] Power Management version 3 >>> Capabilities: [50] MSI: Enable- Count=1/16 Maskable- 64bit+ >>> Capabilities: [70] Express PCI-Express to PCI/PCI-X Bridge, MSI 00 >>> Capabilities: [100] Advanced Error Reporting >>> >>> 02:00.0 ISDN controller: Cologne Chip Designs GmbH ISDN network Controller [HFC-4S] (rev 01) >>> Subsystem: Cologne Chip Designs GmbH HFC-4S [OpenVox B200P / B400P] >>> Flags: medium devsel, IRQ 290 >>> I/O ports at 1000 [size=8] >>> Memory at 01100000 (32-bit, non-prefetchable) [size=4K] >>> Capabilities: [40] Power Management version 2 >>> Kernel driver in use: wcb4xxp >>> Kernel modules: wcb4xxp >>> >>> 02:04.0 ISDN controller: Cologne Chip Designs GmbH ISDN network Controller [HFC-4S] (rev 01) >>> Subsystem: Cologne Chip Designs GmbH HFC-4S [OpenVox B200P / B400P] >>> Flags: medium devsel, IRQ 290 >>> I/O ports at 1008 [size=8] >>> Memory at 01101000 (32-bit, non-prefetchable) [size=4K] >>> Capabilities: [40] Power Management version 2 >>> Kernel driver in use: wcb4xxp >>> Kernel modules: wcb4xxp >>> >>> Any idea? >>> >>> Cheers, >>> Roberto Fichera. >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html