Yeah, this is basically mostly copy-pasted from the sboot code, would need some cleaning up. I've been playing more a little with other bits of the hardware, writing some test fw from scratch, mostly without using the builtin rom (except for interrupts). Oleksij Rempel wrote: > Am 08.06.2017 um 00:39 schrieb Tobias Diedrich: > > Oleksij Rempel wrote: > >> Am 07.06.2017 um 02:12 schrieb Tobias Diedrich: > >>> Oleksij Rempel wrote: > >>>> Yes, this is "normal" problem. The firmware has no error handler for PCI > >>>> bus related exceptions. So if we filed to read PCI bus first time, we > >>>> have choice to Ooops and stall or Ooops and reboot ASAP. So we reboot > >>>> and provide an kernel "firmware panic!" message. > >>>> Every one who can or will to fix this, is welcome. > >>>> > >>>>> ***** > >>>>> Jun 02 14:55:30 computer kernel: usb 1-1.1: ath: firmware panic! > >>>>> exccause: 0x0000000d; pc: 0x0090ae81; badvaddr: 0x10ff4038. > >>> [...] > >>> > >>>> memdmp 50ae78 50ae88 > >>> > >>> 50ae78: 6c10 0412 6aa2 0c02 0088 20c0 2008 1940 l...j..........@ > >>> > >>> [...copy to bin...] > >>> $ bin/objdump -b binary -m xtensa -D /tmp/memdump.bin > >>> [..] > >>> 0: 6c1004 entry a1, 32 > >>> 3: 126aa2 l32r a2, 0xfffdaa8c > >>> 6: 0c0200 memw > >>> 9: 8820 l32i.n a8, a2, 0 <----------Exception cause PC still points at load > >>> b: c020 movi.n a2, 0 > >>> d: 081940 extui a9, a8, 1, 1 > >>> > >>> Judging from that it should be fairly simple to at least implement > >>> some sort of retry, possible after triggering a PCIe link retrain? > >> > >> I assume, yes. > >> > >>> There are some related PCIe root complex registers that may point to > >>> what exactly failed if they were dumped. > >>> > >>> The root complex registers live at 0x00040000 and I think match the > >>> registers described for the root complex in the AR9344 datasheet. > >> > >> Suddenly I don't have ar7010 docs to tell.. > >> > >>> PCIE_INT_MASK would map to 0x40050 and has a bit for SYS_ERR: > >>> "A system error. The RC Core asserts CFG_SYS_ERR_RC if any device in > >>> the hierarchy reports any of the following errors and the associated > >>> enable bit is set in the Root Control register: ERR_COR, ERR_FATAL, > >>> ERR_NONFATAL." > >>> > >>> AFAICS link retrain can be done by setting bit3 (INIT_RST, > >>> "Application request to initiate a training reset") in > >>> PCIE_APP (0x40000). > >>> > >>> See sboot/magpie_1_1/sboot/cmnos/eeprom/src/cmnos_eeprom.c (which > >>> flips some bits in the RC to enable the PCIe bus for reading the > >>> EEPROM). > >>> > >>> The root complex pci configuration space is at 0x20000 which could > >>> have further error details: > >>>> memdmp 20000 20200 > >>> > >>> 020000: a02a 168c 0010 0006 0000 0001 0001 0000 .*.............. > >>> 020010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 020020: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 020030: 0000 0000 0000 0040 0000 0000 0000 01ff .......@........ > >>> 020040: 5bc3 5001 0000 0000 0000 0000 0000 0000 [.P............. > >>> 020050: 0080 7005 0000 0000 0000 0000 0000 0000 ..p............. > >>> 020060: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 020070: 0042 0010 0000 8701 0000 2010 0013 4411 .B............D. > >>> 020080: 3011 0000 0000 0000 00c0 03c0 0000 0000 0............... > >>> 020090: 0000 0000 0000 0010 0000 0000 0000 0000 ................ > >>> 0200a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 0200b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 0200c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 0200d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 0200e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 0200f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 020100: 1401 0001 0000 0000 0000 0000 0006 2030 ...............0 > >>> 020110: 0000 0000 0000 2000 0000 00a0 0000 0000 ................ > >>> 020120: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 020130: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 020140: 0001 0002 0000 0000 0000 0000 0000 0000 ................ > >>> 020150: 0000 0000 8000 00ff 0000 0000 0000 0000 ................ > >>> 020160: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 020170: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 020180: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 020190: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 0201a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 0201b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 0201c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 0201d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 0201e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> 0201f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > >>> > >>> Transformed into something suitable for feeding into lspci -F: > >>> > >>> 00:00.0 Description filled in by lspci > >>> 00: 8c 16 2a a0 06 00 10 00 01 00 00 00 00 00 01 00 > >>> 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> 30: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 00 00 > >>> 40: 01 50 c3 5b 00 00 00 00 00 00 00 00 00 00 00 00 > >>> 50: 05 70 80 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> 70: 10 00 42 00 01 87 00 00 10 20 00 00 11 44 13 00 > >>> 80: 00 00 11 30 00 00 00 00 c0 03 c0 00 00 00 00 00 > >>> 90: 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 > >>> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> > >>> $ lspci -F /tmp/hexdump -vvv > >>> 00:00.0 Non-VGA unclassified device: Qualcomm Atheros Device a02a (rev 01) > >>> !!! Invalid class 0000 for header type 01 > >>> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- > >>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- > >>> Latency: 0 > >>> Interrupt: pin A routed to IRQ 255 > >>> Bus: primary=00, secondary=00, subordinate=00, sec-latency=0 > >>> I/O behind bridge: 00000000-00000fff > >>> Memory behind bridge: 00000000-000fffff > >>> Prefetchable memory behind bridge: 00000000-000fffff > >>> Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR- > >>> BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B- > >>> PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- > >>> Capabilities: [40] Power Management version 3 > >>> Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-) > >>> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > >>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+ > >>> Address: 0000000000000000 Data: 0000 > >>> Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00 > >>> DevCap: MaxPayload 256 bytes, PhantFunc 0 > >>> ExtTag- RBE+ > >>> DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- > >>> RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- > >>> MaxPayload 128 bytes, MaxReadReq 512 bytes > >>> DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- > >>> LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <1us, L1 <64us > >>> ClockPM- Surprise- LLActRep+ BwNot- ASPMOptComp- > >>> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk- > >>> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > >>> LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt- > >>> RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- > >>> RootCap: CRSVisible- > >>> RootSta: PME ReqID 0000, PMEStatus- PMEPending- > >>> DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd- > >>> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd- > >>> LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis- > >>> Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- > >>> Compliance De-emphasis: -6dB > >>> LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- > >>> EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- > >>> > >> > >> Looks promising :) > >> > > > > POC seems to work, though this may additionally need to restore wifi > > state as well, no guarantees there. > > This probably will be next topic. Can you address some comments in the > review and create a pull request in the github repo? > > > > >> str 40018 3 > > 00040018 : 00000003 > >> > > Retry(1) failed PCIe access @0x10ff4038 > > Before: int_mask=0 app=ffc1 reset=0 > > After: int_mask=0 app=ffc1 reset=7 > > wlan int status=0 > > > >> str 40018 3 > > 00040018 : 00000003 > >> > > Retry(1) failed PCIe access @0x10ff4038 > > Before: int_mask=0 app=ffc1 reset=0 > > After: int_mask=0 app=ffc1 reset=7 > > wlan int status=0 > >> > > > > > > diff --git a/target_firmware/magpie_fw_dev/target/init/app_start.c b/target_firmware/magpie_fw_dev/target/init/app_start.c > > index 8fa9c8b..fea62c1 100644 > > --- a/target_firmware/magpie_fw_dev/target/init/app_start.c > > +++ b/target_firmware/magpie_fw_dev/target/init/app_start.c > > @@ -137,6 +137,13 @@ void __section(boot) __noreturn __visible app_start(void) > > > > A_PRINTF(" A_WDT_INIT()\n\r"); > > > > +#if defined(PROJECT_MAGPIE) > > please, use /**/ style comments. > > > + // For some reason needs to be called again here for the > > + // exception handlers to work properly, at least on the XBOX > > + // adapter. > > + fatal_exception_func(); > > +#endif > > + > > #if defined(PROJECT_K2) > > save_cmnos_printf = fw_cmnos_printf; > > #endif > > diff --git a/target_firmware/magpie_fw_dev/target/init/init.c b/target_firmware/magpie_fw_dev/target/init/init.c > > index 7484c05..cad2519 100755 > > --- a/target_firmware/magpie_fw_dev/target/init/init.c > > +++ b/target_firmware/magpie_fw_dev/target/init/init.c > > @@ -212,6 +212,78 @@ LOCAL void zfGenWrongEpidEvent(uint32_t epid) > > mUSB_EP3_XFER_DONE(); > > } > > > > +static void > > +AR7010_pcie_reset(void) > > +{ > > +#define PCIE_RC_ACCESS_DELAY 20 > > + > > +#define PCI_RC_RESET_BIT BIT6 > > +#define PCI_RC_PHY_RESET_BIT BIT7 > > +#define PCI_RC_PLL_RESET_BIT BIT8 > > +#define PCI_RC_PHY_SHIFT_RESET_BIT BIT10 > > + > > +#define HAL_WORD_REG_WRITE(addr, val) do { *((uint32_t*)(addr)) = val; } while (0) > > +#define HAL_WORD_REG_READ(addr) (*((uint32_t*)(addr))) > > we already have iowrite32* ioread32* functions, why do we need more? > > > +#define CMD_PCI_RC_RESET_ON() HAL_WORD_REG_WRITE(MAGPIE_REG_RST_RESET_ADDR, \ > > + (HAL_WORD_REG_READ(MAGPIE_REG_RST_RESET_ADDR)| \ > > + (PCI_RC_PHY_SHIFT_RESET_BIT|PCI_RC_PLL_RESET_BIT|PCI_RC_PHY_RESET_BIT|PCI_RC_RESET_BIT))) > > + > > +#define CMD_PCI_RC_RESET_CLR() HAL_WORD_REG_WRITE(MAGPIE_REG_RST_RESET_ADDR, \ > > + (HAL_WORD_REG_READ(MAGPIE_REG_RST_RESET_ADDR)& \ > > + (~(PCI_RC_PHY_SHIFT_RESET_BIT|PCI_RC_PLL_RESET_BIT|PCI_RC_PHY_RESET_BIT|PCI_RC_RESET_BIT)))) > > + > > + int i; > > + > > + CMD_PCI_RC_RESET_ON(); > > + A_DELAY_USECS(PCIE_RC_ACCESS_DELAY); > > + > > + /* dereset the reset */ > > + CMD_PCI_RC_RESET_CLR(); > > + A_DELAY_USECS(500); > > + > > + /* 7. set bus master and memory space enable */ > > + DEBUG_SYSTEM_STATE = (DEBUG_SYSTEM_STATE&(~0xff)) | 0x45; > > + HAL_WORD_REG_WRITE(0x00020004, (HAL_WORD_REG_READ(0x00020004)|(BIT1|BIT2))); > > + A_DELAY_USECS(PCIE_RC_ACCESS_DELAY); > > + > > + /* 7.5. asser pcie_ep reset */ > > + HAL_WORD_REG_WRITE(0x00040018, (HAL_WORD_REG_READ(0x00040018) & ~(0x1 << 2))); > > + A_DELAY_USECS(PCIE_RC_ACCESS_DELAY); > > + > > + /* 7.5. de-asser pcie_ep reset */ > > + HAL_WORD_REG_WRITE(0x00040018, (HAL_WORD_REG_READ(0x00040018)|(0x1 << 2))); > > + A_DELAY_USECS(PCIE_RC_ACCESS_DELAY); > > + > > + /* 8. set app_ltssm_enable */ > > + DEBUG_SYSTEM_STATE = (DEBUG_SYSTEM_STATE&(~0xff)) | 0x46; > > + HAL_WORD_REG_WRITE(0x00040000, (HAL_WORD_REG_READ(0x00040000)|0xffc1)); > > + > > + /*! > > + * Receive control (PCIE_RESET), > > + * 0x40018, BIT0: LINK_UP, PHY Link up -PHY Link up/down indicator > > + * in case the link up is not ready and we access the 0x14000000, > > + * vmc will hang here > > + */ > > + > > + /* poll 0x40018/bit0 (1000 times) until it turns to 1 */ > > + i = 10000; > > + while(i-->0) > > + { > > + uint32_t reg_value = HAL_WORD_REG_READ(0x00040018); > > + if( reg_value & BIT0 ) > > + break; > > + A_DELAY_USECS(PCIE_RC_ACCESS_DELAY); > > + } > > + > > + HAL_WORD_REG_WRITE(0x14000004, (HAL_WORD_REG_READ(0x14000004)|0x116)); > > + A_DELAY_USECS(PCIE_RC_ACCESS_DELAY); > > + > > + HAL_WORD_REG_WRITE(0x14000010, (HAL_WORD_REG_READ(0x14000010)|EEPROM_CTRL_BASE)); > > +} > > + > > +static int exception_retries = 0; > > + > > void > > AR6002_fatal_exception_handler_patch(CPU_exception_frame_t *exc_frame) > > { > > @@ -226,6 +298,32 @@ AR6002_fatal_exception_handler_patch(CPU_exception_frame_t *exc_frame) > > dump.pc = exc_frame->xt_pc; > > dump.assline = 0; > > i would prefer to put it in to separate function. may be, complete pci > code in a separate file? > > > + if (dump.badvaddr >= 0x10000000 && > > + dump.badvaddr < 0x18000000) { > > if (!bla) > return; > > > + // Exception while accessing PCIe memory space. > > + volatile uint32_t *pcie_app = (uint32_t*) 0x40000; > > + volatile uint32_t *pcie_reset = (uint32_t*) 0x40018; > > + volatile uint32_t *pcie_int_mask = (uint32_t*) 0x40050; > > magic values should be replaced. > > > + // Maybe retry. > > + if (++exception_retries < 2) { > > if (!bla) > return; > > > + A_PRINTF("\nRetry(%d) failed PCIe access @0x%x\n", > > + exception_retries, dump.badvaddr); > > + A_PRINTF("Before: int_mask=%x app=%x reset=%x\n", *pcie_int_mask, *pcie_app, *pcie_reset); > > + > > + AR7010_pcie_reset(); > > + > > + A_PRINTF("After: int_mask=%x app=%x reset=%x\n", *pcie_int_mask, *pcie_app, *pcie_reset); > > + > > + // This should recurse if we failed to recover. > > + A_PRINTF("wlan int status=%x\n", HAL_WORD_REG_READ(0x10ff4038)); > > + > > + // Reset retry counter. > > + exception_retries = 0; > > + return; > > + } > > + } > > + > > zfGenExceptionEvent(dump.exc_frame.xt_exccause, dump.pc, dump.badvaddr); > > > > #if SYSTEM_MODULE_PRINT > > I'm exciting to see it mainline. Thank you for your work! > > -- > Regards, > Oleksij > > _______________________________________________ > ath9k_htc_fw mailing list > ath9k_htc_fw@xxxxxxxxxxxxxxxxxxx > http://lists.infradead.org/mailman/listinfo/ath9k_htc_fw -- Tobias PGP: http://8ef7ddba.uguu.de
Attachment:
signature.asc
Description: Digital signature