> > On 25.10.2019 13:46, Lorenzo Bianconi wrote: > >> On 25.10.2019 01:07, Lorenzo Bianconi wrote: > >>>> On 24.10.2019 23:54, Lorenzo Bianconi wrote: > >>>>>> On 24.10.2019 00:23, Lorenzo Bianconi wrote: > >>>>>>> On same device (e.g. U7612E-H1) PCIE_ASPM causes continuous mcu hangs and > >>>>>>> instability and so let's disable PCIE_ASPM by default. This patch has > >>>>>>> been successfully tested on U7612E-H1 mini-pice card > >>>>>>> > >>>>>>> Signed-off-by: Felix Fietkau <nbd@xxxxxxxx> > >>>>>>> Signed-off-by: Lorenzo Bianconi <lorenzo@xxxxxxxxxx> > >>>>>>> --- > >>>>>>> drivers/net/wireless/mediatek/mt76/mmio.c | 47 +++++++++++++++++++ > >>>>>>> drivers/net/wireless/mediatek/mt76/mt76.h | 1 + > >>>>>>> .../net/wireless/mediatek/mt76/mt76x2/pci.c | 2 + > >>>>>>> 3 files changed, 50 insertions(+) > >>>>>>> > >>>>> > >>>>> [...] > >>>>> > >>>>>>> + > >>>>>>> + if (parent) > >>>>>>> + pcie_capability_clear_word(parent, PCI_EXP_LNKCTL, > >>>>>>> + aspm_conf); > >>>>>> > >>>>>> + linux-pci mailing list > >>>>> > >>>>> Hi Heiner, > >>>>> > >>>>>> > >>>>>> All this seems to be legacy code copied from e1000e. > >>>>>> Fiddling with the low-level PCI(e) registers should be left to the > >>>>>> PCI core. It shouldn't be needed here, a simple call to > >>>>>> pci_disable_link_state() should be sufficient. Note that this function > >>>>>> has a return value meanwhile that you can check instead of reading > >>>>>> back low-level registers. > >>>>> > >>>>> ack, I will add it to v2 > >>>>> > >>>>>> If BIOS forbids that OS changes ASPM settings, then this should be > >>>>>> respected (like PCI core does). Instead the network chip may provide > >>>>>> the option to configure whether it activates certain ASPM (sub-)states > >>>>>> or not. We went through a similar exercise with the r8169 driver, > >>>>>> you can check how it's done there. > >>>>> > >>>>> looking at the vendor sdk (at least in the version I currently have) there are > >>>>> no particular ASPM configurations, it just optionally disables it writing directly > >>>>> in pci registers. > >>>>> Moreover there are multiple drivers that are currently using this approach: > >>>>> - ath9k in ath_pci_aspm_init() > >>>>> - tg3 in tg3_chip_reset() > >>>>> - e1000e in __e1000e_disable_aspm() > >>>>> - r8169 in rtl_enable_clock_request()/rtl_disable_clock_request() > >>>>> > >>>> All these drivers include quite some legacy code. I can mainly speak for r8169: > >>>> First versions of the driver are almost as old as Linux. And even though I > >>>> refactored most of the driver still some legacy code for older chip versions > >>>> (like the two functions you mentioned) is included. > >>>> > >>>>> Is disabling the ASPM for the system the only option to make this minipcie > >>>>> work? > >>>>> > >>>> > >>>> No. What we do in r8169: > >>>> > >>>> - call pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1) > >>>> - If it returns 0, then ASPM (including the L1 sub-states) is disabled. > >>>> - If it returns an errno, then disabling ASPM failed (most likely due to > >>>> BIOS forbidding ASPM changes - pci_disable_link_state will spit out > >>>> a related warning). In this case r8169 configures the chip to not initiate > >>>> transitions to L0s/L1 (the other end of the link may still try to enter > >>>> ASPM states). See rtl_hw_aspm_clkreq_enable(). That's sufficient > >>>> to avoid the ASPM-related problems with certain versions of this chip. > >>>> Maybe your HW provides similar functionality. > >>> > >>> yep, I looked at rtl_hw_aspm_clkreq_enable. This is more or less what I did but > >>> unfortunately there is no specific code or documentation I can use for mt76x2e. > >>> So as last chance I decided to disable ASPM directly (in this way the chip is > >>> working fine). > >>> Do you think a kernel parameter to disable ASPM directly would be acceptable? > >>> > >> Module parameters are not the preferred approach, even though some maintainers > >> may consider it acceptable. I think it should be ok if you disable ASPM per > >> default. Who wants ASPM can enable the individual states via brand-new > >> sysfs attributes (provided BIOS allows OS to control ASPM). > >> However changing ASPM settings via direct register writes may cause > >> inconsistencies between PCI core and actual settings. > >> I'm not sure whether there's any general best practice how to deal with the > >> scenario that a device misbehaves with ASPM enabled and OS isn't allowed to > >> change ASPM settings. > >> Maybe the PCI guys can advise on these points. > > > > Hi Heiner, > > > > I reviewed the mtk sdk and it seems mt7662/mt7612/mt7602 series does not > > have hw pcie ps support (not sure if it just not implemented or so). In my > > scenario without disabling ASPM the card does not work at all, so I guess we > > can proceed with current approach and then try to understand if we can do > > something better. What do you think? > > > > If there's no proper way to deal with the issue, then you may have to go with > the current approach. Just what you could do is calling pci_disable_link_state() > first and do the direct register modification limbo only as a fallback if > function returns an errno or CONFIG_PCIEASPM isn't defined. Hi Heiner, ack, will do in v2 > > It may make sense to change the return value of the pci_disable_link_state > dummy if CONFIG_PCIEASPM isn't defined: from 0 to e.g. -EOPNOTSUPP > Then in the driver it wouldn't be needed to check CONFIG_PCIEASPM. > > Interestingly in mt76x2/pci.c there's the following: > > /* Fix up ASPM configuration */ > > /* RG_SSUSB_G1_CDR_BIR_LTR = 0x9 */ > mt76_rmw_field(dev, 0x15a10, 0x1f << 16, 0x9); > > /* RG_SSUSB_G1_CDR_BIC_LTR = 0xf */ > mt76_rmw_field(dev, 0x15a0c, 0xf << 28, 0xf); > > /* RG_SSUSB_CDR_BR_PE1D = 0x3 */ > mt76_rmw_field(dev, 0x15c58, 0x3 << 6, 0x3); > > Would be helpful to know what this does and whether it may need to be adjusted. > At least in your case this fix attempt doesn't seem to be sufficient. unfortunately not. Regards, Lorenzo > > > > @Ryder, Sean: do you have any hint on this topic? > > > > Regards, > > Lorenzo > > > >> > >>> Regards, > >>> Lorenzo > >>> > >> Heiner > >> > >>>> > >>>>> Regards, > >>>>> Lorenzo > >>>>> > >>>> Heiner > >>>> > >>>>>> > >>>>>>> +} > >>>>>>> +EXPORT_SYMBOL_GPL(mt76_mmio_disable_aspm); > >>>>>>> + > >>>>>>> void mt76_mmio_init(struct mt76_dev *dev, void __iomem *regs) > >>>>>>> { > >>>>>>> static const struct mt76_bus_ops mt76_mmio_ops = { > >>>>>>> diff --git a/drivers/net/wireless/mediatek/mt76/mt76.h b/drivers/net/wireless/mediatek/mt76/mt76.h > >>>>>>> index 570c159515a0..962812b6247d 100644 > >>>>>>> --- a/drivers/net/wireless/mediatek/mt76/mt76.h > >>>>>>> +++ b/drivers/net/wireless/mediatek/mt76/mt76.h > >>>>>>> @@ -578,6 +578,7 @@ bool __mt76_poll_msec(struct mt76_dev *dev, u32 offset, u32 mask, u32 val, > >>>>>>> #define mt76_poll_msec(dev, ...) __mt76_poll_msec(&((dev)->mt76), __VA_ARGS__) > >>>>>>> > >>>>>>> void mt76_mmio_init(struct mt76_dev *dev, void __iomem *regs); > >>>>>>> +void mt76_mmio_disable_aspm(struct pci_dev *pdev); > >>>>>>> > >>>>>>> static inline u16 mt76_chip(struct mt76_dev *dev) > >>>>>>> { > >>>>>>> diff --git a/drivers/net/wireless/mediatek/mt76/mt76x2/pci.c b/drivers/net/wireless/mediatek/mt76/mt76x2/pci.c > >>>>>>> index 73c3104f8858..264bef87e5c7 100644 > >>>>>>> --- a/drivers/net/wireless/mediatek/mt76/mt76x2/pci.c > >>>>>>> +++ b/drivers/net/wireless/mediatek/mt76/mt76x2/pci.c > >>>>>>> @@ -81,6 +81,8 @@ mt76pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) > >>>>>>> /* RG_SSUSB_CDR_BR_PE1D = 0x3 */ > >>>>>>> mt76_rmw_field(dev, 0x15c58, 0x3 << 6, 0x3); > >>>>>>> > >>>>>>> + mt76_mmio_disable_aspm(pdev); > >>>>>>> + > >>>>>>> return 0; > >>>>>>> > >>>>>>> error: > >>>>>>> > >>>>>> > >>>> > >> >