On 24/06/19 4:08 PM, Chris Packham wrote: > Hi Thomas, > > On 21/06/19 6:17 PM, Thomas Petazzoni wrote: >> Hello Chris, >> >> On Fri, 21 Jun 2019 04:03:27 +0000 >> Chris Packham <Chris.Packham@xxxxxxxxxxxxxxxxxxx> wrote: >> >>> I'm in the process of updating the kernel version used on our products >>> from 4.4 -> 5.1. >>> >>> We have one product that uses a Kirkwood CPU, IDT PCI bridge and Marvell >>> Switch ASIC. The Switch ASIC presents as multiple PCI devices. >>> >>> The hardware setup looks like this >>> __________ >>> [ Kirkwood ] --- [ IDT 5T5 ] ---+--- | | >>> +--- | Switch | >>> +--- | | >>> +--- |__________| >>> >>> On the 4.4 based kernel things are fine >>> >>> [root@awplus flash]# lspci -t >>> -[0000:00]---01.0-[01-06]----00.0-[02-06]--+-02.0-[03]----00.0 >>> +-03.0-[04]----00.0 >>> +-04.0-[05]----00.0 >>> \-05.0-[06]----00.0 >>> >>> But on the 5.1 based kernel things get a little weird >>> >>> [root@awplus flash]# lspci -t >>> -[0000:00]---01.0-[01-06]--+-00.0-[02-06]-- >>> +-01.0 >>> +-02.0-[02-06]-- >>> +-03.0-[02-06]-- >>> +-04.0-[02-06]-- >>> +-05.0-[02-06]-- >>> +-06.0-[02-06]-- >>> +-07.0-[02-06]-- >>> +-08.0-[02-06]-- >>> +-09.0-[02-06]-- >>> +-0a.0-[02-06]-- >>> +-0b.0-[02-06]-- >>> +-0c.0-[02-06]-- >>> +-0d.0-[02-06]-- >>> +-0e.0-[02-06]-- >>> +-0f.0-[02-06]-- >>> +-10.0-[02-06]-- >>> +-11.0-[02-06]-- >>> +-12.0-[02-06]-- >>> +-13.0-[02-06]-- >>> +-14.0-[02-06]-- >>> +-15.0-[02-06]-- >>> +-16.0-[02-06]-- >>> +-17.0-[02-06]-- >>> +-18.0-[02-06]-- >>> +-19.0-[02-06]-- >>> +-1a.0-[02-06]-- >>> +-1b.0-[02-06]-- >>> +-1c.0-[02-06]-- >>> +-1d.0-[02-06]-- >>> +-1e.0-[02-06]-- >>> \-1f.0-[02-06]--+-02.0-[03]----00.0 >>> +-03.0-[04]----00.0 >>> +-04.0-[05]----00.0 >>> \-05.0-[06]----00.0 >>> >>> >>> I'll start bisecting to see where things started going wrong. I just >>> wondered if this rings any bells for anyone. >> >> I am almost sure that the culprit is >> 1f08673eef1236f7d02d93fcf596bb8531ef0d12 ("PCI: mvebu: Convert to PCI >> emulated bridge config space"). > > The problem seems to pre-date this commit. I've gone back as far as 4.18 > and the problem still exists (in fact there are more duplicate devices). > I'll keep going back (unfortunately due to out platform being out of > tree it's not a simple bisect). > >> I still think it makes sense to share the bridge emulation code between >> the mvebu and aardvark drivers, but this sharing has required making >> the code very different, with lots of subtle differences in behavior in >> how registers are emulated. > > Agreed. Bugs love to hide in duplicated code. > > I will admit to being ignorant about the need for an emulated bridge. I > know it has something to do with the type of transaction used for the > downstream devices. I also know that these systems won't work without an > emulated bridge. > >> Unfortunately, I don't have access to one of these complicated PCI >> setup with a HW switch on the way, so I couldn't test this kind of >> setups. >> >> Do you mind helping with figuring out what the issues are ? That would >> be really nice. > > No problem. As I said I'll keep going to find a point where behaviour > turns bad for me. I suspect we might find other problems along the way. > Some progress. Our defconfig had CONFIG_CMDLINE="pci=pcie_scan_all" in it. This dated back to before we were using a devicetree with our kirkwood platforms. At some point this started having an effect on the emulated bridge.