On Fri, Mar 10, 2017 at 12:20:54PM +0800, Shawn Lin wrote: > On 2017/3/10 11:22, Shawn Lin wrote: > >On 2017/3/10 10:46, Brian Norris wrote: > >>Currently, if we try to unbind the platform device, the remove will > >>succeed, but the removal won't undo most of the registration, leaving > >>partially-configured PCI devices in the system. > >> > >>This allows, for example, a simple 'lspci' to crash the system, as it > >>will try to touch the freed (via devm_*) driver structures. > >> > >>So let's implement device remove(). > >> > > > >As this patchset seems to be merged together so I think the following > >warning will be ok? if my git-am robot only pick your patch 1->compile-> > >patch 2->compile->patch 3 then > > > >drivers/pci/host/pcie-rockchip.c: In function 'rockchip_pcie_remove': > >drivers/pci/host/pcie-rockchip.c:1435:2: error: implicit declaration of > >function 'pci_unmap_iospace' [-Werror=implicit-function-declaration] > > pci_unmap_iospace(rockchip->io); I'm not sure what you're doing here, but when I test patches 1-3, this all compiles fine for me. Maybe you're testing an old kernel that does not have pci_unmap_iospace()? > >but I guess you may need to move your patch 4 ahead of patch 3? No. Patch 4 is only necessary for building modules that can use those functions; your PCIe driver doesn't build as a module until patch 5. I'm going to guess that you're testing your hacky vendor tree, and not pure upstream. > Well, I am not sure if something is wrong here. > > But when booting up the system for the first time, we got > [ 0.527263] PCI host bridge /pcie@f8000000 ranges: > [ 0.527293] MEM 0xfa000000..0xfa5fffff -> 0xfa000000 > [ 0.527308] IO 0xfa600000..0xfa6fffff -> 0xfa600000 > [ 0.527544] rockchip-pcie f8000000.pcie: PCI host bridge to bus 0000:0 > > so the hierarchy(lspci -t) looks like: > lspci -t > -[0000:00]---00.0-[01]----00.0 > > and lspci > 0000:00:00.0 Class 0604: Device 1d87:0100 > 0001:01:00.0 Class 0108: Device 8086:f1a5 (rev 03) > > but if I did unbind and bind, the bus number is different. > > lspci > 0001:00:00.0 Class 0604: Device 1d87:0100 > 0001:01:00.0 Class 0108: Device 8086:f1a5 (rev 03) > > lspci -t > -+-[0001:00]---00.0-[01]----00.0 > \-[0000:00]- > > This hierarchy looks wrong to me. That looks like it's somewhat an artifact of lspci's tree view, which manufactures the 0000:00 root. I might comment on your "RFT" patch too but... it does touch on the problem (that the domain numbers don't get reused), but I don't think it's a good idea. What if we add another domain after the Rockchip PCIe domain? You'll clobber all the domain allocations the first time you remove the Rockchip one. Instead, if we want to actually stabilize this indexing across hotplug, we need the core PCI code to take care of this for us. e.g., maybe use one of the existing indexing ID mechanisms in the kernel, like IDR? Anyway, other than the bad lspci -t output, is there any actual bug here? I didn't think the domain numbers were actually so special. Brian