On Thu, Oct 24, 2024 at 12:01:59PM -0400, Yazen Ghannam wrote: > On Wed, Oct 23, 2024 at 12:59:28PM -0500, Bjorn Helgaas wrote: > > On Wed, Oct 23, 2024 at 05:21:34PM +0000, Yazen Ghannam wrote: > > > Hi all, > > > > > > The theme of this set is decoupling the "AMD node" concept from the > > > legacy northbridge support. > > > > > > Additionally, AMD System Management Network (SMN) access code is > > > decoupled and expanded too. > > > > > > Patches 1-3 begin reducing the scope of AMD_NB. > > > > > > Patches 4-9 begin moving generic AMD node support out of AMD_NB. > > > > > > Patches 10-13 move SMN support out of AMD_NB and do some refactoring. > > > > > > Patch 14 has HSMP reuse SMN functionality. > > > > > > Patches 15-16 address userspace access to SMN. > > > > > > I say "begin" above because there is more to do here. Ultimately, AMD_NB > > > should only be needed for code used on legacy systems with northbridges. > > > Also, any and all SMN users in the kernel need to be updated to use the > > > central SMN code. Local solutions should be avoided. > > > > Glad to see many of the PCI device IDs going away; thanks for working > > on that! > > > > The use of pci_get_slot() and pci_get_domain_bus_and_slot() is not > > ideal since all those pci_get_*() interfaces are kind of ugly in my > > opinion, and using them means we have to encode topology details in > > the kernel. But this still seems like a big improvement. > > Thanks for the feedback. Hopefully, we'll come to some improved > solution. :) > > Can you please elaborate on your concern? Is it about saying "thing X is > always at SBDF A:B:C.D" or something else? "Thing X is always at SBDF A:B:C.D" is one big reason. "A:B:C.D" says nothing about the actual functionality of the device. A PCI Vendor/Device ID or a PNP ID identifies the device programming model independent of its geographical location. Inferring the functionality and programming model from the location is a maintenance issue because hardware may change the address. PCI bus numbers are under software control, so in general it's not safe to rely on them, although in this case these devices are probably on root buses where the bus number is either fixed or determined by BIOS configuration of the host bridge. I don't like the pci_get_*() functions because they break the driver model. The usual .probe() model binds a device to a driver, which essentially means the driver owns the device and its resources, and the driver and doesn't have to worry about other code interfering. Unlike pci_get_*(), the .probe()/.remove() model automatically handles hotplug without extra things like notifiers in the driver. Hotplug may not be an issue in this particular case, but it requires specific platform knowledge to be sure. Some platforms do support CPU and PCI host bridge hotplug. Thanks again for doing all this work. It's a huge improvement already! Bjorn