On 10/24/2024 12:46, Bjorn Helgaas wrote:
On Thu, Oct 24, 2024 at 12:01:59PM -0400, Yazen Ghannam wrote:
On Wed, Oct 23, 2024 at 12:59:28PM -0500, Bjorn Helgaas wrote:
On Wed, Oct 23, 2024 at 05:21:34PM +0000, Yazen Ghannam wrote:
Hi all,
The theme of this set is decoupling the "AMD node" concept from the
legacy northbridge support.
Additionally, AMD System Management Network (SMN) access code is
decoupled and expanded too.
Patches 1-3 begin reducing the scope of AMD_NB.
Patches 4-9 begin moving generic AMD node support out of AMD_NB.
Patches 10-13 move SMN support out of AMD_NB and do some refactoring.
Patch 14 has HSMP reuse SMN functionality.
Patches 15-16 address userspace access to SMN.
I say "begin" above because there is more to do here. Ultimately, AMD_NB
should only be needed for code used on legacy systems with northbridges.
Also, any and all SMN users in the kernel need to be updated to use the
central SMN code. Local solutions should be avoided.
Glad to see many of the PCI device IDs going away; thanks for working
on that!
The use of pci_get_slot() and pci_get_domain_bus_and_slot() is not
ideal since all those pci_get_*() interfaces are kind of ugly in my
opinion, and using them means we have to encode topology details in
the kernel. But this still seems like a big improvement.
Thanks for the feedback. Hopefully, we'll come to some improved
solution. :)
Can you please elaborate on your concern? Is it about saying "thing X is
always at SBDF A:B:C.D" or something else?
"Thing X is always at SBDF A:B:C.D" is one big reason. "A:B:C.D" says
nothing about the actual functionality of the device. A PCI
Vendor/Device ID or a PNP ID identifies the device programming model
independent of its geographical location. Inferring the functionality
and programming model from the location is a maintenance issue because
hardware may change the address.
PCI bus numbers are under software control, so in general it's not
safe to rely on them, although in this case these devices are probably
on root buses where the bus number is either fixed or determined by
BIOS configuration of the host bridge.
I don't like the pci_get_*() functions because they break the driver
model. The usual .probe() model binds a device to a driver, which
essentially means the driver owns the device and its resources, and
the driver and doesn't have to worry about other code interfering.
Are you suggesting that perhaps we should be introducing amd_smn (patch
10) as a PCI driver that binds "to the root device" instead?
If we made this change, I would wonder if it comes up early enough,
particularly considering quirk_clear_strap_no_soft_reset_dev2_f0() uses
the SMN symbols as PCI fixup final which happens before a driver
attaches (pci_bus_add_device()).
There are some areas that do discovery (for example amd_node_get_root()
in patch 6/16).
I think we should aspire to do is much discovery as possible but I don't
know we can get TOTALLY away from some hardcoded topology information.
Unlike pci_get_*(), the .probe()/.remove() model automatically handles
hotplug without extra things like notifiers in the driver. Hotplug
may not be an issue in this particular case, but it requires specific
platform knowledge to be sure. Some platforms do support CPU and PCI
host bridge hotplug.
Yeah hotplug won't matter for these.
Thanks again for doing all this work. It's a huge improvement
already!