On Mon, May 15, 2023 at 02:59:42PM +0300, Ilpo Järvinen wrote: > On Sun, 14 May 2023, Lukas Wunner wrote: > > On Fri, May 12, 2023 at 11:25:32AM +0300, Ilpo Järvinen wrote: > > > On Thu, 11 May 2023, Lukas Wunner wrote: > > > > On Thu, May 11, 2023 at 10:55:06AM -0500, Bjorn Helgaas wrote: > > > > > I didn't see the prior discussion with Lukas, so maybe this was > > > > > answered there, but is there any reason not to add locking to > > > > > pcie_capability_clear_and_set_word() and friends directly? > > > > > > > > > > It would be nice to avoid having to decide whether to use the locked > > > > > or unlocked versions. > > > > > > > > I think we definitely want to also offer lockless accessors which > > > > can be used in hotpaths such as interrupt handlers if the accessed > > > > registers don't need any locking (e.g. because there are no concurrent > > > > accesses). > > > > ... > All PCI_EXP_SLTSTA ones looked not real RMW but ACK bits type of writes PCI_EXP_SLTSTA, PCI_EXP_LNKSTA, etc are typically RW1C and do not need the usual RMW locking (which I think is what you were saying). > > ... > > What I think is unnecessary and counterproductive is to add wholesale > > locking of any access to the PCI Express Capability Structure. > > > > It's fine to have a single spinlock, but I'd suggest only using it > > for registers which are actually accessed concurrently by multiple > > places in the kernel. > > While it does feel entirely unnecessary layer of complexity to me, it would > be possible to rename the original pcie_capability_clear_and_set_word() to > pcie_capability_clear_and_set_word_unlocked() and add this into > include/linux/pci.h: > > static inline int pcie_capability_clear_and_set_word(struct pci_dev *dev, > int pos, u16 clear, u16 set) > { > if (pos == PCI_EXP_LNKCTL || pos == PCI_EXP_LNKCTL2 || > pos == PCI_EXP_RTCTL) > pcie_capability_clear_and_set_word_locked(...); > else > pcie_capability_clear_and_set_word_unlocked(...); > } > > It would keep the interface exactly the same but protect only a selectable > set of registers. As pos is always a constant, the compiler should be able > to optimize all the dead code away. > > Would that be ok then? Sounds like you have a pretty strong opinion, Lukas, but I guess I don't really understand the value of having locked and unlocked variants of RMW accessors. Config accesses are relatively slow and I don't think they're used in performance-sensitive paths. I would expect the lock to be uncontended and cheap relative to the config access itself, but I have no actual numbers to back up my speculation. Is the performance win worth the extra complexity in callers? Bjorn