On 20/01/22, 11:56 AM, "Greg KH" <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: Hi Greg, Thanks for the comments >On Tue, Jan 18, 2022 at 09:16:01AM -0800, Vikash Bansal wrote: >> In the current implementation, the PCI capability list is parsed from >> the beginning to find each capability, which results in a large number >> of redundant PCI reads. >> >> Instead, we can parse the complete list just once, store it in the >> pci_dev structure, and get the offset of each capability directly from >> the pci_dev structure. >> >> This implementation improves pci devices initialization time by ~2-3% in >> case of bare metal and 7-8% in case of VM running on ESXi. > >What is that in terms of "wall clock" time? % is hard to know here, and >of course it will depend on the PCI bus speed, right? > In terms of "wall clock" time: For bare-metal it reduced from 270ms to 261ms And for VM it reduced from 201ms to 184ms. >> It also adds a memory overhead of 20bytes (value of PCI_CAP_ID_MAX) per >> PCI device. >> >> Signed-off-by: Vikash Bansal <bvikas@xxxxxxxxxx> >> --- >> drivers/pci/pci.c | 43 ++++++++++++++++++++++++++++++++++++------- >> drivers/pci/probe.c | 5 +++++ >> include/linux/pci.h | 2 ++ >> 3 files changed, 43 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index 3d2fb394986a..8e024db30262 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -468,6 +468,41 @@ static u8 __pci_bus_find_cap_start(struct pci_bus *bus, >> return 0; >> } >> >> + >> +/** >> + * pci_find_all_capabilities - Read all capabilities >> + * @dev: the PCI device >> + * >> + * Read all capabilities and store offsets in cap_off >> + * array in pci_dev structure. >> + */ >> +void pci_find_all_capabilities(struct pci_dev *dev) >> +{ >> + int ttl = PCI_FIND_CAP_TTL; >> + u16 ent; >> + u8 pos; >> + u8 id; >> + >> + pos = __pci_bus_find_cap_start(dev->bus, dev->devfn, dev->hdr_type); >> + if (!pos) >> + return; >> + pci_bus_read_config_byte(dev->bus, dev->devfn, pos, &pos); >> + while (ttl--) { >> + if (pos < 0x40) > >What is this magic value of 0x40? > 0x40 is the start address of capability list. This code is copied from function __pci_find_next_cap_ttl >> + break; >> + pos &= ~3; > >Why ~3? > Capability start address is 4 byte aligned. This code is also copied from __pci_find_next_cap_ttl. >> + pci_bus_read_config_word(dev->bus, dev->devfn, pos, &ent); >> + id = ent & 0xff; > >Do you really need the & if you are truncating it? > Yes, this is not really required. But again, this code is copied from __pci_find_next_cap_ttl. >> + if (id == 0xff) >> + break; >> + >> + /* Read first instance of capability */ >> + if (!(dev->cap_off[id])) >> + dev->cap_off[id] = pos; > >Shouldn't you have checked this before you read the value? > Yes, will move this code >> + pos = (ent >> 8); > >What about walking the list using __pci_find_next_cap() like before? >Why is this somehow the same as the old function? > __pci_find_next_cap() is used to find a given capability, It can't be used to walk through the list in this case. >> + } >> +} >> + >> /** >> * pci_find_capability - query for devices' capabilities >> * @dev: PCI device to query >> @@ -489,13 +524,7 @@ static u8 __pci_bus_find_cap_start(struct pci_bus *bus, >> */ >> u8 pci_find_capability(struct pci_dev *dev, int cap) >> { >> - u8 pos; >> - > - pos = __pci_bus_find_cap_start(dev->bus, dev->devfn, dev->hdr_type); >> - if (pos) >> - pos = __pci_find_next_cap(dev->bus, dev->devfn, pos, cap); >> - >> - return pos; >> + return dev->cap_off[cap]; >> } >> EXPORT_SYMBOL(pci_find_capability); >> >> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c >> index 087d3658f75c..bacab12cedbb 100644 >> --- a/drivers/pci/probe.c >> +++ b/drivers/pci/probe.c >> @@ -1839,6 +1839,11 @@ int pci_setup_device(struct pci_dev *dev) >> dev->hdr_type = hdr_type & 0x7f; >> dev->multifunction = !!(hdr_type & 0x80); >> dev->error_state = pci_channel_io_normal; >> + /* >> + * Read all capabilities and store offsets in cap_off >> + * array in pci_dev structure. >> + */ > >Comment is not needed if the function name is descriptive. > ok >> + pci_find_all_capabilities(dev); > >And it is, so no need for the comment. > >> set_pcie_port_type(dev); >> >> pci_set_of_node(dev); >> diff --git a/include/linux/pci.h b/include/linux/pci.h >> index 18a75c8e615c..d221c73e67f8 100644 >> --- a/include/linux/pci.h >> +++ b/include/linux/pci.h >> @@ -326,6 +326,7 @@ struct pci_dev { >> unsigned int class; /* 3 bytes: (base,sub,prog-if) */ >> u8 revision; /* PCI revision, low byte of class word */ >> u8 hdr_type; /* PCI header type (`multi' flag masked out) */ >> + u8 cap_off[PCI_CAP_ID_MAX]; /* Offsets of all pci capabilities */ > >Did you run 'pahole' to ensure you are not adding extra padding bytes >here? > >> #ifdef CONFIG_PCIEAER >> u16 aer_cap; /* AER capability offset */ >> struct aer_stats *aer_stats; /* AER stats for this device */ >> @@ -1128,6 +1129,7 @@ void pci_sort_breadthfirst(void); >> >> u8 pci_bus_find_capability(struct pci_bus *bus, unsigned int devfn, int cap); >> u8 pci_find_capability(struct pci_dev *dev, int cap); >> +void pci_find_all_capabilities(struct pci_dev *dev); > >Why is this now a global function and not one just local to the pci >core? Who else would ever need to call it? Will make pci_find_all_capabilitie local and move it to probe.c > >thanks, > >greg k-h