On Thu, 11 Jun 2009 15:52:55 -0600 Matthew Wilcox <matthew@xxxxxx> wrote: > On Thu, Jun 11, 2009 at 02:23:11PM -0700, Jesse Barnes wrote: > > +++ b/arch/x86/pci/Makefile > > @@ -13,5 +13,5 @@ obj-$(CONFIG_X86_VISWS) += visws.o > > > > obj-$(CONFIG_X86_NUMAQ) += numaq_32.o > > > > -obj-y += common.o early.o > > +obj-y += common.o early.o mrst.o > > We don't want a CONFIG_PCI_MOORESTOWN, like we have for > CONFIG_PCI_OLPC? Sure, I suppose so. > > +++ b/arch/x86/pci/init.c > > @@ -14,6 +14,10 @@ static __init int pci_arch_init(void) > > > > if (!(pci_probe & PCI_PROBE_NOEARLY)) > > pci_mmcfg_early_init(); > > +#ifdef CONFIG_PCI_MMCONFIG > > + pci_mmcfg_late_init(); > > + pci_mrst_init(); > > +#endif > > We should have empty stubs for these functions for the !MMCONFIG case. Good point, will fix. > > +++ b/arch/x86/pci/irq.c > > pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); > > +#if defined(CONFIG_X86_IO_APIC) && defined(CONFIG_SFI) > > + /* For platforms only have IOAPIC, the PCI irq line is 1:1 > > mapped to > > + * IOAPIC RTE entries, so we just enable RTE for the > > device. > > + */ > > The comment is a little garbled. Is the assumption here that all SFI > platforms have an IOAPIC? I think that's the case, but I'll let Jacob answer that. > > - if (!acpi_disabled || pci_mmcfg_config_num || > > mcp55_checked) +#ifdef CONFIG_ACPI > > + > > + if (!acpi_disabled) > > + return NULL; > > +#endif > > + if (pci_mmcfg_config_num || mcp55_checked) > > return NULL; > > acpi_disabled definitely should be 0 if CONFIG_ACPI isn't set. Oh good, one less #ifdef. > > - if (!known_bridge) > > + if (!known_bridge) { > > +#ifdef CONFIG_SFI > > + sfi_table_parse(SFI_SIG_MCFG, sfi_parse_mcfg); > > +#else > > acpi_table_parse(ACPI_SIG_MCFG, acpi_parse_mcfg); > > +#endif > > + } > > SFI and ACPI are mutually exclusive? I think this bit was addressed in another patch actually. And collapsed into a single function. I'll drop this hunk. > > @@ -12,6 +12,7 @@ > > #include <linux/pci.h> > > #include <linux/init.h> > > #include <linux/acpi.h> > > +#include <linux/sfi.h> > > #include <asm/e820.h> > > #include <asm/pci_x86.h> > > > > That's the only change to this file? Looks stray. Yep, messy commit. Will drop. > > diff --git a/arch/x86/pci/mrst.c b/arch/x86/pci/mrst.c > > new file mode 100644 > > index 0000000..1d69381 > > --- /dev/null > > +++ b/arch/x86/pci/mrst.c > > @@ -0,0 +1,236 @@ > > +/* > > + * Moorestown PCI support > > + * Copyright (c) 2008 Intel Corporation > > + * Jesse Barnes <jesse.barnes@xxxxxxxxx> > > + * > > + * Moorestown has an interesting PCI implementation: > > + * - configuration space is memory mapped (as defined by MCFG) > > + * - Lincroft devices also have a real, type 1 configuration > > space > > + * - Early Lincroft silicon has a type 1 access bug that will > > cause > > + * a hang if non-existent devices are accessed > > + * - some devices have the "fixed BAR" capability, which means > > + * they can't be relocated or modified; check for that during > > + * BAR sizing > > + * > > + * So, we use the MCFG space for all reads and writes, but also > > send > > + * Lincroft writes to type 1 space. But only read/write if the > > device > > + * actually exists, otherwise return all 1s for reads and bit > > bucket > > + * the writes. > > + * > > + * Here we assume that type 1 and MCFG based access has already > > been set up, > > + * such that raw_pci_ops is type 1 and raw_pci_ext_ops is MCFG. > > How about using pci_direct_conf1 explicitly? We can make pci_mmcfg > non-static too. Hm, I'll look at that. I went back and forth on a couple of possibilities here. > > > + */ > > + > > +#include <linux/sched.h> > > +#include <linux/pci.h> > > +#include <linux/ioport.h> > > +#include <linux/init.h> > > +#include <linux/dmi.h> > > + > > +#include <asm/acpi.h> > > +#include <asm/segment.h> > > +#include <asm/io.h> > > +#include <asm/smp.h> > > +#include <asm/pci_x86.h> > > +/* > > + * Yuck, we really need to add PCIe capability functions to the > > core > > + */ > > +#define PCIE_CAP_OFFSET 0x100 > > We've used PCI_CFG_SPACE_SIZE for this value before now ... That would work, will fix when I move this to the core. > > +/* Fixed BAR fields */ > > +#define PCIE_VNDR_CAP_ID_FIXED_BAR 0x00 /* Fixed BAR (TBD) > > */ +#define PCI_FIXED_BAR_0_SIZE 0x04 > > +#define PCI_FIXED_BAR_1_SIZE 0x08 > > +#define PCI_FIXED_BAR_2_SIZE 0x0c > > +#define PCI_FIXED_BAR_3_SIZE 0x10 > > +#define PCI_FIXED_BAR_4_SIZE 0x14 > > +#define PCI_FIXED_BAR_5_SIZE 0x1c > > + > > +int fixed_bar_cap(struct pci_bus *bus, unsigned int devfn) > > +{ > > + int pos; > > + u32 pcie_cap = 0, cap_data; > > + > > + pos = PCIE_CAP_OFFSET; > > + while (pos) { > > + if (raw_pci_ext_ops->read(pci_domain_nr(bus), > > bus->number, > > + devfn, pos, 4, > > &pcie_cap)) > > + return 0; > > + > > + if (pcie_cap == 0xffffffff) > > + return 0; > > + > > + if (PCI_EXT_CAP_ID(pcie_cap) == > > PCI_EXT_CAP_ID_VNDR) { > > + raw_pci_ext_ops->read(pci_domain_nr(bus), > > bus->number, > > + devfn, pos + 4, 4, > > &cap_data); > > + if ((cap_data & 0xffff) == > > PCIE_VNDR_CAP_ID_FIXED_BAR) > > + return pos; > > + } > > + > > + pos = pcie_cap >> 20; > > + } > > + > > + return 0; > > +} > > Do we need to find this before we can call pci_find_ext_capability()? > Can we create pci_bus_find_ext_capability() analagous to > pci_bus_find_capability()? I hate to see this kind of functionality > hidden in an arch file somewhere. Agreed, I'll pull it out into generic code (as you can see by the comment it's something I should have done already). > > +static int pci_device_update_fixed(struct pci_bus *bus, unsigned > > int devfn, > > + int reg, int len, u32 val, int > > offset) +{ > > + u32 size; > > + unsigned int domain, busnum; > > + int bar = (reg - PCI_BASE_ADDRESS_0) >> 2; > > + > > + domain = pci_domain_nr(bus); > > + busnum = bus->number; > > + > > + if (val == ~0 && len == 4) { > > + unsigned long decode; > > + > > + raw_pci_ext_ops->read(domain, busnum, devfn, > > + offset + 8 + (bar * 4), 4, > > &size); + > > + /* Turn the size into a decode pattern for the > > sizing code */ > > + if (size) { > > + decode = size - 1; > > + decode |= decode >> 1; > > + decode |= decode >> 2; > > + decode |= decode >> 4; > > + decode |= decode >> 8; > > + decode |= decode >> 16; > > + decode++; > > + decode = ~(decode - 1); > > + } else { > > + decode = ~0; > > + } > > + > > + /* > > + * If val is all ones, the core code is trying to > > size the reg, > > + * so update the mmconfig space with the real size. > > + * > > + * Note: this assumes the fixed size we got is a > > power of two. > > + */ > > + return raw_pci_ext_ops->write(domain, busnum, > > devfn, reg, 4, > > + decode); > > + } > > + > > + /* This is some other kind of BAR write, so just do it. */ > > + return raw_pci_ext_ops->write(domain, busnum, devfn, reg, > > len, val); +} > > What if we made __pci_read_base() a __weak function, and you overrode > it that way? I'll look into that; might be simpler. > > diff --git a/include/linux/pci_regs.h b/include/linux/pci_regs.h > > index 616bf8b..4b412ae 100644 > > --- a/include/linux/pci_regs.h > > +++ b/include/linux/pci_regs.h > > @@ -503,6 +503,7 @@ > > #define PCI_EXT_CAP_ID_PWR 4 > > #define PCI_EXT_CAP_ID_ARI 14 > > #define PCI_EXT_CAP_ID_SRIOV 16 > > +#define PCI_EXT_CAP_ID_VNDR 11 > > I think these should be kept in order by value ... Yep, will fix. Thanks a lot for the review. -- Jesse Barnes, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html