On Wed, Aug 14, 2013 at 8:48 PM, rui wang <ruiv.wang@xxxxxxxxx> wrote: > On 8/15/13, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: >> On Wed, Aug 14, 2013 at 5:32 PM, Mauro Carvalho Chehab >> <m.chehab@xxxxxxxxxxx> wrote: >>> >>> Em Wed, 14 Aug 2013 16:04:06 -0600 >>> Bjorn Helgaas <bhelgaas@xxxxxxxxxx> escreveu: >>> >>> > On Wed, Aug 14, 2013 at 3:45 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> >>> > wrote: >>> > > I'm working on a driver for the Sandy Bridge iMC SMBUS controller. >>> > > It >>> > > (mostly) lives on pci device 15, function 0. So my driver registers >>> > > as an ordinary PCI driver for that device (8086.3ca8). >>> > > >>> > > The problem is that sb_edac also registers for 8086.3ca8. This means >>> > > that the drivers conflict. The sb_edac driver is actually driving >>> > > functionality that's split between eleven (!) different pci devices, >>> > > none of which overlaps the stuff I'm driving. >>> > > >>> > > What's the Right Way (tm) to handle this? Should I just modify >>> > > sb_edac to pick a different one of the 11 devices to probe? Is there >>> > > some standard way to handle this in the PCI code? >>> >>> The sb-edac driver needs to see lots of different devices in order to >>> work, as the memory controller registers are on different PCI IDs. >>> >>> So, there's not much that can be done there, I'm afraid. >> >> Would you be okay with changing the PCI ID that the sb_edac driver >> binds to? That is, let it claim any of the other mess of devices it >> talks to. >> >> Alternatively, I could shove the whole thing into the edac-sbridge module. >> >>> > > What you really need is just to access a few registers on that device, > which doesn't sound reasonable enough to claim the entire device. > Sb_edac is much more generic. > > What I can suggest is that you may still let sb_edac own the device > and handle the register access needed by you. Your iMC-SMB, if large > enough, can be in a separate module calling into sb_edac. > How does my module's modalias work? The APIs I'd need are basically reads and writes of dwords, so exporting functions for that seems mostly useless. Perhaps I should just add the functionality to sb_edac directly, possibly controlled by a config option. At least I can share the code to map bus number to socket. > >>> >>> Btw, we almost added a code there to also access the SMBUS, in order to >>> enumerate the memories. We ended by not adding, because we were afraid >>> or risking to race with BIOS. A race while access an I2C EEPROM memory >>> can be very bad, as a read operation might be understood as write, thus >>> destroying the DIMM configuration. >> >> The controller is, indeed, amazingly racy. But I have access to >> (NDAed) reference code for this, and I need it for NVDIMM support, >> which is a real thing that AFAICT Intel's trying to support, so this >> stuff is really supposed to work. (The reference code performs >> undocumented incantations that I can't (yet?) talk about.) >> >> Is there any way to get sb_edac to expose, for each DIMM, the physical >> address range that the DIMM maps to (or at least the range that >> contains all the interleaved bits of it). That would give me a nice >> sanity check to validate that the firmware isn't lying about the >> NVDIMM aperture. >> >> (Don't use my old i2c_imc submission as a valid sample -- it blows up >> terribly when closed-loop thermal throttling is on.) >> > > Closed Loop Thermal Throttling is where the BIOS constantly accesses > the SMBus and where the race occurs. If you solved the problem with > NDAed information, wouldn't it violate your NDA? Not if I get permission first. One of the vendors involved is quite cooperative :) FWIW, I can't tell from the (public) docs whether it's BIOS accessing SMBus, the chipset accessing SMBus, or both. I have three iMC development machines. One (Core i7 Extreme) has no TSODs at all and works fine. One (Xeon E5) has TSODs, appears to probe SMBus in hardware, and also works fine without magic. The third (a Supermicro board) has an explicit CLTT BIOS setting and does not work if I just follow the instructions in the public docs. I don't actually have private docs, but I'm planning on converting the reference code to something usable tomorrow to see how well it works. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html