On 8/14/19 12:09 PM, Dan Williams wrote: > On Wed, Aug 14, 2019 at 7:34 AM Stephen Douthit > <stephend@xxxxxxxxxxxxxxx> wrote: >> >> On 8/13/19 6:07 PM, Dan Williams wrote: >>> On Tue, Aug 13, 2019 at 12:31 AM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: >>>> >>>> On Mon, Aug 12, 2019 at 12:31:35PM -0700, Dan Williams wrote: >>>>> It seems platforms / controllers that fail to run the option-rom >>>>> should be quirked by device-id, but the PCS register twiddling be >>>>> removed for everyone else. "Card BIOS" to me implies devices with an >>>>> Option-ROM BAR which I don't think modern devices have, so that might >>>>> be a simple way to try to phase out this quirk going forward without >>>>> regressing working setups that might be relying on this. >>>>> >>>>> Then again the driver is already depending on the number of enabled >>>>> ports to be reliable before PCS is written, and the current driver >>>>> does not attempt to enable ports that were not enabled previously. >>>>> That tells me that if the PCS quirk ever mattered it would have >>>>> already regressed when the driver switched from blindly writing 0xf to >>>>> only setting the bits that were already set in ->port_map. >>>> >>>> But how do we find that out? >>> >>> We can layer another assumption on top of Tejun's assumptions from >>> commit 49f290903935 "ahci: update PCS programming". The kernel >>> community has not received any regression reports from that change >>> which says: >>> >>> " >>> port_map is determined from PORTS_IMPL PCI register which is >>> implemented as write or write-once register. If the register isn't >>> programmed, ahci automatically generates it from number of ports, >>> which is good enough for PCS programming. ICH6/7M are probably the >>> only ones where non-contiguous enable bits are necessary && PORTS_IMPL >>> isn't programmed properly but they're proven to work reliably with 0xf >>> anyway. >>> " >>> >>> So the potential options I see are: >>> >>> 1/ Keep the current scheme, but limit it to cases where PORTS_IMPL is >>> less than 8 and assume this need to set the bits is unnecessary legacy >>> to carry forward >>> >>> 2/ Option1 + additionally use PORTS_IMPL as a gate to guess when the >>> PCS format might be different for values >= 8. >>> >>> I think the driver does not need to consider Option2 unless / until it >>> encounters a platform where firmware does not "do the right thing", >>> and given Denverton has been in the wild with the wrong PCS twiddling >>> it seems to suggest nothing needs to be done there. >>> >>>> A compromise to me seems that we just do the PCS quirk for all Intel >>>> devices explicitly listed in the PCI Ids based on new board_* values >>>> as long as they have the old PCS location, and assume anything new >>>> enough to have the new location won't need to quirk, given that it >>>> never properly worked. This might miss some intel devices that were >>>> supported with the class based catchall, though. >>> >>> I'd be more comfortable with PORT_IMPL as the deciding factor. >> >> Unfortunately we can't use CAP.NP or PORTS_IMPL for this heuristic. >> >> The problem is that BIOS should be setting the PORTS_IMPL bitmask to >> match which lanes have actually been connected on the board. So >> PORTS_IMPL can be < 8 even if the controller is new enough to >> potentially support > 8 and has the new config space layout. >> >> As proof here's the relevant ahci_print_info() output booting on a >> Denverton based box with 5 ports implemented: >> >> ahci 0000:00:14.0: AHCI 0001.0301 32 slots 5 ports 6 Gbps 0x7a impl SATA mode >> | \-PORTS_IMPL >> \-CAP.NP > > Ugh, ok, thanks for clarifying and my mistake for not realizing > Denverton already violates that heuristic. > > So now I'm trying to grok Christoph's suggestion. Are you saying that > all existing board-ids would assume old PCS format? That would mean > that Denverton gets the wrong format via board_ahci via the class code > matching? Maybe I'm not understanding the suggestion... My understanding of Christoph's suggestion was that we create a new board_ahci_* entry (e.g. board_ahci_intel_legacy) to match in ahci_pci_tbl against those devices using the original config space layout where PCS is @ 0x92. Once we have that we can conditionally run the PCS poke code in ahci_pci_reset_controller() only for devices with that new board_id. We assume that all newer devices added to the table will not use the legacy board_id and not need to do PCS pokes. Right now there are entries for devices with both the new and old config space layout in ahci_pci_tbl. That means someone at Intel would need to go through each entry and mark it as new or old. Your point about devices matching solely on the class code still applies. > Another option might be that controllers >= 1.3.1 will stop using the > PCS twiddling, and then we go add a new board id where / if it happens > to matter in practice. Can you get someone from the controller design team to give us a clear answer on a revision where this PCS change happened? It would be nice if we could just check PCI_REVISION_ID or something similar.