Re: [PATCH v7 07/17] cxl/pci: Map CXL PCIe Root Port and Downstream Switch Port RAS registers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bowman, Terry wrote:
> 
> 
> On 2/11/2025 7:23 PM, Dan Williams wrote:
> > Terry Bowman wrote:
> >> The CXL mem driver (cxl_mem) currently maps and caches a pointer to RAS
> >> registers for the endpoint's Root Port. The same needs to be done for
> >> each of the CXL Downstream Switch Ports and CXL Root Ports found between
> >> the endpoint and CXL Host Bridge.
> >>
> >> Introduce cxl_init_ep_ports_aer() to be called for each CXL Port in the
> >> sub-topology between the endpoint and the CXL Host Bridge. This function
> >> will determine if there are CXL Downstream Switch Ports or CXL Root Ports
> >> associated with this Port. The same check will be added in the future for
> >> upstream switch ports.
> >>
> >> Move the RAS register map logic from cxl_dport_map_ras() into
> >> cxl_dport_init_ras_reporting(). This eliminates the need for the helper
> >> function, cxl_dport_map_ras().
> > Not sure about the motivation here...
> >
> >> cxl_init_ep_ports_aer() calls cxl_dport_init_ras_reporting() to map
> >> the RAS registers for CXL Downstream Switch Ports and CXL Root Ports.
> > Ok, makes sense...
> >
> >> cxl_dport_init_ras_reporting() must check for previously mapped registers
> >> before mapping. This is required because multiple Endpoints under a CXL
> >> switch may share an upstream CXL Root Port, CXL Downstream Switch Port,
> >> or CXL Downstream Switch Port. Ensure the RAS registers are only mapped
> >> once.
> > Sounds broken. Every device upstream-port only has one downstream port.
> >
> > A CXL switch config looks like this:
> >
> >            │             
> > ┌──────────┼────────────┐
> > │SWITCH   ┌┴─┐          │
> > │         │UP│          │
> > │         └─┬┘          │
> > │    ┌──────┼─────┐     │
> > │    │      │     │     │
> > │   ┌┴─┐  ┌─┴┐  ┌─┴┐    │
> > │   │DP│  │DP│  │DP│    │
> > │   └┬─┘  └─┬┘  └─┬┘    │
> > └────┼──────┼─────┼─────┘
> >     ┌┴─┐  ┌─┴┐  ┌─┴┐     
> >     │EP│  │EP│  │EP│     
> >     └──┘  └──┘  └──┘     
> >
> > ...so how can an endpoint ever find that its immediate parent downstream
> > port has already been mapped?
> 
> 
>             ┌─┴─┐
>             │RP1│
>             └─┬─┘
>   ┌───────────┼───────────┐
>   │SWITCH   ┌─┴─┐         │
>   │         │UP1│         │   RP1 - 0c:00.0
>   │         └─┬─┘         │   UP1 - 0d:00.0
>   │    ┌──────┼─────┐     │   DP1 - 0e:00.0
>   │    │      │     │     │
>   │  ┌─┴─┐  ┌─┴─┐ ┌─┴─┐   │
>   │  │DP1│  │DP2│ │DP3│   │
>   │  └─┬─┘  └─┬─┘ └─┬─┘   │
>   └────┼──────┼─────┼─────┘
>      ┌─┴─┐  ┌─┴─┐ ┌─┴─┐
>      │EP1│  │EP2│ │EP3│
>      └───┘  └───┘ └───┘
> 
> 
> It cant but the root RP and USP have duplicate calls for each EP in the example diagram.
> The function's purpose is to map RAS registers and cache the address. This reuses the
> same function for RP and DSP. The DSP will never be previously mapped as you indicated.

Are you talking about in the current code, which should have already
reported problems due to multiple overlapping mappings, or with the
proposed changes? Can you clarify the sequenece of calls that triggers
the multiple mappings of RP1?

Also, if EP1 and EP2 race to establish the RP1 mapping, then wouldn't
EP1 and EP2 also race to tear it down? What prevents EP2 from unmapping
RP1 if EP1 still needs it mapped?

I would prefer that rather than EP1 being responsible for mapping RP1
RAS, and a lock to prevent EP2 and EP3 from also repeating that, it
should be UP1 in cxl_switch_port_probe() taking responsibility for
mapping RP1 RAS.

One of the known problems with cxl_switch_port_probe() is that it
enumerates all dports regardless of attachment. That would be where I
would expect problems of dports already going through initialization
prematurely in advance of an endpoint showing up. However, that's a
different fix.

[..]
> >> +
> >> +	/* dport may have more than 1 downstream EP. Check if already mapped. */
> >> +	mutex_lock(&ras_init_mutex);
> > I suspect this lock and check got added to workaround "Failed to request
> > region" messages coming out of devm_cxl_iomap_block() in testing? Per
> > above, that's not "more than 1 downstream EPi", that's "failure to clean
> > up the last mapping for the next cxl_mem_probe() event of the same
> > endpoint".
> Synchronization was added to handle the concurrent accesses. I never observed
> issues due to the race condition for RP and USP but I confirmed through further
> testing it is a real potential issue for the RP and USP.

It is still not clear to me how this singleton lock helps when multiple
EPs are sharing a resource that both races init and shutdown.

> You recommended, in the next patch, to map USP RAS registers from cxl_endpoint_port_probe().
> Would you like the RP and DSP mapping to be called from cxl_endpoint_port_probe() as well? Terry

I think it is broken for an EP to be reaching through a switch to
initialize a shared resource at the RP level. Each level of the
hierarchy should take care of its immediate parent. We need this
bottom-up incremental arrangement due to the way that CXL hides
register blocks until CXL link up.




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux