在 2024/5/9 星期四 下午 8:21, Jonathan Cameron 写道:
On Thu, 9 May 2024 19:24:28 +0800
Dongsheng Yang <dongsheng.yang@xxxxxxxxxxxx> wrote:
...
Yes. I think we are going to have to wait on architecture specific clarifications
before any software coherent use case can be guaranteed to work beyond the 3.1 ones
for temporal sharing (only one accessing host at a time) and read only sharing where
writes are dropped anyway so clean write back is irrelevant beyond some noise in
logs possibly (if they do get logged it is considered so rare we don't care!).
Hi Jonathan,
Allow me to discuss further. As described in CXL 3.1:
```
Software-managed coherency schemes are complicated by any host or device
whose caching agents generate clean writebacks. A “No Clean Writebacks”
capability bit is available for a host in the CXL System Description
Structure (CSDS; see Section 9.18.1.6) or for a device in the DVSEC CXL
Capability2 register (see Section 8.1.3.7).
```
If we check and find that the "No clean writeback" bit in both CSDS and
DVSEC is set, can we then assume that software cache-coherency is
feasible, as outlined below:
(1) Both the writer and reader ensure cache flushes. Since there are no
clean writebacks, there will be no background data writes.
(2) The writer writes data to shared memory and then executes a cache
flush. If we trust the "No clean writeback" bit, we can assume that the
data in shared memory is coherent.
(3) Before reading the data, the reader performs cache invalidation.
Since there are no clean writebacks, this invalidation operation will
not destroy the data written by the writer. Therefore, the data read by
the reader should be the data written by the writer, and since the
writer's cache is clean, it will not write data to shared memory during
the reader's reading process. Additionally, data integrity can be ensured.
The first step for CBD should depend on hardware cache coherence, which
is clearer and more feasible. Here, I am just exploring the possibility
of software cache coherence, not insisting on implementing software
cache-coherency right away. :)
Yes, if a platform sets that bit, you 'should' be fine. What exact flush
is needed is architecture specific however and the DMA related ones
may not be sufficient. I'd keep an eye open for arch doc update from the
various vendors.
Also, the architecture that motivated that bit existing is a 'moderately
large' chip vendor so I'd go so far as to say adoption will be limited
unless they resolve that in a future implementation :)
Great, I think we've had a good discussion and reached a consensus on
this issue. The remaining aspect will depend on hardware updates. Thank
you for the information, that helps a lot.
Thanx
Jonathan
Thanx
CBD can initially support (3), and then transition to (1) when hardware
supports cache-coherency. If there's sufficient market demand, we can
also consider supporting (2).
I'd assume only (3) works. The others rely on assumptions I don't think
I guess you mean (1), the hardware cache-coherency way, right?
Indeed - oops!
Hardware coherency is the way to go, or a well defined and clearly document
description of how to play with the various host architectures.
Jonathan
:)
Thanx
you can rely on.
Fun fun fun,
Jonathan
How does this approach sound?
Thanx
J
Keep in mind that I don't think anybody has cxl 3 devices or CPUs yet, and
shared memory is not explicitly legal in cxl 2, so there are things a cpu
could do (or not do) in a cxl 2 environment that are not illegal because
they should not be observable in a no-shared-memory environment.
CBD is interesting work, though for some of the reasons above I'm somewhat
skeptical of shared memory as an IPC mechanism.
Regards,
John
.
.