On Thu, Feb 29, 2024 at 12:15:02AM +0530, Manivannan Sadhasivam wrote: > On Wed, Feb 28, 2024 at 11:39:07AM -0600, Bjorn Helgaas wrote: > > On Wed, Feb 28, 2024 at 10:44:12PM +0530, Manivannan Sadhasivam wrote: > > > On Wed, Feb 28, 2024 at 09:02:11AM -0600, Bjorn Helgaas wrote: > > > > On Wed, Feb 28, 2024 at 06:34:11PM +0530, Mrinmay Sarkar wrote: > > > > > On 2/24/2024 4:24 AM, Bjorn Helgaas wrote: > > > > > > On Fri, Feb 23, 2024 at 07:33:38PM +0530, Mrinmay Sarkar wrote: > > > > > > > Due to some hardware changes, SA8775P has set the > > > > > > > NO_SNOOP attribute in its TLP for all the PCIe > > > > > > > controllers. NO_SNOOP attribute when set, the requester > > > > > > > is indicating that there no cache coherency issues exit > > > > > > > for the addressed memory on the host i.e., memory is not > > > > > > > cached. But in reality, requester cannot assume this > > > > > > > unless there is a complete control/visibility over the > > > > > > > addressed memory on the host. > > > > > > > > > > > > Forgive my ignorance here. It sounds like the cache > > > > > > coherency issue would refer to system memory, so the > > > > > > relevant No Snoop attribute would be in DMA transactions, > > > > > > i.e., Memory Reads or Writes initiated by PCIe Endpoints. > > > > > > But it looks like this patch would affect TLPs initiated > > > > > > by the Root Complex, not those from Endpoints, so I'm > > > > > > confused about how this works. > > > > > > > > > > > > If this were in the qcom-ep driver, it would make sense > > > > > > that setting No Snoop in the TLPs initiated by the > > > > > > Endpoint could be a problem, but that doesn't seem to be > > > > > > what this patch is concerned with. > > > > > > > > > > I think in multiprocessor system cache coherency issue might > > > > > occur. and RC as well needs to snoop cache to avoid > > > > > coherency as it is not enable by default. > > > > > > > > My mental picture isn't detailed enough, so I'm still > > > > confused. We're talking about TLPs initiated by the RC. > > > > Normally these would be because a driver did a CPU load or > > > > store to a PCIe device MMIO space, not to system memory. > > > > > > Endpoint can expose its system memory as a BAR to the host. In > > > that case, the cache coherency issue would apply for TLPs > > > originating from RC as well. > > > > What PCIe transactions are involved here? So far I know about: > > > > RC initiates Memory Read Request (or Write) with NO_SNOOP==0 > > ... > > EP responds with Completion with Data (for Read) > > The memory on the endpoint may be cached (due to linear map and > such). So if the RC is initiating the MWd TLP with NO_SNOOP=1, then > there would be coherency issues because there is no guarantee that > the memory is not cached on the endpoint. So, not snooping the > caches and directly writing to the DDR would cause coherency issues > on the endpoint as well. I don't know what linear map is, but I'll take your word for it that endpoints are allowed to cache things internally. So I guess in the ideal world there might be a way for a driver to specify no-snoop for accesses to its device if it knows there is no caching. The commit log for this patch refers to caching on the *host*, though, and IIUC you're saying this patch clears NO_SNOOP on TLPs from the RC because of potential coherency issues on the *endpoint*. > > But I guess you're saying the EP would initiate other transactions in > > the middle related to snooping? I don't know what those are. > > > > > > But I guess you're suggesting the RC can initiate a TLP with a system > > > > memory address? And this TLP would be routed not to a Root Port or to > > > > downstream devices, but it would instead be kind of a loopback and be > > > > routed back up through the RC and maybe IOMMU, to system memory? > > > > > > > > I would have expected accesses like this to be routed directly to > > > > system memory without ever reaching the PCIe RC. > > > > > > > > > and we are enabling this feature for qcom-ep driver as well. > > > > > it is in patch2. > > > > > > > > > > Thanks > > > > > Mrinmay > > > > > > > > > > > > And worst case, if the memory is cached on the host, it may lead to > > > > > > > memory corruption issues. It should be noted that the caching of memory > > > > > > > on the host is not solely dependent on the NO_SNOOP attribute in TLP. > > > > > > > > > > > > > > So to avoid the corruption, this patch overrides the NO_SNOOP attribute > > > > > > > by setting the PCIE_PARF_NO_SNOOP_OVERIDE register. This patch is not > > > > > > > needed for other upstream supported platforms since they do not set > > > > > > > NO_SNOOP attribute by default. > > > > > > > > > > > > > > 8775 has IP version 1.34.0 so intruduce a new cfg(cfg_1_34_0) for this > > > > > > > platform. Assign enable_cache_snoop flag into struct qcom_pcie_cfg and > > > > > > > set it true in cfg_1_34_0 and enable cache snooping if this particular > > > > > > > flag is true. > > > > > > s/intruduce/introduce/ > > > > > > > > > > > > Bjorn > > > > > > -- > > > மணிவண்ணன் சதாசிவம் > > -- > மணிவண்ணன் சதாசிவம்