Re: [PATCH 2/8] iommu/arm-smmu-v3: Use S2FWB when available

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 20, 2024 at 07:52:53PM +0000, Mostafa Saleh wrote:
> On Tue, Aug 20, 2024 at 09:01:02AM -0300, Jason Gunthorpe wrote:
> > On Tue, Aug 20, 2024 at 08:30:05AM +0000, Mostafa Saleh wrote:
> > > Hi Jason,
> > > 
> > > On Tue, Aug 06, 2024 at 08:41:15PM -0300, Jason Gunthorpe wrote:
> > > > Force Write Back (FWB) changes how the S2 IOPTE's MemAttr field
> > > > works. When S2FWB is supported and enabled the IOPTE will force cachable
> > > > access to IOMMU_CACHE memory and deny cachable access otherwise.
> > > > 
> > > > This is not especially meaningful for simple S2 domains, it apparently
> > > > doesn't even force PCI no-snoop access to be coherent.
> > > > 
> > > > However, when used with a nested S1, FWB has the effect of preventing the
> > > > guest from choosing a MemAttr that would cause ordinary DMA to bypass the
> > > > cache. Consistent with KVM we wish to deny the guest the ability to become
> > > > incoherent with cached memory the hypervisor believes is cachable so we
> > > > don't have to flush it.
> > > > 
> > > > Turn on S2FWB whenever the SMMU supports it and use it for all S2
> > > > mappings.
> > > 
> > > I have been looking into this recently from the KVM side as it will
> > > use FWB for the CPU stage-2 unconditionally for guests(if supported),
> > > however that breaks for non-coherent devices when assigned, and
> > > limiting assigned devices to be coherent seems too restrictive.
> > 
> > kvm's CPU S2 doesn't care about non-DMA-coherent devices though? That
> > concept is only relevant to the SMMU.
>
> Why not? That would be a problem if a device is not dma coherent,
> and the VM knows that and maps it’s DMA memory as non cacheable.
> But it would be overridden by FWB in stage-2 to be cacheable,
> it would lead to coherency issues.

Oh, from that perspective yes, but the entire point of S2FWB is that
VM's can not create non-coherent access so it is a bit nonsense to ask
for both S2FWB and try to assign a non-DMA coherent device.

> Yes, that also breaks (although I think this is an easier problem to
> solve)

Well, it is easy to solve, just don't use S2FWB and manually flush the
caches before the hypervisor touches any memory. :)

> What I mean is the master itself not the SMMU (the SID basically),
> so in that case the STE shouldn’t have FWB enabled.

That doesn't matter, those cases will not pass in IOMMU_CACHE and they
will work fine with S2FWB turned on.

> > Also bear in mind VFIO won't run unless ARM_SMMU_FEAT_COHERENCY is set
> > so we won't even get a chance to ask for a S2 domain.
> 
> Oh, I think that is only for the SMMU, not for the master, the
> SMMU can be coherent (for pte, ste …) but the master can still be
> non coherent. Looking at how VFIO uses it, that seems to be a bug?

If there are mixes of SMMU feature and dev_is_dma_coherent() then it
would be a bug yes..

I recall we started out trying to use dev_is_dma_coherent() but
Christoph explained it doesn't work that generally:

https://lore.kernel.org/kvm/20220406135150.GA21532@xxxxxx/

Seems we sort of gave up on it, too complicated. Robin had a nice
observation of the complexity:

    Disregarding the complete disaster of PCIe No Snoop on Arm-Based 
    systems, there's the more interesting effectively-opposite scenario 
    where an SMMU bridges non-coherent devices to a coherent interconnect. 
    It's not something we take advantage of yet in Linux, and it can only be 
    properly described in ACPI, but there do exist situations where 
    IOMMU_CACHE is capable of making the device's traffic snoop, but 
    dev_is_dma_coherent() - and device_get_dma_attr() for external users - 
    would still say non-coherent because they can't assume that the SMMU is 
    enabled and programmed in just the right way.

Anyhow, for the purposes of KVM and VFIO, devices that don't work with
IOMMU_CACHE are not allowed. From an API perspective
IOMMU_CAP_CACHE_COHERENCY is supposed to return if the struct device
can use IOMMU_CACHE.

The corner case where we have a ARM_SMMU_FEAT_COHERENCY SMMU but
somehow specific devices don't support IOMMU_CACHE is not properly
reflected in IOMMU_CAP_CACHE_COHERENCY. I don't know how to fix that,
and we've been ignoring it for a long time now :)

Jason




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux