Re: Direct mapping of devices that need cache coherent memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Magnus,

I think the check in is_mmio_pfn() should be modified if it is common to
have coherent memory which is marked as reserved to Linux.  However, I'm
not entirely sure which other solutions would work with all other
architectures in KVM using this check.

I'm cc'ing Paolo and Ard here (Ard recently looked into this area as
well), but it may be worth trying to summarize the problem a little more
consicely and ask the question at a braoder scope, for example on
kvm@xxxxxxxxxxxxxxx.

Sorry not to be of more immediate help,
-Christoffer

On Fri, Nov 07, 2014 at 10:35:45AM +0100, Magnus Karlsson wrote:
> Mario,
> 
> That is exactly what I am using, which is good since it means I am not
> doing something completely silly. The problem I am having is that
> kvm_is_mmio_pfn() returns true for my networking device memory region
> because PageReserved() returns true as the (in the Qemu driver mmaped)
> physical memory region was carved out of DRAM using a memreserve in the
> device tree. This region is huge and needs to be consecutive.
> 
> Is it a valid assumption that all Reserved pages are supposed to be treated
> as non-coherent device pages (as it indirectly sets mem_type =
> PAGE_S2_DEVICE in user_mem_abort)? The git log comment for the if statement
> in user_mem_abort that causes my problems fits perfectly with what I want
> to achieve, though my mem region is cache coherent normal memory, not
> non-coherent device memory. Is there a way to communicate what type of
> memory I have mapped via VFIO or /dev/mem that this code could use?
> 
> git show b88657674
>    ARM: KVM: user_mem_abort: support stage 2 MMIO page mapping
> 
>     A userspace process can map device MMIO memory via VFIO or /dev/mem,
>     e.g., for platform device passthrough support in QEMU.
> 
>     During early development, we found the PAGE_S2 memory type being used
>     for MMIO mappings.  This patch corrects that by using the more strongly
>     ordered memory type for device MMIO mappings: PAGE_S2_DEVICE.
> 
> 
> Thanks: Magnus
> 
> On Thu, Nov 6, 2014 at 9:36 PM, Mario Smarduch <m.smarduch@xxxxxxxxxxx>
> wrote:
> 
> > One approach may be to create a host driver to mmap your host
> > memory from qemu, use memory_region_init_ram_ptr(), and map that
> > into some guest address space. Similar to ivshmem except instead
> > of posix shared memory use your shared memory driver.
> > kvm_mmio_is_pfn() should return false on pfn
> > returned from gfn_to_pfn_prot().
> >
> > - Mario
> >
> >
> > On 11/06/2014 12:13 PM, Magnus Karlsson wrote:
> > > Mario,
> > >
> > > Yes. It is part of the regular DRAM that can be used as standard Linux
> > > memory or memory for the networking device. If you have 16 GB of DRAM
> > > memory you decide at boot time how much of it should be used by Linux
> > > and how much should be used by the networking device. It is all coherent.
> > >
> > > Magnus
> > >
> > > Den 6 nov 2014 20:28 skrev "Mario Smarduch" <m.smarduch@xxxxxxxxxxx
> > > <mailto:m.smarduch@xxxxxxxxxxx>>:
> > >
> > >     Is the memory you're trying to pass into guest to communicate
> > >     with your network devices coherently part of onboard DRAM?
> > >
> > >     - Mario
> > >
> > >     On 11/06/2014 05:51 AM, Magnus Karlsson wrote:
> > >     > Hi,
> > >     >
> > >     > I have a problem with the cacheability memory attribute
> > >     permissions that
> > >     > are inserted into the stage 2 page table by KVM for ARM. Would be
> > >     great
> > >     > if I could get some advise. Maybe I am doing something wrong in
> > >     the set
> > >     > up or this kind of device has not been encountered before.
> > >     >
> > >     > So here we go. I have a board with an LSI Axxia AXM5516 SoC used
> > for
> > >     > networking equipment. In addition to 16 A15 cores, it contains a
> > >     lot of
> > >     > networking HW. The networking HW communicates with the cores
> > >     through the
> > >     > use of a part of the regular cache coherent DRAM memory. This
> > >     region has
> > >     > to be consecutive and is today carved out using a memreserve in the
> > >     > device tree. Usually this region is several times larger than the
> > >     amount
> > >     > of memory given to Linux.
> > >     >
> > >     > My problem starts when I would like to access this region from a
> > >     guest.
> > >     > I have made a direct mapping of this region into the guest by
> > adding a
> > >     > sysbus_mmio region into the machine model of Qemu (which will end
> > up
> > >     > calling the KVM_SET_USER_MEMORY_REGION ioctl). And yes I will
> > start to
> > >     > use VFIO as soon as it is accepted ;-). When the stage 2
> > translation
> > >     > entry for this region is about to get entered in
> > >     > arch/arm/kvm/mmu.c:user_mem_abort() I get to this line:
> > >     >
> > >     > if (kvm_is_mmio_pfn(pfn))
> > >     >        mem_type = PAGE_S2_DEVICE;
> > >     >
> > >     >
> > >     > bool kvm_is_mmio_pfn(pfn_t pfn)
> > >     > {
> > >     >       if (pfn_valid(pfn))
> > >     >                return !is_zero_pfn(pfn) &&
> > >     PageReserved(pfn_to_page(pfn));
> > >     >
> > >     >       return true;
> > >     > }
> > >     >
> > >     > For the device to work I need mem_type to continue to be PAGE_S2.
> > But
> > >     > kvm_is_mmio_pfn() returns true as PageReserved() is true due to my
> > >     > memreserved region. I get an uncached device memory mapping as
> > stage 2
> > >     > overrides the mapping in stage 1 (and the cores do not see what the
> > >     > networking hardware wrote in the caches). If I instead remove the
> > >     device
> > >     > memory completely from Linux, then pfn_valid() will be false and
> > >     > kvm_is_mmi_pfn() will still return true (but I would like all
> > >     memory to
> > >     > be part of Linux, so not an idea I fancy). So what to do? Do I
> > need to
> > >     > register the region differently with KVM (and Qemu), should I use
> > the
> > >     > VFIO patches instead because they solve the problem, or does this
> > code
> > >     > assume that all reserved pages are non-coherent device memory and
> > it
> > >     > needs to be extended? Removing the code is of course not a
> > >     solution even
> > >     > for me as there are plenty of other more normal devices on the SoC
> > >     that
> > >     > are non-coherent.
> > >     >
> > >     > BTW, I cannot use the CMA (instead of memreserve) since it does
> > >     not work
> > >     > with highmem.
> > >     >
> > >     > Thanks: Magnus
> > >     >
> > >     >
> > >     > --
> > >     >
> > >     > *Magnus Karlsson*
> > >     >
> > >     > Software Development Engineering Manager
> > >     >
> > >     > Avago Technologies (formerly LSI Logic)
> > >     >
> > >     > Box 1024, Knarrarnäsgatan 15
> > >     >
> > >     > SE-164 21 Kista, Sweden
> > >     >
> > >     > TEL +46 8 594 607 09
> > >     >
> > >     > FAX +46 8 594 607 10
> > >     >
> > >     > CELL +46 73 80 444 88
> > >     >
> > >     > magnus.karlsson@xxxxxxxxxxxxx
> > >     <mailto:magnus.karlsson@xxxxxxxxxxxxx>
> > >     <mailto:magnus.karlsson@xxxxxxxxxxxxx
> > >     <mailto:magnus.karlsson@xxxxxxxxxxxxx>>
> > >     >
> > >     >
> > >     >
> > >     > _______________________________________________
> > >     > kvmarm mailing list
> > >     > kvmarm@xxxxxxxxxxxxxxxxxxxxx <mailto:kvmarm@xxxxxxxxxxxxxxxxxxxxx>
> > >     > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> > >     >
> > >
> >
> >
> 
> 
> -- 
> 
> *Magnus Karlsson*
> 
> Software Development Engineering Manager
> 
> Avago Technologies (formerly LSI Logic)
> 
> Box 1024, Knarrarnäsgatan 15
> 
> SE-164 21 Kista, Sweden
> 
> TEL +46 8 594 607 09
> 
> FAX +46 8 594 607 10
> 
> CELL +46 73 80 444 88
> 
> magnus.karlsson@xxxxxxxxxxxxx

> _______________________________________________
> kvmarm mailing list
> kvmarm@xxxxxxxxxxxxxxxxxxxxx
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm





[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux