Re: Direct mapping of devices that need cache coherent memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Magnus,
  so I think Christoffer answered the root cause, so not much
to add from my end.

I was under the impression you could grab huge chunk of memory
early during initialization while memory is not fragmented.

But if it's so big and contiguous, that you have to reserve
then I don't know.

On 11/07/2014 01:35 AM, Magnus Karlsson wrote:
> Mario,
> 
> That is exactly what I am using, which is good since it means I am not
> doing something completely silly. The problem I am having is that
> kvm_is_mmio_pfn() returns true for my networking device memory region
> because PageReserved() returns true as the (in the Qemu driver mmaped)
> physical memory region was carved out of DRAM using a memreserve in the
> device tree. This region is huge and needs to be consecutive. 
> 
> Is it a valid assumption that all Reserved pages are supposed to be
> treated as non-coherent device pages (as it indirectly sets mem_type =
> PAGE_S2_DEVICE in user_mem_abort)?  

No if I had to guess PageReserved() has double meaning here and fails
in your use case which is legitimate. Another way to identify device
memory may be needed.

> The git log comment for the if
> statement in user_mem_abort that causes my problems fits perfectly with
> what I want to achieve, though my mem region is cache coherent normal
> memory, not non-coherent device memory. Is there a way to communicate
> what type of memory I have mapped via VFIO or /dev/mem that this code
> could use? 

Not sure but it doesn't appear like it can, 2nd stage page fault handler
just defaults in your case to S2_DEVICE due to page type attributes.

There was a discussion some time back to relax stage2 to PAGE_S2 (or
I think a patch was already posted) and let stage 1 determine the
attributes.
That would allow you to set your memory region cacheable. But there was
some
security hole for GIC  where other guests can access each others GIC ranges.
Perhaps just making GIC range S2_DEVICE may work but this needs maintainers
to resolve these issues. You might have to look for some workaround.

> 
> git show b88657674
>    ARM: KVM: user_mem_abort: support stage 2 MMIO page mapping
>     
>     A userspace process can map device MMIO memory via VFIO or /dev/mem,
>     e.g., for platform device passthrough support in QEMU.
>     
>     During early development, we found the PAGE_S2 memory type being used
>     for MMIO mappings.  This patch corrects that by using the more strongly
>     ordered memory type for device MMIO mappings: PAGE_S2_DEVICE.
>   
> 
> Thanks: Magnus
> 
> On Thu, Nov 6, 2014 at 9:36 PM, Mario Smarduch <m.smarduch@xxxxxxxxxxx
> <mailto:m.smarduch@xxxxxxxxxxx>> wrote:
> 
>     One approach may be to create a host driver to mmap your host
>     memory from qemu, use memory_region_init_ram_ptr(), and map that
>     into some guest address space. Similar to ivshmem except instead
>     of posix shared memory use your shared memory driver.
>     kvm_mmio_is_pfn() should return false on pfn
>     returned from gfn_to_pfn_prot().
> 
>     - Mario
> 
> 
>     On 11/06/2014 12:13 PM, Magnus Karlsson wrote:
>     > Mario,
>     >
>     > Yes. It is part of the regular DRAM that can be used as standard Linux
>     > memory or memory for the networking device. If you have 16 GB of DRAM
>     > memory you decide at boot time how much of it should be used by Linux
>     > and how much should be used by the networking device. It is all coherent.
>     >
>     > Magnus
>     >
>     > Den 6 nov 2014 20:28 skrev "Mario Smarduch" <m.smarduch@xxxxxxxxxxx <mailto:m.smarduch@xxxxxxxxxxx>
>     > <mailto:m.smarduch@xxxxxxxxxxx <mailto:m.smarduch@xxxxxxxxxxx>>>:
>     >
>     >     Is the memory you're trying to pass into guest to communicate
>     >     with your network devices coherently part of onboard DRAM?
>     >
>     >     - Mario
>     >
>     >     On 11/06/2014 05:51 AM, Magnus Karlsson wrote:
>     >     > Hi,
>     >     >
>     >     > I have a problem with the cacheability memory attribute
>     >     permissions that
>     >     > are inserted into the stage 2 page table by KVM for ARM.
>     Would be
>     >     great
>     >     > if I could get some advise. Maybe I am doing something wrong in
>     >     the set
>     >     > up or this kind of device has not been encountered before.
>     >     >
>     >     > So here we go. I have a board with an LSI Axxia AXM5516 SoC
>     used for
>     >     > networking equipment. In addition to 16 A15 cores, it contains a
>     >     lot of
>     >     > networking HW. The networking HW communicates with the cores
>     >     through the
>     >     > use of a part of the regular cache coherent DRAM memory. This
>     >     region has
>     >     > to be consecutive and is today carved out using a memreserve
>     in the
>     >     > device tree. Usually this region is several times larger
>     than the
>     >     amount
>     >     > of memory given to Linux.
>     >     >
>     >     > My problem starts when I would like to access this region from a
>     >     guest.
>     >     > I have made a direct mapping of this region into the guest
>     by adding a
>     >     > sysbus_mmio region into the machine model of Qemu (which
>     will end up
>     >     > calling the KVM_SET_USER_MEMORY_REGION ioctl). And yes I
>     will start to
>     >     > use VFIO as soon as it is accepted ;-). When the stage 2
>     translation
>     >     > entry for this region is about to get entered in
>     >     > arch/arm/kvm/mmu.c:user_mem_abort() I get to this line:
>     >     >
>     >     > if (kvm_is_mmio_pfn(pfn))
>     >     >        mem_type = PAGE_S2_DEVICE;
>     >     >
>     >     >
>     >     > bool kvm_is_mmio_pfn(pfn_t pfn)
>     >     > {
>     >     >       if (pfn_valid(pfn))
>     >     >                return !is_zero_pfn(pfn) &&
>     >     PageReserved(pfn_to_page(pfn));
>     >     >
>     >     >       return true;
>     >     > }
>     >     >
>     >     > For the device to work I need mem_type to continue to be
>     PAGE_S2. But
>     >     > kvm_is_mmio_pfn() returns true as PageReserved() is true due
>     to my
>     >     > memreserved region. I get an uncached device memory mapping
>     as stage 2
>     >     > overrides the mapping in stage 1 (and the cores do not see
>     what the
>     >     > networking hardware wrote in the caches). If I instead
>     remove the
>     >     device
>     >     > memory completely from Linux, then pfn_valid() will be false and
>     >     > kvm_is_mmi_pfn() will still return true (but I would like all
>     >     memory to
>     >     > be part of Linux, so not an idea I fancy). So what to do? Do
>     I need to
>     >     > register the region differently with KVM (and Qemu), should
>     I use the
>     >     > VFIO patches instead because they solve the problem, or does
>     this code
>     >     > assume that all reserved pages are non-coherent device
>     memory and it
>     >     > needs to be extended? Removing the code is of course not a
>     >     solution even
>     >     > for me as there are plenty of other more normal devices on
>     the SoC
>     >     that
>     >     > are non-coherent.
>     >     >
>     >     > BTW, I cannot use the CMA (instead of memreserve) since it does
>     >     not work
>     >     > with highmem.
>     >     >
>     >     > Thanks: Magnus
>     >     >
>     >     >
>     >     > --
>     >     >
>     >     > *Magnus Karlsson*
>     >     >
>     >     > Software Development Engineering Manager
>     >     >
>     >     > Avago Technologies (formerly LSI Logic)
>     >     >
>     >     > Box 1024, Knarrarnäsgatan 15
>     >     >
>     >     > SE-164 21 Kista, Sweden
>     >     >
>     >     > TEL +46 8 594 607 09
>     >     >
>     >     > FAX +46 8 594 607 10
>     >     >
>     >     > CELL +46 73 80 444 88
>     >     >
>     >     > magnus.karlsson@xxxxxxxxxxxxx
>     <mailto:magnus.karlsson@xxxxxxxxxxxxx>
>     >     <mailto:magnus.karlsson@xxxxxxxxxxxxx
>     <mailto:magnus.karlsson@xxxxxxxxxxxxx>>
>     >     <mailto:magnus.karlsson@xxxxxxxxxxxxx
>     <mailto:magnus.karlsson@xxxxxxxxxxxxx>
>     >     <mailto:magnus.karlsson@xxxxxxxxxxxxx <mailto:magnus.karlsson@xxxxxxxxxxxxx>>>
>     >     >
>     >     >
>     >     >
>     >     > _______________________________________________
>     >     > kvmarm mailing list
>     >     > kvmarm@xxxxxxxxxxxxxxxxxxxxx
>     <mailto:kvmarm@xxxxxxxxxxxxxxxxxxxxx>
>     <mailto:kvmarm@xxxxxxxxxxxxxxxxxxxxx
>     <mailto:kvmarm@xxxxxxxxxxxxxxxxxxxxx>>
>     >     > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>     >     >
>     >
> 
> 
> 
> 
> -- 
> 
> *Magnus Karlsson*
> 
> Software Development Engineering Manager
> 
> Avago Technologies (formerly LSI Logic)
> 
> Box 1024, Knarrarnäsgatan 15
> 
> SE-164 21 Kista, Sweden
> 
> TEL +46 8 594 607 09
> 
> FAX +46 8 594 607 10
> 
> CELL +46 73 80 444 88
> 
> magnus.karlsson@xxxxxxxxxxxxx <mailto:magnus.karlsson@xxxxxxxxxxxxx>
> 

_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm





[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux