On Thu, Aug 15, 2013 at 10:23 PM, Christoffer Dall <christoffer.dall@xxxxxxxxxx> wrote: > On Thu, Aug 15, 2013 at 04:37:27PM +0100, Marc Zyngier wrote: >> On 2013-08-15 16:13, Anup Patel wrote: >> > Hi Marc, >> > >> > On Thu, Aug 15, 2013 at 8:17 PM, Marc Zyngier <marc.zyngier@xxxxxxx> >> > wrote: >> >> On 2013-08-15 14:31, Anup Patel wrote: >> >>> >> >>> On Thu, Aug 15, 2013 at 2:01 PM, Marc Zyngier >> >>> <marc.zyngier@xxxxxxx> >> >>> wrote: >> >>>> >> >>>> On 2013-08-15 07:26, Anup Patel wrote: >> >>>>> >> >>>>> >> >>>>> Hi Marc, >> >>>>> >> >>>>> On Thu, Aug 15, 2013 at 10:22 AM, Marc Zyngier >> >>>>> <marc.zyngier@xxxxxxx> >> >>>>> wrote: >> >>>>>> >> >>>>>> >> >>>>>> Hi Anup, >> >>>>>> >> >>>>>> >> >>>>>> On 2013-08-14 15:22, Anup Patel wrote: >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> On Wed, Aug 14, 2013 at 5:34 PM, Marc Zyngier >> >>>>>>> <marc.zyngier@xxxxxxx> >> >>>>>>> wrote: >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> Hi Pranav, >> >>>>>>>> >> >>>>>>>> On 2013-08-14 12:47, Pranavkumar Sawargaonkar wrote: >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> Systems with large external L3-cache (few MBs), might have >> >>>>>>>>> dirty >> >>>>>>>>> content belonging to the guest page in L3-cache. To tackle >> >>>>>>>>> this, >> >>>>>>>>> we need to flush such dirty content from d-cache so that >> >>>>>>>>> guest >> >>>>>>>>> will see correct contents of guest page when guest MMU is >> >>>>>>>>> disabled. >> >>>>>>>>> >> >>>>>>>>> The patch fixes coherent_icache_guest_page() for external >> >>>>>>>>> L3-cache. >> >>>>>>>>> >> >>>>>>>>> Signed-off-by: Pranavkumar Sawargaonkar >> >>>>>>>>> <pranavkumar@xxxxxxxxxx> >> >>>>>>>>> Signed-off-by: Anup Patel <anup.patel@xxxxxxxxxx> >> >>>>>>>>> --- >> >>>>>>>>> arch/arm64/include/asm/kvm_mmu.h | 2 ++ >> >>>>>>>>> 1 file changed, 2 insertions(+) >> >>>>>>>>> >> >>>>>>>>> diff --git a/arch/arm64/include/asm/kvm_mmu.h >> >>>>>>>>> b/arch/arm64/include/asm/kvm_mmu.h >> >>>>>>>>> index efe609c..5129038 100644 >> >>>>>>>>> --- a/arch/arm64/include/asm/kvm_mmu.h >> >>>>>>>>> +++ b/arch/arm64/include/asm/kvm_mmu.h >> >>>>>>>>> @@ -123,6 +123,8 @@ static inline void >> >>>>>>>>> coherent_icache_guest_page(struct kvm *kvm, gfn_t gfn) >> >>>>>>>>> if (!icache_is_aliasing()) { /* PIPT */ >> >>>>>>>>> unsigned long hva = gfn_to_hva(kvm, gfn); >> >>>>>>>>> flush_icache_range(hva, hva + PAGE_SIZE); >> >>>>>>>>> + /* Flush d-cache for systems with external >> >>>>>>>>> caches. */ >> >>>>>>>>> + __flush_dcache_area((void *) hva, PAGE_SIZE); >> >>>>>>>>> } else if (!icache_is_aivivt()) { /* non >> >>>>>>>>> ASID-tagged >> >>>>>>>>> VIVT >> >>>>>>>>> */ >> >>>>>>>>> /* any kind of VIPT cache */ >> >>>>>>>>> __flush_icache_all(); >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> [adding Will to the discussion as we talked about this in the >> >>>>>>>> past] >> >>>>>>>> >> >>>>>>>> That's definitely an issue, but I'm not sure the fix is to hit >> >>>>>>>> the >> >>>>>>>> data >> >>>>>>>> cache on each page mapping. It looks overkill. >> >>>>>>>> >> >>>>>>>> Wouldn't it be enough to let userspace do the cache cleaning? >> >>>>>>>> kvmtools >> >>>>>>>> knows which bits of the guest memory have been touched, and >> >>>>>>>> can do a >> >>>>>>>> "DC >> >>>>>>>> DVAC" on this region. >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> It seems a bit unnatural to have cache cleaning is user-space. >> >>>>>>> I am >> >>>>>>> sure >> >>>>>>> other architectures don't have such cache cleaning in >> >>>>>>> user-space for >> >>>>>>> KVM. >> >>>>>>> >> >>>>>>>> >> >>>>>>>> The alternative is do it in the kernel before running any vcpu >> >>>>>>>> - but >> >>>>>>>> that's not very nice either (have to clean the whole of the >> >>>>>>>> guest >> >>>>>>>> memory, which makes a full dcache clean more appealing). >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> Actually, cleaning full d-cache by set/way upon first run of >> >>>>>>> VCPU was >> >>>>>>> our second option but current approach seemed very simple hence >> >>>>>>> we went for this. >> >>>>>>> >> >>>>>>> If more people vote for full d-cache clean upon first run of >> >>>>>>> VCPU then >> >>>>>>> we should revise this patch. >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> Can you please give the attached patch a spin on your HW? I've >> >>>>>> boot-tested >> >>>>>> it on a model, but of course I can't really verify that it fixes >> >>>>>> your >> >>>>>> issue. >> >>>>>> >> >>>>>> As far as I can see, it should fix it without any additional >> >>>>>> flushing. >> >>>>>> >> >>>>>> Please let me know how it goes. >> >>>>> >> >>>>> >> >>>>> >> >>>>> HCR_EL2.DC=1 means all memory access for Stage1 MMU off are >> >>>>> treated as "Normal Non-shareable, Inner Write-Back >> >>>>> Write-Allocate, >> >>>>> Outer Write-Back Write-Allocate memory" >> >>>>> >> >>>>> HCR_EL2.DC=0 means all memory access for Stage1 MMU off are >> >>>>> treated as "Strongly-ordered device memory" >> >>>>> >> >>>>> Now if Guest/VM access hardware MMIO devices directly (such as >> >>>>> VGIC CPU interface) with MMU off then MMIO devices will be >> >>>>> accessed as normal memory when HCR_EL2.DC=1. >> >>>> >> >>>> >> >>>> >> >>>> I don't think so. Stage-2 still applies, and should force MMIO to >> >>>> be >> >>>> accessed as device memory. >> >>>> >> >>>> >> >>>>> The HCR_EL2.DC=1 makes sense only if we have all software >> >>>>> emulated devices for Guest/VM which is not true for KVM ARM or >> >>>>> KVM ARM64 because we use VGIC. >> >>>>> >> >>>>> IMHO, this patch enforces incorrect memory attribute for Guest/VM >> >>>>> when Stage1 MMU is off. >> >>>> >> >>>> >> >>>> >> >>>> See above. My understanding is that HCR.DC controls the default >> >>>> output of >> >>>> Stage-1, and Stage-2 overrides still apply. >> >>> >> >>> >> >>> You had mentioned that PAGE_S2_DEVICE attribute was redundant >> >>> and wanted guest to decide the memory attribute. In other words, >> >>> you >> >>> did not want to enforce any memory attribute in Stage2. >> >>> >> >>> Please refer to this patch >> >>> https://patchwork.kernel.org/patch/2543201/ >> >> >> >> >> >> This patch has never been merged. If you carry on following the >> >> discussion, >> >> you will certainly notice it was dropped for a very good reason: >> >> https://lists.cs.columbia.edu/pipermail/kvmarm/2013-May/005827.html >> >> >> >> So Stage-2 memory attributes are used, they are not going away, and >> >> they are >> >> essential to the patch I sent this morning. >> >> >> >> >> >> M. >> >> -- >> >> Fast, cheap, reliable. Pick two. >> > >> > HCR_EL2.DC=1 will break Guest/VM bootloader/firmware if the Guest/VM >> > is >> > provided a DMA-capable device in pass-through mode. The reason being >> > bootloader/firmware typically don't enable MMU and such >> > bootloader/firmware >> > will programme a pass-through DMA-capable device without any flushes >> > to >> > guest RAM (because it has not enabled MMU). >> > >> > A good example of such a device would be SATA AHCI controller given >> > to a >> > Guest/VM as direct access (using SystemMMU) and Guest >> > bootloader/firmware >> > accessing this SATA AHCI controller to load kernel images from SATA >> > disk. >> > In this situation, we will have to hack Guest bootloader/firmware >> > AHCI driver to >> > explicitly flushes to Guest RAM (because have HCR_EL2.DC=1). >> >> OK. So *that* looks like a valid argument against HCR_EL2.DC==1. Out of >> curiosity: is that a made up example or something you actually have? >> >> Back to square one: >> Can you please benchmark the various cache cleaning options (global at >> startup time, per-page on S2 translation fault, and user-space)? >> > Eh, why is this a more valid argument than the vgic? The device > passthrough Stage-2 mappings would still have the Stage-2 memory > attributes to configure that memory region as device memory. Why is it > relevant if the device is DMA-capable in this context? > > -Christoffer Most ARM bootloader/firmware run with MMU off hence, they will not do explicit cache flushes when programming DMA-capable device. Now If HCR_EL2.DC=1 then Guest RAM updates done by bootloader/firmware will not be visible to DMA-capable device. --Anup _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm