On 11 September 2014 18:48, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: > On 11/09/14 16:26, Ard Biesheuvel wrote: >> On 11 September 2014 16:52, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: >>> On 11/09/14 15:41, Ard Biesheuvel wrote: >>>> Hello all, >>>> >>>> I spent most of the day chasing a particularly weird heisenbug in the >>>> QEMU+KVM+UEFI combo. >>>> The symptom was that UEFI init would hang on the first write to the >>>> second NOR flash (to initialize the variable store) but *only* when >>>> using the -bios option (instead of -pflash) and a boot image of >>>> exactly 64 MB in size. Note that this implies that the second NOR >>>> flash was not file backed. >>>> >>>> As it turns out, the choice of the -bios option and the size of the >>>> file affect whether KVM ends up using sections or pages to back the >>>> NOR flash, and in my failure case, it was using the latter. That >>>> resulted in KVM going down a code path where the memory backing the >>>> NOR was writable, which breaks the MMIO emulation, and resulted in the >>>> hang on init of the variable store. >>>> >>>> The patch below fixes it for me. >>>> >>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c >>>> index c68ec28f17c3..121abc6fef97 100644 >>>> --- a/arch/arm/kvm/mmu.c >>>> +++ b/arch/arm/kvm/mmu.c >>>> @@ -817,7 +817,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, >>>> phys_addr_t fault_ipa, >>>> pfn = gfn_to_pfn_prot(kvm, gfn, write_fault, &writable); >>>> if (is_error_pfn(pfn)) >>>> return -EFAULT; >>>> >>>> - if (kvm_is_mmio_pfn(pfn)) >>>> + if (writable && kvm_is_mmio_pfn(pfn)) >>>> mem_type = PAGE_S2_DEVICE; >>>> >>>> spin_lock(&kvm->mmu_lock); >>>> >>>> Here is the definition of kvm_is_mmio() for completeness. I am a bit >>>> out of my depth here, so perhaps someone else can shed some light on >>>> this? >>>> >>>> bool kvm_is_mmio_pfn(pfn_t pfn) >>>> { >>>> if (pfn_valid(pfn)) >>>> return PageReserved(pfn_to_page(pfn)); >>>> >>>> return true; >>>> } >>>> >>>> To me, it is particularly puzzling what PageReserved() has to do with >>>> anything, as I couldn't find any other uses of it under kvm/ >>>> >>> >>> My understanding is that kvm_is_mmio_pfn() is used for *devices* that >>> are mapped directly mapped (think device assignment). PageReserved() >>> would make sense there. >>> >>> Now, I'm not familiar with the whole QEMU setup, so maybe you could >>> describe how things are mapped, and what is supposed to happen? >>> >> >> When running QEMU using the -bios <file> option, the file is exposed >> to the guest as an emulated NOR flash at 0x0, so that you can boot >> from it directly. >> In my case, the NOR is used for the boot image itself, and for a >> non-volatile variable store at 0x400_0000, which is initialized by >> UEFI when it boots. >> >> In order for the NOR emulation to work, writes to the NOR need to >> trap, so that QEMU can take down the whole memory region, trapping all >> reads and writes, until a command is issued that puts it back into >> array mode, and the memory region is created again. In this mode, the >> guest reads to the NOR go straight to host RAM. (Or they should: the >> patch you merged today fixed and issue where reads were mistaken for >> writes and sent to QEMU instead) >> >> So when UEFI enters it non-volatile variable store driver, it first >> issues a read to the base of the second half of the NOR (0x400_0000), >> resulting in some host RAM to be pinned to back it up. However, for >> some reason, it is mapped as PAGE_S2_DEVICE, which is read-write, so >> when subsequently a write is issued to kick the NOR into command mode, >> it just gets sent to host RAM as well. >> >> Indeed, by the looks of it, kvm_is_mmio_pfn() is intended to map host >> device memory straight into the guest physical address space, but this >> is not what I am doing, so why is it returning 'true' here? > > That's the bit I do not understand. Any chance you can find out who sets > this bit in the page tables? > That is also the bit *I* don't understand, and that is why I am asking you guys :-) Let me first double check that PageReserved is actually the culprit here ... _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm