On Aug 12, 2014, at 12:36 AM, Jungseok Lee wrote: > On Aug 12, 2014, at 12:07 AM, Christoffer Dall wrote: >> On Mon, Aug 11, 2014 at 11:23:04PM +0900, Jungseok Lee wrote: >>> On Aug 11, 2014, at 8:35 PM, Christoffer Dall wrote: >>>> On Mon, Aug 11, 2014 at 12:24:35PM +0100, Richard W.M. Jones wrote: >>>>> On Mon, Aug 11, 2014 at 01:20:46PM +0200, Christoffer Dall wrote: >>>>>> On Sun, Aug 10, 2014 at 02:24:04PM +0100, Richard W.M. Jones wrote: >>>>>>> kvm_alloc_stage2_pgd has to do an order 9 allocation, ie. 512 >>>>>>> contiguous pages I think. >>>>>>> >>>>>>> This often leads to problems running qemu when memory is relatively >>>>>>> low -- eg. if you have one VM running, a healthy number of host >>>>>>> applications, and perhaps "just" 4GB free; then you decide to run the >>>>>>> libguestfs test suite. >>>>>>> >>>>>>> Any suggestions how to deal with this? >>>>>>> >>>>>> I'm not familiar with the libguestfs test suite, but are you saying you >>>>>> have 4GB of free physical memory and when you start your first VM then >>>>>> you get this error? That sounds unlikely to me. >>>>> >>>>> No, it runs hundreds of appliances (not all at the same time). Some >>>>> fail. >>>>> >>>>> It seems to be a memory fragmentation issue, rather than the absolute >>>>> free memory. >>>>> >>>> Ok, that's what I thought. You can probably hack around it by reducing >>>> S2_PGD_ORDER to whatever is accessible by the VMs you wish to run (as >>>> Jungseok also points out), but I'm afraid an upstream solution is >>>> probably not ready before the next merge window opens, at least. >>> >>> In case of ARM64 KVM, it is possible to reduce S2_PGD_ORDER in the following way. >>> >>> --- a/arch/arm64/include/asm/kvm_mmu.h >>> +++ b/arch/arm64/include/asm/kvm_mmu.h >>> @@ -62,7 +62,28 @@ >>> * Align KVM with the kernel's view of physical memory. Should be >>> * 40bit IPA, with PGD being 8kB aligned in the 4KB page configuration. >>> */ >>> -#define KVM_PHYS_SHIFT PHYS_MASK_SHIFT >>> +static inline int kvm_get_pa_range(void) >>> +{ >>> + int pa_range = read_cpuid(ID_AA64MMFR0_EL1) & 0xf; >>> + >>> + switch (pa_range) { >>> + case 0: >>> + return 32; >>> + case 1: >>> + return 36; >>> + case 2: >>> + return 40; >>> + case 3: >>> + return 42; >>> + case 4: >>> + return 44; >>> + case 5: >>> + return 48; >>> + default: >>> + return -EINVAL; >>> + } >>> +} >>> +#define KVM_PHYS_SHIFT kvm_get_pa_range() >>> #define KVM_PHYS_SIZE (1UL << KVM_PHYS_SHIFT) >>> #define KVM_PHYS_MASK (KVM_PHYS_SIZE - 1UL) >>> >>> The code puts limitation on guest's address space which is at most >>> host's physical address space. For example, if host runs on Cortex-A57, >>> IPA is set to 44, not 48. >>> >>> If this approach looks reasonable, I will post it as 3.17-rc1 comes up. >>> If not, please ignore it or use it as hack. >>> >> >> Did you check what happens when handling a stage-2 translation fault due >> to the input address being larger than the address space specified by >> the T0SZ field? > > I will check it carefully. > >> My feeling is that this should only be included in a proper rework of >> the supported guest physical address sizes. > > I agree. I just would like to figure out a right approach. > Thanks for the comment! As Christoffer points out, T0SZ field should be considered together. - Jungseok Lee _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm