On Sat, 14 Jan 2023 at 06:49, Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> wrote: > > On 2023/01/14 14:23, Richard Henderson wrote: > > On 1/8/23 22:22, Akihiko Odaki wrote: > >> libvirt uses "none" machine type to test KVM availability. Before this > >> change, QEMU used to pass 0 as machine type when calling KVM_CREATE_VM. > >> > >> The kernel documentation says: > >>> On arm64, the physical address size for a VM (IPA Size limit) is > >>> limited to 40bits by default. The limit can be configured if the host > >>> supports the extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use > >>> KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits) to set the size in the machine type > >>> identifier, where IPA_Bits is the maximum width of any physical > >>> address used by the VM. The IPA_Bits is encoded in bits[7-0] of the > >>> machine type identifier. > >>> > >>> e.g, to configure a guest to use 48bit physical address size:: > >>> > >>> vm_fd = ioctl(dev_fd, KVM_CREATE_VM, KVM_VM_TYPE_ARM_IPA_SIZE(48)); > >>> > >>> The requested size (IPA_Bits) must be: > >>> > >>> == ========================================================= > >>> 0 Implies default size, 40bits (for backward compatibility) > >>> N Implies N bits, where N is a positive integer such that, > >>> 32 <= N <= Host_IPA_Limit > >>> == ========================================================= > >> > >>> Host_IPA_Limit is the maximum possible value for IPA_Bits on the host > >>> and is dependent on the CPU capability and the kernel configuration. > >>> The limit can be retrieved using KVM_CAP_ARM_VM_IPA_SIZE of the > >>> KVM_CHECK_EXTENSION ioctl() at run-time. > >>> > >>> Creation of the VM will fail if the requested IPA size (whether it is > >>> implicit or explicit) is unsupported on the host. > >> https://docs.kernel.org/virt/kvm/api.html#kvm-create-vm > >> > >> So if Host_IPA_Limit < 40, such KVM_CREATE_VM will fail, and libvirt > >> incorrectly thinks KVM is not available. This actually happened on M2 > >> MacBook Air. > >> > >> Fix this by specifying 32 for IPA_Bits as any arm64 system should > >> support the value according to the documentation. > >> > >> Signed-off-by: Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> > >> --- > >> accel/kvm/kvm-all.c | 4 ++++ > >> 1 file changed, 4 insertions(+) > >> > >> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c > >> index e86c33e0e6..776ac7efcc 100644 > >> --- a/accel/kvm/kvm-all.c > >> +++ b/accel/kvm/kvm-all.c > >> @@ -2294,7 +2294,11 @@ static int kvm_init(MachineState *ms) > >> KVMState *s; > >> const KVMCapabilityInfo *missing_cap; > >> int ret; > >> +#ifdef TARGET_AARCH64 > >> + int type = 32; > >> +#else > >> int type = 0; > >> +#endif > > > > No need for an ifdef. Down below we have, > > > > if (object_property_find(OBJECT(current_machine), "kvm-type")) { > > g_autofree char *kvm_type = > > object_property_get_str(OBJECT(current_machine), > > "kvm-type", > > &error_abort); > > type = mc->kvm_type(ms, kvm_type); > > } else if (mc->kvm_type) { > > type = mc->kvm_type(ms, NULL); > > } > > > > and the aarch64 -M virt machine provides virt_kvm_type as mc->kvm_type. > > > > How did you hit this? Are you trying to implement your own board model? > > > > Looking at this, I'm surprised this is a board hook and not a cpu hook. > > But I suppose the architecture specific 'type' can hide any number of > > sins. Anyway, if you are doing your own board model, I suggest > > arranging to share the virt board hook -- maybe moving it to > > target/arm/kvm.c in the process? > I hit this problem when I used libvirt; libvirt uses "none" machine type > to probe the availability of KVM and "none" machine type does not > provide kvm_type hook. > > As the implementation of "none" machine type is shared among different > architectures, we cannot remove ifdef by moving it to the hook. > > Although implementing the hook for "none" machine type is still > possible, I think the default type should provide the lowest common > denominator and "none" machine type shouldn't try to work around when > the type is wrong. Otherwise it doesn't make sense to provide the "default". Yes, the problem is that the 'none' board type is all architecture-independent code, and so is this kvm_init() code, so there's no obvious arm-specific place to say "pick the best IPA size that will work for this host". Perhaps we should create somewhere in here a target-arch specific hook: we already have ifdefs in this function for S390X and PPC (printing some special case error strings if the ioctl fails), so maybe a hook that does "take the type provided by the machine hook, if any, sanitize or reject it, do the ioctl call, print arch-specific help/error messages if relevant" ? Paolo, do you have an opinion? thanks -- PMM