On Thu, Mar 02, 2017 at 01:20:05PM +0100, Paolo Bonzini wrote: > On 02/03/2017 12:39, James Hogan wrote: > > It can't right now, though with relocation of the kernel now implemented > > in MIPS Linux for KASLR, and hopes for a more generic EVA implementation > > (which can require the kernel to be linked in a completely different > > segment) it isn't completely infeasible. > > What about the other way round, sticking a minimal T&E stub in kernel > space and running the kernel in userspace? Would it be feasible or > would it be as complex as KVM itself? You mean have a fallback in the guest kernel to keep kernel running from userspace addresses in kernel mode so it works in VZ guests and non-virtualized? Interesting idea. I think it would involve a lot of complexity. It could forgo some of the emulation of privileged instructions that KVM T&E does since its running in kernel mode, but memory management would be more complex, and invasive changes would be required to the kernel. - Memory privilege protection is on the granularity of segments, so with the traditional segment layout all of USeg (0x00000000..0x7FFFFFFF) is accessible to user mode, so you'd still need to utilise ASIDs to separate the address spaces of actual user programs running in 0x00000000..0x3FFFFFFF from the kernel code running in 0x40000000..0x7FFFFFFF. - USeg is always TLB mapped. That means any kernel code could trigger TLB exceptions, which breaks existing assumptions (e.g. normally from unmapped kernel segments you can disable interrupts and then manipulate the TLB, but that isn't safe if a TLB refill exception could happen at any time and clobber the TLB registers). If in the future we manage to workaround these issues and map the kernel (for security/protection purposes), then it would be easier, but then we'll likely already have the capability to fully relocate into a different segment. > > 1) QEMU, which I've implemented using the kvm_type machine callback. > > This allows the KVM type to be specified with e.g. > > "-machine malta,accel=kvm,kvm-type=TE" > > Otherwise it defaults to using KVM_VM_MIPS_DEFAULT. > > > > When you try and load a kernel (which happens after kvm_init() has > > already passed the kvm type into KVM_CREATE_VM) it will check that it > > supports the current kernel type. > > > > 2) My kvm test application, which uses KVM_VM_MIPS_DEFAULT by default > > and hackily maps itself into the guest physical address space to run C > > code test cases. > > So this one would work for both TE and VZ because the guest is not a > Linux kernel. Yes, the test code is position independent and careful to avoid direct references to any symbols. The GPA mappings are set up the same, but the virtual addresses (PC, stack pointer etc) are set up slightly differently depending on whether the VZ capability is present. > I don't know... Instinctively I would think that it's easy to get > KVM_VM_MIPS_DEFAULT wrong and place the VZ-and-fall-back-to-TE policy in > userspace, but I can be convinced otherwise if the failure mode is good > enough. Yeh, I think I agree. It isn't really necessary to have that decision making in the kernel, and to use a particular KVM type userspace needs to be aware about it, so it can always figure out from capabilities which one to use prior to KVM_CREATE_VM. I suppose the exception is T&E. It shouldn't assume that just because VZ is available that T&E isn't (even if that is the case right now). It could always just try KVM_CREATE_VM with kvm type 0 and detect the error I suppose, but capabilities are nicer. Maybe I'll redefine KVM_CAP_MIPS_VZ a bit, such that the value returned + 1 is a bitmask of supported kvm types: has T&E = !!( (v + 1) & BIT(KVM_VM_MIPS_TE) ) has VZ = !!( (v + 1) & BIT(KVM_VM_MIPS_VZ) ) That way old kernels which return 0 are consistent, and other implementations could be added if really necessary without confusing userland (but fingers crossed it'll never ever be necessary). > For example, what happens if you use KVM_SET_USER_MEMORY_REGION > for a kernel address in TE mode? That deals with physical addresses and user/kernel memory is distinguished by the virtual address, so the KVM mode (T&E vs VZ) doesn't make a difference here. Cheers James
Attachment:
signature.asc
Description: Digital signature