Re: [PATCH] KVM: arm64: Add support for IDSR exits to userspace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2020-03-23 08:22, Lev Aronsky wrote:
On Sun, Mar 22, 2020 at 05:29:52PM +0000, Marc Zyngier wrote:
On 2020-03-22 14:20, Lev Aronsky wrote:
> On Sun, Mar 22, 2020 at 11:34:35AM +0000, Marc Zyngier wrote:
> > Hi Lev,
> >
> > Thanks for this.
> >
> > On 2020-03-22 09:36, aronsky@xxxxxxxxx wrote:
> > > From: Lev Aronsky <aronsky@xxxxxxxxx>
> > >
> > > As per ARM DDI 0487E.a, section D12.3.2, there can be various
> > > system registers that are IMPLEMENTATION DEFINED.
> > >
> > > So far, KVM would inject an undefined instruction into the guest,
> > > whenever access to an implementation defined system register (from
> > > here on referred to as IDSR) was trapped. This makes sense, since a
> > > general-purpose OS probably shouldn't rely on the existence of IDSRs.
> > >
> > > This patch introduces an option to give userspace a chance to handle
> > > these traps. This can be used to emulate specific architectures, and
> > > virtualize operating systems that rely on the existence of specific
> > > IDSRs.
> >
> > Do you have an example of such operating systems? Also, do you have
> > an example where userspace could actually do something useful about
> > such access, other than maybe treating it as RAZ/WI?
>
> I ran into the issue when working on our company's project, which aims
> to emulate Apple's iOS under QEMU. While emulation currently works
> nicely, we were looking to improve performance by using hardware
> virtualization, and that's when we ran into the issue of
> implementation-defined system registers, since iOS uses those. Frankly,
> in our case, we don't really know the purpose of those registers, as far
> as the iOS kernel is concerned. When emulating them, we treat them as
> simple 64-bit storage areas. It's possible that treating them as RAZ/WI
> would work, as well.

It's not really reassuring, is it? :-/

It's not perfect, but also not too bad - the system is booting as
expected. It might not be a 100% perfect emulation of a real device
without a proper implementation of those registers, but it's good enough
in our case.

Hum. OK, I guess.

> > > Similarly to the recently introduced NISV exits, this is an ABI to
> > > userspace, and comes with a matching new capability that allows the
> > > configuration of the behavior.
> >
> > These are different issues: one is a shortcoming of the architecture,
> > the other a shortcoming (or hyperspecialization) of the guest OS.
>
> You're correct. I mentioned the NISV exits because my code is modeled
> after them, and was hoping it would make following my changes easier.

It does. IT is just that wou did seem to conflate the two things, which
triggered an allergic reaction... ;-)

> > > This patch introduces KVM_CAP_ARM_IDSR_TO_USER, which can be used to
> > > enable userspace exits due to access to IDSRs. Additionally, it
> > > introduces a matching new exit reason, KVM_EXIT_ARM_IDSR, to report
> > > those events to userspace. Userspace can choose to emulate the access
> > > based on the architecture requirements, or refuse to emulate it and
> >
> > s/architecture/implementation/
>
> Sorry for the newbie question - but how do I change the description now?
> I will upload an updated patch, based on your following comments, but
> I'm not sure how to change the original commit description the correct
> way.

git commit --amend

Thanks. I know about amending commits - I was wondering about what to do
with the amended commit. As far as I understand, given there are
additional changes I'll have to make, I'd have to make a v2 of the
patch, but according to the guides I've seen, I would only add the
changelog of the newer version of the patch, so I wasn't sure whether to
make the correction in the original description of the patch or not.

In general, the commit log must be accurate, so you are free to change
it at any time to reflect what the patch does.


>
> > > let the kernel continue with the default behavior (injecting an
> > > undefined instruction exception into the guest).
> > >
> > > Signed-off-by: Lev Aronsky <aronsky@xxxxxxxxx>
> > > ---
> > >  arch/arm64/include/asm/kvm_coproc.h |  1 +
> > >  arch/arm64/include/asm/kvm_host.h   |  6 +++
> > >  arch/arm64/kvm/sys_regs.c           | 66 ++++++++++++++++++++++++++++-
> > >  include/uapi/linux/kvm.h            | 14 ++++++
> > >  tools/include/uapi/linux/kvm.h      | 14 ++++++
> > >  virt/kvm/arm/arm.c                  | 11 +++++
> > >  6 files changed, 111 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/arm64/include/asm/kvm_coproc.h
> > > b/arch/arm64/include/asm/kvm_coproc.h
> > > index 0185ee8b8b5e..34b45efffe52 100644
> > > --- a/arch/arm64/include/asm/kvm_coproc.h
> > > +++ b/arch/arm64/include/asm/kvm_coproc.h
> > > @@ -33,6 +33,7 @@ int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct
> > > kvm_run *run);
> > >  int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
> > >  int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
> > >  int kvm_handle_sys_reg(struct kvm_vcpu *vcpu, struct kvm_run *run);
> > > +int kvm_handle_idsr_return(struct kvm_vcpu *vcpu, struct kvm_run *run);
> > >
> > >  #define kvm_coproc_table_init kvm_sys_reg_table_init
> > >  void kvm_sys_reg_table_init(void);
> > > diff --git a/arch/arm64/include/asm/kvm_host.h
> > > b/arch/arm64/include/asm/kvm_host.h
> > > index d87aa609d2b6..951c7e6fec8b 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -91,6 +91,12 @@ struct kvm_arch {
> > >  	 * supported.
> > >  	 */
> > >  	bool return_nisv_io_abort_to_user;
> > > +
> > > +	/*
> > > +	 * If we encounter an access to an implementation-defined system
> > > +	 * register, report this to user space.
> > > +	 */
> > > +	bool return_idsr_to_user;
> >
> > If we are going to add flags like this, maybe we'd be better off
> > having
> > actual flags as an unsigned long.
>
> I don't mind either way - as I mentioned, I tried to follow the NISV
> exits style, and therefore used `bool`. Should I keep it, change it to
> `unsigned long`, or change both mine and the NISV flag to `unsigned
> long`?

Change return_nisv_io_abort_to_user to "unsigned long flags", add a new
RETURN_NISV_IO_ABORT_TO_USER flag and repaint all the current users
in a preliminary patch.

Will do. This means there will be two patchsets, right?

One patch-set, at least two patches (and a cover letter).

[...]

> I must admit that I haven't taken an in-depth look at 32-bit accesses,
> since it was out of scope of our project (which currently focuses only
> on newish versions of iOS). IMO, the patch is useful as is, even without
> the 32-bit support - but I can see how it looks like a job half-done.

The 32bit guest support definitely isn't optional. I appreciate that you
focus on 64bit, but having different behaviours between the two is not
really something I want to entertain.

Totally understandable - but I'm afraid that in this case, the patch
will have to wait, as we currently focus our R&D resources on further
advancing our project. I will do my best, however, to try and find some
time to work on it in the near future. Would booting a 32-bit ARM Linux
and using a kernel module to write/read implementation-defined registers
be an acceptable approach to test it? Also, just to make sure I
understand correctly - we're talking about AArch32 mode of arm64
processors, right? Not testing it on 32-bit ARM cores (which, as I
understand, are being retired in 5.7 anyway).

32bit-on-64bit is what I care about, yes. 32bit *host* is on its way to
retirement, as you found out. And yes, just booting any 32bit guest and
checking that you indeed trap access to an IMPDEF register would be enough.


> Unfortunately, we don't have a 32-bit setup that would allow us to
> properly test the code before submitting. I could copy the changes over,
> but submitting the patches without actually testing them seems
> irresponsible to me.

Well, you seem to at least have QEMU. This should be enough to test
*something*. Also, if you're testing this with KVM, what are you running
KVM on?

We're running it on an ARM cloud server (we were hoping to be able to
use SBCs for the project, but iOS uses 16K pages for kernel mode, and we found out (the hard way) that most older/cheaper ARM cores don't support
it (Cortex A76 being the first one to support it, IIRC).

I think there is more than just A76. A55 definitely has TGran16, as well as
A73, A75, A65 (I've stopped looking in the various TRMs). So definitely
in the realm of SBCs (I have a quad A55 on my desk, worth $70).


> Interestingly, EL0 access to implementation-defined registers currently
> results in an UNDEF, even though I expected it to be passed on to our
> handler (I saw this behavior with a custom system register we defined
> for direct communication with the hypervisor from a user-mode program we
> developed). I tried following the ARM documentation to figure out what
> could cause such a behavior, but so far I'm at a loss.

Here's your answer:

"When the value of HCR_EL2.TIDCP is 1, it is IMPLEMENTATION DEFINED whether any of this functionality accessed from EL0 is trapped to EL2. If it is not, then it is UNDEFINED, and any attempt to access it from EL0 generates an
exception that is taken to EL1."

Also, I don't really understand how you define a custom system register.
Unless you're writing the HW as well, of course.

We are using QEMU as the hypervisor. QEMU allows for definition of
arbitrary system registers (based on opc0/opc1/opc2/crm/crn), with
custom read/write callback functions. We have a custom machine for
iPhone emulation (you can take a look at our code at
https://github.com/alephsecurity/xnu-qemu-arm64, if you're interested),
so yeah - you could say we're writing the hardware, as well.

I'm pretty sure this wouldn't work with HW virtualization. I suspect
this would UNDEF directly on the CPU, leading to an exception being taken at EL1 without intervention of the hypervisor. Which makes sense as you'd
be executing an instruction that the CPU really doesn't implement.

> As far as the whole idea of supporting non-architectural guests... First
> of all, you're absolutely spot-on regarding your guess - as mentioned
> above, we're attempting to virtualize iOS. I believe that in the current
> state of the patch, given its opt-in nature and limited scope
> (implementation-defined registers), its benefits (the potential to
> successfully execute non-architetural guests) outweigh the negatives
> (extra code in the kernel, that's unnecessary to most of the guests
> under ARM/KVM). I would be happy to give more examples (other than iOS)
> that use these registers, but unfortunately, I'm not aware of any such
> systems (though I wouldn't be surprised if Samsung has some
> Exynos-specifc code, but that's just a shot in the dark).

Oh, absolutely *all* implementations have IMPDEF registers. You just
have to look at the TRM for any of the ARM implementations, which
document a number of them.

And as for having non-architectural code in the kernel, that's of course, as you've guessed, a total no-no-no. We made that mistake on 32bit, and
stopped quickly!

My worry is more that you'd need to start faking a lot more than just a
couple of IMPDEP sysregs. All the ID registers, for a start. Then find
ways to deal with the host's errata, that the guest has to know about
(despite the ID registers being faked). And make sure that no workaround
that happen in the guest result in anything unexpected.

Some of that is possible (the ID register bit, although it requires some very serious work), and you could cross your fingers hoping that no erratum
will get in the way. But it seems pretty fragile...

That's definitely something we're worried about, but so far our efforts
have been successful (as in, right now, with the above changes in the
kernel and matching updates to QEMU, our iOS image boots to user-mode
successfully). We are definitely skipping on some parts of the OS (for
example, anything related to TPM). It's possible that as we attempt to
improve the accuracy of our emulation/virtualization, we'll run into the
problems you're describing.

Fun! ;-)

I'm looking forward to your posting of the next version that would include 32bit supoport. Another thing I'd ask is a corresponding support patch for a userspace VMM of your choice (QEMU, kvmtool) so that I can at least exercise
the support.

Thanks,

        M.
--
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm



[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux