-----Original Message-----
From: Marc Zyngier <maz@xxxxxxxxxx>
Sent: Monday, January 18, 2021 9:26 PM
To: Jianyong Wu <Jianyong.Wu@xxxxxxx>
Cc: Justin He <Justin.He@xxxxxxx>; nd <nd@xxxxxxx>; will@xxxxxxxxxx;
kvmarm@xxxxxxxxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [PATCH] arm64/kvm: correct the error report in
kvm_handle_guest_abort
On 2021-01-18 13:04, Jianyong Wu wrote:
> Hi Marc,
>
>> -----Original Message-----
>> From: kvmarm-bounces@xxxxxxxxxxxxxxxxxxxxx <kvmarm-
>> bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Jianyong Wu
>> Sent: Saturday, January 16, 2021 4:47 PM
>> To: Marc Zyngier <maz@xxxxxxxxxx>
>> Cc: Justin He <Justin.He@xxxxxxx>; nd <nd@xxxxxxx>; will@xxxxxxxxxx;
>> kvmarm@xxxxxxxxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
>> Subject: RE: [PATCH] arm64/kvm: correct the error report in
>> kvm_handle_guest_abort
>>
>> Hi Marc,
>>
>> > -----Original Message-----
>> > From: Marc Zyngier <maz@xxxxxxxxxx>
>> > Sent: Friday, January 15, 2021 7:21 PM
>> > To: Jianyong Wu <Jianyong.Wu@xxxxxxx>
>> > Cc: James Morse <James.Morse@xxxxxxx>; will@xxxxxxxxxx; Suzuki
>> Poulose
>> > <Suzuki.Poulose@xxxxxxx>; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
>> > kvmarm@xxxxxxxxxxxxxxxxxxxxx; Steve Capper
<Steve.Capper@xxxxxxx>;
>> > Justin He <Justin.He@xxxxxxx>; nd <nd@xxxxxxx>
>> > Subject: Re: [PATCH] arm64/kvm: correct the error report in
>> > kvm_handle_guest_abort
>> >
>> > On 2021-01-15 09:30, Jianyong Wu wrote:
>> > > Currently, error report when cache maintenance at read-only
>> > > memory range, like rom, is not clear enough and even not correct.
>> > > As the specific error is definitely known by kvm, it is obliged
>> > > to give it out.
>> > >
>> > > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash
>> > > range from
>> > > 0 to 128M, error is reported by kvm as "Data abort outside
>> > > memslots with no valid syndrome info" which is not quite correct.
>> > >
>> > > Signed-off-by: Jianyong Wu <jianyong.wu@xxxxxxx>
>> > > ---
>> > > arch/arm64/kvm/mmu.c | 12 +++++++++---
>> > > 1 file changed, 9 insertions(+), 3 deletions(-)
>> > >
>> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
>> > > 7d2257cc5438..de66b7e38a5b 100644
>> > > --- a/arch/arm64/kvm/mmu.c
>> > > +++ b/arch/arm64/kvm/mmu.c
>> > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct
kvm_vcpu
>> > > *vcpu)
>> > > * So let's assume that the guest is just being
>> > > * cautious, and skip the instruction.
>> > > */
>> > > - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
>> > {
>> > > - kvm_incr_pc(vcpu);
>> > > - ret = 1;
>> > > + if (kvm_vcpu_dabt_is_cm(vcpu)) {
>> > > + if (kvm_is_error_hva(hva)) {
>> > > + kvm_incr_pc(vcpu);
>> > > + ret = 1;
>> > > + goto out_unlock;
>> > > + }
>> > > +
>> > > + kvm_err("Do cache maintenance in the read-
only
>> > memory range\n");
>> >
>> > We don't scream on the console for guests bugs.
>> Ok
>>
>> >
>> > > + ret = -EFAULT;
>> > > goto out_unlock;
>> > > }
>> >
>> > And what is userspace going to do with that? To be honest, I'd
>> > rather not report it in any case:
>> >
>> > - either it isn't mapped, and there is no cache to clean/invalidate
>> > - or it is mapped read-only:
>> > - if it is a "DC IVAC", the guest should get the fault as per
>> > the ARM ARM. But I don't think we can identify the particular CMO
>> > at this stage, so actually performing an invalidation is the least
>> > bad thing to do.
>> >
>> > How about this (untested)?
>
> I have tested for this. It works that DC ops can pass on memory range
> for rom. But there is performance issue. It takes too long a time that
> do DC on rom range compared with on normal memory range. Here is
some
> data:
> Ops memory type size
> time
> dc civac rom memory 128M
> 6700ms;
> dc civac writable normal memory 128M
> 300ms;
>
> It's a single thread test and may be worse on multi thread. I'm not
> sure we can bear it. WDYT?
The problem is that the guest is invalidating one cache-line at a
time, but we
invalidate 128M at a time in your example.
I would say that I really don't care how slow it is. We cannot know
which
address the guest is trying to invalidate (as your commit message
shows,
there is no syndrome information available).
So it seems our only choices are:
- don't do any invalidation, which is likely to break the guest
- invalidate everything, always
Given that, I'd rather have a slow guest. Also, it very much looks
like no
existing SW does this, so I cannot say I care much.