Re: [PATCH 0/2] Workaround for Cavium ThunderX2 erratum 219

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 11, 2019 at 11:44:55AM +0100, Will Deacon wrote:
> Hi JC,
> 
> Thanks for posting this.
> 
> On Fri, Oct 11, 2019 at 10:35:21AM +0000, Jayachandran Chandrasekharan Nair wrote:
> > These two patches are based on the work by Marc Zyngier and addresses
> > Cavium ThunderX2 erratum 219.
> > 
> > This erratum (originally reported by ARM folks) is from an interesting
> > use of the prefetch instruction in the KPTI patchset. The prefetch
> > was done between a TTBR change and the corresponding ISB, and this
> > occasionally caused a crash on ThunderX2.
> > 
> > The first patch removes the troublesome prefetch for ThunderX2.
> > The second patch addresses the case where the issue can be triggered
> > from a guest kernel. The workaround in this case is to trap TTBR
> > accesses by setting HCR_EL2.TVM for guests and doing the system
> > register update from EL2 in a fast path.
> 
> FWIW, I was already planning to send the following to Linus:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=errata/tx2-219
> 
> so please base any changes on top of that branch.

Please consider taking my patchset as is, if you don't have
issues with patches.

> > Due to the nature of the erratum, the trap-and-emulate is only
> > needed when SMT is enabled.
> > 
> > The overhead of trap-and-emulate is expected to be negligible on most
> > workloads. A command line option kvm-arm.vm_msr_trap has been
> > provided to override trapping on guest TTBR updates.  This is to
> > address a very limited case where a user wants to run SMT enabled,
> > with a trustworthy guest kernel, and wants to avoid the performance
> > overhead associated with emulating the address translation register
> > changes.
> 
> Do you have any performance data to show the impact of the workaround on
> non-kpti guests? I don't think we can justify the inclusion of a cmdline
> option for this without figures showing that it's really necessary.
> Otherwise, the "very limited case" really is a niche scenario where the
> CONFIG option can simply be disabled.

In my view, you are switching the responsibility here. Even one use
case should be enough reason not to force a performance regression
that cannot be opted out of. You are expected to leave as much policy
as reasonable to the user with safe (and least astonishing) defaults.

A class of guest usecases involve running stock images (linux or
non-linux). The arm64 server ecosystem is still in development, so
we should allow someone evaluating a server system to turn off or
on options as much as possible without forcing a re-compile.

Also, the run-time option is generic enough that it can be switched
on/off for any platform, not just the one affected by this erratum.

So, I disagree with the queued patchset - but I will leave you to make
your call on which way to go.

JC
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm



[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux