On Tue, Apr 16, 2024 at 11:29 PM Alejandro Jimenez <alejandro.j.jimenez@xxxxxxxxxx> wrote: > > > > On 4/16/24 15:51, Paolo Bonzini wrote: > > On Tue, Apr 16, 2024 at 8:08 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > >> > >> On Thu, Feb 15, 2024, Alejandro Jimenez wrote: > >>> The goal of this RFC is to agree on a mechanism for querying the state (and > >>> related stats) of APICv/AVIC. I clearly have an AVIC bias when approaching this > >>> topic since that is the side that I have mostly looked at, and has the greater > >>> number of possible inhibits, but I believe the argument applies for both > >>> vendor's technologies. > >>> > >>> Currently, a user or monitoring app trying to determine if APICv is actually > >>> being used needs implementation-specific knowlegde in order to look for specific > >>> types of #VMEXIT (i.e. AVIC_INCOMPLETE_IPI/AVIC_NOACCEL), checking GALog events > >>> by watching /proc/interrupts for AMD-Vi*-GA, etc. There are existing tracepoints > >>> (e.g. kvm_apicv_accept_irq, kvm_avic_ga_log) that make this task easier, but > >>> tracefs is not viable in some scenarios. Adding kvm debugfs entries has similar > >>> downsides. Suravee has previously proposed a new IOCTL interface[0] to expose > >>> this information, but there has not been any development in that direction. > >>> Sean has mentioned a preference for using BPF to extract info from the current > >>> tracepoints, which would require reworking existing structs to access some > >>> desired data, but as far as I know there isn't any work done on that approach > >>> yet. > >>> > >>> Recently Joao mentioned another alternative: the binary stats framework that is > >>> already supported by kernel[1] and QEMU[2]. > >> > >> The hiccup with stats are that they are ABI, e.g. we can't (easily) ditch stats > >> once they're added, and KVM needs to maintain the exact behavior. > > > > Stats are not ABI---why would they be? They have an established > > meaning and it's not a good idea to change it, but it's not an > > absolute no-no(*); and removing them is not a problem at all. > > > > For example, if stats were ABI, there would be no need for the > > introspection mechanism, you could just use a struct like ethtool > > stats (which *are* ABO). > > > > Not everything makes a good stat but, if in doubt and it's cheap > > enough to collect it, go ahead and add it. Cheap collection is the > > important point, because tracepoints in a hot path can be so expensive > > as to slow down the guest substantially, at least in microbenchmarks. > > > > In this case I'm not sure _all_ inhibits makes sense and I certainly > > wouldn't want a bitmask, > > I wanted to be able to query enough info via stats (i.e. is APICv active, and if > not, why is it inhibited?) that is exposed via the other interfaces which are not > always available. That unfortunately requires a bit of "overloading" of > the stat as I mentioned earlier, but it remains cheap to collect and expose. > > What would be your preferred interface to provide the (complete) APICv state? > > but a generic APICv-enabled stat certainly > > makes sense, and perhaps another for a weirdly-configured local APIC. > > Can you expand on what you mean by "weirdly-configured"? Do you mean something > like virtual wire mode? I mean any of APICV_INHIBIT_REASON_HYPERV, APICV_INHIBIT_REASON_PHYSICAL_ID_ALIASED, APICV_INHIBIT_REASON_APIC_ID_MODIFIED, APICV_INHIBIT_REASON_APIC_BASE_MODIFIED, APICV_INHIBIT_REASON_LOGICAL_ID_ALIASED, which in practice are always going to be APICV_INHIBIT_REASON_HYPERV on 99.99% of the gues ExtINT is visible via irq_window_exits; if you are not running nested, do not have the problematic configurations above. don't have debugging (BLOCKIRQ) or in-kernel PIT with reinjection, that's basically the only one that's left. Paolo > Alejandro > > > > > Paolo > > > > (*) you have to draw a line somewhere. New processor models may > > virtualize parts of the CPU in such a way that some stats become > > meaningless or just stay at zero. Should KVM not support those > > features because it is not possible anymore to introspect the guest > > through stat? > > > >> Tracepoints are explicitly not ABI, and so we can be much more permissive when it > >> comes to adding/expanding tracepoints, specifically because there are no guarantees > >> provided to userspace. > >> > > > > >