On 10/07/2017 17:32, alazar@xxxxxxxxxxxxxxx wrote: > Thanks for your review, Paolo. > Below are some answers. > Will have to chew on the others. I'm not sure what you think of removing KVMI_EVENT_ACTION_SET_REGS and more or less standardizing on actions SKIP/RETRY/ALLOW/CRASH. The main remaining issue seems to be map/unmap. Paolo > On Fri, 7 Jul 2017 18:52:45 +0200, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: >> On 07/07/2017 16:34, Adalbert Lazar wrote: >>> +:Returns: ↴ >> >> What is this character? :) > > I thought using a nice right-down arrow instead of a boring/inexpressive > ASCII char in order to avoid trailing spaces will be cool. At least it > was fun :) > >>> + struct kvmi_get_version_reply { >>> + __s32 err; >>> + __u32 version; >>> + }; >>> + >>> +Returns the introspection API version (the KVMI_VERSION constant) and the >>> +error code (zero). In case of an unlikely error, the version will have an >>> +undefined value. >> >> Would it make sense to return a few more information fields, for example >> the set of supported KVMI_CONTROL_EVENTS events? > > Sure. > >>> + struct kvmi_get_guest_info_reply { >>> + __s32 err; >>> + __u16 vcpu_count; >>> + __u16 padding; >>> + __u64 tsc_speed; >>> + }; >>> + >>> +Returns the number of online vcpus, and the TSC frequency in HZ, if supported >>> +by the architecture (otherwise is 0). >> >> Can the TSC frequency be specified only if the guest is using TSC >> scaling? Defining the TSC frequency on older hosts is a bit tricky. 0 >> would be the host. >> >> Maybe define the second padding to be >> >> __u16 arch; >> >> (0 = x86) followed by an arch-specific payload. > > Sure. > >>> +5. KVMI_SHUTDOWN_GUEST >>> +---------------------- >>> + >>> +Ungracefully shutdown the guest. >> >> Note that all KVM can do here is pass down a request asynchronously to >> userspace, failing further KVM_RUN ioctls. I'm suggesting below an >> alternative action. > > We were thinking about sending an "ungraceful" TERM signal :D > > Adding another reponse (exit-to-userspace-with-shutdown) on events, > as you suggested below, makes sense. > >>> +12. KVMI_SET_PAGE_ACCESS >>> +------------------------ >>> + >>> +Sets the spte flags (rwx - present, write & user) - access - for the specified >>> +vCPU and guest physical address. >> >> rwx or pwu? I suppose RWX. Maybe #define the constants in the >> documentation. > > RWX. > Will do. > >>> +17. KVMI_CONTROL_EVENTS >>> +----------------------- >>> + >>> +:Architectures: all >>> +:Versions: >= 1 >>> +:Parameters: ↴ >>> + >>> +:: >>> + >>> + struct kvmi_control_events { >>> + __u16 vcpu; >>> + __u16 padding; >>> + __u32 events; >>> + }; >>> + >>> +:Returns: ↴ >>> + >>> +:: >>> + >>> + struct kvmi_error_code { >>> + __s32 err; >>> + __u32 padding; >>> + }; >>> + >>> +Enables/disables vCPU introspection events, by setting/clearing one or more >>> +of the following bits (see 'Events' below) : >>> + >>> + KVMI_EVENT_CR >>> + KVMI_EVENT_MSR >>> + KVMI_EVENT_XSETBV >>> + KVMI_EVENT_BREAKPOINT >>> + KVMI_EVENT_USER_CALL >>> + KVMI_EVENT_PAGE_FAULT >>> + KVMI_EVENT_TRAP >>> + >>> +Trying to enable unsupported events (~KVMI_KNOWN_EVENTS) by the current >>> +architecture would fail and -EINVAL will be returned. >> >> I would prefer the interface to allow enable (set bits), disable (clear >> bits) in addition to set (what you have here) the events. > > Calling KVMI_CONTROL_EVENTS with event=KVMI_EVENT_CR|KVMI_EVENT_MSR will > enable CR and MSR events and disable all the other events. It functions like > a set and clear, in the same time. > > It will be better to have KVMI_ENABLE_CONTROL_EVENTS and > KVMI_DISABLE_CONTROL_EVENTS instead? > >> The return value could indicate the set of enabled events. > > Indeed. > >>> +Events >>> +------ >>> + >>> +All vcpu events are sent using the KVMI_EVENT_VCPU message id. No event will >>> +be sent unless enabled with a KVMI_CONTROL_EVENTS command. >>> + >>> +For x86, the message data starts with a common structure:: >>> + >>> + struct kvmi_event_x86 { >>> + __u16 vcpu; >>> + __u8 mode; >>> + __u8 padding1; >>> + __u32 event; >>> + struct kvm_regs regs; >>> + struct kvm_sregs sregs; >>> + struct { >>> + __u64 sysenter_cs; >>> + __u64 sysenter_esp; >>> + __u64 sysenter_eip; >>> + __u64 efer; >>> + __u64 star; >>> + __u64 lstar; >> >> cstar too, for AMD CPUs. > > Thanks. Will do. > >>> + } msrs; >>> + }; >>> + >>> +In order to help the introspection tool with the event analysis while >>> +avoiding unnecessary introspection commands, the message data holds some >>> +registers (kvm_regs, kvm_sregs and a couple of MSR-s) beside >>> +the vCPU id, its mode (in bytes) and the event (one of the flags set >>> +with the KVMI_CONTROL_EVENTS command). >>> + >>> +The replies to events also start with a common structure, having the >>> +KVMI_EVENT_VCPU_REPLY message id:: >>> + >>> + struct kvmi_event_x86_reply { >>> + struct kvm_regs regs; >> >> Put regs last? >> >>> + __u32 actions; >>> + __u32 padding; >>> + }; > > Right. > >>> +5. KVMI_EVENT_USER_CALL >>> +----------------------- >> >> Please rename this to KVMI_EVENT_HCALL or HYPERCALL or VMCALL. > > Sure. > >>> + >>> +:Architectures: x86 >>> +:Versions: >= 1 >>> +:Parameters: ↴ >>> + >>> +:: >>> + >>> + struct kvmi_event_x86; >>> + >>> +:Returns: ↴ >>> + >>> +:: >>> + >>> + struct kvmi_event_x86_reply; >>> + >>> +This event is sent on a user hypercall and the introspection has already >>> +already been enabled for this kind of event (KVMI_CONTROL_EVENTS). >>> + >>> +kvmi_event_x86 is sent to the introspector, which can respond with the >>> +KVMI_EVENT_ACTION_SET_REGS bit set in 'actions', instructing the host >>> +kernel to override the general purpose registers using the values from >>> +introspector (regs). >> >> Does KVMI_EVENT_ACTION_SET_REGS bypass the hypercall? Why is >> KVMI_EVENT_ACTION_ALLOW not allowed? As before, I'd prefer >> SKIP/RETRY/ALLOW/CRASH as the actions. > > We've used this as way for guest code to communicate with the introspector. > KVMI_EVENT_ACTION_ALLOW is implied here. > >>> --- /dev/null >>> +++ b/include/uapi/linux/kvmi.h > ... >> >> Errnos values are not portable, I'd prefer to have them defined >> explicitly in the header. > > We were thinking that the introspection tool will take care of this, > as with the endianness. > > Surely, it will be much more clear to have the errnos defined here. > > Thanks again, > > Adalbert >