Re: Introspection API development

Jan Kiszka <jan.kiszka@xxxxxx> · Thu, 4 Aug 2016 14:44:10 +0200

On 2016-08-04 13:18, Mihai Donțu wrote:
> On Thu, 4 Aug 2016 10:50:30 +0200 Paolo Bonzini wrote:
>> On 04/08/2016 05:25, Stephen Pape wrote:
>>> My approach involves modifying the kernel driver to export a
>>> /dev/virt/ filesystem. I suppose I could do it all via /dev/kvm ioctls
>>> as well.
>>>
>>> My (relatively minor) patch allows processes besides the launching
>>> process to do things like map guest memory and read VCPU states for a
>>> VM. Soon, I'll be looking into adding support for handling events (cr3
>>> writes, int3 traps, etc.). Eventually, an event should come in, a
>>> program will handle it (while able to read memory/registers), and then
>>> resume the VCPU.  
>>
>> I think the interface should be implemented entirely in userspace and it
>> should be *mostly* socket-based; I say mostly because I understand that
>> reading memory directly can be useful.
> 
> We are working on something similar, but we're looking into making it
> entirely in kernel and possibly leveraging VirtIO, due to performance
> considerations (mostly caused by the overhead of hw virtualization).
> 
> The model we're aiming is: on a KVM host, out of the N running VM-s, one
> has special privileges allowing it to manipulate the memory and vCPU
> state of the others. We call that special VM an SVA (Security Virtual
> Appliance) and it uses a channel (much like the one found on Xen -
> evtchn) and a set of specific VMCALL-s to:
> 
>   * receive notifications from the host when a new VM is
>     created/destroyed
>   * manipulate the EPT of a specific VM
>   * manipulate the vCPU state of a specific VM (GPRs)
>   * manipulate the memory of a specific VM (insert code)
> 
> We don't have much code in place at the moment, but we plan to post a
> RFC series in the near future.
> 
> Obviously we've tried the userspace / qemu approach since it would have
> made development _much_ easier, but it's simply not "performant" enough.

What was the bottleneck? VCPU state monitoring/manipulation, VM memory
access or GPA-to-HPA (ie. EPT on Intel) manipulations? I suppose that
information will be essential when you want to convince the maintainers
to add another kernel interface (in times where they are rather reduced).

Jan

> This whole KVM work is actually a "glue" to an introspection technology
> we have developed and which uses extensive hooking (via EPT) to monitor
> execution of the kernel and user-mode processes, all the while aiming
> to shave at most 20% out of the performance of each VM (in a 100-VM
> setup).
> 
>> So this is a lot like a mix of two interfaces:
>>
>> - a debugger interface which lets you read/write registers and set events
>>
>> - the vhost-user interface which lets you pass the memory map (a mapping
>> between guest physical addresses and offsets in a file descriptor) from
>> QEMU to another process.
>>
>> The gdb stub protocol seems limited for the kind of event you want to
>> trap, but there was already a GSoC project a few years ago that looked
>> at gdb protocol extensions.  Jan, what was the outcome?
>>
>> In any case, I think there should be a separation between the ioctl KVM
>> API and the socket userspace API.  By the way most of the KVM API is
>> already there---e.g. reading/writing registers, breakpoints,
>> etc.---though you'll want to add events such as cr3 or idtr writes.
>>
>>> My question is, is this anything the KVM group would be interested in
>>> bringing upstream? I'd definitely be willing to change my approach if
>>> necessary. If there's no interest, I'll just have to maintain my own
>>> patches.  
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html