Re: [PATCH/RFC 4/9] KVM: s390: Add MEMOP ioctls for reading/writing guest memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 16, 2015 at 09:51:40AM +0100, Christian Borntraeger wrote:
> From: Thomas Huth <thuth@xxxxxxxxxxxxxxxxxx>
> 
> On s390, we've got to make sure to hold the IPTE lock while accessing
> logical memory. So let's add an ioctl for reading and writing logical
> memory to provide this feature for userspace, too.
> The maximum transfer size of this call is limited to 64kB to prevent
> that the guest can trigger huge copy_from/to_user transfers. QEMU
> currently only requests up to one or two pages so far, so 16*4kB seems
> to be a reasonable limit here.
> 
> Signed-off-by: Thomas Huth <thuth@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Christian Borntraeger <borntraeger@xxxxxxxxxx>
> ---
>  Documentation/virtual/kvm/api.txt | 46 ++++++++++++++++++++++++
>  arch/s390/kvm/gaccess.c           | 22 ++++++++++++
>  arch/s390/kvm/gaccess.h           |  2 ++
>  arch/s390/kvm/kvm-s390.c          | 74 +++++++++++++++++++++++++++++++++++++++
>  include/uapi/linux/kvm.h          | 21 +++++++++++
>  5 files changed, 165 insertions(+)
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index ee47998e..f03178d 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2716,6 +2716,52 @@ The fields in each entry are defined as follows:
>     eax, ebx, ecx, edx: the values returned by the cpuid instruction for
>           this function/index combination
>  
> +4.89 KVM_S390_MEM_OP
> +
> +Capability: KVM_CAP_S390_MEM_OP
> +Architectures: s390
> +Type: vcpu ioctl
> +Parameters: struct kvm_s390_mem_op (in)
> +Returns: = 0 on success,
> +         < 0 on generic error (e.g. -EFAULT or -ENOMEM),
> +         > 0 if an exception occurred while walking the page tables
> +
> +Read or write data from/to the logical (virtual) memory of a VPCU.
> +
> +Parameters are specified via the following structure:
> +
> +struct kvm_s390_mem_op {
> +	__u64 gaddr;		/* the guest address */
> +	__u64 flags;		/* arch specific flags */
> +	__u32 size;		/* amount of bytes */
> +	__u32 op;		/* type of operation */
> +	__u64 buf;		/* buffer in userspace */
> +	__u8 ar;		/* the access register number */
> +	__u8 reserved[31];	/* should be set to 0 */
> +};
> +
> +The type of operation is specified in the "op" field. It is either
> +KVM_S390_MEMOP_LOGICAL_READ for reading from logical memory space or
> +KVM_S390_MEMOP_LOGICAL_WRITE for writing to logical memory space. The
> +KVM_S390_MEMOP_F_CHECK_ONLY flag can be set in the "flags" field to check
> +whether the corresponding memory access would create an access exception
> +(without touching the data in the memory at the destination). In case an
> +access exception occurred while walking the MMU tables of the guest, the
> +ioctl returns a positive error number to indicate the type of exception.
> +This exception is also raised directly at the corresponding VCPU if the
> +flag KVM_S390_MEMOP_F_INJECT_EXCEPTION is set in the "flags" field.
> +
> +The start address of the memory region has to be specified in the "gaddr"
> +field, and the length of the region in the "size" field. "buf" is the buffer
> +supplied by the userspace application where the read data should be written
> +to for KVM_S390_MEMOP_LOGICAL_READ, or where the data that should be written
> +is stored for a KVM_S390_MEMOP_LOGICAL_WRITE. "buf" is unused and can be NULL
> +when KVM_S390_MEMOP_F_CHECK_ONLY is specified. "ar" designates the access
> +register number to be used.
> +
> +The "reserved" field is meant for future extensions. It is not used by
> +KVM with the currently defined set of flags.
> +
>  5. The kvm_run structure
>  ------------------------
>  
> diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
> index c230904..04a7c67 100644
> --- a/arch/s390/kvm/gaccess.c
> +++ b/arch/s390/kvm/gaccess.c
> @@ -697,6 +697,28 @@ int guest_translate_address(struct kvm_vcpu *vcpu, unsigned long gva,
>  }
>  
>  /**
> + * check_gva_range - test a range of guest virtual addresses for accessibility
> + */
> +int check_gva_range(struct kvm_vcpu *vcpu, unsigned long gva,
> +		    unsigned long length, int is_write)
> +{
> +	unsigned long gpa;
> +	unsigned long currlen;
> +	int rc = 0;
> +
> +	ipte_lock(vcpu);
> +	while (length > 0 && !rc) {
> +		currlen = min(length, PAGE_SIZE - (gva % PAGE_SIZE));
> +		rc = guest_translate_address(vcpu, gva, &gpa, is_write);
> +		gva += currlen;
> +		length -= currlen;
> +	}
> +	ipte_unlock(vcpu);
> +
> +	return rc;
> +}

What i was wondering is why you can't translate the address
in the kernel and let userspace perform the actual read/write?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux