On Fri, Aug 11, 2017 at 3:23 PM, Hans de Goede <hdegoede@xxxxxxxxxx> wrote: > This commit adds a driver for the Virtual Box Guest PCI device used in > Virtual Box virtual machines. Enabling this driver will add support for > Virtual Box Guest integration features such as copy-and-paste, seamless > mode and OpenGL pass-through. > > This driver also offers vboxguest IPC functionality which is needed > for the vboxfs driver which offers folder sharing support. > > Signed-off-by: Hans de Goede <hdegoede@xxxxxxxxxx> Since you are still working on coding style cleanups, I'll comment mainly on the ioctl interface for now. Overall it already looks much better than I expected from previous experience with the vbox source. Most of the issues I would normally comment on are already moot based on the assumption that we won't be able to make substantial changes to either the user space portion or the hypervisor side. > +/** > + * Inflate the balloon by one chunk. > + * > + * The caller owns the balloon mutex. > + * > + * @returns VBox status code > + * @param gdev The Guest extension device. > + * @param chunk_idx Index of the chunk. > + */ > +static int vbg_balloon_inflate(PVBOXGUESTDEVEXT gdev, u32 chunk_idx) > +{ > + VMMDevChangeMemBalloon *req = gdev->mem_balloon.change_req; > + struct page **pages; > + int i, rc; > + > + pages = kmalloc(sizeof(*pages) * VMMDEV_MEMORY_BALLOON_CHUNK_PAGES, > + GFP_KERNEL | __GFP_NOWARN); > + if (!pages) > + return VERR_NO_MEMORY; > + > + req->header.size = sizeof(*req); > + req->inflate = true; > + req->pages = VMMDEV_MEMORY_BALLOON_CHUNK_PAGES; The balloon section seems to be rather simplistic with its ioctl interface. Ideally this should use CONFIG_BALLOON_COMPACTION and an oom notifier like the virtio_balloon driver does. Yes, I realize that only one of the six (or more) balloon drivers we have in the kernel does it ;-) > +/** > + * Callback for heartbeat timer. > + */ > +static void vbg_heartbeat_timer(unsigned long data) > +{ > + PVBOXGUESTDEVEXT gdev = (PVBOXGUESTDEVEXT)data; > + > + vbg_req_perform(gdev, gdev->guest_heartbeat_req); > + mod_timer(&gdev->heartbeat_timer, > + msecs_to_jiffies(gdev->heartbeat_interval_ms)); > +} This looks like a watchdog and should use the drivers/watchdog subsystem interfaces. > +/** > + * vbg_query_host_version try get the host feature mask and version information > + * (vbg_host_version). > + * > + * @returns 0 or negative errno value (ignored). > + * @param gdev The Guest extension device. > + */ > +static int vbg_query_host_version(PVBOXGUESTDEVEXT gdev) > +{ > + VMMDevReqHostVersion *req; > + int rc, ret; > + > + req = vbg_req_alloc(sizeof(*req), VMMDevReq_GetHostVersion); > + if (!req) > + return -ENOMEM; While I understand you want to keep the ioctl interface, the version information looks like something that we should also have in sysfs in an appropriate location. > diff --git a/drivers/misc/vboxguest/vboxguest_linux.c b/drivers/misc/vboxguest/vboxguest_linux.c > new file mode 100644 > index 000000000000..8468c7139b98 > --- /dev/null > +++ b/drivers/misc/vboxguest/vboxguest_linux.c This looks like a fairly short file, and could be merged into the core file. > +/** The file_operations structures. */ > +static const struct file_operations vbg_misc_device_fops = { > + .owner = THIS_MODULE, > + .open = vbg_misc_device_open, > + .release = vbg_misc_device_close, > + .unlocked_ioctl = vgdrvLinuxIOCtl, > +}; > +static const struct file_operations vbg_misc_device_user_fops = { > + .owner = THIS_MODULE, > + .open = vbg_misc_device_user_open, > + .release = vbg_misc_device_close, > + .unlocked_ioctl = vgdrvLinuxIOCtl, > +}; These lack a .compat_ioctl callback. > +/** > + * Device I/O Control entry point. > + * > + * @returns -ENOMEM or -EFAULT for errors inside the ioctl callback; 0 > + * on success, or a positive VBox status code on vbox guest-device errors. > + * > + * @param pInode Associated inode pointer. > + * @param pFilp Associated file pointer. > + * @param uCmd The function specified to ioctl(). > + * @param ulArg The argument specified to ioctl(). > + */ > +static long vgdrvLinuxIOCtl(struct file *pFilp, unsigned int uCmd, > + unsigned long ulArg) > +{ > + PVBOXGUESTSESSION session = (PVBOXGUESTSESSION) pFilp->private_data; > + u32 cbData = _IOC_SIZE(uCmd); > + void *pvBufFree; > + void *pvBuf; > + int rc, ret = 0; > + u64 au64Buf[32 / sizeof(u64)]; > + > + /* > + * For small amounts of data being passed we use a stack based buffer > + * except for VMMREQUESTs where the data must not be on the stack. > + */ > + if (cbData <= sizeof(au64Buf) && > + VBOXGUEST_IOCTL_STRIP_SIZE(uCmd) != > + VBOXGUEST_IOCTL_STRIP_SIZE(VBOXGUEST_IOCTL_VMMREQUEST(0))) { > + pvBufFree = NULL; > + pvBuf = &au64Buf[0]; > + } else { > + /* __GFP_DMA32 for VBOXGUEST_IOCTL_VMMREQUEST */ > + pvBufFree = pvBuf = kmalloc(cbData, GFP_KERNEL | __GFP_DMA32); > + if (!pvBuf) > + return -ENOMEM; > + } > + if (copy_from_user(pvBuf, (void *)ulArg, cbData) == 0) { There should be a check for a maximum argument size. I'd also change the commands to not always do both read and write, but only whichever applies. This function would then do the copy_from_user/copy_to_user conditionally. > +static int hgcm_call_preprocess_linaddr(const HGCMFunctionParameter *src_parm, > + bool is_user, void **bounce_buf_ret, > + size_t *pcb_extra) > +{ > + void *buf, *bounce_buf; > + bool copy_in; > + u32 len; > + int ret; > + > + buf = (void *)src_parm->u.Pointer.u.linearAddr; > + len = src_parm->u.Pointer.size; > + copy_in = src_parm->type != VMMDevHGCMParmType_LinAddr_Out; > + > + if (!is_user) { > + if (WARN_ON(len > VBGLR0_MAX_HGCM_KERNEL_PARM)) > + return VERR_OUT_OF_RANGE; > + > + hgcm_call_inc_pcb_extra(buf, len, pcb_extra); > + return VINF_SUCCESS; > + } > + > + if (len > VBGLR0_MAX_HGCM_USER_PARM) > + return VERR_OUT_OF_RANGE; > + > + if (len <= PAGE_SIZE * 2) > + bounce_buf = kmalloc(len, GFP_KERNEL); > + else > + bounce_buf = vmalloc(len); > + we have kvmalloc() for this now > +#ifdef CONFIG_X86_64 > +int vbg_hgcm_call32(VBOXGUESTDEVEXT *gdev, VBoxGuestHGCMCallInfo * pCallInfo, > + u32 cbCallInfo, u32 timeout_ms, bool is_user) > +{ > + VBoxGuestHGCMCallInfo *pCallInfo64 = NULL; > + HGCMFunctionParameter *pParm64 = NULL; > + HGCMFunctionParameter32 *pParm32 = NULL; > + u32 cParms = pCallInfo->cParms; > + u32 iParm; > + int rc = VINF_SUCCESS; > + > + /* > + * The simple approach, allocate a temporary request and convert the parameters. > + */ > + pCallInfo64 = kzalloc(sizeof(*pCallInfo64) + > + cParms * sizeof(HGCMFunctionParameter), > + GFP_KERNEL); > + if (!pCallInfo64) > + return VERR_NO_MEMORY; > + > + *pCallInfo64 = *pCallInfo; > + pParm32 = VBOXGUEST_HGCM_CALL_PARMS32(pCallInfo); > + pParm64 = VBOXGUEST_HGCM_CALL_PARMS(pCallInfo64); > + for (iParm = 0; iParm < cParms; iParm++, pParm32++, pParm64++) { > + switch (pParm32->type) { > + case VMMDevHGCMParmType_32bit: > + pParm64->type = VMMDevHGCMParmType_32bit; > + pParm64->u.value32 = pParm32->u.value32; > + break; > + > + case VMMDevHGCMParmType_64bit: > + pParm64->type = VMMDevHGCMParmType_64bit; > + pParm64->u.value64 = pParm32->u.value64; > + break; > + > + case VMMDevHGCMParmType_LinAddr_Out: > + case VMMDevHGCMParmType_LinAddr: > + case VMMDevHGCMParmType_LinAddr_In: > + pParm64->type = pParm32->type; > + pParm64->u.Pointer.size = pParm32->u.Pointer.size; > + pParm64->u.Pointer.u.linearAddr = > + pParm32->u.Pointer.u.linearAddr; > + break; It would be good to clean this up and always use the same structure here. > diff --git a/include/linux/vbox_err.h b/include/linux/vbox_err.h > new file mode 100644 > index 000000000000..906ff7d2585d > --- /dev/null > +++ b/include/linux/vbox_err.h > @@ -0,0 +1,6 @@ > +#ifndef __VBOX_ERR_H__ > +#define __VBOX_ERR_H__ > + > +#include <uapi/linux/vbox_err.h> > + > +#endif > diff --git a/include/linux/vbox_ostypes.h b/include/linux/vbox_ostypes.h > new file mode 100644 > index 000000000000..ea2a391f135f > --- /dev/null > +++ b/include/linux/vbox_ostypes.h > @@ -0,0 +1,6 @@ > +#ifndef __VBOX_OSTYPES_H__ > +#define __VBOX_OSTYPES_H__ > + > +#include <uapi/linux/vbox_ostypes.h> > + > +#endif > diff --git a/include/linux/vboxguest.h b/include/linux/vboxguest.h > new file mode 100644 > index 000000000000..fca5d199a884 > --- /dev/null > +++ b/include/linux/vboxguest.h > @@ -0,0 +1,6 @@ > +#ifndef __VBOXGUEST_H__ > +#define __VBOXGUEST_H__ > + > +#include <uapi/linux/vboxguest.h> > + > +#endif This should not be needed, we add -I${srctree}/include/uapi/ to the include path already. > +/** > + * The layout of VMMDEV RAM region that contains information for guest. > + */ > +typedef struct VMMDevMemory { > + /** The size of this structure. */ > + u32 u32Size; > + /** The structure version. (VMMDEV_MEMORY_VERSION) */ > + u32 u32Version; > + > + union { > + struct { > + /** Flag telling that VMMDev has events pending. */ > + bool fHaveEvents; > + } V1_04; > + As this is a hardware interface, maybe use u32 instead of 'bool'. Strictly speaking, the ring buffer structures would even have to be __le32 and use byteorder specific read/write access, but that could perhaps be skipped here when you add a Kconfig dependency on !CONFIG_CPU_BIG_ENDIAN (which won't ever be set on x86). > diff --git a/include/uapi/linux/vbox_err.h b/include/uapi/linux/vbox_err.h > new file mode 100644 > index 000000000000..e6e7ba835e36 > --- /dev/null > +++ b/include/uapi/linux/vbox_err.h > diff --git a/include/uapi/linux/vbox_ostypes.h b/include/uapi/linux/vbox_ostypes.h > new file mode 100644 > index 000000000000..abe9a38ebfbd > --- /dev/null > +++ b/include/uapi/linux/vbox_ostypes.h > @@ -0,0 +1,158 @@ Some of these headers are not really ABI between the kernel and user space but are between the vbox host and guest, so include/uapi is maybe not the best place for them. Do we need all of this both in the kernel and in the header that actually defines the kernel/user interface? > +/** > + * @name VBoxGuest IOCTL codes and structures. > + * > + * The range 0..15 is for basic driver communication. > + * The range 16..31 is for HGCM communication. > + * The range 32..47 is reserved for future use. > + * The range 48..63 is for OS specific communication. > + * The 7th bit is reserved for future hacks. > + * The 8th bit is reserved for distinguishing between 32-bit and 64-bit > + * processes in future 64-bit guest additions. > + * @{ > + */ > +#if __BITS_PER_LONG == 64 > +#define VBOXGUEST_IOCTL_FLAG 128 > +#else > +#define VBOXGUEST_IOCTL_FLAG 0 > +#endif > +/** @} */ This seems misguided, we already know the difference between 32-bit and 64-bit tasks based on the entry point, and if the interface is properly designed then we don't care about this difference at all. > +#define VBOXGUEST_IOCTL_CODE_(function, size) \ > + _IOC(_IOC_READ|_IOC_WRITE, 'V', (function), (size)) > +#define VBOXGUEST_IOCTL_STRIP_SIZE(code) \ > + VBOXGUEST_IOCTL_CODE_(_IOC_NR((code)), 0) > + > +#define VBOXGUEST_IOCTL_CODE(function, size) \ > + VBOXGUEST_IOCTL_CODE_((function) | VBOXGUEST_IOCTL_FLAG, size) > +/* Define 32 bit codes to support 32 bit applications in 64 bit guest driver. */ > +#define VBOXGUEST_IOCTL_CODE_32(function, size) \ > +VBOXGUEST_IOCTL_CODE_(function, size) If the command numbers can be changed wihtout causing too many problems, I'd just do away with these wrappers and use _IOR/_IOW/_IORW as needed. > +/** Input and output buffers layout of the IOCTL_VBOXGUEST_WAITEVENT */ > +typedef struct VBoxGuestWaitEventInfo { > + /** timeout in milliseconds */ > + u32 u32TimeoutIn; > + /** events to wait for */ > + u32 u32EventMaskIn; > + /** result code */ > + u32 u32Result; > + /** events occurred */ > + u32 u32EventFlagsOut; > +} VBoxGuestWaitEventInfo; > +VBOXGUEST_ASSERT_SIZE(VBoxGuestWaitEventInfo, 16); 'u32' is not an approprioate type for a kernel header, use '__u32' instead. Arnd