On 09/20/2016 04:03 AM, Alex Williamson wrote: > On Tue, 20 Sep 2016 00:43:15 +0530 > Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote: > >> On 9/20/2016 12:06 AM, Alex Williamson wrote: >>> On Mon, 19 Sep 2016 23:52:36 +0530 >>> Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote: >>> >>>> On 8/26/2016 7:43 PM, Kirti Wankhede wrote: >>>>> * PGP Signed: 08/26/2016 at 07:15:44 AM, Decrypted >>>>> On 8/25/2016 2:52 PM, Dong Jia wrote: >>>>>> On Thu, 25 Aug 2016 09:23:53 +0530 >>>> >>>>>>> + >>>>>>> +static ssize_t vfio_mdev_read(void *device_data, char __user *buf, >>>>>>> + size_t count, loff_t *ppos) >>>>>>> +{ >>>>>>> + struct vfio_mdev *vmdev = device_data; >>>>>>> + struct mdev_device *mdev = vmdev->mdev; >>>>>>> + struct parent_device *parent = mdev->parent; >>>>>>> + unsigned int done = 0; >>>>>>> + int ret; >>>>>>> + >>>>>>> + if (!parent->ops->read) >>>>>>> + return -EINVAL; >>>>>>> + >>>>>>> + while (count) { >>>>>> Here, I have to say sorry to you guys for that I didn't notice the >>>>>> bad impact of this change to my patches during the v6 discussion. >>>>>> >>>>>> For vfio-ccw, I introduced an I/O region to input/output I/O >>>>>> instruction parameters and results for Qemu. The @count of these data >>>>>> currently is 140. So supporting arbitrary lengths in one shot here, and >>>>>> also in vfio_mdev_write, seems the better option for this case. >>>>>> >>>>>> I believe that if the pci drivers want to iterate in a 4 bytes step, you >>>>>> can do that in the parent read/write callbacks instead. >>>>>> >>>>>> What do you think? >>>>>> >>>>> >>>>> I would like to know Alex's thought on this. He raised concern with this >>>>> approach in v6 reviews: >>>>> "But I think this is exploitable, it lets the user make the kernel >>>>> allocate an arbitrarily sized buffer." >>>>> >>>> >>>> Read/write callbacks are for slow path, emulation of mmio region which >>>> are mainly device registers. I do feel it shouldn't support arbitrary >>>> lengths. >>>> Alex, I would like to know your thoughts. >>> >>> The exploit was that the mdev layer allocated a buffer and copied the >>> entire user buffer into kernel space before passing it to the vendor >>> driver. The solution is to pass the __user *buf to the vendor driver >>> and let them sanitize and split it however makes sense for their >>> device. We shouldn't be assuming naturally aligned, up to dword >>> accesses in the generic mdev layers. Those sorts of semantics are >>> defined by the device type. This is another case where making >>> the mdev layer as thin as possible is actually the best thing to >>> do to make it really device type agnostic. Thanks, >>> >> >> >> Alex, >> >> These were the comments on v6 patch: >> >>>>> Do we really need to support arbitrary lengths in one shot? Seems >>>>> like >>>>> we could just use a 4 or 8 byte variable on the stack and iterate >>>>> until >>>>> done. >>>>> >>>> >>>> We just want to pass the arguments to vendor driver as is here.Vendor >>>> driver could take care of that. >> >>> But I think this is exploitable, it lets the user make the kernel >>> allocate an arbitrarily sized buffer. >> >> As per above discussion in v7 version, this module don't allocated >> memory from heap. >> >> If vendor driver allocates arbitrary memory in kernel space through mdev >> module interface, isn't that would be exploit? > > Yep, my 4-8/byte chunking idea was too PCI specific. If a vendor > driver introduces an exploit, that's a bug in the vendor driver. I'm > not sure if we can or should attempt to guard against that. Ultimately > the vendor driver is either open source and we can inspect it for such > exploits or it's closed source, taints the kernel, and we hope for the > best. It might make a good unit test to perform substantially sized > reads/writes to the mdev device. Can't agree more! :-) > Perhaps the only sanity test we can > make in the core is to verify the access doesn't exceed the size of > the region as advertised by the vendor driver. Thanks, Even performing a lightweight sanity check, would require vfio-mdev to be able to decode the ppos into a particular region, that means information of all regions should be stored in the framework. I guess it is not your preferred way :) -- Thanks, Jike -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html