On 2/2/21 10:37 AM, Konrad Rzeszutek Wilk wrote: > On Mon, Jan 25, 2021 at 07:33:35PM +0100, Martin Radev wrote: >> On Mon, Jan 18, 2021 at 10:14:28AM -0500, Konrad Rzeszutek Wilk wrote: >>> On Mon, Jan 18, 2021 at 12:44:58PM +0100, Martin Radev wrote: >>>> On Wed, Jan 13, 2021 at 12:30:17PM +0100, Christoph Hellwig wrote: >>>>> On Tue, Jan 12, 2021 at 04:07:29PM +0100, Martin Radev wrote: >>>>>> The size of the buffer being bounced is not checked if it happens >>>>>> to be larger than the size of the mapped buffer. Because the size >>>>>> can be controlled by a device, as it's the case with virtio devices, >>>>>> this can lead to memory corruption. >>>>>> >>>>> >>>>> I'm really worried about all these hodge podge hacks for not trusted >>>>> hypervisors in the I/O stack. Instead of trying to harden protocols >>>>> that are fundamentally not designed for this, how about instead coming >>>>> up with a new paravirtualized I/O interface that is specifically >>>>> designed for use with an untrusted hypervisor from the start? >>>> >>>> Your comment makes sense but then that would require the cooperation >>>> of these vendors and the cloud providers to agree on something meaningful. >>>> I am also not sure whether the end result would be better than hardening >>>> this interface to catch corruption. There is already some validation in >>>> unmap path anyway. >>>> >>>> Another possibility is to move this hardening to the common virtio code, >>>> but I think the code may become more complicated there since it would >>>> require tracking both the dma_addr and length for each descriptor. >>> >>> Christoph, >>> >>> I've been wrestling with the same thing - this is specific to busted >>> drivers. And in reality you could do the same thing with a hardware >>> virtio device (see example in https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fthunderclap.io%2F&data=04%7C01%7Cthomas.lendacky%40amd.com%7Cfc27af49d9a943699f6c08d8c798eed4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637478806973542212%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aUVqobkOSDfDhCAEauABOUvCAaIcw%2FTh07YFxeBjBDU%3D&reserved=0) - where the >>> mitigation is 'enable the IOMMU to do its job.'. >>> >>> AMD SEV documents speak about utilizing IOMMU to do this (AMD SEV-SNP).. >>> and while that is great in the future, SEV without IOMMU is now here. >>> >>> Doing a full circle here, this issue can be exploited with virtio >>> but you could say do that with real hardware too if you hacked the >>> firmware, so if you say used Intel SR-IOV NIC that was compromised >>> on an AMD SEV machine, and plumbed in the guest - the IOMMU inside >>> of the guest would be SWIOTLB code. Last line of defense against >>> bad firmware to say. >>> >>> As such I am leaning towards taking this code, but I am worried >>> about the performance hit .. but perhaps I shouldn't as if you >>> are using SWIOTLB=force already you are kind of taking a >>> performance hit? >>> >> >> I have not measured the performance degradation. This will hit all AMD SEV, >> Intel TDX, IBM Protected Virtualization VMs. I don't expect the hit to >> be large since there are only few added operations per hundreads of copied >> bytes. I could try to measure the performance hit by running some benchmark >> with virtio-net/virtio-blk/virtio-rng. >> >> Earlier I said: >>>> Another possibility is to move this hardening to the common virtio code, >>>> but I think the code may become more complicated there since it would >>>> require tracking both the dma_addr and length for each descriptor. >> >> Unfortunately, this doesn't make sense. Even if there's validation for >> the size in the common virtio layer, there will be some other device >> which controls a dma_addr and length passed to dma_unmap* in the >> corresponding driver. The device can target a specific dma-mapped private >> buffer by changing the dma_addr and set a good length to overwrite buffers >> following it. >> >> So, instead of doing the check in every driver and hitting a performance >> cost even when swiotlb is not used, it's probably better to fix it in >> swiotlb. >> >> @Tom Lendacky, do you think that it makes sense to harden swiotlb or >> some other approach may be better for the SEV features? > > I am not Tom, but this change seems the right way forward regardless if > is TDX, AMD SEV, or any other architecture that encrypt memory and use > SWIOTLB. Sorry, I missed the @Tom before. I'm with Konrad and believe it makes sense to add these checks. I'm not sure if there would be a better approach for all confidential computing technologies. SWIOTLB works nicely, but is limited because of the 32-bit compatible memory location. Being able to have buffers above the 32-bit limit would alleviate that, but that is support that would have to be developed. Thanks, Tom > > Let me queue it up in development branch and do some regression testing. >>