On Tue, Feb 02, 2021 at 04:34:09PM -0600, Tom Lendacky wrote: > On 2/2/21 10:37 AM, Konrad Rzeszutek Wilk wrote: > > On Mon, Jan 25, 2021 at 07:33:35PM +0100, Martin Radev wrote: > >> On Mon, Jan 18, 2021 at 10:14:28AM -0500, Konrad Rzeszutek Wilk wrote: > >>> On Mon, Jan 18, 2021 at 12:44:58PM +0100, Martin Radev wrote: > >>>> On Wed, Jan 13, 2021 at 12:30:17PM +0100, Christoph Hellwig wrote: > >>>>> On Tue, Jan 12, 2021 at 04:07:29PM +0100, Martin Radev wrote: > >>>>>> The size of the buffer being bounced is not checked if it happens > >>>>>> to be larger than the size of the mapped buffer. Because the size > >>>>>> can be controlled by a device, as it's the case with virtio devices, > >>>>>> this can lead to memory corruption. > >>>>>> > >>>>> > >>>>> I'm really worried about all these hodge podge hacks for not trusted > >>>>> hypervisors in the I/O stack. Instead of trying to harden protocols > >>>>> that are fundamentally not designed for this, how about instead coming > >>>>> up with a new paravirtualized I/O interface that is specifically > >>>>> designed for use with an untrusted hypervisor from the start? > >>>> > >>>> Your comment makes sense but then that would require the cooperation > >>>> of these vendors and the cloud providers to agree on something meaningful. > >>>> I am also not sure whether the end result would be better than hardening > >>>> this interface to catch corruption. There is already some validation in > >>>> unmap path anyway. > >>>> > >>>> Another possibility is to move this hardening to the common virtio code, > >>>> but I think the code may become more complicated there since it would > >>>> require tracking both the dma_addr and length for each descriptor. > >>> > >>> Christoph, > >>> > >>> I've been wrestling with the same thing - this is specific to busted > >>> drivers. And in reality you could do the same thing with a hardware > >>> virtio device (see example in https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fthunderclap.io%2F&data=04%7C01%7Cthomas.lendacky%40amd.com%7Cfc27af49d9a943699f6c08d8c798eed4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637478806973542212%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aUVqobkOSDfDhCAEauABOUvCAaIcw%2FTh07YFxeBjBDU%3D&reserved=0) - where the > >>> mitigation is 'enable the IOMMU to do its job.'. > >>> > >>> AMD SEV documents speak about utilizing IOMMU to do this (AMD SEV-SNP).. > >>> and while that is great in the future, SEV without IOMMU is now here. > >>> > >>> Doing a full circle here, this issue can be exploited with virtio > >>> but you could say do that with real hardware too if you hacked the > >>> firmware, so if you say used Intel SR-IOV NIC that was compromised > >>> on an AMD SEV machine, and plumbed in the guest - the IOMMU inside > >>> of the guest would be SWIOTLB code. Last line of defense against > >>> bad firmware to say. > >>> > >>> As such I am leaning towards taking this code, but I am worried > >>> about the performance hit .. but perhaps I shouldn't as if you > >>> are using SWIOTLB=force already you are kind of taking a > >>> performance hit? > >>> > >> > >> I have not measured the performance degradation. This will hit all AMD SEV, > >> Intel TDX, IBM Protected Virtualization VMs. I don't expect the hit to > >> be large since there are only few added operations per hundreads of copied > >> bytes. I could try to measure the performance hit by running some benchmark > >> with virtio-net/virtio-blk/virtio-rng. > >> > >> Earlier I said: > >>>> Another possibility is to move this hardening to the common virtio code, > >>>> but I think the code may become more complicated there since it would > >>>> require tracking both the dma_addr and length for each descriptor. > >> > >> Unfortunately, this doesn't make sense. Even if there's validation for > >> the size in the common virtio layer, there will be some other device > >> which controls a dma_addr and length passed to dma_unmap* in the > >> corresponding driver. The device can target a specific dma-mapped private > >> buffer by changing the dma_addr and set a good length to overwrite buffers > >> following it. > >> > >> So, instead of doing the check in every driver and hitting a performance > >> cost even when swiotlb is not used, it's probably better to fix it in > >> swiotlb. > >> > >> @Tom Lendacky, do you think that it makes sense to harden swiotlb or > >> some other approach may be better for the SEV features? > > > > I am not Tom, but this change seems the right way forward regardless if > > is TDX, AMD SEV, or any other architecture that encrypt memory and use > > SWIOTLB. > > Sorry, I missed the @Tom before. I'm with Konrad and believe it makes > sense to add these checks. > > I'm not sure if there would be a better approach for all confidential > computing technologies. SWIOTLB works nicely, but is limited because of > the 32-bit compatible memory location. Being able to have buffers above > the 32-bit limit would alleviate that, but that is support that would have > to be developed. Funny you mention that.. Dongli (CCed) is working on exactly that and should be posting his patches the next couple of days. > > Thanks, > Tom > > > > > Let me queue it up in development branch and do some regression testing. > >>