Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path

Tom Lendacky <thomas.lendacky@xxxxxxx> · Tue, 2 Feb 2021 16:34:09 -0600

On 2/2/21 10:37 AM, Konrad Rzeszutek Wilk wrote:
> On Mon, Jan 25, 2021 at 07:33:35PM +0100, Martin Radev wrote:
>> On Mon, Jan 18, 2021 at 10:14:28AM -0500, Konrad Rzeszutek Wilk wrote:
>>> On Mon, Jan 18, 2021 at 12:44:58PM +0100, Martin Radev wrote:
>>>> On Wed, Jan 13, 2021 at 12:30:17PM +0100, Christoph Hellwig wrote:
>>>>> On Tue, Jan 12, 2021 at 04:07:29PM +0100, Martin Radev wrote:
>>>>>> The size of the buffer being bounced is not checked if it happens
>>>>>> to be larger than the size of the mapped buffer. Because the size
>>>>>> can be controlled by a device, as it's the case with virtio devices,
>>>>>> this can lead to memory corruption.
>>>>>>
>>>>>
>>>>> I'm really worried about all these hodge podge hacks for not trusted
>>>>> hypervisors in the I/O stack.  Instead of trying to harden protocols
>>>>> that are fundamentally not designed for this, how about instead coming
>>>>> up with a new paravirtualized I/O interface that is specifically
>>>>> designed for use with an untrusted hypervisor from the start?
>>>>
>>>> Your comment makes sense but then that would require the cooperation
>>>> of these vendors and the cloud providers to agree on something meaningful.
>>>> I am also not sure whether the end result would be better than hardening
>>>> this interface to catch corruption. There is already some validation in
>>>> unmap path anyway.
>>>>
>>>> Another possibility is to move this hardening to the common virtio code,
>>>> but I think the code may become more complicated there since it would
>>>> require tracking both the dma_addr and length for each descriptor.
>>>
>>> Christoph,
>>>
>>> I've been wrestling with the same thing - this is specific to busted
>>> drivers. And in reality you could do the same thing with a hardware
>>> virtio device (see example in https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fthunderclap.io%2F&amp;data=04%7C01%7Cthomas.lendacky%40amd.com%7Cfc27af49d9a943699f6c08d8c798eed4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637478806973542212%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=aUVqobkOSDfDhCAEauABOUvCAaIcw%2FTh07YFxeBjBDU%3D&amp;reserved=0) - where the
>>> mitigation is 'enable the IOMMU to do its job.'.
>>>
>>> AMD SEV documents speak about utilizing IOMMU to do this (AMD SEV-SNP)..
>>> and while that is great in the future, SEV without IOMMU is now here.
>>>
>>> Doing a full circle here, this issue can be exploited with virtio
>>> but you could say do that with real hardware too if you hacked the
>>> firmware, so if you say used Intel SR-IOV NIC that was compromised
>>> on an AMD SEV machine, and plumbed in the guest - the IOMMU inside
>>> of the guest would be SWIOTLB code. Last line of defense against
>>> bad firmware to say.
>>>
>>> As such I am leaning towards taking this code, but I am worried
>>> about the performance hit .. but perhaps I shouldn't as if you
>>> are using SWIOTLB=force already you are kind of taking a
>>> performance hit?
>>>
>>
>> I have not measured the performance degradation. This will hit all AMD SEV,
>> Intel TDX, IBM Protected Virtualization VMs. I don't expect the hit to
>> be large since there are only few added operations per hundreads of copied
>> bytes. I could try to measure the performance hit by running some benchmark
>> with virtio-net/virtio-blk/virtio-rng.
>>
>> Earlier I said:
>>>> Another possibility is to move this hardening to the common virtio code,
>>>> but I think the code may become more complicated there since it would
>>>> require tracking both the dma_addr and length for each descriptor.
>>
>> Unfortunately, this doesn't make sense. Even if there's validation for
>> the size in the common virtio layer, there will be some other device
>> which controls a dma_addr and length passed to dma_unmap* in the
>> corresponding driver. The device can target a specific dma-mapped private
>> buffer by changing the dma_addr and set a good length to overwrite buffers
>> following it.
>>
>> So, instead of doing the check in every driver and hitting a performance
>> cost even when swiotlb is not used, it's probably better to fix it in
>> swiotlb.
>>
>> @Tom Lendacky, do you think that it makes sense to harden swiotlb or
>> some other approach may be better for the SEV features?
> 
> I am not Tom, but this change seems the right way forward regardless if
> is TDX, AMD SEV, or any other architecture that encrypt memory and use
> SWIOTLB.

Sorry, I missed the @Tom before. I'm with Konrad and believe it makes
sense to add these checks.

I'm not sure if there would be a better approach for all confidential
computing technologies. SWIOTLB works nicely, but is limited because of
the 32-bit compatible memory location. Being able to have buffers above
the 32-bit limit would alleviate that, but that is support that would have
to be developed.

Thanks,
Tom

> 
> Let me queue it up in development branch and do some regression testing.
>>