On 2023-09-06 15:44, Sean Christopherson wrote:
On Wed, Sep 06, 2023, Robin Murphy wrote:
On 2023-09-05 19:59, Sean Christopherson wrote:
And if the driver *doesn't* initialize the data, then the copy is at best pointless,
and possibly even worse than leaking stale swiotlb data.
Other than the overhead, done right it can't be any worse than if SWIOTLB
were not involved at all.
Yep.
Looking at commit ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE"),
IIUC the data leak was observed with a synthetic test "driver" that was developed
to verify a real data leak fixed by commit a45b599ad808 ("scsi: sg: allocate with
__GFP_ZERO in sg_build_indirect()"). Which basically proves my point: copying
from the source only adds value absent a bug in the owning driver.
Huh? IIUC the bug there was that the SCSI layer failed to sanitise
partially-written buffers. That bug was fixed, and the scrutiny therein
happened to reveal that SWIOTLB *also* had a lower-level problem with
partial writes, in that it was corrupting DMA-mapped memory which was not
updated by the device. Partial DMA writes are not in themselves indicative
of a bug, they may well be a valid and expected behaviour.
The problem is that the comment only talks about leaking data to userspace, and
doesn't say anything about data corruption or the "swiotlb needs to match hardware"
justification that Linus pointed out. I buy both of those arguments for copying
data from the original page, but the "may prevent leaking swiotlb content" is IMO
completely nonsensical, because if preventing leakage is the only goal, then
explicitly initializing the memory is better in every way.
If no one objects, I'll put together a patch to rewrite the comment in terms of
mimicking hardware and not corrupting the caller's data.
Sounds good to me. I guess the trouble is that as soon as a CVE is
involved it can then get hard to look past it, or want to risk appearing
to downplay it :)
IMO, rather than copying from the original memory, swiotlb_tbl_map_single() should
simply zero the original page(s) when establishing the mapping. That would harden
all usage of swiotlb and avoid the read-before-write behavior that is problematic
for KVM.
Depends on one's exact definition of "harden"... Corrupting memory with
zeros is less bad than corrupting memory with someone else's data if you
look at it from an information security point of view, but from a
not-corrupting-memory point of view it's definitely still corrupting memory
:/
Taking a step back, is there not an argument that if people care about
general KVM performance then they should maybe stop emulating obsolete PC
hardware from 30 years ago, and at least emulate obsolete PC hardware from
20 years ago that supports 64-bit DMA?
Heh, I don't think there's an argument per se, people most definitely shouldn't
be emulating old hardware if they care about performance. I already told Yan as
much.
Even non-virtualised, SWIOTLB is pretty horrible for I/O performance by its
very nature - avoiding it if at all possible should always be preferred.
Yeah. The main reason I didn't just sweep this under the rug is the confidential
VM use case, where SWIOTLB is used to bounce data from guest private memory into
shread buffers. There's also a good argument that anyone that cares about I/O
performance in confidential VMs should put in the effort to enlighten their device
drivers to use shared memory directly, but practically speaking that's easier said
than done.
Indeed a bunch of work has gone into SWIOTLB recently trying to make it
a bit more efficient for such cases where it can't be avoided, so it is
definitely still interesting to learn about impacts at other levels like
this. Maybe there's a bit of a get-out for confidential VMs though,
since presumably there's not much point COW-ing encrypted private
memory, so perhaps KVM might end up wanting to optimise that out and
thus happen to end up less sensitive to unavoidable SWIOTLB behaviour
anyway?
Cheers,
Robin.