On Wed, Sep 06, 2023, Robin Murphy wrote: > On 2023-09-05 19:59, Sean Christopherson wrote: > > And if the driver *doesn't* initialize the data, then the copy is at best pointless, > > and possibly even worse than leaking stale swiotlb data. > > Other than the overhead, done right it can't be any worse than if SWIOTLB > were not involved at all. Yep. > > Looking at commit ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE"), > > IIUC the data leak was observed with a synthetic test "driver" that was developed > > to verify a real data leak fixed by commit a45b599ad808 ("scsi: sg: allocate with > > __GFP_ZERO in sg_build_indirect()"). Which basically proves my point: copying > > from the source only adds value absent a bug in the owning driver. > > Huh? IIUC the bug there was that the SCSI layer failed to sanitise > partially-written buffers. That bug was fixed, and the scrutiny therein > happened to reveal that SWIOTLB *also* had a lower-level problem with > partial writes, in that it was corrupting DMA-mapped memory which was not > updated by the device. Partial DMA writes are not in themselves indicative > of a bug, they may well be a valid and expected behaviour. The problem is that the comment only talks about leaking data to userspace, and doesn't say anything about data corruption or the "swiotlb needs to match hardware" justification that Linus pointed out. I buy both of those arguments for copying data from the original page, but the "may prevent leaking swiotlb content" is IMO completely nonsensical, because if preventing leakage is the only goal, then explicitly initializing the memory is better in every way. If no one objects, I'll put together a patch to rewrite the comment in terms of mimicking hardware and not corrupting the caller's data. > > IMO, rather than copying from the original memory, swiotlb_tbl_map_single() should > > simply zero the original page(s) when establishing the mapping. That would harden > > all usage of swiotlb and avoid the read-before-write behavior that is problematic > > for KVM. > > Depends on one's exact definition of "harden"... Corrupting memory with > zeros is less bad than corrupting memory with someone else's data if you > look at it from an information security point of view, but from a > not-corrupting-memory point of view it's definitely still corrupting memory > :/ > > Taking a step back, is there not an argument that if people care about > general KVM performance then they should maybe stop emulating obsolete PC > hardware from 30 years ago, and at least emulate obsolete PC hardware from > 20 years ago that supports 64-bit DMA? Heh, I don't think there's an argument per se, people most definitely shouldn't be emulating old hardware if they care about performance. I already told Yan as much. > Even non-virtualised, SWIOTLB is pretty horrible for I/O performance by its > very nature - avoiding it if at all possible should always be preferred. Yeah. The main reason I didn't just sweep this under the rug is the confidential VM use case, where SWIOTLB is used to bounce data from guest private memory into shread buffers. There's also a good argument that anyone that cares about I/O performance in confidential VMs should put in the effort to enlighten their device drivers to use shared memory directly, but practically speaking that's easier said than done.