> From: Zhao, Yan Y <yan.y.zhao@xxxxxxxxx> > Sent: Monday, May 13, 2024 3:11 PM > On Fri, May 10, 2024 at 10:57:28AM -0600, Alex Williamson wrote: > > On Fri, 10 May 2024 18:31:13 +0800 > > Yan Zhao <yan.y.zhao@xxxxxxxxx> wrote: > > > > > On Thu, May 09, 2024 at 12:10:49PM -0600, Alex Williamson wrote: > > > > On Tue, 7 May 2024 14:21:38 +0800 > > > > Yan Zhao <yan.y.zhao@xxxxxxxxx> wrote: > > > > > @@ -1847,6 +1891,9 @@ static void vfio_test_domain_fgsp(struct > vfio_domain *domain, struct list_head * > > > > > break; > > > > > } > > > > > > > > > > + if (!domain->enforce_cache_coherency) > > > > > + arch_clean_nonsnoop_dma(page_to_phys(pages), > PAGE_SIZE * 2); > > > > > + > > > > > > > > Seems like this use case isn't subject to the unmap aspect since these > > > > are kernel allocated and freed pages rather than userspace pages. > > > > There's not an "ongoing use of the page" concern. > > > > > > > > The window of opportunity for a device to discover and exploit the > > > > mapping side issue appears almost impossibly small. > > > > > > > The concern is for a malicious device attempting DMAs automatically. > > > Do you think this concern is valid? > > > As there're only extra flushes for 4 pages, what about keeping it for safety? > > > > Userspace doesn't know anything about these mappings, so to exploit > > them the device would somehow need to discover and interact with the > > mapping in the split second that the mapping exists, without exposing > > itself with mapping faults at the IOMMU. Userspace could guess the attacking ranges based on code, e.g. currently the code just tries to use the 1st available IOVA region which likely starts at address 0. and mapping faults don't stop the attack. Just some after-the-fact hint revealing the possibility of being attacked. 😊 > > > > I don't mind keeping the flush before map so that infinitesimal gap > > where previous data in physical memory exposed to the device is closed, > > but I have a much harder time seeing that the flush on unmap to > > synchronize physical memory is required. > > > > For example, the potential KSM use case doesn't exist since the pages > > are not owned by the user. Any subsequent use of the pages would be > > subject to the same condition we assumed after allocation, where the > > physical data may be inconsistent with the cached data. It's easy to physical data can be different from the cached one at any time. In normal case the cache line is marked as dirty and the CPU cache protocol guarantees coherency between cache/memory. here we talked about a situation which a malicious user uses non-coherent DMA to bypass CPU and makes memory/cache inconsistent when the CPU still considers the memory copy is up-to-date (e.g. cacheline is in exclusive or shared state). In this case multiple reads from the next-user may get different values from cache or memory depending on when the cacheline is invalidated. So it's really about a bad inconsistency state which can be recovered only by invalidating the cacheline (so memory data is up-to-date) or doing a WB-type store (to mark memory copy out-of-date) before the next-use. > > flush 2 pages, but I think it obscures the function of the flush if we > > can't articulate the value in this case. btw KSM is one example. Jason mentioned in earlier discussion that not all free pages are zero-ed before the next use then it'd always good to conservatively prevent any potential inconsistent state leaked back to the kernel. Though I'm not sure what'd be a real usage in which the next user will directly use then uninitialized content w/o doing any meaningful writes (which once done then will stop the attacking window)... > > > I agree the second flush is not necessary if we are confident that functions in > between the two flushes do not and will not touch the page in CPU side. > However, can we guarantee this? For instance, is it possible for some > IOMMU > driver to read/write the page for some quirks? (Or is it just a totally > paranoid?) > If that's not impossible, then ensuring cache and memory coherency before > page reclaiming is better? > I don't think it's a valid argument.