On Wed, Jan 24, 2018 at 11:26 AM, Russell King - ARM Linux <linux@xxxxxxxxxxxxxxx> wrote: > On Wed, Jan 24, 2018 at 10:45:28AM -0800, Gurchetan Singh wrote: >> On Wed, Jan 24, 2018 at 4:45 AM, Russell King - ARM Linux < >> linux@xxxxxxxxxxxxxxx> wrote: >> > So no, this is not an acceptable approach. >> > >> > Secondly, in light of spectre and meltdown, do we _really_ want to >> > export cache flushing to userspace in any case - these attacks rely >> > on being able to flush specific cache lines from the caches in order >> > to do the timing attacks (while leaving others in place.) >> >> > Currently, 32-bit ARM does not export such flushing capabilities to >> > userspace, which makes it very difficult (I'm not going to say >> > impossible) to get any working proof-of-code program that even >> > illustrates the timing attack. Exposing this functionality changes >> > that game, and means that we're much more open to these exploits. >> > (Some may say that you can flush cache lines by reading a large >> > enough buffer - I'm aware, I've tried that, the results are too >> > unreliable even for a simple attempt which doesn't involve crossing >> > privilege boundaries.) >> > >> >> Will using the DMA API (dma_sync_single_for_device / >> dma_sync_sg_for_device) mitigate your Meltdown / Spectre concerns in any >> way? > > I see no point in answering that question based on what you've written > below (see below for why). > >> > Do you really need cacheable GPU buffers, or will write combining >> > buffers (as we use elsewhere such as etnaviv) suffice? Please provide >> > some _real_ _world_ performance measurements that demonstrate that >> > there is a real need for this functionality. >> >> >> My desire is for the vgem driver to work correctly on ARM, which requires >> cache flushing. The mappings vgem itself creates are write combine. > > If the pages are mapped write-combine, they are by definition *not* > cacheable, so there should be no cache flushing required. > >> The >> issue is the pages retrieved on ARM architecture usually have to be flushed >> before they can be used (see rockchip_gem_get_pages / tegra_bo_get_pages). >> This patch set attempts to do the flushing in an architecture independent >> manner (since vgem is intended to work on ARM / x86). > > I see rockchip_gem_get_pages() using shmem_read_mapping_page() to get > the pages. That's more or less fine, we do that on Etnaviv too. > > (Side note: provided the pages are not coming from lowmem, as mapping > lowmem pages are mapped cacheable, and if you also map them elsewhere > as write-combine, you're stepping into some potential cache attribute > issues.) > > How we deal with this in Etnaviv is to use dma_map_sg() after we get > the pages - see > > etnaviv_gem_get_pages(), which calls the memory specific .get_pages > method, and goes on to call etnaviv_gem_scatter_map(). > > There's no need for the faking up of a SG table that way, just let > dma_map_sg() do whatever it needs to do. This means you're not abusing > the DMA API, and if you have a system IOMMU in the way, as a bonus that > gets setup for you. Okay, so the recommended solution to my problem (vgem doesn't work on ARM) is to: 1) Create vgem_get_pages / vgem_put_pages functions. vgem_get_pages will be called from vgem_gem_fault (since currently vgem does delayed allocation) and vgem_pin_pages. 2) Call dma_map_sg() / dma_unmap_sg in vgem_get_pages / vgem_put_pages if the architecture is ARM. That works for me, if that works for everyone else. >> There is some interest in cache-able DRM buffers (though, again, this >> patchset is not about that). Renderscript accesses are very slow on ARM >> and we keep shadow buffers to improve performance (see >> crrev.com/602736). > > 404. Oops, the correct URL is crrev.com/c/602736. >> Jeffy >> has done some tests with memcpys in our camera stack that shows >> improvements (with caching --> 4 to 7 ms, without caching --> 20 to 90ms). >> However, I do agree Spectre complicates things. > > At the moment, on 32-bit ARM, we have very little mitigation work for > Spectre beyond the generic kernel work that is going on at the moment > for all architectures, so I really do not want to introduce anything > that makes 32-bit ARM more easily vulnerable to attack. Fair enough. >That may change in the future (sorry, I can't say when), but right now I don't > think we have much of an option. > > -- > RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ > FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up > According to speedtest.net: 8.21Mbps down 510kbps up _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel