On Tue, 2023-09-19 at 09:47 +0200, Christian König wrote: > !! External Email > > Am 19.09.23 um 08:56 schrieb Thomas Hellström: > > > > On 9/19/23 07:39, Christian König wrote: > > > Am 19.09.23 um 03:26 schrieb Zack Rusin: > > > > On Mon, 2023-09-18 at 16:21 -0400, Alex Deucher wrote: > > > > > !! External Email > > > > > > > > > > On Mon, Sep 18, 2023 at 3:06 PM Thomas Hellström > > > > > <thomas.hellstrom@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > On 9/18/23 17:52, Zack Rusin wrote: > > > > > > > On Mon, 2023-09-18 at 17:13 +0200, Thomas Hellström wrote: > > > > > > > > Hi, > > > > > > > > > > > > > > > > On 9/18/23 16:56, Thomas Hellström wrote: > > > > > > > > > Hi Zack, Christian > > > > > > > > > > > > > > > > > > On 9/18/23 13:36, Christian König wrote: > > > > > > > > > > Hi Zack, > > > > > > > > > > > > > > > > > > > > adding Thomas and Daniel. > > > > > > > > > > > > > > > > > > > > I briefly remember that I talked with Thomas and some other > > > > > > > > > > people > > > > > > > > > > about that quite a while ago as well, but I don't fully > > > > > > > > > > remember the > > > > > > > > > > outcome. > > > > > > > > > Found one old thread, but didn't read it: > > > > > > > > > > > > > > > > > > https://lists.freedesktop.org/archives/dri-devel/2019-September/234100.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > /Thomas > > > > > > > > > > > > > > > > > > > > > > > > > > Ugh. Now starting to read that thread I have a vague > > > > > > > > recollection it all > > > > > > > > ended with not supporting mapping any device pages whatsoever > > > > > > > > when SEV > > > > > > > > was enabled, but rather resorting to llvmpipe and VM-local bos. > > > > > > > Hi, Thomas. > > > > > > > > > > > > > > Thanks for finding this! I'd (of course) like to solve it > > > > > > > properly and get > > > > > > > vmwgfx > > > > > > > running with 3d support with SEV-ES active instead of essentially > > > > > > > disabling > > > > > > > the > > > > > > > driver when SEV-ES is active. > > > > > > > > > > > > > > I think there are two separate discussions there, the > > > > > > > non-controversial one > > > > > > > and the > > > > > > > controversial one: > > > > > > > 1) The non-controversial: is there a case where drivers would > > > > > > > want encrypted > > > > > > > memory > > > > > > > for TT pages but not for io mem mappings? Because if not then as > > > > > > > Christian > > > > > > > pointed > > > > > > > out we could just add pgprot_decrypted to ttm_io_prot and be > > > > > > > essentially done. > > > > > > > The > > > > > > > current method of decrypting io mem but leaving sys mem mappings > > > > > > > encrypted is > > > > > > > a bit > > > > > > > weird anyway. > > > > > > > > > > > > > > If the answer to that question is "yes, some driver does want the > > > > > > > TT mappings > > > > > > > to be > > > > > > > encrypted" then your "[PATCH v2 3/4] drm/ttm, drm/vmwgfx: > > > > > > > Correctly support > > > > > > > support > > > > > > > AMD memory encryption" solves that. I think getting one of those > > > > > > > two in makes > > > > > > > sense > > > > > > > regardless of everything else, agreed? > > > > > > Well, there is more to it I think. > > > > > > > > > > > > IIRC, the AMD SME encryption mode has a way for a device to have the > > > > > > memory controller (?) encrypt / decrypt device traffic by using an > > > > > > address range alias, so in theory it supports encrypted TT pages, and > > > > > > the dma-layer may indeed hand encrypted DMA pages to TTM on such > > > > > > systems > > > > > > depending on the device's DMA mask. That's why I think that > > > > > > force_dma_unencrypted() export was needed, and If the amdgpu driver > > > > > > accesses TT memory in SME mode *without* pgprot_decrypted() and it > > > > > > still > > > > > > works, then I think that mode is actually used. How could it > > > > > > otherwise work? > > > > > For SME, as long as the encrypted bit is set in the physical address > > > > > used for DMA, the memory controller will handle the encrypt/decrypt > > > > > for the device. For devices with a limited dma mask, you need to use > > > > > the IOMMU so that the encrypted bit is retained when the address hits > > > > > the memory controller. > > > > How does that work on systems with swiotlb, e.g. swiotlb=force, or > > > > i.e. what would > > > > decrypt the ttm tt mappings when copying between system and vram > > > > when iommu is > > > > disabled/absent? > > > > > > SME makes it mandatory that all devices can handle the physical > > > address used for DMA, either native or with the help of IOMMU. > > > > > > Hacks like SWIOTLB are not directly supported as far as I know. Maybe > > > somehow SWIOTLB manually decrypts the data while copying it or > > > something like this, but I'm not 100% sure if that is actually > > > implemented. > > > > > > Regards, > > > Christian. > > > > A bold guess after looking at various code and patches: > > > > 1) Devices under SME that don't support the encryption bit and SEV: > > a) Coherent memory is unencrypted. > > b) Streaming DMA under IOMMU: The IOMMU sets the encrypted bit. > > c) Streaming DMA with SWIOTLB: The bounce buffer is unencrypted. > > Copying to/from bounce-buffer decrypts/encrypts. > > > > 2) Devices under SME that do support the encryption bit (which I > > believe is most graphics devices in general on SME systems, not just > > amdgpu; it "just works") > > *) Coherent memory is encrypted. The DMA layer sets dma addresses and > > pgprot accordingly. > > *) Streaming DMA is encrypted. > > > > So the bug in TTM would then be it's not handling 1a) and 1b) correctly. > > > > Remedy: > > 1b) Shouldn't be used with encryption. > > 1a) This is what we should try to fix. Exporting > > dma_force_unencrypted() didn't seem to be a way forward. Properly > > fixing this would, I guess, mean implement the missing functionality > > in the dma layer: For vmap / kmap we could simply reuse the virtual > > addresses we get back from dma_alloc_coherent(), but for faulting one > > would want something like dma_coherent_insert_pfn() (if it doesn't > > exist already) after a proper disussion with Christoph Hellwig. > > Christoph once pointed me to dma_mmap_attrs() for this, but I never > found the time to fully look into it. Hmm, yea, that would make sense https://elixir.bootlin.com/linux/latest/source/kernel/dma/direct.c#L564 Replacing the vmap's with dma_mmap_attrs would probably fix this, but it would require a bit of extra setup. So we're saying that yes, we don't want unconditional pgprot_decrypt in ttm_io_prot. We'd like to leave those tt mappings as encrypted when possible and instead maybe add a vaddr to ttm_tt (or extract it from the pages->private via the ttm_pool_dma, but that seems rather ugly), plus add get_vm_area() to be able to use dma_mmap_attrs instead of vmap when use_dma_alloc's is true in ttm_bo_vmap/ttm_bo_kmap? z