Re: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when snooping not available

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 01.07.24 um 13:52 schrieb Jiaxun Yang:
在2024年7月1日七月 下午12:40,Christian König写道:
[...]
Ah, wait a second.

Such a thing as non-coherent PCIe implementation doesn't exist. The PCIe
specification makes it mandatory for memory access to be cache coherent.
There are some non-PCIe TTM GPU being hit by this pitfall, we have non-coherent
Vivante GPU on some devices.

Yeah, but those are perfectly supported.

If you have a non PCIe device which needs uncached or write combined system memory allocations you can just specify this at memory allocation time.

Handling it at TTM core makes more sense on reducing per-driver effort on dealing
platform issues.

There are a bunch of non-compliant PCIe implementations which have
broken cache coherency, but those explicitly violate the specification
and because of that are not supported.
I don't really understand, "doesn't exist" and "bunch of" seems to be contradicting
with each other.

A compliant non-coherent PCIe implementation doesn't exists, that's made mandatory by the PCIe standard.

What does exists are some non compliant non coherent PCIe implementations, but as far as I know those are then not supported at all by Linux.

We already had tons of problems with platform chips which intentionally doesn't correctly implement the PCIe specification because it is cheaper.

At least for AMDs GPU driver we reverted to rejecting patches for platform bugs cause by incorrectly wired up PCIe root complexes.

Regards,
Christian.


Regards,
Christian.

Unfortunately I don't think we can safely ttm_cached to ttm_write_comnined, we've
had enough drama with write combine behaviour on all different platforms.

See drm_arch_can_wc_memory in drm_cache.h.

Yes this really sounds like an issue.

Maybe the behavior of ttm_write_combined should furtherly be decided
by drm_arch_can_wc_memory() in case of quirks?
IMO for DMA mappings, use dma_pgprot at mapping makes more sense :-)

Thanks
- Jiaxun
Thanks

+
   	return ttm_prot_from_caching(caching, tmp);
   }
   EXPORT_SYMBOL(ttm_io_prot);
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 7b00ddf0ce49f..3335df45fba5e 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -152,6 +152,10 @@ static void ttm_tt_init_fields(struct ttm_tt *ttm,
   			       enum ttm_caching caching,
   			       unsigned long extra_pages)
   {
+	/* Downgrade cached mapping for non-snooping devices */
+	if (!bo->bdev->dma_coherent && caching == ttm_cached)
+		caching = ttm_write_combined;
+
   	ttm->num_pages = (PAGE_ALIGN(bo->base.size) >> PAGE_SHIFT) + extra_pages;
   	ttm->page_flags = page_flags;
   	ttm->dma_address = NULL;
diff --git a/include/drm/ttm/ttm_caching.h b/include/drm/ttm/ttm_caching.h
index a18f43e93abab..f92d7911f50e4 100644
--- a/include/drm/ttm/ttm_caching.h
+++ b/include/drm/ttm/ttm_caching.h
@@ -47,7 +47,8 @@ enum ttm_caching {

   	/**
   	 * @ttm_cached: Fully cached like normal system memory, requires that
-	 * devices snoop the CPU cache on accesses.
+	 * devices snoop the CPU cache on accesses. Downgraded to
+	 * ttm_write_combined when the snooping capaiblity is missing.
   	 */
   	ttm_cached
   };
--
2.45.2




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux