So i spend the day looking at ttm and eviction. The first patch i sent earlier is i believe something that should be merged. This patch however is more about discussing if other people are interested in similar mecanism to be share among driver through ttm. I could otherwise just move its logic to the radeon driver. So the idea of this patch is that we don't want to constantly move object in and out of certain memory pool, mostly VRAM. So it adds a minimum residency time and no object that have been in the given pool for less than this residency time can be moved out. It closely solve regression we are having with radeon since gallium driver change and probably improve some other workload. Statistic i gathered on xonotic/realquake showed that we can have as much as 1GB in each direction (VRAM to system and system to vram) over a second. So we are obviously not saturating the PCIE bandwidth. Profiling shows that 80-90% of the cost of this eviction is in memory allocation/deallocation for the system memory (lot of irqlock, and mostly kernel spending time allocating pages thing 256 000 or more page per second to allocate/deallocate. I used this WIP patch to gather statistic and play with various combination : http://people.freedesktop.org/~glisse/0001-TTM-EVICT-WIP.patch Some numbers with xonotic : 17.369fps stock 3.7 kernel 27.883fps 3.7 kernel + do not preserve caching patch ~ +60% 49.292fps 3.7 kernel + WIP with 500ms residency for all pool and no bo wait for eviction 49.258fps 3.7 kernel + WIP with 500ms residency for all pool and bo wait 48.213fps 3.7 kernel always allowing GTT placement (basicly revent the gallium patch effect) Other design i am thinking of is changing the way radeon handle it's memory and stop trying to revalidate object to different memory pool at each cs, instead i think we should keep a vram lru list probably per process and move bo out of vram according to this lru and following some euristic. So radeon would only move bo into vram when there is room. Other improvement i am thinking of is to reuse GTT memory of object that are moved in for object that are evicted as statistic i gathered showed that it's often close amount that move in and out. But this would require true dma as it would mean scheduling in/out move on page granularity or group of page (write 4 pages from vram to scratch 4pages into sys, write 4 pages of system memory bo to vram 4 pages, write 4pages of vram to the just moved 4pages of system memory ...). Cheers, Jerome _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel