On Thu, Nov 29, 2012 at 10:18 AM, Thomas Hellstrom <thomas@xxxxxxxxxxxx> wrote: > On 11/28/2012 10:51 PM, Marek Olšák wrote: >> >> I think the problem with Radeon/TTM is much deeper. Let me demonstrate >> it on the following example. >> >> Unigine Heaven needs about 385MB of space for static resources, that's >> only 75% of my 512MB card. Yet, TTM is not capable of getting all of >> that into VRAM. If I allow GTT placements, I get 20 fps, which is the >> old Mesa behavior. If I force VRAM placements, I get 3 fps, because we >> validate buffers 10 times per frame and there's probably a lot of >> buffer evictions during each validation. >> > > Marek, > Did you look at the total amount of referenced buffers in the ring including > vertex buffers? > > Depending on how hard you throttle, I guess vertex / index buffer data > referenced by the > ring commands may well exceed the VRAM limitation. Buffers (not textures) take only 30 MB. These are stats for 1 frame of Unigine Heaven. Each line is a CS ioctl. VRAM [used in CS] / [total allocated], GTT [used in CS] / [total allocated] 1. VRAM: 171 / 390 MB, GTT: 1 / 5 MB 2. VRAM: 144 / 390 MB, GTT: 2 / 5 MB 3. VRAM: 184 / 390 MB, GTT: 1 / 5 MB 4. VRAM: 35 / 390 MB, GTT: 2 / 5 MB 5. VRAM: 119 / 390 MB, GTT: 1 / 5 MB 6. VRAM: 207 / 390 MB, GTT: 1 / 5 MB 7. VRAM: 65 / 390 MB, GTT: 2 / 5 MB If I move all buffers (vertex, index, constant, streamout, queries, shader code, etc.) to GTT, this is how one frame looks like (not the same one though, but it's close): 1. VRAM: 144 / 359 MB, GTT: 16 / 35 MB 2. VRAM: 95 / 359 MB, GTT: 12 / 35 MB 3. VRAM: 178 / 359 MB, GTT: 15 / 35 MB 4. VRAM: 55 / 359 MB, GTT: 13 / 35 MB 5. VRAM: 22 / 359 MB, GTT: 16 / 35 MB 6. VRAM: 163 / 359 MB, GTT: 16 / 35 MB 7. VRAM: 133 / 359 MB, GTT: 11 / 35 MB 8. VRAM: 66 / 359 MB, GTT: 4 / 35 MB The stats are generated in the Mesa driver based on the driver's expectations where buffers should be placed. I can easily see how VRAM is thrashed with the strict LRU approach. Also, is it possible that one buffer is moved twice for a single CS ioctl? Imagine there's a buffer at the end of the relocation list, which is also at the head of the LRU list. Some buffer in the middle causes eviction of the last buffer. When the last buffer is validated, it's moved back to VRAM. Can it happen? Marek _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel