On 27/06/17 07:29 AM, John Brooks wrote: > On Mon, Jun 26, 2017 at 06:44:30PM +0900, Michel Dänzer wrote: >> On 24/06/17 02:39 AM, John Brooks wrote: >>> The BO move throttling code is designed to allow VRAM to fill quickly if it >>> is relatively empty. However, this does not take into account situations >>> where the visible VRAM is smaller than total VRAM, and total VRAM may not >>> be close to full but the visible VRAM segment is under pressure. In such >>> situations, visible VRAM would experience unrestricted swapping and >>> performance would drop. >>> >>> Add a separate counter specifically for moves involving visible VRAM, and >>> check it before moving BOs there. >>> >>> Fixes: 95844d20ae02 (drm/amdgpu: throttle buffer migrations at CS using a fixed MBps limit (v2)) >>> Signed-off-by: John Brooks <john at fastquake.com> >> >> Something like this patch is definitely needed, good catch. >> >> However, as is one issue is that it incurs CPU overhead even when all of >> VRAM is CPU visible. Can that be avoided somehow? >> > > I guess that depends which part of it you consider to be expensive. > > bytes_moved_vis_threshold isn't used unless visible VRAM is smaller than total > VRAM, so any work/instructions that go into computing it could be skipped in > that case (at the cost of checking that visible_vram_size < real_vram_size) > Would that help? I think so. E.g. in amdgpu_cs_get_threshold_for_moves, it should be possible to handle most of this in a single if (visible_vram_size < real_vram_size) block, which should be mostly free in the visible_vram_size == real_vram_size case thanks to branch prediction. Not sure there's much that can be done elsewhere though. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer