Am 24.06.2017 um 20:36 schrieb John Brooks: > On Sat, Jun 24, 2017 at 08:07:15PM +0200, Christian König wrote: >> Am 23.06.2017 um 19:39 schrieb John Brooks: >>> This patch series is intended to improve performance when limited CPU-visible >>> VRAM is under pressure. >>> >>> Moving BOs into visible VRAM is essentially a housekeeping task. It's faster to >>> access them in VRAM than GTT, but it isn't a hard requirement for them to be in >>> VRAM. As such, it is unnecessary to spend valuable time blocking on this in the >>> page fault handler or during command submission. Doing so translates directly >>> into a longer frame time (ergo stalls and stuttering). >> Sorry, but that strongly sounds like you are messing with things you don't >> fully understand. >> >> Blocking in the page fault handler is mandatory to handle the page fault >> correctly. So that is not something we can change easily. > I do understand that. Indeed, the page fault handler must block until the > memory is accessible. But the memory doesn't have to be in visible VRAM to be > accessible; it could also be in GTT. So, what I meant was that it does not have > to block for the excessively long time it takes to move BOs into visible VRAM > when it is already full. It could spend less time blocking by just moving it to > GTT in that situation. I apologize for the poor wording. Ah! Yeah that makes more sense, I could have read the source first before writing the mail. Sorry just got goosebumps from the idea that you tried to avoid the blocking in the page fault handler :) Christian. > > John > >> Regards, >> Christian. >> >>> The problem worsens when attempting to move BOs into visible VRAM when it is >>> full. This takes much longer than a simple move because other BOs have to be >>> evicted, which involves finding and then moving potentially hundreds of other >>> BOs, which is very time consuming. In the case of limited visible VRAM, it's >>> important to do this sometime to keep the contents of visible VRAM fresh, but >>> it does not need to be a blocking operation. If visible VRAM is full, the BO >>> can be read from GTT in the meantime and the BO can be moved to VRAM later. >>> >>> Thus, I have made it so that neither the command submission code nor page fault >>> handler spends time evicting BOs from visible VRAM, and instead this is >>> deferred to a workqueue function that's queued when CS requests BOs flagged >>> AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED. >>> >>> Speaking of CPU_ACCESS_REQUIRED, I've changed the handling of that flag so that >>> the kernel driver can clear it later even if it was set by userspace. This is >>> because the userspace graphics library can't know whether the application >>> really needs it to be CPU_ACCESS_REQUIRED forever. The kernel driver can't know >>> either, but it does know when page faults occur, and if a BO doesn't appear to >>> have any page faults when it's moved somewhere inaccessible, the flag can be >>> removed and it doesn't have to take up space in CPU-visible memory anymore. >>> This change was based on IRC discussions with Michel. >>> >>> Patch 7 fixes a problem with BO moverate throttling that causes visible VRAM >>> moves to not be throttled if total VRAM isn't full enough. >>> >>> I've also added a vis_vramlimit module parameter for debugging purposes. It's >>> similar to the vramlimit parameter except it limits only visible VRAM. >>> >>> I have tested this patch set with the two games I know to be affected by >>> visible VRAM pressure: DiRT Rally and Dying Light. It practically eliminates >>> eviction-related stuttering in DiRT Rally as well as very low performance if >>> visible VRAM is limited to 64MB. It also fixes severely low framerates that >>> occurred in some areas of Dying Light. All my testing was done with an R9 290 >>> with 4GB of visible VRAM with an Intel i7 4790. >>> >>> -- >>> John Brooks (Frogging101) >>> >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx at lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>