> One question: Will it be possible to share these split BOs as dmabufs? In theory yes, in practice I'm not sure. DMA-bufs are designed around scatter gather tables, those fortunately support buffers split over the whole address space. The problem is the importing device needs to be able to handle that as well. Regards, Christian. Am 16.08.2016 um 20:33 schrieb Felix Kuehling: > Very nice. I'm looking forward to this for KFD as well. > > One question: Will it be possible to share these split BOs as dmabufs? > > Regards, > Felix > > > On 16-08-16 11:27 AM, Christian König wrote: >> Hi Marek, >> >> I'm already working on this. >> >> My current approach is to use a custom BO manager for VRAM with TTM >> and so split allocations into chunks of 4MB. >> >> Large BOs are still swapped out as one, but it makes it much more >> likely to that you can allocate 1/2 of VRAM as one buffer. >> >> Give me till the end of the week to finish this and then we can test >> if that's sufficient or if we need to do more. >> >> Regards, >> Christian. >> >> Am 16.08.2016 um 16:33 schrieb Marek Olšák: >>> Hi, >>> >>> I'm seeing random temporary freezes (up to 2 seconds) under memory >>> pressure. Before I describe the exact circumstances, I'd like to say >>> that this is a serious issue affecting playability of certain AAA >>> Linux games. >>> >>> In order to reproduce this, an application should: >>> - allocate a few very large buffers (256-512 MB per buffer) >>> - allocate more memory than there is available VRAM. The issue also >>> occurs (but at a lower frequency) if the app needs only 80% of VRAM. >>> >>> Example: ttm_bo_validate needs to migrate a 512 MB buffer. The total >>> size of moved memory for that call can be as high as 1.5 GB. This is >>> always followed by a big temporary drop in VRAM usage. >>> >>> The game I'm testing needs 3.4 GB of VRAM. >>> >>> Setups: >>> Tonga - 2 GB: It's nearly unplayable, because freezes occur too often. >>> Fiji - 4 GB: There is one freeze at the beginning (which is annoying >>> too), after that it's smooth. >>> >>> So even 4 GB is not enough. >>> >>> Workarounds: >>> - Split buffers into smaller pieces in the kernel. It's not necessary >>> to manage memory at page granularity (64KB). Splitting buffers into >>> 16MB-large pieces might not be optimal but it would be a significant >>> improvement. >>> - Or do the same in Mesa. This would prevent inter-process and >>> inter-API buffer sharing for split buffers (DRI, OpenCL), but we would >>> at least verify how much the situation improves. >>> >>> Other issues sharing the same cause: >>> - Allocations requesting 1/3 or more VRAM have a high chance of >>> failing. It's generally not possible to allocate 1/2 or more VRAM as >>> one buffer. >>> >>> Comments welcome, >>> >>> Marek >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx at lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx