On 2019-02-18 3:39 p.m., Thomas Hellstrom wrote: > On Mon, 2019-02-18 at 18:07 +0100, Christian König wrote: >> Am 18.02.19 um 10:47 schrieb Thomas Hellstrom: >>> On Mon, 2019-02-18 at 09:20 +0000, Koenig, Christian wrote: >>>> Another good question is also why the heck the acc_size counts >>>> towards >>>> the DMA32 zone? >>> DMA32 TTM pages are accounted in the DMA32 zone. Other pages are >>> not. >> Yeah, I'm perfectly aware of this. But this is for the accounting >> size! >> >> We have an accounting for the stuff needed additional to the pages >> backing the BO (e.g. the page and DMA addr array). >> >> And from the bug description it sounds like we use the DMA32 zone >> for >> this accounting which of course is completely nonsense. > It's actually accounted in all available zones, since it would be > pretty hard to determine exactly where that memory should be accounted. > In particular if it's vmalloced. It might be DMA32, it might not. Given > the objective of stopping malicious user-space from exhausting the > DMA32 zone it was, at the time the code was written, a reasonable > approximation. With ever increasing memory sizes, there might be better > solutions? As far as I can see, in TTM, ttm_mem_global_alloc is only used for the acc_size in ttm_bo_init_reserved. Other than that vmwgfx also seems to use it to account for a few things that are allocated with kmalloc. So would a better solution be to change ttm_mem_global_alloc to use only the kernel zone? Regards, Felix > > /Thomas > >> Christian. >> >>> For small persistent allocations using ttm_mem_global_alloc(), they >>> are >>> accounted also in the DMA32 zone, which may cause over-accounting >>> of >>> that zone, but that's pretty unlikely to be a big problem.. >>> >>> /Thomas >>> >>> >>> >>> >>> >>>> In other words why should the internal bookkeeping pages be >>>> allocated >>>> in >>>> the DMA32 zone? >>>> >>>> That doesn't sounds valid to me in any way, >>>> Christian. >>>> >>>> Am 18.02.19 um 09:02 schrieb Thomas Hellstrom: >>>>> Hmm, >>>>> >>>>> This zone was intended to stop TTM page allocations from >>>>> exhausting >>>>> the DMA32 zone. IIRC dma_alloc_coherent() uses DMA32 by >>>>> default, >>>>> which >>>>> means if we drop this check, other devices may stop functioning >>>>> unexpectedly? >>>>> >>>>> However, in the end I'd expect the kernel page allocation >>>>> system >>>>> to >>>>> make sure there are some pages left in the DMA32 zone, >>>>> otherwise >>>>> random non-IO page allocations would also potentially exhaust >>>>> the >>>>> DMA32 zone without anybody caring, which means removing this >>>>> zone >>>>> wouldn't be any worse than whatever other subsystems may be >>>>> doing >>>>> already... >>>>> >>>>> /Thomas >>>>> >>>>> >>>>> On 2/16/19 12:02 AM, Kuehling, Felix wrote: >>>>>> This is an RFC. I'm not sure this is the right solution, but >>>>>> it >>>>>> highlights the problem I'm trying to solve. >>>>>> >>>>>> The dma32_zone limits the acc_size of all allocated BOs to >>>>>> 2GB. >>>>>> On a >>>>>> 64-bit system with hundreds of GB of system memory and GPU >>>>>> memory, >>>>>> this can become a bottle neck. We're seeing TTM memory >>>>>> allocation >>>>>> failures not because we're truly out of memory, but because >>>>>> we're >>>>>> out of space in the dma32_zone for the acc_size needed for >>>>>> our BO >>>>>> book-keeping. >>>>>> >>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@xxxxxxx> >>>>>> CC: thellstrom@xxxxxxxxxx >>>>>> CC: christian.koenig@xxxxxxx >>>>>> --- >>>>>> drivers/gpu/drm/ttm/ttm_memory.c | 4 ++-- >>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_memory.c >>>>>> b/drivers/gpu/drm/ttm/ttm_memory.c >>>>>> index f1567c3..bb05365 100644 >>>>>> --- a/drivers/gpu/drm/ttm/ttm_memory.c >>>>>> +++ b/drivers/gpu/drm/ttm/ttm_memory.c >>>>>> @@ -363,7 +363,7 @@ static int >>>>>> ttm_mem_init_highmem_zone(struct >>>>>> ttm_mem_global *glob, >>>>>> glob->zones[glob->num_zones++] = zone; >>>>>> return 0; >>>>>> } >>>>>> -#else >>>>>> +#elifndef CONFIG_64BIT >>>>>> static int ttm_mem_init_dma32_zone(struct ttm_mem_global >>>>>> *glob, >>>>>> const struct sysinfo *si) >>>>>> { >>>>>> @@ -441,7 +441,7 @@ int ttm_mem_global_init(struct >>>>>> ttm_mem_global >>>>>> *glob) >>>>>> ret = ttm_mem_init_highmem_zone(glob, &si); >>>>>> if (unlikely(ret != 0)) >>>>>> goto out_no_zone; >>>>>> -#else >>>>>> +#elifndef CONFIG_64BIT >>>>>> ret = ttm_mem_init_dma32_zone(glob, &si); >>>>>> if (unlikely(ret != 0)) >>>>>> goto out_no_zone; >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cthellstrom%40vmware.com%7C1a84a2a700cd48b3980a08d695c38ade%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636861064581303965&sdata=6FCAP6YbostgKfEEhRKKh9eQSrfXR%2BfCczYB8WwlPVY%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx