Re: [PATCH 1/1] [RFC] drm/ttm: Don't init dma32_zone on 64-bit systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/20/19 9:07 AM, Christian König wrote:
Am 20.02.19 um 07:41 schrieb Thomas Hellstrom:
On Tue, 2019-02-19 at 17:06 +0000, Kuehling, Felix wrote:
On 2019-02-18 3:39 p.m., Thomas Hellstrom wrote:
On Mon, 2019-02-18 at 18:07 +0100, Christian König wrote:
Am 18.02.19 um 10:47 schrieb Thomas Hellstrom:
On Mon, 2019-02-18 at 09:20 +0000, Koenig, Christian wrote:
Another good question is also why the heck the acc_size
counts
towards
the DMA32 zone?
DMA32 TTM pages are accounted in the DMA32 zone. Other pages
are
not.
Yeah, I'm perfectly aware of this. But this is for the accounting
size!

We have an accounting for the stuff needed additional to the
pages
backing the BO (e.g. the page and DMA addr array).

And from the bug description it sounds like we use the DMA32 zone
for
this accounting which of course is completely nonsense.
It's actually accounted in all available zones, since it would be
pretty hard to determine exactly where that memory should be
accounted.
In particular if it's vmalloced. It might be DMA32, it might not.
Given
the objective of stopping malicious user-space from exhausting the
DMA32 zone it was, at the time the code was written, a reasonable
approximation. With ever increasing memory sizes, there might be
better
solutions?
As far as I can see, in TTM, ttm_mem_global_alloc is only used for
the
acc_size in ttm_bo_init_reserved. Other than that vmwgfx also seems
to
use it to account for a few things that are allocated with kmalloc.

So would a better solution be to change ttm_mem_global_alloc to use
only
the kernel zone?

IMO we need to determine what functionality to keep and then the best
solution. The current code does its job, but is obviously too
restrictive. Both of the solutions you suggest open up for potential
DOS attacks (DMA32 and kernel zones are not mutually exclusive. They
overlap).

Yeah and exactly because of this it doesn't make to much sense to account the housekeeping stuff in both zones.

IMO it makes sense because (given the assumption that kmalloc/vmalloc doesn't care) , accounting only in DMA32 in low memory configurations will not guarantee that kernel is not exhausted. Accounting only in kernel in huge memory configurations will not guarantee that DMA32 is not exhausted.


IIRC the kernel takes perfectly care by itself that kmalloced memory doesn't eat up everything in the DMA32 zone.

If we can somehow verify this, or at least illustrate that it's likely, I'm perfectly fine with either patch from the vmwgfx POW.

Thanks,
Thomas




Christian.



/Thomas




Regards,
    Felix


/Thomas

Christian.

For small persistent allocations using ttm_mem_global_alloc(),
they
are
accounted also in the DMA32 zone, which may cause over-
accounting
of
that zone, but that's pretty unlikely to be a big problem..

/Thomas





In other words why should the internal bookkeeping pages be
allocated
in
the DMA32 zone?

That doesn't sounds valid to me in any way,
Christian.

Am 18.02.19 um 09:02 schrieb Thomas Hellstrom:
Hmm,

This zone was intended to stop TTM page allocations from
exhausting
the DMA32 zone. IIRC dma_alloc_coherent() uses DMA32 by
default,
which
means if we drop this check, other devices may stop
functioning
unexpectedly?

However, in the end I'd expect the kernel page allocation
system
to
make sure there are some pages left in the DMA32 zone,
otherwise
random non-IO page allocations would also potentially
exhaust
the
DMA32 zone without anybody caring, which means removing
this
zone
wouldn't be any worse than whatever other subsystems may be
doing
already...

/Thomas


On 2/16/19 12:02 AM, Kuehling, Felix wrote:
This is an RFC. I'm not sure this is the right solution,
but
it
highlights the problem I'm trying to solve.

The dma32_zone limits the acc_size of all allocated BOs
to
2GB.
On a
64-bit system with hundreds of GB of system memory and
GPU
memory,
this can become a bottle neck. We're seeing TTM memory
allocation
failures not because we're truly out of memory, but
because
we're
out of space in the dma32_zone for the acc_size needed
for
our BO
book-keeping.

Signed-off-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>
CC: thellstrom@xxxxxxxxxx
CC: christian.koenig@xxxxxxx
---
     drivers/gpu/drm/ttm/ttm_memory.c | 4 ++--
     1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_memory.c
b/drivers/gpu/drm/ttm/ttm_memory.c
index f1567c3..bb05365 100644
--- a/drivers/gpu/drm/ttm/ttm_memory.c
+++ b/drivers/gpu/drm/ttm/ttm_memory.c
@@ -363,7 +363,7 @@ static int
ttm_mem_init_highmem_zone(struct
ttm_mem_global *glob,
         glob->zones[glob->num_zones++] = zone;
         return 0;
     }
-#else
+#elifndef CONFIG_64BIT
     static int ttm_mem_init_dma32_zone(struct
ttm_mem_global
*glob,
                        const struct sysinfo *si)
     {
@@ -441,7 +441,7 @@ int ttm_mem_global_init(struct
ttm_mem_global
*glob)
         ret = ttm_mem_init_highmem_zone(glob, &si);
         if (unlikely(ret != 0))
             goto out_no_zone;
-#else
+#elifndef CONFIG_64BIT
         ret = ttm_mem_init_dma32_zone(glob, &si);
         if (unlikely(ret != 0))
             goto out_no_zone;
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Cthellstrom%40vmware.com%7C1357d06244fb499b31dd08d6968ca04b%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636861928079462725&amp;sdata=1Bucho93KMzN0z7QbfiNn%2BNWaZs5yi86Ya6vm9Xhbqo%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux