Re: [PATCH 05/13] drm/ttm: overhaul memory accounting

Thomas Hellstrom <thellstrom@xxxxxxxxxx> · Thu, 10 Nov 2011 21:05:22 +0100

On 11/10/2011 07:05 PM, Jerome Glisse wrote:
On Thu, Nov 10, 2011 at 11:27:33AM +0100, Thomas Hellstrom wrote:

On 11/09/2011 09:22 PM, j.glisse@xxxxxxxxx wrote:

From: Jerome Glisse<jglisse@xxxxxxxxxx>

This is an overhaul of the ttm memory accounting. This tries to keep
the same global behavior while removing the whole zone concept. It
keeps a distrinction for dma32 so that we make sure that ttm don't
starve the dma32 zone.

There is 3 threshold for memory allocation :
- max_mem is the maximum memory the whole ttm infrastructure is
   going to allow allocation for (exception of system process see
   below)
- emer_mem is the maximum memory allowed for system process, this
   limit is>   to max_mem
- swap_limit is the threshold at which point ttm will start to
   try to swap object because ttm is getting close the max_mem
   limit
- swap_dma32_limit is the threshold at which point ttm will start
   swap object to try to reduce the pressure on the dma32 zone. Note
   that we don't specificly target object to swap to it might very
   well free more memory from highmem rather than from dma32

Accounting is done through used_mem&   used_dma32_mem, which sum give
the total amount of memory actually accounted by ttm.

Idea is that allocation will fail if (used_mem + used_dma32_mem)>
max_mem and if swapping fail to make enough room.

The used_dma32_mem can be updated as a later stage, allowing to
perform accounting test before allocating a whole batch of pages.

Jerome, you're removing a fair amount of functionality here, without
justifying
why it could be removed.

All this code was overkill.

[1] I don't agree, and since it's well tested, thought throught and 
working, I see no obvious reason to alter it,
within the context of this patch series unless it's absolutely required 
for the functionality.

Consider a low-end system with 1G of kernel memory and 10G of
highmem. How do we avoid putting stress on the kernel memory? I also
wouldn't be too surprised if DMA32 zones appear in HIGHMEM systems
in the future making the current zone concept good to keep.

Right now kernel memory is accounted against all zone so it decrease
not only the kernel zone but also the dma32&  highmem if present.

Do you mean that the code is incorrect? In that case, did you consider 
the fact
that zones may overlap? (Although I admit the name "highmem" might be 
misleading. Should be "total").

Note also that kernel zone in current code == dma32 zone.

Last time I looked, the highmem split was typically at slightly less 
than 1GB, depending on the hardware and desired setup. I admit that was 
some time ago, but has that really changed? On all archs?
Furthermore, on !Highmem systems, All pages are in the kernel zone.

When it comes to future it looks a lot simpler, it seems everyone
is moving toward more capable and more advanced iommu that can remove
all the restriction on memory from the device pov. I mean even arm
are getting more and more advance iommu. I don't see any architecture
worse supporting not going down that road.

While the proposed change is probably possible, with different low <-> 
high splits depending on whether HIGHMEM is defined or not, I think [1] 
is a good reason for not pushing it through.

Also, in effect you move the DOS from *all* zones into the DMA32
zone and create a race in that multiple simultaneous allocators can
first pre-allocate out of the global zone, and then update the DMA32
zone without synchronization. In this way you might theoretically
end up with more DMA32 pages allocated than present in the zone.

Ok a respin attached with a simple change, things will be
accounted against dma32 zone and only when we get page we will
decrease the dma32 zone usage that way no DOS on dma32.

It also deals with the case where there is still lot of highmem
but no more dma32.

So why not just do a ttm_mem_global_alloc() for the pages you want to 
allocate,
and add a proper adjustment function if memory turns out to be either 
HIGHMEM or !DMA32

With the proposed code there's also a theoretical problem in that a
potentially huge number of pages are unaccounted before they are
actually freed.

What you mean unaccounted ? The way it works is :
r = global_memory_alloc(size)
if (r) fail
alloc pages
update memory accounting according to what page where allocated

So memory is always accounted before even being allocated (exception
are the kernel object for vmwgfx&  ttm_bo but we can move accounting
there too if you want, those are small allocation and i didn't think
it was worse changing that).

No, I mean the sequence

unaccount_page_array()
---Race--
free_page_array()

/Thomas

Cheers,
Jerome

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel