Re: [RFC PATCH] devres: avoid over memory allocation with managed memory allocation

Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> · Sat, 23 Jul 2022 12:10:48 +0200

On Sat, Jul 23, 2022 at 12:04:33PM +0200, Christophe JAILLET wrote:
> On one side, when using devm_kmalloc(), a memory overhead is added in order
> to keep track of the data needed to release the resources automagically.
> 
> On the other side, kmalloc() also rounds-up the required memory size in
> order to ease memory reuse and avoid memory fragmentation.
> 
> Both behavior together can lead to some over memory allocation which can
> be avoided.
> 
> For example:
>   - if 4096 bytes of managed memory is required
>   - "4096 + sizeof(struct devres_node)" bytes are required to the memory
> allocator
>   - 8192 bytes are allocated and nearly half of it is wasted
> 
> In such a case, it would be better to really allocate 4096 bytes of memory
> and record an "action" to perform the kfree() when needed.
> 
> On my 64 bits system:
>    sizeof(struct devres_node) = 40
>    sizeof(struct action_devres) = 16
> 
> So, a devm_add_action() call will allocate 56, rounded up to 64 bytes.
> 
> kmalloc() uses hunks of 8k, 4k, 2k, 1k, 512, 256, 192, 128, 96, 64, 32, 16,
> 8 bytes.
> 
> So in order to save some memory, if the 256 bytes boundary is crossed
> because of the overhead of devm_kmalloc(), 2 distinct memory allocations
> make sense.
> 
> Signed-off-by: Christophe JAILLET <christophe.jaillet@xxxxxxxxxx>
> ---
> This patch is only a RFC to get feed-back on the proposed approach.
> 
> It is compile tested only.
> I don't have numbers to see how much memory could be saved.
> I don't have numbers on the performance impact.
> 
> Should this makes sense to anyone, I would really appreciate getting some
> numbers from others to confirm if it make sense or not.
> 
> 
> The idea of this patch came to me because of a discussion initiated by
> Feng Tang <feng.tang@xxxxxxxxx>. He proposes to track wasted memory
> allocation in order to give hints on where optimizations can be done.
> 
> My approach is to avoid part of these allocations when due to the usage of
> a devm_ function.
> 
> 
> The drawbacks I see are:
>    - code is more complex
>    - this concurs to memory fragmentation because there will be 2 memory
>      allocations, instead of just 1
>    - this is slower for every memory allocation because of the while loop
>      and tests
>    - the magic 256 constant is maybe not relevant on all systems
>    - some places of the kernel already take advantage of this over memory
>      allocation. So unpredictable impacts can occur somewhere! (see [1],
>      which is part of the [2] thread)
>    - this makes some assumption in devres.c on how memory allocation works,
>      which is not a great idea :(
> 
> The advantages I see:
>    - in some cases, it saves some memory :)
>    - fragmentation is not necessarily an issue, devm_ allocated memory
>      are rarely freed, right?

I think devm_  allocated memory does not happen that much, try it on
your systems and see!

Numbers would be great to have, can you run some benchmarks?  Try it on
a "common" SoC device (raspberry pi?) and a desktop to compare.

thanks,

greg k-h