Re: [PATCH v2] slab, rust: extend kmalloc() alignment guarantees to remove Rust padding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 3, 2024 at 9:25 AM Vlastimil Babka <vbabka@xxxxxxx> wrote:
>
> Slab allocators have been guaranteeing natural alignment for
> power-of-two sizes since commit 59bb47985c1d ("mm, sl[aou]b: guarantee
> natural alignment for kmalloc(power-of-two)"), while any other sizes are
> guaranteed to be aligned only to ARCH_KMALLOC_MINALIGN bytes (although
> in practice are aligned more than that in non-debug scenarios).
>
> Rust's allocator API specifies size and alignment per allocation, which
> have to satisfy the following rules, per Alice Ryhl [1]:
>
>   1. The alignment is a power of two.
>   2. The size is non-zero.
>   3. When you round up the size to the next multiple of the alignment,
>      then it must not overflow the signed type isize / ssize_t.
>
> In order to map this to kmalloc()'s guarantees, some requested
> allocation sizes have to be padded to the next power-of-two size [2].
> For example, an allocation of size 96 and alignment of 32 will be padded
> to an allocation of size 128, because the existing kmalloc-96 bucket
> doesn't guarantee alignent above ARCH_KMALLOC_MINALIGN. Without slab
> debugging active, the layout of the kmalloc-96 slabs however naturally
> align the objects to 32 bytes, so extending the size to 128 bytes is
> wasteful.
>
> To improve the situation we can extend the kmalloc() alignment
> guarantees in a way that
>
> 1) doesn't change the current slab layout (and thus does not increase
>    internal fragmentation) when slab debugging is not active
> 2) reduces waste in the Rust allocator use case
> 3) is a superset of the current guarantee for power-of-two sizes.
>
> The extended guarantee is that alignment is at least the largest
> power-of-two divisor of the requested size. For power-of-two sizes the
> largest divisor is the size itself, but let's keep this case documented
> separately for clarity.
>
> For current kmalloc size buckets, it means kmalloc-96 will guarantee
> alignment of 32 bytes and kmalloc-196 will guarantee 64 bytes.
>
> This covers the rules 1 and 2 above of Rust's API as long as the size is
> a multiple of the alignment. The Rust layer should now only need to
> round up the size to the next multiple if it isn't, while enforcing the
> rule 3.
>
> Implementation-wise, this changes the alignment calculation in
> create_boot_cache(). While at it also do the calulation only for caches
> with the SLAB_KMALLOC flag, because the function is also used to create
> the initial kmem_cache and kmem_cache_node caches, where no alignment
> guarantee is necessary.
>
> In the Rust allocator's krealloc_aligned(), remove the code that padded
> sizes to the next power of two (suggested by Alice Ryhl) as it's no
> longer necessary with the new guarantees.
>
> Reported-by: Alice Ryhl <aliceryhl@xxxxxxxxxx>
> Reported-by: Boqun Feng <boqun.feng@xxxxxxxxx>
> Link: https://lore.kernel.org/all/CAH5fLggjrbdUuT-H-5vbQfMazjRDpp2%2Bk3%3DYhPyS17ezEqxwcw@xxxxxxxxxxxxxx/ [1]
> Link: https://lore.kernel.org/all/CAH5fLghsZRemYUwVvhk77o6y1foqnCeDzW4WZv6ScEWna2+_jw@xxxxxxxxxxxxxx/ [2]
> Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
> Reviewed-by: Boqun Feng <boqun.feng@xxxxxxxxx>
> Acked-by: Roman Gushchin <roman.gushchin@xxxxxxxxx>

Reviewed-by: Alice Ryhl <aliceryhl@xxxxxxxxxx>

> ---
> v2: - add Rust side change as suggested by Alice, also thanks Boqun for fixups
> - clarify that the alignment already existed (unless debugging) but was
>   not guaranteed, so there's no extra fragmentation in slab
> - add r-b, a-b thanks tO Boqun and Roman
>
> If it's fine with Rust folks, I can put this in the slab.git tree.
>
>  Documentation/core-api/memory-allocation.rst |  6 ++++--
>  include/linux/slab.h                         |  3 ++-
>  mm/slab_common.c                             |  9 +++++----
>  rust/kernel/alloc/allocator.rs               | 19 ++++++-------------
>  4 files changed, 17 insertions(+), 20 deletions(-)
>
> diff --git a/Documentation/core-api/memory-allocation.rst b/Documentation/core-api/memory-allocation.rst
> index 1c58d883b273..8b84eb4bdae7 100644
> --- a/Documentation/core-api/memory-allocation.rst
> +++ b/Documentation/core-api/memory-allocation.rst
> @@ -144,8 +144,10 @@ configuration, but it is a good practice to use `kmalloc` for objects
>  smaller than page size.
>
>  The address of a chunk allocated with `kmalloc` is aligned to at least
> -ARCH_KMALLOC_MINALIGN bytes.  For sizes which are a power of two, the
> -alignment is also guaranteed to be at least the respective size.
> +ARCH_KMALLOC_MINALIGN bytes. For sizes which are a power of two, the
> +alignment is also guaranteed to be at least the respective size. For other
> +sizes, the alignment is guaranteed to be at least the largest power-of-two
> +divisor of the size.
>
>  Chunks allocated with kmalloc() can be resized with krealloc(). Similarly
>  to kmalloc_array(): a helper for resizing arrays is provided in the form of
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index ed6bee5ec2b6..640cea6e6323 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -604,7 +604,8 @@ void *__kmalloc_large_node_noprof(size_t size, gfp_t flags, int node)
>   *
>   * The allocated object address is aligned to at least ARCH_KMALLOC_MINALIGN
>   * bytes. For @size of power of two bytes, the alignment is also guaranteed
> - * to be at least to the size.
> + * to be at least to the size. For other sizes, the alignment is guaranteed to
> + * be at least the largest power-of-two divisor of @size.
>   *
>   * The @flags argument may be one of the GFP flags defined at
>   * include/linux/gfp_types.h and described at
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 1560a1546bb1..7272ef7bc55f 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -617,11 +617,12 @@ void __init create_boot_cache(struct kmem_cache *s, const char *name,
>         s->size = s->object_size = size;
>
>         /*
> -        * For power of two sizes, guarantee natural alignment for kmalloc
> -        * caches, regardless of SL*B debugging options.
> +        * kmalloc caches guarantee alignment of at least the largest
> +        * power-of-two divisor of the size. For power-of-two sizes,
> +        * it is the size itself.
>          */
> -       if (is_power_of_2(size))
> -               align = max(align, size);
> +       if (flags & SLAB_KMALLOC)
> +               align = max(align, 1U << (ffs(size) - 1));
>         s->align = calculate_alignment(flags, align, size);
>
>  #ifdef CONFIG_HARDENED_USERCOPY
> diff --git a/rust/kernel/alloc/allocator.rs b/rust/kernel/alloc/allocator.rs
> index 229642960cd1..e6ea601f38c6 100644
> --- a/rust/kernel/alloc/allocator.rs
> +++ b/rust/kernel/alloc/allocator.rs
> @@ -18,23 +18,16 @@ pub(crate) unsafe fn krealloc_aligned(ptr: *mut u8, new_layout: Layout, flags: F
>      // Customized layouts from `Layout::from_size_align()` can have size < align, so pad first.
>      let layout = new_layout.pad_to_align();
>
> -    let mut size = layout.size();
> -
> -    if layout.align() > bindings::ARCH_SLAB_MINALIGN {
> -        // The alignment requirement exceeds the slab guarantee, thus try to enlarge the size
> -        // to use the "power-of-two" size/alignment guarantee (see comments in `kmalloc()` for
> -        // more information).
> -        //
> -        // Note that `layout.size()` (after padding) is guaranteed to be a multiple of
> -        // `layout.align()`, so `next_power_of_two` gives enough alignment guarantee.
> -        size = size.next_power_of_two();
> -    }
> +    // Note that `layout.size()` (after padding) is guaranteed to be a multiple of `layout.align()`
> +    // which together with the slab guarantees means the `krealloc` will return a properly aligned
> +    // object (see comments in `kmalloc()` for more information).
> +    let size = layout.size();

The size is a multiple of the alignment due to the `pad_to_align` call
above, which rounds up the size to ensure that this is the case.

Alice





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux