Re: [PATCH 03/14] io_uring: specify freeptr usage for SLAB_TYPESAFE_BY_RCU io_kiocb cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/19/24 12:44 PM, Jens Axboe wrote:
On 11/19/24 12:41 PM, Geert Uytterhoeven wrote:
Hi Jens,

On Tue, Nov 19, 2024 at 8:30?PM Jens Axboe <axboe@xxxxxxxxx> wrote:
On 11/19/24 12:25 PM, Geert Uytterhoeven wrote:
On Tue, Nov 19, 2024 at 8:10?PM Jens Axboe <axboe@xxxxxxxxx> wrote:
On 11/19/24 12:02 PM, Geert Uytterhoeven wrote:
On Tue, Nov 19, 2024 at 8:00?PM Jens Axboe <axboe@xxxxxxxxx> wrote:
On 11/19/24 10:49 AM, Geert Uytterhoeven wrote:
On Tue, Nov 19, 2024 at 5:21?PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
On 11/19/24 08:02, Jens Axboe wrote:
On 11/19/24 8:36 AM, Guenter Roeck wrote:
On Tue, Oct 29, 2024 at 09:16:32AM -0600, Jens Axboe wrote:
Doesn't matter right now as there's still some bytes left for it, but
let's prepare for the io_kiocb potentially growing and add a specific
freeptr offset for it.

Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>

This patch triggers:

Kernel panic - not syncing: __kmem_cache_create_args: Failed to create slab 'io_kiocb'. Error -22
CPU: 0 UID: 0 PID: 1 Comm: swapper Not tainted 6.12.0-mac-00971-g158f238aa69d #1
Stack from 00c63e5c:
         00c63e5c 00612c1c 00612c1c 00000300 00000001 005f3ce6 004b9044 00612c1c
         004ae21e 00000310 000000b6 005f3ce6 005f3ce6 ffffffea ffffffea 00797244
         00c63f20 000c6974 005ee588 004c9051 005f3ce6 ffffffea 000000a5 00c614a0
         004a72c2 0002cb62 000c675e 004adb58 0076f28a 005f3ce6 000000b6 00c63ef4
         00000310 00c63ef4 00000000 00000016 0076f23e 00c63f4c 00000010 00000004
         00000038 0000009a 01000000 00000000 00000000 00000000 000020e0 0076f23e
Call Trace: [<004b9044>] dump_stack+0xc/0x10
  [<004ae21e>] panic+0xc4/0x252
  [<000c6974>] __kmem_cache_create_args+0x216/0x26c
  [<004a72c2>] strcpy+0x0/0x1c
  [<0002cb62>] parse_args+0x0/0x1f2
  [<000c675e>] __kmem_cache_create_args+0x0/0x26c
  [<004adb58>] memset+0x0/0x8c
  [<0076f28a>] io_uring_init+0x4c/0xca
  [<0076f23e>] io_uring_init+0x0/0xca
  [<000020e0>] do_one_initcall+0x32/0x192
  [<0076f23e>] io_uring_init+0x0/0xca
  [<0000211c>] do_one_initcall+0x6e/0x192
  [<004a72c2>] strcpy+0x0/0x1c
  [<0002cb62>] parse_args+0x0/0x1f2
  [<000020ae>] do_one_initcall+0x0/0x192
  [<0075c4e2>] kernel_init_freeable+0x1a0/0x1a4
  [<0076f23e>] io_uring_init+0x0/0xca
  [<004b911a>] kernel_init+0x0/0xec
  [<004b912e>] kernel_init+0x14/0xec
  [<004b911a>] kernel_init+0x0/0xec
  [<0000252c>] ret_from_kernel_thread+0xc/0x14

when trying to boot the m68k:q800 machine in qemu.

An added debug message in create_cache() shows the reason:

#### freeptr_offset=154 object_size=182 flags=0x310 aligned=0 sizeof(freeptr_t)=4

freeptr_offset would need to be 4-byte aligned but that is not the
case on m68k.

Why is ->work 2-byte aligned to begin with on m68k?!

My understanding is that m68k does not align pointers.

The minimum alignment for multi-byte integral values on m68k is
2 bytes.

See also the comment at
https://elixir.bootlin.com/linux/v6.12/source/include/linux/maple_tree.h#L46

Maybe it's time we put m68k to bed? :-)

We can add a forced alignment ->work to be 4 bytes, won't change
anything on anything remotely current. But does feel pretty hacky to
need to align based on some ancient thing.

Why does freeptr_offset need to be 4-byte aligned?

Didn't check, but it's slab/slub complaining using a 2-byte aligned
address for the free pointer offset. It's explicitly checking:

        /* If a custom freelist pointer is requested make sure it's sane. */
        err = -EINVAL;
        if (args->use_freeptr_offset &&
            (args->freeptr_offset >= object_size ||
             !(flags & SLAB_TYPESAFE_BY_RCU) ||
             !IS_ALIGNED(args->freeptr_offset, sizeof(freeptr_t))))
                goto out;

It is not guaranteed that alignof(freeptr_t) >= sizeof(freeptr_t)
(free_ptr is sort of a long). If freeptr_offset must be a multiple of
4 or 8 bytes,
the code that assigns it must make sure that is true.

Right, this is what the email is about...

I guess this is the code in fs/file_table.c:

    .freeptr_offset = offsetof(struct file, f_freeptr),

which references:

    include/linux/fs.h:           freeptr_t               f_freeptr;

I guess the simplest solution is to add an __aligned(sizeof(freeptr_t))
(or __aligned(sizeof(long)) to the definition of freeptr_t:

    include/linux/slab.h:typedef struct { unsigned long v; } freeptr_t;

It's not, it's struct io_kiocb->work, as per the stack trace in this
email.

Sorry, I was falling out of thin air into this thread...

linux-next/master:io_uring/io_uring.c:          .freeptr_offset =
offsetof(struct io_kiocb, work),
linux-next/master:io_uring/io_uring.c:          .use_freeptr_offset = true,

Apparently io_kiocb.work is of type struct io_wq_work, not freeptr_t?
Isn't that a bit error-prone, as the slab core code expects a freeptr_t?

It just needs the space, should not matter otherwise. But may as well
just add the union and align the freeptr so it stop complaining on m68k.

Ala the below, perhaps alignment takes care of itself then?


diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index 593c10a02144..a83ec7f7849d 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -674,7 +674,11 @@ struct io_kiocb {
 	struct io_kiocb			*link;
 	/* custom credentials, valid IFF REQ_F_CREDS is set */
 	const struct cred		*creds;
-	struct io_wq_work		work;
+
+	union {
+		struct io_wq_work	work;
+		freeptr_t		freeptr;
+	};
 
 	struct {
 		u64			extra1;
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 73af59863300..86ac7df2a601 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3812,7 +3812,7 @@ static int __init io_uring_init(void)
 	struct kmem_cache_args kmem_args = {
 		.useroffset = offsetof(struct io_kiocb, cmd.data),
 		.usersize = sizeof_field(struct io_kiocb, cmd.data),
-		.freeptr_offset = offsetof(struct io_kiocb, work),
+		.freeptr_offset = offsetof(struct io_kiocb, freeptr),
 		.use_freeptr_offset = true,
 	};
 

-- 
Jens Axboe




[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux