On Tue, May 4, 2021 at 4:39 AM Stefan Metzmacher <metze@xxxxxxxxx> wrote: > > I'm currently testing this (moving things to the end and resetting ->ip = 0 too) This part is not right (or at least very questionable): > + if (!ret && unlikely(p->flags & PF_IO_WORKER)) { That testing "ret" is misleading, in my opinion. If PF_IO_WORKER is set, there is no way we wouldn't want to do the kthread_frame_init(). Now, ret can never be non-zero, because PF_IO_WORKER will never have CLONE_SETTLS set, so this is kind of moot, but it does mean that the test for 'ret' is just pointless, and makes the code look like it would care. For similar reasons, we probably don't want to go down to the whole io_bitmap_share() case - the IO bitmap only makes sense in user space, so going through all that code is pointless, but also would make people think it might be relevant (and we _would_ copy the io bitmap pointer and increment the ref if the real user thread had one, so we'd do all that pointless stuff that doesn't actually matter). So don't move that code down. It's best done right after the register initialization. Moving it down to below the setting of 'gs' for the 32-bit case is ok, though. I think my original patch had it above it, but technically it makes sense to do it below - that's when all the register state really is initialized. As to: > + childregs->ip = 0; > [..] > which means the output looks like this: > > (gdb) info threads > Id Target Id Frame > * 1 LWP 4863 "io_uring-cp-for" syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 > 2 LWP 4864 "iou-mgr-4863" 0x0000000000000000 in ?? () > 3 LWP 4865 "iou-wrk-4863" 0x0000000000000000 in ?? () > (gdb) thread 3 > [Switching to thread 3 (LWP 4865)] > #0 0x0000000000000000 in ?? () > (gdb) bt > #0 0x0000000000000000 in ?? () > Backtrace stopped: Cannot access memory at address 0x0 Yeah, that's probably sensible. I'm not sure it's a bad idea to show the IO thread as being in the original system call - that makes perfect sense to me too, but I guess it could go either way. So I don't think it's wrong to clear the user space ->ip. > What do you think? Should I post that as v2 if my final testing doesn't find any problem? Yes, please, with the above "move the IO thread return up a bit" comment, please do post a tested version with some nice commit log, and we can close this issue. It even looks like gdb will be cleaned up too. Yay. But I think having that separate test for PF_IO_WORKER is a good idea regardless, since it just makes clear that an IO worker isn't the same thing as a kernel thread in that code. Linus