Re: Kernel stack read with PTRACE_EVENT_EXIT and io_uring threads

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:

On Tue, Jun 15, 2021 at 12:32 PM Eric W. Biederman
<ebiederm@xxxxxxxxxxxx> wrote:

I had to update ret_from_kernel_thread to pop that state to get Linus's
change to boot.  Apparently kernel_threads exiting needs to be handled.

You are very right.

That, btw, seems to be a horrible design mistake, but I think it's how
"kernel_execve()" works - both for the initial "init", but also for
user-mode helper processes.

Both of those cases do "kernel_thread()" to create a new thread, and
then that new kernel thread does kernel_execve() to create the user
space image for that thread. And that act of "execve()" clears
PF_KTHREAD from the thread, and then the final return from the kernel
thread function returns to that new user space.

Or something like that. It's been ages since I looked at that code,
and your patch initially confused the heck out of me because I went
"that can't _possibly_ be needed".

But yes, I think your patch is right.

And I think our horrible "kernel threads return to user space when
done" is absolutely horrifically nasty. Maybe of the clever sort, but
mostly of the historical horror sort.

Or am I mis-rememberting how this ends up working? Did you look at
exactly what it was that returned from kernel threads?

This might be worth commenting on somewhere. But your patch for alpha
looks correct to me. Did you have some test-case to verify ptrace() on
io worker threads?

At this point I just booted an alpha image and on qemu-system-alpha.

I do have gdb in my kernel image so I can test that.  I haven't yet but
I can and should.

Sleeping on it I came up with a plan to add TF_SWITCH_STACK_SAVED to
indicate if the registers have been saved.  The DO_SWITCH_STACK and
UNDO_SWITCH_STACK helpers (except in alpha_switch_to) can test that.
The ptrace helpers can test that and turn an access of random kernel
stack contents into something well behaved that does WARN_ON_ONCE
because we should not get there.

I suspect adding TF_SWITCH_STACK_SAVED should come first so it
is easy to verify the problem behavior, before I fix it.

My real goal is to find a pattern that architectures whose register
saves are structured like alphas can emulate, to minimize problems in
the future.

Plus I would really like to get the last handful of architectures
updated so we can remove CONFIG_HAVE_ARCH_TRACEHOOK.  I think we can
do that on alpha because we save all of the system call arguments
in pt_regs and that is all the other non-ptrace code paths care about.

AKA I am trying to move the old architectures forward so we can get rid
of unnecessary complications in the core code.

Eric



[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux