Re: [PATCH -next] fs: Fix memory leaks in do_renameat2() error paths

Jens Axboe <axboe@xxxxxxxxx> · Mon, 2 Nov 2020 14:39:32 -0700

On 11/2/20 1:31 PM, Jens Axboe wrote:
> On 11/2/20 1:12 PM, Eric W. Biederman wrote:
>> Jens Axboe <axboe@xxxxxxxxx> writes:
>>
>>> On 11/2/20 12:27 PM, Eric W. Biederman wrote:
>>>> Jens Axboe <axboe@xxxxxxxxx> writes:
>>>>
>>>>> On 10/30/20 4:22 PM, Al Viro wrote:
>>>>>> On Fri, Oct 30, 2020 at 02:33:11PM -0600, Jens Axboe wrote:
>>>>>>> On 10/30/20 12:49 PM, Al Viro wrote:
>>>>>>>> On Fri, Oct 30, 2020 at 12:46:26PM -0600, Jens Axboe wrote:
>>>>>>>>
>>>>>>>>> See other reply, it's being posted soon, just haven't gotten there yet
>>>>>>>>> and it wasn't ready.
>>>>>>>>>
>>>>>>>>> It's a prep patch so we can call do_renameat2 and pass in a filename
>>>>>>>>> instead. The intent is not to have any functional changes in that prep
>>>>>>>>> patch. But once we can pass in filenames instead of user pointers, it's
>>>>>>>>> usable from io_uring.
>>>>>>>>
>>>>>>>> You do realize that pathname resolution is *NOT* offloadable to helper
>>>>>>>> threads, I hope...
>>>>>>>
>>>>>>> How so? If we have all the necessary context assigned, what's preventing
>>>>>>> it from working?
>>>>>>
>>>>>> Semantics of /proc/self/..., for starters (and things like /proc/mounts, etc.
>>>>>> *do* pass through that, /dev/stdin included)
>>>>>
>>>>> Don't we just need ->thread_pid for that to work?
>>>>
>>>> No.  You need ->signal.
>>>>
>>>> You need ->signal->pids[PIDTYPE_TGID].  It is only for /proc/thread-self
>>>> that ->thread_pid is needed.
>>>>
>>>> Even more so than ->thread_pid, it is a kernel invariant that ->signal
>>>> does not change.
>>>
>>> I don't care about the pid itself, my suggestion was to assign ->thread_pid
>>> over the lookup operation to ensure that /proc/self/ worked the way that
>>> you'd expect.
>>
>> I understand that.
>>
>> However /proc/self/ refers to the current process not to the current
>> thread.  So ->thread_pid is not what you need to assign to make that
>> happen.  What the code looks at is: ->signal->pids[PIDTYPE_TGID].
>>
>> It will definitely break invariants to assign to ->signal.
>>
>> Currently only exchange_tids assigns ->thread_pid and it is nasty.  It
>> results in code that potentially results in infinite loops in
>> kernel/signal.c
>>
>> To my knowledge nothing assigns ->signal->pids[PIDTYPE_TGID].  At best
>> it might work but I expect the it would completely confuse something in
>> the pid to task or pid to process mappings.  Which is to say even if it
>> does work it would be an extremely fragile solution.
> 
> Thanks Eric, that's useful. Sounds to me like we're better off, at least
> for now, to just expressly forbid async lookup of /proc/self/. Which
> isn't really the end of the world as far as I'm concerned.

Alternatively, we just teach task_pid_ptr() where to look for an
alternate, if current->flags & PF_IO_WORKER is true. Then we don't have
to assign anything that's visible in task_struct, and in fact the async
worker can retain this stuff on the stack. As all requests are killed
before a task is allowed to exit, that should be safe.

diff --git a/kernel/pid.c b/kernel/pid.c
index 74ddbff1a6ba..5fd421a4864c 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -42,6 +42,7 @@
 #include <linux/sched/signal.h>
 #include <linux/sched/task.h>
 #include <linux/idr.h>
+#include <linux/io_uring.h>
 #include <net/sock.h>
 #include <uapi/linux/pidfd.h>
 
@@ -320,6 +321,12 @@ EXPORT_SYMBOL_GPL(find_vpid);
 
 static struct pid **task_pid_ptr(struct task_struct *task, enum pid_type type)
 {
+	if ((task->flags & PF_IO_WORKER) && task->io_uring) {
+		return (type == PIDTYPE_PID) ?
+			&task->io_uring->thread_pid :
+			&task->io_uring->pids[type];
+	}
+
 	return (type == PIDTYPE_PID) ?
 		&task->thread_pid :
 		&task->signal->pids[type];

-- 
Jens Axboe