Arch maintainers Ahoy! (was Re: x86: faster strncpy_from_user())

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Mon, 21 May 2012 09:50:34 -0700

Ok, several weeks ago I sent out an email about the x86
strncpy_from_user() patch I did, and cc'd linux-arch because the
second stage might well affect other architectures.

Nobody reacted to it, and today that second stage was merged into
mainline, so I thought I'd re-do the notice. The relevant part of the
original explanation is appended, but I'll explain once more..

On x86, we *used* to be very lazy about strncpy_from_user(), and not
actually do the full proper error handling: the code was (for
historical reasons) using inline asms where it was very inconvenient
to actually check the proper end of the address space. And nobody had
really ever bothered to fix it up. As a result, the main user
(getname()) had its own special magic workarounds for this.

Until 3.4. We now do strncpy_from_user() properly, with all the
appropriate error cases of hitting the end of the user address space
etc handled.

And now that the 3.5 merge window is open, some of the first merging I
did was my own branches that had been pending. Including the one that
removes all the hackery from getname(), and expects
strncpy_from_user() to just work right.

If any architecture copied the lazy x86 strncpy_from_user(), they
really do need to fix themselves now. You can actually take the new
all-C x86 implementation, and just strip away the word-at-a-time
optimizations, and it should be pretty generic (if perhaps not
optimal).

The thing to really look out for is when the top of the user address
space abuts the kernel address space. If you *only* rely on page
faults, the user can read kernel addresses by mapping the very last
page and then putting a non-terminated string at the end of that page,
and trying to open a file using that name. So you need to either have
a guard page (guaranteeing a fault), or you need to take TASK_SIZE or
similar into account. Because getname() no longer does the TASK_SIZE
games for you.

                       Linus

On Fri, Apr 6, 2012 at 2:32 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> but what is perhaps more interesting (and the reason linux-arch is
> cc'd) is that not only is it noticeably faster, it should also
> actually get all the error cases right.
>
> What do I mean by that?
>
> I mean that in my tree (but not committed), I have actually removed
> this nasty piece of code from fs/namei.c:
>
>  diff --git a/fs/namei.c b/fs/namei.c
>  index 0062dd17eb55..2b7d691523bb 100644
>  --- a/fs/namei.c
>  +++ b/fs/namei.c
>  @@ -119,14 +119,7 @@
>   static int do_getname(const char __user *filename, char *page)
>   {
>          int retval;
>  -       unsigned long len = PATH_MAX;
>  -
>  -       if (!segment_eq(get_fs(), KERNEL_DS)) {
>  -               if ((unsigned long) filename >= TASK_SIZE)
>  -                       return -EFAULT;
>  -               if (TASK_SIZE - (unsigned long) filename < PATH_MAX)
>  -                       len = TASK_SIZE - (unsigned long) filename;
>  -       }
>  +       const unsigned long len = PATH_MAX;
>
>          retval = strncpy_from_user(page, filename, len);
>          if (retval > 0) {
>
> which worked around the fact that strncpy_from_user() didn't really
> get the end conditions quite right, and it really just did an
> access_ok() on the first byte. The new x86 implementation should get
> all of that right without any need for any ugly hacks.
>
> Now, I haven't committed that change to do_getname(), but it *will*
> get committed at some point.
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html