Re: copying from/to user question

Thomas Gleixner <tglx@xxxxxxxxxxxxx> · Mon, 09 Sep 2024 16:31:30 +0200

On Mon, Sep 09 2024 at 12:14, Arnd Bergmann wrote:
> On Mon, Sep 9, 2024, at 09:18, Christian Brauner wrote:
>> On Mon, Sep 09, 2024 at 10:50:10AM GMT, Christian Brauner wrote:
>>> 
>>> This is another round of Christian's asking sus questions about kernel
>>> apis. I asked them a few people and generally the answers I got was
>>> "Good question, I don't know." or the reasoning varied a lot. So I take
>>> it I'm not the only one with that question.
>>> 
>>> I was looking at a potential epoll() bug and it got me thinking about
>>> dos & don'ts for put_user()/copy_from_user() and related helpers as
>>> epoll does acquire the epoll mutex and then goes on to loop over a list
>>> of ready items and calls __put_user() for each item. Granted, it only
>>> puts a __u64 and an integer but still that seems adventurous to me and I
>>> wondered why.
>>> 
>>> Generally, new vfs apis always try hard to call helpers that copy to or
>>> from userspace without any locks held as my understanding has been that
>>> this is best practice as to avoid risking taking page faults while
>>> holding a mutex or semaphore even though that's supposedly safe.
>>> 
>>> Is this understanding correct? And aside from best practice is it in
>>> principle safe to copy to or from userspace with sleeping locks held?
>
> I would be very suspicious if it's an actual __put_user() rather
> than the normal put_user() since at least on x86 that skips the
> __might_fault() instrumentation.

epoll_put_uevent() uses __put_user(). __put_user() does neither have
might_fault() nor does it check the destination pointer. It's documented
that the caller needs to have validated it via access_ok(), which
happens in do_epoll_wait().

> With the normal put_user() at least I would expect the
> might_lock_read(&current->mm->mmap_lock) instrumentation
> in __might_fault() to cause a lockdep splat if you are holding
> a mutex that is also required during a page fault, which
> in turn would deadlock if your __user pointer is paged out.

Right. But an actual page fault would still trip over that if there is a
lock dependency chain because pagefaults are enabled.

Coming back to your general question.

It is generally safe to fault with a sleeping lock held when there is no
invers lock chain vs. mmap_lock.

Whether it's a good idea is a different question, which depends on the
context of what the mutex is protecting and what consequences result in
holding it for a extended period of time, e.g. due to a swapped out
page.

Thanks,

        tglx