Re: [PATCH 1/3] io_uring: add support for async work inheriting files table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 18, 2019 at 5:55 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
> On 10/18/19 9:00 AM, Jens Axboe wrote:
> > On 10/18/19 8:52 AM, Jann Horn wrote:
> >> On Fri, Oct 18, 2019 at 4:43 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
> >>>
> >>> On 10/18/19 8:40 AM, Jann Horn wrote:
> >>>> On Fri, Oct 18, 2019 at 4:37 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
> >>>>>
> >>>>> On 10/18/19 8:34 AM, Jann Horn wrote:
> >>>>>> On Fri, Oct 18, 2019 at 4:01 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
> >>>>>>> On 10/17/19 8:41 PM, Jann Horn wrote:
> >>>>>>>> On Fri, Oct 18, 2019 at 4:01 AM Jens Axboe <axboe@xxxxxxxxx> wrote:
> >>>>>>>>> This is in preparation for adding opcodes that need to modify files
> >>>>>>>>> in a process file table, either adding new ones or closing old ones.
> >>>>>> [...]
> >>>>>>> Updated patch1:
> >>>>>>>
> >>>>>>> http://git.kernel.dk/cgit/linux-block/commit/?h=for-5.5/io_uring-test&id=df6caac708dae8ee9a74c9016e479b02ad78d436
> >>>>>>
> >>>>>> I don't understand what you're doing with old_files in there. In the
> >>>>>> "s->files && !old_files" branch, "current->files = s->files" happens
> >>>>>> without holding task_lock(), but current->files and s->files are also
> >>>>>> the same already at that point anyway. And what's the intent behind
> >>>>>> assigning stuff to old_files inside the loop? Isn't that going to
> >>>>>> cause the workqueue to keep a modified current->files beyond the
> >>>>>> runtime of the work?
> >>>>>
> >>>>> I simply forgot to remove the old block, it should only have this one:
> >>>>>
> >>>>> if (s->files && s->files != cur_files) {
> >>>>>            task_lock(current);
> >>>>>            current->files = s->files;
> >>>>>            task_unlock(current);
> >>>>>            if (cur_files)
> >>>>>                    put_files_struct(cur_files);
> >>>>>            cur_files = s->files;
> >>>>> }
> >>>>
> >>>> Don't you still need a put_files_struct() in the case where "s->files
> >>>> == cur_files"?
> >>>
> >>> I want to hold on to the files for as long as I can, to avoid unnecessary
> >>> shuffling of it. But I take it your worry here is that we'll be calling
> >>> something that manipulates ->files? Nothing should do that, unless
> >>> s->files is set. We didn't hide the workqueue ->files[] before this
> >>> change either.
> >>
> >> No, my worry is that the refcount of the files_struct is left too
> >> high. From what I can tell, the "do" loop in io_sq_wq_submit_work()
> >> iterates over multiple instances of struct sqe_submit. If there are
> >> two sqe_submit instances with the same ->files (each holding a
> >> reference from the get_files_struct() in __io_queue_sqe()), then:
> >>
> >> When processing the first sqe_submit instance, current->files and
> >> cur_files are set to $user_files.
> >> When processing the second sqe_submit instance, nothing happens
> >> (s->files == cur_files).
> >> After the loop, at the end of the function, put_files_struct() is
> >> called once on $user_files.
> >>
> >> So get_files_struct() has been called twice, but put_files_struct()
> >> has only been called once. That leaves the refcount too high, and by
> >> repeating this, an attacker can make the refcount wrap around and then
> >> cause a use-after-free.
> >
> > Ah now I see what you are getting at, yes that's clearly a bug! I wonder
> > how we best safely can batch the drops. We can track the number of times
> > we've used the same files, and do atomic_sub_and_test() in a
> > put_files_struct_many() type addition. But that would leave us open to
> > the issue you describe, where someone could maliciously overflow the
> > files ref count.
> >
> > Probably not worth over-optimizing, as long as we can avoid the
> > current->files task lock/unlock and shuffle.
> >
> > I'll update the patch.
>
> Alright, this incremental on top should do it. And full updated patch
> here:
>
> http://git.kernel.dk/cgit/linux-block/commit/?h=for-5.5/io_uring-test&id=40449c5a3d3b16796fa13e9469c69d62986e961c
>
> Let me know what you think.

Ignoring the locking elision, basically the logic is now this:

static void io_sq_wq_submit_work(struct work_struct *work)
{
        struct io_kiocb *req = container_of(work, struct io_kiocb, work);
        struct files_struct *cur_files = NULL, *old_files;
        [...]
        old_files = current->files;
        [...]
        do {
                struct sqe_submit *s = &req->submit;
                [...]
                if (cur_files)
                        /* drop cur_files reference; borrow lifetime must
                         * end before here */
                        put_files_struct(cur_files);
                /* move reference ownership to cur_files */
                cur_files = s->files;
                if (cur_files) {
                        task_lock(current);
                        /* current->files borrows reference from cur_files;
                         * existing borrow from previous loop ends here */
                        current->files = cur_files;
                        task_unlock(current);
                }

                [call __io_submit_sqe()]
                [...]
        } while (req);
        [...]
        /* existing borrow ends here */
        task_lock(current);
        current->files = old_files;
        task_unlock(current);
        if (cur_files)
                /* drop cur_files reference; borrow lifetime must
                 * end before here */
                put_files_struct(cur_files);
}

If you run two iterations of this loop, with a first element that has
a ->files pointer and a second element that doesn't, then in the
second run through the loop, the reference to the files_struct will be
dropped while current->files still points to it; current->files is
only reset after the loop has ended. If someone accesses
current->files through procfs directly after that, AFAICS you'd get a
use-after-free.



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux