On 10/18/19 9:00 AM, Jens Axboe wrote: > On 10/18/19 8:52 AM, Jann Horn wrote: >> On Fri, Oct 18, 2019 at 4:43 PM Jens Axboe <axboe@xxxxxxxxx> wrote: >>> >>> On 10/18/19 8:40 AM, Jann Horn wrote: >>>> On Fri, Oct 18, 2019 at 4:37 PM Jens Axboe <axboe@xxxxxxxxx> wrote: >>>>> >>>>> On 10/18/19 8:34 AM, Jann Horn wrote: >>>>>> On Fri, Oct 18, 2019 at 4:01 PM Jens Axboe <axboe@xxxxxxxxx> wrote: >>>>>>> On 10/17/19 8:41 PM, Jann Horn wrote: >>>>>>>> On Fri, Oct 18, 2019 at 4:01 AM Jens Axboe <axboe@xxxxxxxxx> wrote: >>>>>>>>> This is in preparation for adding opcodes that need to modify files >>>>>>>>> in a process file table, either adding new ones or closing old ones. >>>>>> [...] >>>>>>> Updated patch1: >>>>>>> >>>>>>> http://git.kernel.dk/cgit/linux-block/commit/?h=for-5.5/io_uring-test&id=df6caac708dae8ee9a74c9016e479b02ad78d436 >>>>>> >>>>>> I don't understand what you're doing with old_files in there. In the >>>>>> "s->files && !old_files" branch, "current->files = s->files" happens >>>>>> without holding task_lock(), but current->files and s->files are also >>>>>> the same already at that point anyway. And what's the intent behind >>>>>> assigning stuff to old_files inside the loop? Isn't that going to >>>>>> cause the workqueue to keep a modified current->files beyond the >>>>>> runtime of the work? >>>>> >>>>> I simply forgot to remove the old block, it should only have this one: >>>>> >>>>> if (s->files && s->files != cur_files) { >>>>> task_lock(current); >>>>> current->files = s->files; >>>>> task_unlock(current); >>>>> if (cur_files) >>>>> put_files_struct(cur_files); >>>>> cur_files = s->files; >>>>> } >>>> >>>> Don't you still need a put_files_struct() in the case where "s->files >>>> == cur_files"? >>> >>> I want to hold on to the files for as long as I can, to avoid unnecessary >>> shuffling of it. But I take it your worry here is that we'll be calling >>> something that manipulates ->files? Nothing should do that, unless >>> s->files is set. We didn't hide the workqueue ->files[] before this >>> change either. >> >> No, my worry is that the refcount of the files_struct is left too >> high. From what I can tell, the "do" loop in io_sq_wq_submit_work() >> iterates over multiple instances of struct sqe_submit. If there are >> two sqe_submit instances with the same ->files (each holding a >> reference from the get_files_struct() in __io_queue_sqe()), then: >> >> When processing the first sqe_submit instance, current->files and >> cur_files are set to $user_files. >> When processing the second sqe_submit instance, nothing happens >> (s->files == cur_files). >> After the loop, at the end of the function, put_files_struct() is >> called once on $user_files. >> >> So get_files_struct() has been called twice, but put_files_struct() >> has only been called once. That leaves the refcount too high, and by >> repeating this, an attacker can make the refcount wrap around and then >> cause a use-after-free. > > Ah now I see what you are getting at, yes that's clearly a bug! I wonder > how we best safely can batch the drops. We can track the number of times > we've used the same files, and do atomic_sub_and_test() in a > put_files_struct_many() type addition. But that would leave us open to > the issue you describe, where someone could maliciously overflow the > files ref count. > > Probably not worth over-optimizing, as long as we can avoid the > current->files task lock/unlock and shuffle. > > I'll update the patch. Alright, this incremental on top should do it. And full updated patch here: http://git.kernel.dk/cgit/linux-block/commit/?h=for-5.5/io_uring-test&id=40449c5a3d3b16796fa13e9469c69d62986e961c Let me know what you think. diff --git a/fs/io_uring.c b/fs/io_uring.c index 68eda36bcc9c..2fed0badad38 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2289,13 +2289,13 @@ static void io_sq_wq_submit_work(struct work_struct *work) set_fs(USER_DS); } } - if (s->files && s->files != cur_files) { + if (cur_files) + put_files_struct(cur_files); + cur_files = s->files; + if (cur_files && cur_files != current->files) { task_lock(current); - current->files = s->files; + current->files = cur_files; task_unlock(current); - if (cur_files) - put_files_struct(cur_files); - cur_files = s->files; } if (!ret) { @@ -2393,8 +2393,9 @@ static void io_sq_wq_submit_work(struct work_struct *work) task_lock(current); current->files = old_files; task_unlock(current); - put_files_struct(cur_files); } + if (cur_files) + put_files_struct(cur_files); } /* -- Jens Axboe