On Fri, May 03, 2024 at 01:35:09PM -0600, Jens Axboe wrote: > On 5/3/24 1:22 PM, Kees Cook wrote: > > On Fri, May 03, 2024 at 12:49:11PM -0600, Jens Axboe wrote: > >> On 5/3/24 12:26 PM, Kees Cook wrote: > >>> Thanks for doing this analysis! I suspect at least a start of a fix > >>> would be this: > >>> > >>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > >>> index 8fe5aa67b167..15e8f74ee0f2 100644 > >>> --- a/drivers/dma-buf/dma-buf.c > >>> +++ b/drivers/dma-buf/dma-buf.c > >>> @@ -267,9 +267,8 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) > >>> > >>> if (events & EPOLLOUT) { > >>> /* Paired with fput in dma_buf_poll_cb */ > >>> - get_file(dmabuf->file); > >>> - > >>> - if (!dma_buf_poll_add_cb(resv, true, dcb)) > >>> + if (!atomic_long_inc_not_zero(&dmabuf->file) && > >>> + !dma_buf_poll_add_cb(resv, true, dcb)) > >>> /* No callback queued, wake up any other waiters */ > >> > >> Don't think this is sane at all. I'm assuming you meant: > >> > >> atomic_long_inc_not_zero(&dmabuf->file->f_count); > > > > Oops, yes, sorry. I was typed from memory instead of copy/paste. > > Figured :-) > > >> but won't fly as you're not under RCU in the first place. And what > >> protects it from being long gone before you attempt this anyway? This is > >> sane way to attempt to fix it, it's completely opposite of what sane ref > >> handling should look like. > >> > >> Not sure what the best fix is here, seems like dma-buf should hold an > >> actual reference to the file upfront rather than just stash a pointer > >> and then later _hope_ that it can just grab a reference. That seems > >> pretty horrible, and the real source of the issue. > > > > AFAICT, epoll just doesn't hold any references at all. It depends, > > I think, on eventpoll_release() (really eventpoll_release_file()) > > synchronizing with epoll_wait() (but I don't see how this happens, and > > the race seems to be against ep_item_poll() ...?) > > > > I'm really confused about how eventpoll manages the lifetime of polled > > fds. > > epoll doesn't hold any references, and it's got some ugly callback to > deal with that. It's not ideal, nor pretty, but that's how it currently > works. See eventpoll_release() and how it's called. This means that > epoll itself is supposedly safe from the file going away, even though it > doesn't hold a reference to it. Right -- what remains unclear to me is how struct file lifetime is expected to work in the struct file_operations::poll callbacks. Because using get_file() there looks clearly unsafe... > Except that in this case, the file is already gone by the time > eventpoll_release() is called. Which presumably is some interaction with > the somewhat suspicious file reference management that dma-buf is doing. > But I didn't look into that much, outside of noting it looks a bit > suspect. Not yet, though. Here's (one) race state from the analysis. I added lines for the dma_fence_add_callback()/dma_buf_poll_cb() case, since that's the case that would escape any eventpoll_release/epoll_wait synchronization (if it exists?): close(dmabuf->file) __fput_sync (f_count == 1, last ref) f_count-- (f_count == 0 now) __fput epoll_wait vfs_poll(dmabuf->file) get_file(dmabuf->file)(f_count == 1) dma_fence_add_callback() eventpoll_release dmabuf->file deallocation dma_buf_poll_cb() fput(dmabuf->file) (f_count == 1) f_count-- dmabuf->file deallocation Without fences to create a background callback, we just do a double-free: close(dmabuf->file) __fput_sync (f_count == 1, last ref) f_count-- (f_count == 0 now) __fput epoll_wait vfs_poll(dmabuf->file) get_file(dmabuf->file)(f_count == 1) dma_buf_poll_cb() fput(dmabuf->file) (f_count == 1) f_count-- eventpoll_release dmabuf->file deallocation eventpoll_release dmabuf->file deallocation get_file(), via epoll_wait()->vfs_poll()->dma_buf_poll(), has raised f_count again. Then eventpoll_release() is doing things to remove dmabuf->file from the eventpoll lists, but I *think* this is synchronized so that an epoll_wait() will only call .poll handlers with a valid (though possibly f_count==0) file, but I can't figure out where that happens. (If it's not happening, we have a much bigger problem, but I imagine we'd see massive corruption all the time, which we don't.) So, yeah, I can't figure out how eventpoll_release() and epoll_wait() are expected to behave safely for .poll handlers. Regardless, for the simple case: it seems like it's just totally illegal to use get_file() in a poll handler. Is this known/expected? And if so, how can dmabuf possibly deal with that? -- Kees Cook