On 5/3/24 1:22 PM, Kees Cook wrote: > On Fri, May 03, 2024 at 12:49:11PM -0600, Jens Axboe wrote: >> On 5/3/24 12:26 PM, Kees Cook wrote: >>> Thanks for doing this analysis! I suspect at least a start of a fix >>> would be this: >>> >>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >>> index 8fe5aa67b167..15e8f74ee0f2 100644 >>> --- a/drivers/dma-buf/dma-buf.c >>> +++ b/drivers/dma-buf/dma-buf.c >>> @@ -267,9 +267,8 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) >>> >>> if (events & EPOLLOUT) { >>> /* Paired with fput in dma_buf_poll_cb */ >>> - get_file(dmabuf->file); >>> - >>> - if (!dma_buf_poll_add_cb(resv, true, dcb)) >>> + if (!atomic_long_inc_not_zero(&dmabuf->file) && >>> + !dma_buf_poll_add_cb(resv, true, dcb)) >>> /* No callback queued, wake up any other waiters */ >> >> Don't think this is sane at all. I'm assuming you meant: >> >> atomic_long_inc_not_zero(&dmabuf->file->f_count); > > Oops, yes, sorry. I was typed from memory instead of copy/paste. Figured :-) >> but won't fly as you're not under RCU in the first place. And what >> protects it from being long gone before you attempt this anyway? This is >> sane way to attempt to fix it, it's completely opposite of what sane ref >> handling should look like. >> >> Not sure what the best fix is here, seems like dma-buf should hold an >> actual reference to the file upfront rather than just stash a pointer >> and then later _hope_ that it can just grab a reference. That seems >> pretty horrible, and the real source of the issue. > > AFAICT, epoll just doesn't hold any references at all. It depends, > I think, on eventpoll_release() (really eventpoll_release_file()) > synchronizing with epoll_wait() (but I don't see how this happens, and > the race seems to be against ep_item_poll() ...?) > > I'm really confused about how eventpoll manages the lifetime of polled > fds. epoll doesn't hold any references, and it's got some ugly callback to deal with that. It's not ideal, nor pretty, but that's how it currently works. See eventpoll_release() and how it's called. This means that epoll itself is supposedly safe from the file going away, even though it doesn't hold a reference to it. Except that in this case, the file is already gone by the time eventpoll_release() is called. Which presumably is some interaction with the somewhat suspicious file reference management that dma-buf is doing. But I didn't look into that much, outside of noting it looks a bit suspect. >>> Due to this issue I've proposed fixing get_file() to detect pathological states: >>> https://lore.kernel.org/lkml/20240502222252.work.690-kees@xxxxxxxxxx/ >> >> I don't think this would catch this case, as the memory could just be >> garbage at this point. > > It catches it just fine! :) I tested it against the published PoC. Sure it _may_ catch the issue, but the memory may also just be garbage at that point. Not saying it's a useless addition, outside of the usual gripes of making the hot path slower, just that it won't catch all cases. Which I guess is fine and expected. > And for cases where further allocations have progressed far enough to > corrupt the freed struct file and render the check pointless, nothing > different has happened than what happens today. At least now we have a > window to catch the situation across the time frame before it is both > reallocated _and_ the contents at the f_count offset gets changed to > non-zero. Right. -- Jens Axboe