Re: [PATCH] epoll: try to be a _bit_ better about file lifetimes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 03, 2024 at 02:33:37PM -0700, Linus Torvalds wrote:
> On Fri, 3 May 2024 at 14:24, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > Can we get to ep_item_poll(epi, ...) after eventpoll_release_file()
> > got past __ep_remove()?  Because if we can, we have a worse problem -
> > epi freed under us.
> 
> Look at the hack in __ep_remove(): if it is concurrent with
> eventpoll_release_file(), it will hit this code
> 
>         spin_lock(&file->f_lock);
>         if (epi->dying && !force) {
>                 spin_unlock(&file->f_lock);
>                 return false;
>         }
> 
> and not free the epi.
> 
> But as far as I can tell, almost nothing else cares about the f_lock
> and dying logic.
> 
> And in fact, I don't think doing
> 
>         spin_lock(&file->f_lock);
> 
> is even valid in the places that look up file through "epi->ffd.file",
> because the lock itself is inside the thing that you can't trust until
> you've taken the lock...
> 
> So I agree with Kees about the use of "atomic_dec_not_zero()" kind of
> logic - but it also needs to be in an RCU-readlocked region, I think.

Why isn't it enough to just force dma_buf_poll() to use
get_file_active()? Then that whole problem goes away afaict.

So the fix I had yesterday before I had to step away from the computer
was literally just that [1]. It currently uses two atomic incs
potentially but that can probably be fixed by the dma folks to be
smarter about when they actually need to take a file reference.

> 
> I wish epoll() just took the damn file ref itself. But since it relies
> on the file refcount to release the data structure, that obviously
> can't work.
> 
>                 Linus

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 8fe5aa67b167..7149c45976e1 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -244,13 +244,18 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
        if (!dmabuf || !dmabuf->resv)
                return EPOLLERR;

+       if (!get_file_active(&dmabuf->file))
+               return EPOLLERR;
+
        resv = dmabuf->resv;

        poll_wait(file, &dmabuf->poll, poll);

        events = poll_requested_events(poll) & (EPOLLIN | EPOLLOUT);
-       if (!events)
+       if (!events) {
+               fput(file);
                return 0;
+       }

        dma_resv_lock(resv, NULL);

@@ -268,7 +273,6 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
                if (events & EPOLLOUT) {
                        /* Paired with fput in dma_buf_poll_cb */
                        get_file(dmabuf->file);
-
                        if (!dma_buf_poll_add_cb(resv, true, dcb))
                                /* No callback queued, wake up any other waiters */
                                dma_buf_poll_cb(NULL, &dcb->cb);
@@ -301,6 +305,7 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
        }

        dma_resv_unlock(resv);
+       fput(file);
        return events;
 }




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux