Re: [PATCH] fuse: clear FR_PENDING without holding fiq lock for uring requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 4, 2025 at 12:00 PM Bernd Schubert
<bernd.schubert@xxxxxxxxxxx> wrote:
>
>
>
> On 2/4/25 20:26, Joanne Koong wrote:
> > Hi Bernd,
> >
> > On Tue, Feb 4, 2025 at 3:03 AM Bernd Schubert
> > <bernd.schubert@xxxxxxxxxxx> wrote:
> >>
> >> Hi Joanne,
> >>
> >> On 2/3/25 19:50, Joanne Koong wrote:
> >>> req->flags is set/tested/cleared atomically in fuse. When the FR_PENDING
> >>> bit is cleared from the request flags when assigning a request to a
> >>> uring entry, the fiq->lock does not need to be held.
> >>>
> >>> Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx>
> >>> Fixes: c090c8abae4b6 ("fuse: Add io-uring sqe commit and fetch support")
> >>> ---
> >>>  fs/fuse/dev_uring.c | 2 --
> >>>  1 file changed, 2 deletions(-)
> >>>
> >>> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
> >>> index ab8c26042aa8..42389d3e7235 100644
> >>> --- a/fs/fuse/dev_uring.c
> >>> +++ b/fs/fuse/dev_uring.c
> >>> @@ -764,9 +764,7 @@ static void fuse_uring_add_req_to_ring_ent(struct fuse_ring_ent *ent,
> >>>                       ent->state);
> >>>       }
> >>>
> >>> -     spin_lock(&fiq->lock);
> >>>       clear_bit(FR_PENDING, &req->flags);
> >>> -     spin_unlock(&fiq->lock);
> >>>       ent->fuse_req = req;
> >>>       ent->state = FRRS_FUSE_REQ;
> >>>       list_move(&ent->list, &queue->ent_w_req_queue);
> >>
> >> I think that would have an issue in request_wait_answer(). Let's say
> >>
> >>
> >> task-A, request_wait_answer(),
> >>                 spin_lock(&fiq->lock);
> >>                 /* Request is not yet in userspace, bail out */
> >>                 if (test_bit(FR_PENDING, &req->flags)) {  // ========> if passed
> >>                         list_del(&req->list);  // --> removes from the list
> >>
> >> task-B,
> >> fuse_uring_add_req_to_ring_ent()
> >>         clear_bit(FR_PENDING, &req->flags);
> >>         ent->fuse_req = req;
> >>         ent->state = FRRS_FUSE_REQ;
> >>         list_move_tail(&ent->list, &queue->ent_w_req_queue);
> >>         fuse_uring_add_to_pq(ent, req);  // ==> Add to list
> >>
> >>
> >>
> >> What I mean is, task-A passes the if, but is then slower than task-B. I.e.
> >> task-B runs fuse_uring_add_to_pq() before task-B does the list_del.
> >>
> >
> > Is this race condition possible given that fiq->ops->send_req() is
> > called (and completed) before request_wait_answer() is called? The
> > path I see is this:
> >
> > __fuse_simple_request()
> >     __fuse_request_send()
> >         fuse_send_one()
> >             fiq->ops->send_req()
> >                   fuse_uring_queue_fuse_req()
> >                       fuse_uring_add_req_to_ring_ent()
> >                            clear FR_PENDING bit
> >                            fuse_uring_add_to_pq()
> >         request_wait_answer()
> >
> > It doesn't seem like task A can call request_wait_answer() while task
> > B is running fuse_uring_queue_fuse_req() on the same request while the
> > request still has the FR_PENDING bit set.
> >
> > This case of task A running request_wait_answer() while task B is
> > executing fuse_uring_add_req_to_ring_ent() can happen through
> > fuse_uring_commit_fetch() ->  fuse_uring_add_req_to_ring_ent(), but at
> > that point the FR_PENDING flag will have already been cleared on the
> > request, so this would bypass the "if (test_bit(FR_PENDING,...))"
> > check in request_wait_answer().
>
> I mean this case. I don't think FR_PENDING is cleared - why should it?
> And where? The request is pending state, waiting to get into 'FR_SENT'?
>
> >
> > Is there something I'm missing? I think if this race condition is
> > possible, then we also have a bigger problem where the request can be
> > freed out in this request_wait_answer() ->  if (test_bit(FR_PENDING,
> > &req->flags))...  case while fuse_uring_add_req_to_ring_ent() ->
> > fuse_uring_add_to_pq() dereferences it still.
>
> I don't think so, if we take the lock.
>

the path I'm looking at is this:

task A -
__fuse_simple_request()
    fuse_get_req() -> request is allocated (req refcount is 1)
    __fuse_request_send()
        __fuse_get_request() -> req refcount is 2
        fuse_send_one() -> req gets sent to uring
        request_wait_answer()
               ...
               hits the interrupt case, goes into "if
test_bit(FR_PENDING, ...)" case which calls __fuse_put_request(), req
refcount is now 1
    fuse_put_request() -> req refcount is dropped to 0, request is freed

while in task B -
fuse_uring_commit_fetch()
    fuse_uring_next_fuse_req()
        fuse_uring_ent_assign_req()
            gets req off fuse_req_queue
            fuse_uring_add_req_to_ring_ent()
                 clear FR_PENDING
                 fuse_uring_add_to_pq()
                     dereferences req

if task A hits the interrupt case in request_wait_answer() and then
calls fuse_put_request() before task B clears the pending flag (and
after it's gotten the request from the fuse_req_queue in
fuse_uring_ent_assign_req()), then I think we hit this case, no?


Thanks,
Joanne
>
> Thanks,
> Bernd
>





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux