On 2/4/25 20:26, Joanne Koong wrote: > Hi Bernd, > > On Tue, Feb 4, 2025 at 3:03 AM Bernd Schubert > <bernd.schubert@xxxxxxxxxxx> wrote: >> >> Hi Joanne, >> >> On 2/3/25 19:50, Joanne Koong wrote: >>> req->flags is set/tested/cleared atomically in fuse. When the FR_PENDING >>> bit is cleared from the request flags when assigning a request to a >>> uring entry, the fiq->lock does not need to be held. >>> >>> Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx> >>> Fixes: c090c8abae4b6 ("fuse: Add io-uring sqe commit and fetch support") >>> --- >>> fs/fuse/dev_uring.c | 2 -- >>> 1 file changed, 2 deletions(-) >>> >>> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c >>> index ab8c26042aa8..42389d3e7235 100644 >>> --- a/fs/fuse/dev_uring.c >>> +++ b/fs/fuse/dev_uring.c >>> @@ -764,9 +764,7 @@ static void fuse_uring_add_req_to_ring_ent(struct fuse_ring_ent *ent, >>> ent->state); >>> } >>> >>> - spin_lock(&fiq->lock); >>> clear_bit(FR_PENDING, &req->flags); >>> - spin_unlock(&fiq->lock); >>> ent->fuse_req = req; >>> ent->state = FRRS_FUSE_REQ; >>> list_move(&ent->list, &queue->ent_w_req_queue); >> >> I think that would have an issue in request_wait_answer(). Let's say >> >> >> task-A, request_wait_answer(), >> spin_lock(&fiq->lock); >> /* Request is not yet in userspace, bail out */ >> if (test_bit(FR_PENDING, &req->flags)) { // ========> if passed >> list_del(&req->list); // --> removes from the list >> >> task-B, >> fuse_uring_add_req_to_ring_ent() >> clear_bit(FR_PENDING, &req->flags); >> ent->fuse_req = req; >> ent->state = FRRS_FUSE_REQ; >> list_move_tail(&ent->list, &queue->ent_w_req_queue); >> fuse_uring_add_to_pq(ent, req); // ==> Add to list >> >> >> >> What I mean is, task-A passes the if, but is then slower than task-B. I.e. >> task-B runs fuse_uring_add_to_pq() before task-B does the list_del. >> > > Is this race condition possible given that fiq->ops->send_req() is > called (and completed) before request_wait_answer() is called? The > path I see is this: > > __fuse_simple_request() > __fuse_request_send() > fuse_send_one() > fiq->ops->send_req() > fuse_uring_queue_fuse_req() > fuse_uring_add_req_to_ring_ent() > clear FR_PENDING bit > fuse_uring_add_to_pq() > request_wait_answer() > > It doesn't seem like task A can call request_wait_answer() while task > B is running fuse_uring_queue_fuse_req() on the same request while the > request still has the FR_PENDING bit set. > > This case of task A running request_wait_answer() while task B is > executing fuse_uring_add_req_to_ring_ent() can happen through > fuse_uring_commit_fetch() -> fuse_uring_add_req_to_ring_ent(), but at > that point the FR_PENDING flag will have already been cleared on the > request, so this would bypass the "if (test_bit(FR_PENDING,...))" > check in request_wait_answer(). I mean this case. I don't think FR_PENDING is cleared - why should it? And where? The request is pending state, waiting to get into 'FR_SENT'? > > Is there something I'm missing? I think if this race condition is > possible, then we also have a bigger problem where the request can be > freed out in this request_wait_answer() -> if (test_bit(FR_PENDING, > &req->flags))... case while fuse_uring_add_req_to_ring_ent() -> > fuse_uring_add_to_pq() dereferences it still. I don't think so, if we take the lock. Thanks, Bernd