On Mon, 2022-08-01 at 13:25 -0700, dai.ngo@xxxxxxxxxx wrote: > On 8/1/22 12:26 PM, Jeff Layton wrote: > > On Mon, 2022-08-01 at 12:22 -0700, dai.ngo@xxxxxxxxxx wrote: > > > On 8/1/22 12:10 PM, Jeff Layton wrote: > > > > On Mon, 2022-08-01 at 11:52 -0700, dai.ngo@xxxxxxxxxx wrote: > > > > > On 8/1/22 4:47 AM, Jeff Layton wrote: > > > > > > On Sun, 2022-07-31 at 13:19 -0700, Dai Ngo wrote: > > > > > > > Use-after-free occurred when the laundromat tried to free expired > > > > > > > cpntf_state entry on the s2s_cp_stateids list after inter-server > > > > > > > copy completed. The sc_cp_list that the expired copy state was > > > > > > > inserted on was already freed. > > > > > > > > > > > > > > When COPY completes, the Linux client normally sends LOCKU(lock_state x), > > > > > > > FREE_STATEID(lock_state x) and CLOSE(open_state y) to the source server. > > > > > > > The nfs4_put_stid call from nfsd4_free_stateid cleans up the copy state > > > > > > > from the s2s_cp_stateids list before freeing the lock state's stid. > > > > > > > > > > > > > > However, sometimes the CLOSE was sent before the FREE_STATEID request. > > > > > > > When this happens, the nfsd4_close_open_stateid call from nfsd4_close > > > > > > > frees all lock states on its st_locks list without cleaning up the copy > > > > > > > state on the sc_cp_list list. When the time the FREE_STATEID arrives the > > > > > > > server returns BAD_STATEID since the lock state was freed. This causes > > > > > > > the use-after-free error to occur when the laundromat tries to free > > > > > > > the expired cpntf_state. > > > > > > > > > > > > > > This patch adds a call to nfs4_free_cpntf_statelist in > > > > > > > nfsd4_close_open_stateid to clean up the copy state before calling > > > > > > > free_ol_stateid_reaplist to free the lock state's stid on the reaplist. > > > > > > > > > > > > > > Signed-off-by: Dai Ngo <dai.ngo@xxxxxxxxxx> > > > > > > > --- > > > > > > > fs/nfsd/nfs4state.c | 3 +++ > > > > > > > 1 file changed, 3 insertions(+) > > > > > > > > > > > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > > > > > > index 9409a0dc1b76..749f51dff5c7 100644 > > > > > > > --- a/fs/nfsd/nfs4state.c > > > > > > > +++ b/fs/nfsd/nfs4state.c > > > > > > > @@ -6608,6 +6608,7 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s) > > > > > > > struct nfs4_client *clp = s->st_stid.sc_client; > > > > > > > bool unhashed; > > > > > > > LIST_HEAD(reaplist); > > > > > > > + struct nfs4_ol_stateid *stp; > > > > > > > > > > > > > > spin_lock(&clp->cl_lock); > > > > > > > unhashed = unhash_open_stateid(s, &reaplist); > > > > > > > @@ -6616,6 +6617,8 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s) > > > > > > > if (unhashed) > > > > > > > put_ol_stateid_locked(s, &reaplist); > > > > > > > spin_unlock(&clp->cl_lock); > > > > > > > + list_for_each_entry(stp, &reaplist, st_locks) > > > > > > > + nfs4_free_cpntf_statelist(clp->net, &stp->st_stid); > > > > > > > free_ol_stateid_reaplist(&reaplist); > > > > > > > } else { > > > > > > > spin_unlock(&clp->cl_lock); > > > > > > Nice catch. > > > > > > > > > > > > There are a number of places that call free_ol_stateid_reaplist. Is it > > > > > > really only in nfsd4_close_open_stateid that we need to do this? I > > > > > > wonder if it would be better to do this inside free_ol_stateid_reaplist > > > > > > instead so that all of those callers clean up the copy states as well? > > > > > Yes, we can do this in free_ol_stateid_reaplist too, I tested it. > > > > > > > > > > The linux client uses either delegation state or lock state to send with > > > > > the COPY_NOTIFY to the source server. If the server grants the delegation > > > > > in the OPEN then the client uses the delegation state, otherwise it sends > > > > > the LOCK to the source and uses the lock state for the COPY_NOTIFY. This > > > > > problem happens only when the lock state is used *and* the client sends > > > > > the CLOSE and FREE_STATEID out of order. > > > > > > > > > > free_ol_stateid_reaplist is called from release_open_stateid, release_openowner, > > > > > nfsd4_close_open_stateid and nfsd4_release_lockowner. Among these functions, > > > > > only nfsd4_close_open_stateid deals with lock state that may have cpntf_state > > > > > associated with it and only for the minorversion > 1 case. > > > > > > > > > > nfsd4_release_lockowner will free the lock states but if the client has > > > > > not send LOCKU yet then put_ol_stateid_locked would fail to add the lock > > > > > state on the reaplist. > > > > > > > > > > I'm ok to move it to free_ol_stateid_reaplist if you still think we should > > > > > and don't mind a little overhead on the unneeded cases. > > > > > > > > > If you think this is the only way this can happen, then the patch you > > > > have is fine. In that case though, it might be good to have something > > > > like this in free_ol_stateid_reaplist(): > > > > > > > > WARN_ON(!list_empty(&stp->sc_cp_list)); > > > > > > > > ...to try and catch cases where these objects slip through the cracks > > > > after future changes. > > > Good suggestion, I'll add it to v2. > > > > > Actually, it might be better to put it in nfs4_free_ol_stateid. If we're > > freeing an open or lock stateid and this list isn't empty, then > > something is wrong. > > The linux client uses only delegation or lock state for COPY_NOTIFY. > However the protocol allows for using open state with COPY_NOTIFY too, > so adding the warning in nfs4_free_ol_stateid would cover behavior of > other clients. I'll add this to v2. > Might also want to check this when freeing a deleg stid too. -- Jeff Layton <jlayton@xxxxxxxxxx>