On Tue, Aug 13, 2019 at 1:57 PM Olga Kornievskaia <olga.kornievskaia@xxxxxxxxx> wrote: > > On Mon, Aug 12, 2019 at 4:00 PM J. Bruce Fields <bfields@xxxxxxxxxx> wrote: > > > > On Mon, Aug 12, 2019 at 03:16:47PM -0400, Olga Kornievskaia wrote: > > > On Mon, Aug 12, 2019 at 12:19 PM Olga Kornievskaia > > > <olga.kornievskaia@xxxxxxxxx> wrote: > > > > While this passes my testing, in theory this allows for the race that > > > > we get the copy notify size but then offload_cancel arrive and change > > > > the value. Then refcount_sub_and test_check would have an incorrect > > > > value (can subtract larger than an actual reference count). I have no > > > > solution for that as there is no refcount_sub_and_lock() that will > > > > allow to decrement by a multiple under a lock. Thoughts? > > > > > > I tried not to use the client's cl_lock but instead use a specific > > > lock to protect the copy notifies stateid on the stateid list. But > > > since stateid's reference counter (sc_count) is protected by it, I > > > think by getting rid of the special lock and using cl_lock will solve > > > the problem of coordinating access between the sc_count and the > > > copy_notify stateid list. Are the any problems with using such a big > > > lock? > > > > Probably not. But it can be confusing when a single lock is used for > > several different things. A comment explaining why you need it might > > help. > > While holding the client's cl_lock to manipulate the list of copy > notify stateids solves the refcount problem. It generates a different > problem for the laundromat thread. There, client list is traversed > already holding the cl_lock, so I can't call routines to free > copy_notify stateid because in turn it calls nfs4_put_stid() which > wants to take the cl_lock. Putting the copy_notify stateid on the > reaplist and then I lose a pointer to the client structure that I need > to take the lock. Then it seems the nfs4_cpntf_state structure would > need to keep a pointer to the client structure but then I get a > problem of making sure the nfs4_client structure isn't going away and > because it even a bigger mess. > > I think I need to remove the code in the laundromat that looks for the > not referenced copy_notifies stateid and just rely on cleaning on the > removal of the stateid (basically what I originally had). Or I need to > rely on the client to always send FREE_STATEID. I don't see other > options, do you? Ignore this Bruce. Trond gave me a good idea and gets me unstuck. > > > > > --b.