On Wed, Aug 07, 2019 at 12:42:08PM -0400, Olga Kornievskaia wrote: > On Wed, Aug 7, 2019 at 12:09 PM J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > > > On Wed, Aug 07, 2019 at 12:02:40PM -0400, Olga Kornievskaia wrote: > > > On Thu, Aug 1, 2019 at 3:36 PM J. Bruce Fields <bfields@xxxxxxxxxx> wrote: > > > > > > > > On Thu, Aug 01, 2019 at 02:24:04PM -0400, Olga Kornievskaia wrote: > > > > > i was just looking at close_lru and delegation_lru but I guess that's > > > > > not a list of delegation or open stateids but rather some complex of > > > > > not deleting the stateid right away but moving it to nfs4_ol_stateid > > > > > and the list on the nfsd_net. Are you looking for something similar > > > > > for the copy_notify state or can I just keep a global list of the > > > > > nfs4_client and add and delete of that (not move to the delete later)? > > > > > > > > A global list seems like it should work if the locking's OK. > > > > > > I'm having issues taking a reference on a parent stateid and being > > > able to clean it. Let me try to explain. > > > > With other stateid parent relationships I believe what we do is: instead > > of the child taking a reference on the parent, we ensure that the child > > is destroyed, and that nobody can be holding a pointer to it, before we > > destroy the parent. > > I don't think we can get away from not taking a reference on the > parent. When a READ comes with the copy_notify stateid, it's used to > lookup the parent state because the nfs4_preprocess_stateid_op() that > checks the validity of the stateid for a given operation needs to > check validity of that parent stateid). Otherwise, we'd have to > special case the READ calling nfs4_preprocess_stateid_op() and special > call that function to when called from READ and finding a copy_notify > stateid will forego the other checks. Do you want me to that instead > of what I proposed below? Um, honestly I'm not sure I understand your code below yet. I'll take another look.... > > > Since I take a reference on the stateid, then during what would have > > > been the last put (due to say a close operation), stateid isn't > > > released. Now that stateid is sticking around. I personally would have > > > liked on what would have been a close and release of the stateid to > > > release the copy notify state(s) That's OK with me as long as it works. Did I complain about it? The only real requirement is that we've got *some* way to assure that we aren't going to find a copy_notify stateid and try to follow it to its parent, after the parent's been freed. --b. > > > (which was being done before but > > > having a reference makes it hard? i want to count number of copy > > > notify states and if then somehow if the num_copies-1 is going to make > > > it 0, then decrement by num_copies (and the normal -1) but if it's not > > > the last reference then it shouldn't be decremented. > > > > > > Now say no fancy logic happens on close so we have these stateids left > > > over . What to do on unmount? It will error with err_client_busy since > > > there are non-zero copy notify states and only after a lease period it > > > will release the resources (when the close of the file should have > > > removed any copy notify state)? > > > > > > Question: would it be acceptable to do something like this on freeing > > > of the parent stateid? > > > > > > @@ -896,8 +931,12 @@ static void block_delegations(struct knfsd_fh *fh) > > > might_lock(&clp->cl_lock); > > > > > > if (!refcount_dec_and_lock(&s->sc_count, &clp->cl_lock)) { > > > - wake_up_all(&close_wq); > > > - return; > > > + if (!refcount_sub_and_test_checked(s->sc_cp_list_size, > > > + &s->sc_count)) { > > > + refcount_add_checked(s->sc_cp_list_size, &s->sc_count); > > > + wake_up_all(&close_wq); > > > + return; > > > + } > > > } > > > idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id); > > > spin_unlock(&clp->cl_lock); > > > > > > then free the copy notify stateids associated with stateid. > > > > > > Laundromat would still be checking the copy_notify stateids for > > > anything that's been not active for a while (but not closed). > > > > > > > > > > > > > > > > > > > > > > > --b.