On Wed, Jun 30, 2021 at 12:13:35PM -0700, dai.ngo@xxxxxxxxxx wrote: > > On 6/30/21 11:55 AM, J. Bruce Fields wrote: > >On Wed, Jun 30, 2021 at 11:49:18AM -0700, dai.ngo@xxxxxxxxxx wrote: > >>On 6/30/21 11:05 AM, J. Bruce Fields wrote: > >>>On Wed, Jun 30, 2021 at 10:51:27AM -0700, dai.ngo@xxxxxxxxxx wrote: > >>>>>On 6/28/21 1:23 PM, J. Bruce Fields wrote: > >>>>>>where ->fl_expire_lock is a new lock callback with second > >>>>>>argument "check" > >>>>>>where: > >>>>>> > >>>>>> check = 1 means: just check whether this lock could be freed > >>>>Why do we need this, is there a use case for it? can we just always try > >>>>to expire the lock and return success/fail? > >>>We can't expire the client while holding the flc_lock. And once we drop > >>>that lock we need to restart the loop. Clearly we can't do that every > >>>time. > >>> > >>>(So, my code was wrong, it should have been: > >>> > >>> > >>> if (fl->fl_lops->fl_expire_lock(fl, 1)) { > >>> spin_unlock(&ct->flc_lock); > >>> fl->fl_lops->fl_expire_locks(fl, 0); > >>> goto retry; > >>> } > >>> > >>>) > >>This is what I currently have: > >> > >>retry: > >> list_for_each_entry(fl, &ctx->flc_posix, fl_list) { > >> if (!posix_locks_conflict(request, fl)) > >> continue; > >> > >> if (fl->fl_lmops && fl->fl_lmops->lm_expire_lock) { > >> spin_unlock(&ctx->flc_lock); > >> ret = fl->fl_lmops->lm_expire_lock(fl, 0); > >> spin_lock(&ctx->flc_lock); > >> if (ret) > >> goto retry; > >We have to retry regardless of the return value. Once we've dropped > >flc_lock, it's not safe to continue trying to iterate through the list. > > Yes, thanks! > > > > >> } > >> > >> if (conflock) > >> locks_copy_conflock(conflock, fl); > >> > >>>But the 1 and 0 cases are starting to look pretty different; maybe they > >>>should be two different callbacks. > >>why the case of 1 (test only) is needed, who would use this call? > >We need to avoid dropping the spinlock in the case there are no clients > >to expire, otherwise we'll make no forward progress. > > I think we can remember the last checked file_lock and skip it: I doubt that works in the case there are multiple locks with lm_expire_lock set. If you really don't want another callback here, maybe you could set some kind of flag on the lock. At the time a client expires, you're going to have to walk all of its locks to see if anyone's waiting for them. At the same time maybe you could set an FL_EXPIRABLE flag on all those locks, and test for that here. If the network partition heals and the client comes back, you'd have to remember to clear that flag again. --b. > retry: > list_for_each_entry(fl, &ctx->flc_posix, fl_list) { > if (!posix_locks_conflict(request, fl)) > continue; > > if (checked_fl != fl && fl->fl_lmops && > fl->fl_lmops->lm_expire_lock) { > checked_fl = fl; > spin_unlock(&ctx->flc_lock); > fl->fl_lmops->lm_expire_lock(fl); > spin_lock(&ctx->flc_lock); > goto retry; > } > > if (conflock) > locks_copy_conflock(conflock, fl); > > -Dai > > > > >--b.