> On Jul 29, 2024, at 10:26 AM, Youzhong Yang <youzhong@xxxxxxxxx> wrote: > > How is this going? any chance to move forward and deal with the EEXIST > case in a future patch? I see no harm in keeping the EEXIST check. The EEXIST patch has been applied to nfsd-next for a while, along with the other patches in this series. > On Mon, Jul 15, 2024 at 10:06 AM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: >> >> On Mon, Jul 15, 2024 at 08:25:53AM -0400, Jeff Layton wrote: >>> On Mon, 2024-07-15 at 10:27 +1000, NeilBrown wrote: >>>> On Fri, 12 Jul 2024, Jeff Layton wrote: >>>>> Given that we do the search and insertion while holding the i_lock, I >>>>> don't think it's possible for us to get EEXIST here. Remove this case. >>>> >>>> I was going to comment that as rhltable_insert() cannot return -EEXIST >>>> that is an extra reason to discard the check. But then I looked at the >>>> code an I cannot convince myself that it cannot. >>>> If __rhashtable_insert_fast() finds that tbl->future_tbl is not NULL it >>>> calls rhashtable_insert_slow(), and that seems to fail if the key >>>> already exists. But it shouldn't for an rhltable, it should just add >>>> the new item to the linked list for that key. >>>> >>>> It looks like this has always been broken: adding to an rhltable during >>>> a resize event can cause EEXIST.... >>>> >>>> Would anyone like to check my work? I'm surprise that hasn't been >>>> noticed if it is really the case. >>>> >>>> >>> >>> I don't know this code well at all, but it looks correct to me: >>> >>> static void *rhashtable_try_insert(struct rhashtable *ht, const void *key, >>> struct rhash_head *obj) >>> { >>> struct bucket_table *new_tbl; >>> struct bucket_table *tbl; >>> struct rhash_lock_head __rcu **bkt; >>> unsigned long flags; >>> unsigned int hash; >>> void *data; >>> >>> new_tbl = rcu_dereference(ht->tbl); >>> >>> do { >>> tbl = new_tbl; >>> hash = rht_head_hashfn(ht, tbl, obj, ht->p); >>> if (rcu_access_pointer(tbl->future_tbl)) >>> /* Failure is OK */ >>> bkt = rht_bucket_var(tbl, hash); >>> else >>> bkt = rht_bucket_insert(ht, tbl, hash); >>> if (bkt == NULL) { >>> new_tbl = rht_dereference_rcu(tbl->future_tbl, ht); >>> data = ERR_PTR(-EAGAIN); >>> } else { >>> flags = rht_lock(tbl, bkt); >>> data = rhashtable_lookup_one(ht, bkt, tbl, >>> hash, key, obj); >>> new_tbl = rhashtable_insert_one(ht, bkt, tbl, >>> hash, obj, data); >>> if (PTR_ERR(new_tbl) != -EEXIST) >>> data = ERR_CAST(new_tbl); >>> >>> rht_unlock(tbl, bkt, flags); >>> } >>> } while (!IS_ERR_OR_NULL(new_tbl)); >>> >>> if (PTR_ERR(data) == -EAGAIN) >>> data = ERR_PTR(rhashtable_insert_rehash(ht, tbl) ?: >>> -EAGAIN); >>> >>> return data; >>> } >>> >>> I'm assuming the part we need to worry about is where >>> rhashtable_insert_one returns -EEXIST. >>> >>> It holds the rht_lock across the lookup and insert though. So if >>> rhashtable_insert_one returns -EEXIST, then "data" must be something >>> valid. In that case, "data" won't be overwritten and it will fall >>> through and return the pointer to the entry already there. >>> >>> That said, this logic is really convoluted, so I may have missed >>> something too. >> >> This is the issue I was concerned about after my review: it's >> obvious that the rhtable API can return -EEXIST, but it's just >> really hard to tell whether the rh/l/table API will ever return >> -EEXIST. >> >> As Neil says, the rhtable "hash table full" case should not happen >> with rhltable. But can we prove that? >> >> If we are not yet confident, then maybe PATCH 1/3 should replace >> the "if (ret == -EEXIST)" with "WARN_ON(ret == -EEXIST)"...? It's >> also possible to ask the human(s) who constructed the rhltable >> code. :-) >> >> >>>>> Cc: Youzhong Yang <youzhong@xxxxxxxxx> >>>>> Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") >>>>> Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx> >>>>> --- >>>>> This is replacement for PATCH 1/3 in the series I sent yesterday. I >>>>> think it makes sense to just eliminate this case. >>>>> --- >>>>> fs/nfsd/filecache.c | 2 -- >>>>> 1 file changed, 2 deletions(-) >>>>> >>>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c >>>>> index f84913691b78..b9dc7c22242c 100644 >>>>> --- a/fs/nfsd/filecache.c >>>>> +++ b/fs/nfsd/filecache.c >>>>> @@ -1038,8 +1038,6 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, >>>>> if (likely(ret == 0)) >>>>> goto open_file; >>>>> >>>>> - if (ret == -EEXIST) >>>>> - goto retry; >>>>> trace_nfsd_file_insert_err(rqstp, inode, may_flags, ret); >>>>> status = nfserr_jukebox; >>>>> goto construction_err; >>>>> >>>>> --- >>>>> base-commit: ec1772c39fa8dd85340b1a02040806377ffbff27 >>>>> change-id: 20240711-nfsd-next-c9d17f66e2bd >>>>> >>>>> Best regards, >>>>> -- >>>>> Jeff Layton <jlayton@xxxxxxxxxx> >>>>> >>>>> >>>> >>> >>> -- >>> Jeff Layton <jlayton@xxxxxxxxxx> >> >> -- >> Chuck Lever -- Chuck Lever