On 2019/01/11 13:08, Paul E. McKenney wrote: > On Tue, Jan 08, 2019 at 06:59:13PM -0800, Paul E. McKenney wrote: >> On Tue, Jan 08, 2019 at 04:19:59PM -0800, Paul E. McKenney wrote: >>> On Wed, Jan 09, 2019 at 07:16:05AM +0900, Akira Yokosawa wrote: >>>> On 2019/01/08 10:39:31 -0800, Paul E. McKenney wrote: >>>>> On Wed, Jan 09, 2019 at 12:35:37AM +0900, Akira Yokosawa wrote: >>>>>> On 2019/01/09 0:28, Paul E. McKenney wrote: >>>>>>> On Tue, Jan 08, 2019 at 09:56:57AM +0800, Junchang Wang wrote: >>>>>>>> On Tue, Jan 8, 2019 at 7:06 AM Akira Yokosawa <akiyks@xxxxxxxxx> wrote: >>>>>>>>> On 2019/01/08 07:54:16 +0900, Akira Yokosawa wrote: >>> >>> [ . . . ] >>> >>>>>>>> Hi Paul and Akira, >>>>>>>> >>>>>>>> Thanks a lot for the comments, which I need some more time to look >>>>>>>> into. For Paul's patch, I have a few concerns. Please take a look. >>>>>>>> >>>>>>>> My understanding is that with this path, during the time period when >>>>>>>> the resizing thread is running, an updater may insert/delete an item >>>>>>>> into/from the new hash table, while readers are still looking up data >>>>>>>> in the old one, resulting the readers are unaware of >>>>>>>> insertions/deletions happening simultaneously. For example, it seems >>>>>>>> the following sequence could happen. >>>>>>>> >>>>>>>> 1. The resizing thread starts. >>>>>>>> 2. The resizing thread successfully passes bucket *B* of the old hash table. >>>>>>>> 3. An updater wants to insert a new item *I* which should be inserted >>>>>>>> into bucket *B*. >>>>>>>> 4. The updater will select the new hash table and insert the item *I* >>>>>>>> into the new hash table. >>>>>>>> 5. A read request comes in and wants to lookup item *I*. The lookup >>>>>>>> request will check the old hash table and fail. Doesn't it? >>>>>>>> 6. The resizing thread exits. >>>>>>>> 7. Now subsequent read requests can successfully find item *I*. >>>>>>> >>>>>>> Yes, this can happen. >>>>>>> >>>>>>>> Is my understanding correct? Please let me know if I misunderstood >>>>>>>> anything. Give the truth that this patch can accelerate the fast path, >>>>>>>> I think it should be OK because resizing is typically happen rarely. >>>>>>>> Just want to make sure I fully understand the algorithm. >>>>>>> >>>>>>> It is a design choice, and some users would prefer not to fail to see >>>>>>> new items during a resize. One approach would be to revert back to >>>>>>> the old-style checking, and another would be to provide a separate >>>>>>> lookup interface that synchronizes with adds and deletes. >>>>>>> >>>>>>> So, I could add a quick quiz with this information, I could revert the >>>>>>> change, or I could add another lookup function that provided more timely >>>>>>> information. Left to myself, I would provide a quick quiz, but what >>>>>>> do you guys think? >>>>>> >>>>>> Hi, I was composing a message, but now I'm replying to this one. >>>>>> I think adding a quick quiz would be a good idea. >>>>> >>>>> But in the meantime, it occurred to me that I was looking at the >>>>> problem in the wrong way. I believe that the following patch makes >>>>> hashtab_lookup() find elements recently added by hashtab_add(), even >>>>> during a resize, and without the need for memory barriers. >>>>> >>>>> The scenario that convinced me to take this approach is when a thread >>>>> does hashtab_add(), then immediately searches for the newly added element. >>>>> Failing to find it would be quite a surprise to most people. >>>> >>>> When a thread does hashtab_del() and immediately checks the deletion, >>>> it still finds the deleted element while resizing is in progress. >>>> This would also be a surprise. Current version looks less consistent >>>> than the simpler one did. >>> >>> I bet I can fix that... Famous last words! ;-) >>> >>> But please see below and tell me what you think. >> >> Well, that is not quite right, but close. Working on it... > > Seems to be stable. I have not yet updated the text. I am currently > looking into whether I can get rid of ->ht_resize_cur. Without ->ht_resize_cur, it would be hard (or impossible) for hashtab_lock_mod() to see if bucket in the new table needs to be locked, wouldn't it? I have another idea to simplify the code, with the possibility of increasing update-side cost during resizing. By keeping updating ->cur hashtab until the resizing finished, hashtab_lookup() only needs to see ->cur. hashtab_add() and hashtab_del() update both buckets if hashtab_lock_mod() has locked two buckets. I've added the code as the alternative code to avoid messing line lables. The diff is appended below. Thoughts? Note: By adding a "-DFCV_SNIPPET" flag to the compiler, you can compile your version of the code. Thanks, Akira > In theory, this > would make it trivial to make the resizing "pause", releasing the lock > from time to time. > > For whatever it is worth... > > Thanx, Paul > diff --git a/CodeSamples/datastruct/hash/hash_resize.c b/CodeSamples/datastruct/hash/hash_resize.c index 9f9fe8c..35bc847 100644 --- a/CodeSamples/datastruct/hash/hash_resize.c +++ b/CodeSamples/datastruct/hash/hash_resize.c @@ -30,7 +30,11 @@ struct ht_elem { struct rcu_head rh; struct cds_list_head hte_next[2]; //\lnlbl{ht_elem:next} +#ifndef FCV_SNIPPET + unsigned long hte_hash[2]; +#else /* #ifndef FCV_SNIPPET */ unsigned long hte_hash; +#endif /* #ifndef FCV_SNIPPET */ }; /* Hash-table bucket element. */ @@ -230,12 +234,16 @@ hashtab_lookup(struct hashtab *htp_master, void *key) htp = rcu_dereference(htp_master->ht_cur); //\lnlbl{lkp:get_curtbl} htep = ht_search_bucket(htp, key); //\lnlbl{lkp:get_curbkt} +#ifndef FCV_SNIPPET + return htep; +#else /* #ifndef FCV_SNIPPET */ if (htep) //\lnlbl{lkp:entchk} return htep; //\lnlbl{lkp:ret_match} htp = rcu_dereference(htp->ht_new); //\lnlbl{lkp:get_nxttbl} if (!htp) //\lnlbl{lkp:htpchk} return NULL; //\lnlbl{lkp:noresize} return ht_search_bucket(htp, key); //\lnlbl{lkp:ret_nxtbkt} +#endif /* #ifndef FCV_SNIPPET */ } //\lnlbl{lkp:e} /* @@ -248,9 +256,18 @@ void hashtab_add(struct ht_elem *htep, //\lnlbl{add:b} struct ht_bucket *htbp = lsp->hbp[0]; //\lnlbl{add:htbp} int i = lsp->hls_idx[0]; //\lnlbl{add:i} +#ifndef FCV_SNIPPET + htep->hte_hash[0] = lsp->hls_hash[0]; + cds_list_add_rcu(&htep->hte_next[i], &htbp->htb_head); + if ((htbp = lsp->hbp[1])) { + htep->hte_hash[1] = lsp->hls_hash[1]; + cds_list_add_rcu(&htep->hte_next[!i], &htbp->htb_head); + } +#else /* #ifndef FCV_SNIPPET */ htep->hte_hash = lsp->hls_hash[0]; //\lnlbl{add:hash} htep->hte_next[!i].prev = NULL; //\lnlbl{add:initp} cds_list_add_rcu(&htep->hte_next[i], &htbp->htb_head); //\lnlbl{add:add} +#endif /* #ifndef FCV_SNIPPET */ } //\lnlbl{add:e} /* @@ -262,6 +279,11 @@ void hashtab_del(struct ht_elem *htep, //\lnlbl{del:b} { int i = lsp->hls_idx[0]; //\lnlbl{del:i} +#ifndef FCV_SNIPPET + cds_list_del_rcu(&htep->hte_next[i]); + if (lsp->hbp[1]) + cds_list_del_rcu(&htep->hte_next[!i]); +#else /* #ifndef FCV_SNIPPET */ if (htep->hte_next[i].prev) { //\lnlbl{del:if} cds_list_del_rcu(&htep->hte_next[i]); //\lnlbl{del:del} htep->hte_next[i].prev = NULL; //\lnlbl{del:init} @@ -270,6 +292,7 @@ void hashtab_del(struct ht_elem *htep, //\lnlbl{del:b} cds_list_del_rcu(&htep->hte_next[!i]); //\lnlbl{del:delnew} htep->hte_next[!i].prev = NULL; //\lnlbl{del:initnew} } +#endif /* #ifndef FCV_SNIPPET */ } //\lnlbl{del:e} //\end{snippet} @@ -350,5 +373,9 @@ void defer_del_rcu(struct rcu_head *rhp) #define quiescent_state() rcu_quiescent_state() +#ifndef FCV_SNIPPET +#define check_hash() (htep->hte_hash[0] != hash && htep->hte_hash[1] != hash) +#endif /* #ifndef FCV_SNIPPET */ + #include "hashtorture.h" #endif /* #ifdef TEST_HASH */ diff --git a/CodeSamples/datastruct/hash/hashtorture.h b/CodeSamples/datastruct/hash/hashtorture.h index 6f47baa..d6345cc 100644 --- a/CodeSamples/datastruct/hash/hashtorture.h +++ b/CodeSamples/datastruct/hash/hashtorture.h @@ -62,6 +62,10 @@ void (*defer_del_done)(struct ht_elem *htep) = NULL; #define rcu_barrier() do ; while (0) #endif /* #ifndef quiescent_state */ +#ifndef check_hash +#define check_hash() (htep->hte_hash != hash) +#endif /* #ifndef check_hash */ + /* * Test variables. */ @@ -988,7 +992,7 @@ int zoo_lookup(char *key) htep = hashtab_lookup(perftest_htp, hash, key); zhep = container_of(htep, struct zoo_he, zhe_e); BUG_ON(htep && - (htep->hte_hash != hash || + (check_hash() || strncmp(zhep->name, (char *)key, ZOO_NAMELEN) != 0)); hashtab_unlock_lookup(perftest_htp, hash); hashtab_lookup_done(htep);