> On Mar 25, 2020, at 2:37 AM, Yihao Wu <wuyihao@xxxxxxxxxxxxxxxxx> wrote: > > On 2020/3/25 1:46 AM, Chuck Lever wrote: >>>>> --- >>>>> net/sunrpc/cache.c | 3 +++ >>>>> 1 file changed, 3 insertions(+) >>>>> >>>>> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c >>>>> index bd843a81afa0..3e523eefc47f 100644 >>>>> --- a/net/sunrpc/cache.c >>>>> +++ b/net/sunrpc/cache.c >>>>> @@ -524,9 +524,11 @@ void cache_purge(struct cache_detail *detail) >>>>> struct hlist_node *tmp = NULL; >>>>> int i = 0; >>>>> >>>>> + spin_lock(&cache_list_lock); >>>>> spin_lock(&detail->hash_lock); >>>>> if (!detail->entries) { >>>>> spin_unlock(&detail->hash_lock); >>>>> + spin_unlock(&cache_list_lock); >>>>> return; >>>>> } >>>>> >>>>> @@ -541,6 +543,7 @@ void cache_purge(struct cache_detail *detail) >>>>> } >>>>> } >>>>> spin_unlock(&detail->hash_lock); >>>>> + spin_unlock(&cache_list_lock); >>>>> } >>>>> EXPORT_SYMBOL_GPL(cache_purge); >>> >>> Hmm... Shouldn't this patch be dropping cache_list_lock() when we call >>> sunrpc_end_cache_remove_entry()? The latter does call both >>> cache_revisit_request() and cache_put(), and while they do not >>> explicitly call anything that holds cache_list_lock, some of those cd- >>>> cache_put callbacks do look as if there is potential for deadlock. >> I see svc_export_put calling dput, eventually, which might_sleep(). > > Wow that's a little strange. If svc_export_put->dput might_sleep, why can we > spin_lock(&detail->hash_lock); in cache_purge in the first place? > > And I agree with Trond those cd->cache_put callbacks are dangerous. I will look > into them today. > > But if we dropping cache_list_lock when we call sunrpc_end_cache_remove_entry, > cache_put is not protected, and this patch won't work anymore, right? IMHO Neil's proposed solution seems pretty safe, and follows a well-understood pattern. It would be nice (but not 100% necessary) if the race you found was spelled out in the patch description. Thanks! -- Chuck Lever