Muhammad Usama Anjum <musamaanjum@xxxxxxxxx> writes: > pfid is being set to tcon->crfid.fid and they are copied in each other > multiple times. Remove the memcopy between same pointers. > > Addresses-Coverity: ("Overlapped copy") > Fixes: 9e81e8ff74b9 ("cifs: return cached_fid from open_shroot") > Signed-off-by: Muhammad Usama Anjum <musamaanjum@xxxxxxxxx> > --- > I'm not sure why refcount was being incremented here. This file has been > evoloved so much. Any ideas? The fact that pfid is the same as the cache is very weird... Probably due to recent change. This function returns a cached dir entry for the root of the share which can be accessed/shared by multiple task. The basic logic is: open_cached_dir(result) { if (cache.is_valid) { memcpy(result, cache->fid) return } // not cached, send open() to server dir tmp; smb2_open(&tmp...) memcpy(cache->fid, tmp) cache.is_valid = true memcpy(result, cache->fid) return } My understanding of this is that all file/dir entry have a refcount so to prevent callers from releasing the cached entry when they put it we need to bump it before returning. open_cached_dir(result) { if (cache.is_valid) { kref_get(cache) memcpy(result, cache->fid) return } // not cached, send open() to server dir tmp; smb2_open(&tmp...) memcpy(cache->fid, tmp) cache.is_valid = true kref_init(cache) kref_get(cache) memcpy(result, cache->fid) return } Now this function can be called from multiple thread, and there are couple of critical sections. process 1 process 2 ------------------- ----------------- if (cache.is_valid) => false continue smb2_open(...) if (cache.is_valid) => false continue smb2_open(...) cache.is_valid = true In that exemple, we ended up opening twice and overwriting the cache. So we need to add locks to avoid this race condition. open_cached_dir(result) { mutex_lock(cache) if (cache.is_valid) { kref_get(cache) memcpy(result, cache->fid) mutex_unlock(cache) return cache } // not cached, send open() to server dir tmp; smb2_open(&tmp...) memcpy(cache->fid, tmp) cache.is_valid = true kref_init(cache) kref_get(cache) mutex_unlock(cache) memcpy(result, cache->fid) return } But now we get reports of deadlocks. Turns out smb2_open() in some code path (if it ends up triggering reconnect) will take the lock again. Since linux mutex are not reentrant this will block forever (deadlock). So we need to release for the smb2_open() call. open_cached_dir(result) { mutex_lock(cache); if (cache.is_valid) { kref_get(cache) memcpy(result, cache->fid) return cache } // release lock for open mutex_unlock(cache) // not cached, send open() to server dir tmp; smb2_open(&tmp...) // take it back mutex_lock(cache) // now we need to check is_valid again since it could have been // changed in that small unlocked time frame by a concurrent process if (cache.is_valid) { // a concurrent call to this func was done already // return the existing one to caller memcpy(result, cache->fid) kref_get(cache) mutex_unlock(cache) // close the tmp duplicate one we opened smb2_close(tmp) return } memcpy(result, cache->fid) kref_init(cache) kref_get(cache) mutex_unlock(cache) return } That ^^^ is the pseudo-code of what the function *should* be doing. We need to go over it and see what it is doing different now. I think it's likely when we made the code to be used for caching any dir something diverged wrong. Cheers, -- Aurélien Aptel / SUSE Labs Samba Team GPG: 1839 CB5F 9F5B FB9B AA97 8C99 03C8 A49B 521B D5D3 SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, DE GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 247165 (AG München)