On 7/2/20 3:24 PM, Linus Torvalds wrote:
On Thu, Jul 2, 2020 at 2:17 PM Pavel Machek <pavel@xxxxxxx> wrote:
commit 4cd9973f9ff69e37dd0ba2bd6e6423f8179c329a upstream.
Patch series "ocfs2: fix nfsd over ocfs2 issues", v2.
This causes locking imbalance:
This sems to be true upstream too.
When ocfs2_nfs_sync_lock() returns error, caller can not know if the
lock was taken or not.
Right you are.
And your patch looks sane:
diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
index c141b06811a6..8149fb6f1f0d 100644
--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -2867,9 +2867,15 @@ int ocfs2_nfs_sync_lock(struct ocfs2_super *osb, int ex)
status = ocfs2_cluster_lock(osb, lockres, ex ? LKM_EXMODE : LKM_PRMODE,
0, 0);
- if (status < 0)
+ if (status < 0) {
mlog(ML_ERROR, "lock on nfs sync lock failed %d\n", status);
+ if (ex)
+ up_write(&osb->nfs_sync_rwlock);
+ else
+ up_read(&osb->nfs_sync_rwlock);
+ }
+
return status;
}
although the whole thing looks messy.
If the issue is a lifetime thing (like that commit says), the proper
model isn't a lock, but a refcount.
Oh well. Junxiao?
There is a block number embedded in nfs file handle, to verify it's an
inode, need acquire this nfs_sync_lock global lock to avoid any inode
removed from local node and other nodes in the cluster, before this
verify done, seemed no way to use a refcount.
Thanks,
Junxiao.
Linus