[PATCH v6.4-rc2 2/5] fs: dlm: fix cleanup pending ops when interrupted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This patch mainly reverts what commit b92a4e3f86b1 ("fs: dlm: change posix
lock sigint handling") introduced. Except two things, checking if
op->done got true under ops_lock after it got interrupted and changing
"no op" messages to debug printout.

There is currently problems with cleaning up pending operations. The
main idea of commit b92a4e3f86b1 ("fs: dlm: change posix lock sigint
handling") was to wait for a reply and if it was interrupted then the
cleanup routine e.g. list_del(), do_unlock_close() will be executed.

This requires that for every dlm op request a answer in dev_write()
comes back. The cleanup routine do_unlock_close() is not operating in
the dlm user space software on a per request basis and will cleanup
everything else what matches certain plock op fields which concludes
that we don't get anymore for all request a result back. This will
have some leftovers inside the dlm plock recv_list which will never
being deleted.

It was confirmed with a new debugfs entry to look if some plock lists
have still entries left when there is no posix lock activity, checked
by dlm_tool plocks $LS, ongoing anymore. In the specific testcase on
a gfs2 mountpoint the following command was executed:

stress-ng --fcntl 32

and the stress-ng program was killed after certain time.

Due the fact that do_unlock_close() cleans more than just a specific
operation and the dlm operation is already removed by list_del(). This
list_del() can either be operating on send_list or recv_list. If it hits
recv_list it still can be that answers coming back for an ongoing
operation and do_unlock_close() is not synchronized with the list_del().
This will end in "no op ..." log_print(), to not confuse the user about
such issues which seems to be there by design we move this logging
information to pr_debug() as those are expected log messages.

Cc: stable@xxxxxxxxxxxxxxx
Fixes: b92a4e3f86b1 ("fs: dlm: change posix lock sigint handling")
Signed-off-by: Alexander Aring <aahringo@xxxxxxxxxx>
---
 fs/dlm/plock.c | 25 ++++++-------------------
 1 file changed, 6 insertions(+), 19 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index ff364901f22b..fea2157fac5b 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -30,8 +30,6 @@ struct plock_async_data {
 struct plock_op {
 	struct list_head list;
 	int done;
-	/* if lock op got interrupted while waiting dlm_controld reply */
-	bool sigint;
 	struct dlm_plock_info info;
 	/* if set indicates async handling */
 	struct plock_async_data *data;
@@ -167,12 +165,14 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 			spin_unlock(&ops_lock);
 			goto do_lock_wait;
 		}
-
-		op->sigint = true;
+		list_del(&op->list);
 		spin_unlock(&ops_lock);
+
 		log_debug(ls, "%s: wait interrupted %x %llx pid %d",
 			  __func__, ls->ls_global_id,
 			  (unsigned long long)number, op->info.pid);
+		do_unlock_close(&op->info);
+		dlm_release_plock_op(op);
 		goto out;
 	}
 
@@ -434,19 +434,6 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
 		if (iter->info.fsid == info.fsid &&
 		    iter->info.number == info.number &&
 		    iter->info.owner == info.owner) {
-			if (iter->sigint) {
-				list_del(&iter->list);
-				spin_unlock(&ops_lock);
-
-				pr_debug("%s: sigint cleanup %x %llx pid %d",
-					  __func__, iter->info.fsid,
-					  (unsigned long long)iter->info.number,
-					  iter->info.pid);
-				do_unlock_close(&iter->info);
-				memcpy(&iter->info, &info, sizeof(info));
-				dlm_release_plock_op(iter);
-				return count;
-			}
 			list_del_init(&iter->list);
 			memcpy(&iter->info, &info, sizeof(info));
 			if (iter->data)
@@ -465,8 +452,8 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
 		else
 			wake_up(&recv_wq);
 	} else
-		log_print("%s: no op %x %llx", __func__,
-			  info.fsid, (unsigned long long)info.number);
+		pr_debug("%s: no op %x %llx", __func__,
+			 info.fsid, (unsigned long long)info.number);
 	return count;
 }
 
-- 
2.31.1




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux