+ ocfs2-fix-qs_holds-may-could-not-be-zero.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: ocfs2: fix qs_holds may could not be zero
has been added to the -mm tree.  Its filename is
     ocfs2-fix-qs_holds-may-could-not-be-zero.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/ocfs2-fix-qs_holds-may-could-not-be-zero.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/ocfs2-fix-qs_holds-may-could-not-be-zero.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/SubmitChecklist when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Zhangyang <zhang.yangB@xxxxxxx>
Subject: ocfs2: fix qs_holds may could not be zero

In our test, We fond that when the network down, qs->qs_holds could not b=
e reduce to zero, it will lead to the node can't do fence.

o2net_idle_timer -> o2quo_conn_err -> qs->qs_holds++, after
O2NET_QUORUM_DE= LAY_MS if qs_holds could be subtract to zero, it could do
make_decision.

But if there are many nodes, when one node network down which contains
o2net connections may not do o2net_idle_timer at the same time.

So when a o2net_node have done nn->nn_still_up, but the qs_holds is not
zero.  because the other o2net_node have not done nn->nn_still_up.  So the
first o2net_node will do o2net_idle_timer again, and the qs_holds could be
add again.  And the qs_holds is global variable, so it formed a loop, the
node could not do o2quo_make_decision, because of qs_holds never be zero.

I alter the function o2quo_conn_err, take o2quo_set_hold under control of
t= he bit map qs_conn_bm.

Link: http://lkml.kernel.org/r/7F50894FD17BEC45AAC26E5BADA6CE330C60F99A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Yang Zhang <zhang.yangB@xxxxxxx>
Cc: Mark Fasheh <mfasheh@xxxxxxxxxxx>
Cc: Joel Becker <jlbec@xxxxxxxxxxxx>
Cc: Junxiao Bi <junxiao.bi@xxxxxxxxxx>
Cc: Joseph Qi <jiangqi903@xxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 fs/ocfs2/cluster/quorum.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff -puN fs/ocfs2/cluster/quorum.c~ocfs2-fix-qs_holds-may-could-not-be-zero fs/ocfs2/cluster/quorum.c
--- a/fs/ocfs2/cluster/quorum.c~ocfs2-fix-qs_holds-may-could-not-be-zero
+++ a/fs/ocfs2/cluster/quorum.c
@@ -314,13 +314,16 @@ void o2quo_conn_err(u8 node)
 				node, qs->qs_connected);
 
 		clear_bit(node, qs->qs_conn_bm);
+		/*
+		 * Bring set hold within this judgement, in order to avoid
+		 * qs_hold could not be zero.
+		 */
+		if (test_bit(node, qs->qs_hb_bm))
+			o2quo_set_hold(qs, node);
 	}
 
 	mlog(0, "node %u, %d total\n", node, qs->qs_connected);
 
-	if (test_bit(node, qs->qs_hb_bm))
-		o2quo_set_hold(qs, node);
-
 	spin_unlock(&qs->qs_lock);
 }
 
_

Patches currently in -mm which might be from zhang.yangB@xxxxxxx are

ocfs2-fix-qs_holds-may-could-not-be-zero.patch

--
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux