Re: gnfs split brain when 1 server in 3x1 down (high load) - help request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/04/20 4:59 am, Erik Jacobson wrote:
Apologies for misinterpreting the backtrace.

#0  afr_read_txn_refresh_done (frame=0x7ffcf4146478,
this=0x7fff64013720, err=5) at afr-read-txn.c:312
#1  0x00007fff68938d2b in afr_txn_refresh_done
(frame=frame@entry=0x7ffcf4146478, this=this@entry=0x7fff64013720,
err=5, err@entry=0)
      at afr-common.c:1222
Sorry, I missed this too.
(gdb) print event_generation
$3 = 0

(gdb) print priv->fav_child_policy
$4 = AFR_FAV_CHILD_NONE

I am not sure what this signifies though.  It appears to be a read
transaction with no event generation and no favorite child policy.

Feel free to ask for clarification in case my thought process went awry
somewhere.

Favorite child policy is only for automatically resolving split-brains and is 0 unless that volume option is set. The problem is indeed that event_generation is zero. Could you try to apply this logging patch and see if afr_inode_event_gen_reset() for that gfid is hit or if afr_access() has a zero event_gen to begin with?

Thanks,

Ravi





diff --git a/xlators/cluster/afr/src/afr-common.c b/xlators/cluster/afr/src/afr-common.c
index 4bfaef9e8..61f21795e 100644
--- a/xlators/cluster/afr/src/afr-common.c
+++ b/xlators/cluster/afr/src/afr-common.c
@@ -750,6 +750,8 @@ afr_inode_event_gen_reset(inode_t *inode, xlator_t *this)
 
     GF_VALIDATE_OR_GOTO(this->name, inode, out);
 
+    gf_msg_callingfn(this->name, GF_LOG_ERROR, 0, AFR_MSG_SPLIT_BRAIN,
+                     "Resetting event gen for %s", uuid_utoa(inode->gfid));
     LOCK(&inode->lock);
     {
         ret = __afr_inode_event_gen_reset(inode, this);
diff --git a/xlators/cluster/afr/src/afr-inode-read.c b/xlators/cluster/afr/src/afr-inode-read.c
index 9204add5b..5ac83d6c8 100644
--- a/xlators/cluster/afr/src/afr-inode-read.c
+++ b/xlators/cluster/afr/src/afr-inode-read.c
@@ -172,6 +172,12 @@ afr_access(call_frame_t *frame, xlator_t *this, loc_t *loc, int mask,
     if (xdata)
         local->xdata_req = dict_ref(xdata);
 
+    if (local->event_generation == 0)
+        gf_msg(this->name, GF_LOG_ERROR, 0, AFR_MSG_SPLIT_BRAIN,
+               "Event gen is zero for %s(%s)", local->loc.name,
+               local->loc.inode->gfid ? uuid_utoa(local->loc.inode->gfid)
+                                      : "NULL");
+
     afr_read_txn(frame, this, loc->inode, afr_access_wind,
                  AFR_METADATA_TRANSACTION);
 
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux