On 04/04/20 9:12 pm, Erik Jacobson wrote:
This leaves us with afr_quorum_errno() returning the error. afr_final_errno() iterates through the 'children', looking for valid errors within the replies for the transaction (refresh transaction?). The function returns the highest valued error, which must be EIO (value of 5) in this case. I have not looked into how or what would set the error value in the replies array,
The errror numbers that you see in the replies array in afr_final_errno() are set in afr_inode_refresh_subvol_cbk().
During inode refresh (which is essentially a lookup), AFR sends the the lookup request on all its connected children and the replies from each one of them are captured in afr_inode_refresh_subvol_cbk(). So adding a log here can identify if we got EIO from any of its children. See attached patch for an example.
After we hear from all children, afr_inode_refresh_subvol_cbk() then calls afr_inode_refresh_done()-->afr_txn_refresh_done()-->afr_read_txn_refresh_done(). But you already know this flow now.
diff --git a/xlators/cluster/afr/src/afr-common.c b/xlators/cluster/afr/src/afr-common.c index 4bfaef9e8..096ce06f0 100644 --- a/xlators/cluster/afr/src/afr-common.c +++ b/xlators/cluster/afr/src/afr-common.c @@ -1318,6 +1318,12 @@ afr_inode_refresh_subvol_cbk(call_frame_t *frame, void *cookie, xlator_t *this, if (xdata) local->replies[call_child].xdata = dict_ref(xdata); } + if (op_ret == -1) + gf_msg_callingfn( + this->name, GF_LOG_ERROR, op_errno, AFR_MSG_SPLIT_BRAIN, + "Inode refresh on child:%d failed with errno:%d for %s(%s) ", + call_child, op_errno, local->loc.name, + uuid_utoa(local->loc.inode->gfid)); if (xdata) { ret = dict_get_int8(xdata, "link-count", &need_heal); local->replies[call_child].need_heal = need_heal;
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users