gf_defrag_handle_migrate_error() seems broken at the moment.
/* if errno is not ENOSPC or ENOTCONN, we can still continue
with rebalance process */
if ((op_errno != ENOSPC) || (op_errno != ENOTCONN))
return 1;
if (op_errno == ENOTCONN) {
/* Most probably mount point went missing (mostly due
to a brick down), say rebalance failure to user,
let him restart it if everything is fine */
defrag->defrag_status = GF_DEFRAG_STATUS_FAILED;
return -1;
}
if (op_errno == ENOSPC) {
/* rebalance process itself failed, may be
remote brick went down, or write failed due to
disk full etc etc.. */
defrag->defrag_status = GF_DEFRAG_STATUS_FAILED;
return -1;
}
return 0;
The first if condition should have a logical AND instead of an OR. It
looks like we will still continue (return 1) if op_errno happens to be
anything, even if it is ENOSPC or ENOTCONN. Do we need any special
handling for ENOTCONN & ENOSPC? We anyway abort a rebalance process if
we were to get a CHILD_DOWN and that should handle errors with op_errno
ENOTCONN. If a write fails because of ENOSPC in one brick, shouldn't we
let rebalance continue and migrate data to other non-full bricks? That
seems to be the intent in gf_defrag_migrate_data() where we update the
skipped counter if an ENOSPC is encountered. If we don't need to accord
special behavior to these errnos, this function can be retired.
-Vijay
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel