On 01/03/2015 05:59 PM, Vijay Bellur wrote:
gf_defrag_handle_migrate_error() seems broken at the moment. /* if errno is not ENOSPC or ENOTCONN, we can still continue with rebalance process */ if ((op_errno != ENOSPC) || (op_errno != ENOTCONN)) return 1; if (op_errno == ENOTCONN) { /* Most probably mount point went missing (mostly due to a brick down), say rebalance failure to user, let him restart it if everything is fine */ defrag->defrag_status = GF_DEFRAG_STATUS_FAILED; return -1; } if (op_errno == ENOSPC) { /* rebalance process itself failed, may be remote brick went down, or write failed due to disk full etc etc.. */ defrag->defrag_status = GF_DEFRAG_STATUS_FAILED; return -1; } return 0; The first if condition should have a logical AND instead of an OR. It looks like we will still continue (return 1) if op_errno happens to be anything, even if it is ENOSPC or ENOTCONN. Do we need any special handling for ENOTCONN & ENOSPC? We anyway abort a rebalance process if we were to get a CHILD_DOWN and that should handle errors with op_errno ENOTCONN. If a write fails because of ENOSPC in one brick, shouldn't we let rebalance continue and migrate data to other non-full bricks? That seems to be the intent in gf_defrag_migrate_data() where we update the skipped counter if an ENOSPC is encountered. If we don't need to accord special behavior to these errnos, this function can be retired.
Sent [1] to rfc queue. Please check and let me know your comments. -Vijay [1] http://review.gluster.org/9384 _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel