Errors in gf_defrag_handle_migrate_error() ?

Vijay Bellur <vbellur@xxxxxxxxxx> · Sat, 03 Jan 2015 17:59:16 +0530

gf_defrag_handle_migrate_error() seems broken at the moment.

        /* if errno is not ENOSPC or ENOTCONN, we can still continue
           with rebalance process */
        if ((op_errno != ENOSPC) || (op_errno != ENOTCONN))
                return 1;

        if (op_errno == ENOTCONN) {
                /* Most probably mount point went missing (mostly due
                   to a brick down), say rebalance failure to user,
                   let him restart it if everything is fine */
                defrag->defrag_status = GF_DEFRAG_STATUS_FAILED;
                return -1;
        }

        if (op_errno == ENOSPC) {
                /* rebalance process itself failed, may be
                   remote brick went down, or write failed due to
                   disk full etc etc.. */
                defrag->defrag_status = GF_DEFRAG_STATUS_FAILED;
                return -1;
        }

        return 0;

The first if condition should have a logical AND instead of an OR. It 
looks like we will still continue (return 1) if op_errno happens to be 
anything, even if it is ENOSPC or ENOTCONN. Do we need any special 
handling for ENOTCONN & ENOSPC? We anyway abort a rebalance process if 
we were to get a CHILD_DOWN and that should handle errors with op_errno 
ENOTCONN. If a write fails because of ENOSPC in one brick, shouldn't we 
let rebalance continue and migrate data to other non-full bricks? That 
seems to be the intent in gf_defrag_migrate_data() where we update the 
skipped counter if an ENOSPC is encountered. If we don't need to accord 
special behavior to these errnos, this function can be retired.

-Vijay

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel