Hi, I'm currently removing a few bricks from a distributed dispersed volume using gluster volume remove-brick, I'm running GLusterFS 6.6. It triggered a rebalance that is supposed to remove the data from the bricks. Today in the morning, it had ~50.000 failures on each server. I found a whole bunch of log entries like this: [2020-02-17 10:02:47.971011] I [dht-rebalance.c:1589:dht_migrate_file] 0-OMICS-dht: $FILE: attempting to move from OMICS-disperse-0 to OMICS-disperse- 10 [2020-02-17 10:02:47.997915] W [MSGID: 0] [dht-rebalance.c:1026:__dht_check_free_space] 0-OMICS-dht: Write will cross min-free-disk for file - $FILE on subvol - OMICS-disperse-10. Looking for new subvol [2020-02-17 10:02:47.997970] I [MSGID: 0] [dht-rebalance.c:1082:__dht_check_free_space] 0-OMICS-dht: new target found - OMICS-disperse-1 for file - $FILE [2020-02-17 10:02:48.192873] I [MSGID: 0] [dht-rebalance.c:1788:dht_migrate_file] 0-OMICS-dht: destination for file - $FILE is changed to - OMICS- disperse-1 [2020-02-17 10:02:48.407606] E [MSGID: 109023] [dht-rebalance.c:2055:dht_migrate_file] 0-OMICS-dht: failed to set xattr on $FILE in OMICS-disperse-10 [Operation not supported] [2020-02-17 10:02:48.414374] E [MSGID: 109023] [dht-rebalance.c:2874:gf_defrag_migrate_single_file] 0-OMICS-dht: migrate-data failed for $FILE [Operation not supported] The bricks for subvol disperse-10 have indeed hit 90% during the rebalance. subvol disperse-1 is way lower. If I look for $FILE on the bricks, I find copies on both subvol disperse-0 and subvol disperse-1, and those on subvol disperse-1 look weird (brick 0100 and 0101 belong to subvol disperse-0, brick 0102 and 0103 are part of subvol disperse-1): # ls -lah $BRICKS/$FILE -rw-r--r-- 2 $USER $GROUP 3.5K Feb 13 07:47 $BRICK0100/$FILE -rw-r--r-- 2 $USER $GROUP 3.5K Feb 13 07:47 $BRICK0101/$FILE -rw-r--r-- 2 $USER $GROUP 0 Feb 17 11:02 $BRICK0102/$FILE -rw-r--r-- 2 $USER $GROUP 0 Feb 17 11:02 $BRICK0103/$FILE This doesn't look like a linkfile. Some of those files are empty on client side, some aren't. But since those aren't my files, I can't tell for sure whether they are supposed to look empty. The empty ones report a file size of 0 (du -h $FILE) from client side, but they do have a size (and content) on server side in their original subvolume, so I'm guessing they shouldn'd be empty :( I stopped the remove-brick operation, this looked weird. Is this supposed to happen? Or is the reblance screwing up when trying to move things to a brick that's already full? I'm removing the subvolume disperse-12. Is it intended that data from subvol disperse-0 is being moved? Should I open a bug report? And, most importantly, are those weird non-linkfile-but-empty-files going to be a problem and if yes, how do I get rid of them safely? Can I restore the content of those files that are currently shown as empty? Thanks in advance and kind regards, Gudrun Amedick
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
________ Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/441850968 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users