remove-brick seems to delete file content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm currently removing a few bricks from a distributed dispersed volume using gluster volume remove-brick, I'm running GLusterFS 6.6. It triggered a
rebalance that is supposed to remove the data from the bricks. Today in the morning, it had ~50.000 failures on each server. I found a whole bunch of
log entries like this:

[2020-02-17 10:02:47.971011] I [dht-rebalance.c:1589:dht_migrate_file] 0-OMICS-dht: $FILE: attempting to move from OMICS-disperse-0 to OMICS-disperse-
10
[2020-02-17 10:02:47.997915] W [MSGID: 0] [dht-rebalance.c:1026:__dht_check_free_space] 0-OMICS-dht: Write will cross min-free-disk for file - $FILE
on subvol - OMICS-disperse-10. Looking for new subvol
[2020-02-17 10:02:47.997970] I [MSGID: 0] [dht-rebalance.c:1082:__dht_check_free_space] 0-OMICS-dht: new target found - OMICS-disperse-1 for file -
$FILE
[2020-02-17 10:02:48.192873] I [MSGID: 0] [dht-rebalance.c:1788:dht_migrate_file] 0-OMICS-dht: destination for file - $FILE is changed to - OMICS-
disperse-1
[2020-02-17 10:02:48.407606] E [MSGID: 109023] [dht-rebalance.c:2055:dht_migrate_file] 0-OMICS-dht: failed to set xattr on $FILE in OMICS-disperse-10
[Operation not supported]
[2020-02-17 10:02:48.414374] E [MSGID: 109023] [dht-rebalance.c:2874:gf_defrag_migrate_single_file] 0-OMICS-dht: migrate-data failed for $FILE
[Operation not supported]

The bricks for subvol disperse-10 have indeed hit 90% during the rebalance. subvol disperse-1 is way lower.

If I look for $FILE on the bricks, I find copies on both subvol disperse-0 and subvol disperse-1, and those on subvol disperse-1 look weird (brick
0100 and 0101 belong to subvol disperse-0, brick 0102 and 0103 are part of subvol disperse-1):

# ls -lah $BRICKS/$FILE
-rw-r--r-- 2 $USER $GROUP 3.5K Feb 13 07:47 $BRICK0100/$FILE
-rw-r--r-- 2 $USER $GROUP 3.5K Feb 13 07:47 $BRICK0101/$FILE
-rw-r--r-- 2 $USER $GROUP    0 Feb 17 11:02 $BRICK0102/$FILE
-rw-r--r-- 2 $USER $GROUP    0 Feb 17 11:02 $BRICK0103/$FILE

This doesn't look like a linkfile.

Some of those files are empty on client side, some aren't. But since those aren't my files, I can't tell for sure whether they are supposed to look
empty. The empty ones report a file size of 0 (du -h $FILE) from client side, but they do have a size (and content) on server side in their original
subvolume, so I'm guessing they shouldn'd be empty :(

I stopped the remove-brick operation, this looked weird. Is this supposed to happen? Or is the reblance screwing up when trying to move things to a
brick that's already full? 
I'm removing the subvolume disperse-12. Is it intended that data from subvol disperse-0 is being moved?
Should I open a bug report? 
And, most importantly, are those weird non-linkfile-but-empty-files going to be a problem and if yes, how do I get rid of them safely? Can I restore
the content of those files that are currently shown as empty?


Thanks in advance and kind regards,

Gudrun Amedick

Attachment: smime.p7s
Description: S/MIME cryptographic signature

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux