I'm running 3.3b3 on a 5brick/Ubuntu 10.04.4 system with mixed IPoIB/GbE. It's behaving well other than the current problem. The gluster filesystem is live and being used lightly by our cluster. Note that the gli volume has 2 bricks on pbs2ib. I'm trying to clear the smaller brick in preparation to replace the disks with larger ones. ===== root at pbs1:/var/log/glusterfs# gluster volume info Volume Name: gli Type: Distribute Volume ID: 76cc5e88-0ac4-42ac-a4a3-31bf2ba611d4 Status: Started Number of Bricks: 5 Transport-type: tcp,rdma Bricks: Brick1: pbs1ib:/bducgl Brick2: pbs2ib:/bducgl <--- to remain Brick3: pbs2ib:/bducgl1 <--- to be removed Brick4: pbs3ib:/bducgl Brick5: pbs4ib:/bducgl Options Reconfigured: performance.io-cache: on performance.quick-read: on performance.io-thread-count: 64 ===== 'df' reports the brick (on /bducgl1) has 1265060072 KB: ===== root at pbs2:~# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb1 28959736 15104536 12384124 55% / .. /dev/md0 8788707776 1524178616 7264529160 18% /bducgl /dev/sda 1952129740 1265060072 687069668 65% /bducgl1 ^^^^^^^^^^ ===== (incidentally, this number '1265060072' does not change when as files are being removed, even as the files are being removed - ie files that the log say are being removed are no longer visible on the brick filesystem - is this expected?) However the remove-brick operation has been going for about a day and reports having moved 1,369,285,939,442 bytes ===== root at pbs1: # gluster volume remove-brick gli pbs2ib:/bducgl1 status Node Rebalanced-files size scanned status --------- ----------- ----------- ----------- ------------ localhost 90 189616 87639 not started pbs4ib 0 0 0 not started pbs3ib 0 0 0 not started pbs2ib 8617041369285939442 2941430 in progress ^^^^^^^^^^^^^ ===== This is more than even 1265060072*1024=1.29542151373e+12 bytes, so I'm wondering when/if this process is going to end..? If I examine the 'gli-rebalance.log', I am still getting log entries like this (at about 1/sec - I would have expected considerably faster) [2012-05-21 14:27:31.629995] I [dht-rebalance.c:854:dht_migrate_file] 0-gli-dht: completed migration of /alamng/Research/Scheraga/F8i1m/set15/Fig8_int1_template_Hamil_set15.dat from subvolume gli-client-2 to gli-client-1 so migration appears to be happening and at repeated 'status' updates, the numbers change, but why is the gluster byte info so different from the 'df' info? And is there any way to get an idea of when the process will end? The 'scanned' column number also is increasing so it's obviously not the total number of files to be moved. After writing most of this note, this is the status about 10m later: # gluster volume remove-brick gli pbs2ib:/bducgl1 status Node Rebalanced-files size scanned status --------- ----------- ----------- ----------- --------- localhost 90 189616 87639 not started pbs4ib 0 0 0 not started pbs3ib 0 0 0 not started pbs2ib 8780091379699182236 2994733 in progress -- Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine [ZOT 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 415 South Circle View Dr, Irvine, CA, 92697 [shipping] MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps) -- -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://gluster.org/pipermail/gluster-users/attachments/20120521/6e7a1f3f/attachment-0001.htm>