'remove-brick' is removing more bytes than are in the brick(?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Installed the qa42 version on servers and clients and under load, it 
worked as advertised (tho of course more slowly than I would have 
liked :)) - removed ~1TB in just under 24 hr (on a DDR-IB connected 4 
node set) ~ 40MB/s overall tho there were a huge number of tiny files.

The remove-brick cleared the brick (~1TB), tho with an initial set of 
120 failures (what does this mean?)

      Rebalanced
  Node   files         size   scanned failures      status   timestamp
------ ------- ------------  --------  ----    -----------   
---------------
pbs2ib   15676  69728541188    365886   120    in progress	 May 
22 17:33:14
pbs2ib   24844 134323243354    449667   120    in progress       May 
22 18:08:56 
pbs2ib   37937 166673066147    714175   120    in progress	 May 
22 19:08:21 
pbs2ib   42014 173145657374    806556   120    in progress	 May 
22 19:33:21
pbs2ib  418842 222883965887   5729324   120    in progress	 May 
23 07:15:19
pbs2ib  419148 222907742889   5730903   120    in progress	 May 
23 07:16:26
pbs2ib  507375 266212060954   6192573   120    in progress	 May 
23 09:48:05
pbs2ib  540201 312712114570   6325234   120    in progress	 May 
23 11:15:51
pbs2ib  630332 416533679754   6633562   120    in progress	 May 
23 14:24:16
pbs2ib  644156 416745820627   6681746   120    in progress	 May 
23 14:45:44
pbs2ib  732989 432162450646   7024331   120      completed	 May 
23 17:26:20

(sorry for any wrapping)

and finally deleted the files:
root at pbs2:~
404 $ df -h
Filesystem  Size  Used Avail Use% Mounted on
/dev/md0    8.2T 1010G  7.2T  13% /bducgl   <- retained brick
/dev/sda    1.9T  384M  1.9T   1% /bducgl1  <- removed brick

altho it left the directory skeleton (is this a bug or feature?):

root at pbs2:/bducgl1
406 $ ls
aajames   aelsadek  amentes   avuong1   btatevos  chiaoyic  dbecerra
aamelire  aganesan  anasr     awaring   bvillac   clarkap   dbkeator
aaskariz  agkent    anhml     balakire  calvinjs  cmarcum   dcs     
abanaiya  agold     argardne  bgajare   casem     cmarkega  dcuccia 
aboessen  ahnsh     arup      biggsj    cbatmall  courtnem  detwiler
abondar   aihler    asidhwa   binz      cesar     crex      dgorur  
abraatz   aisenber  asuncion  bjanakal  cestark   cschen    dhealion
abriscoe  akatha    atenner   blind     cfalvo    culverr   dkyu    
abusch    alai2     atfrank   blutes    cgalasso  daliz     dmsmith 
acohan    alamng    athina    bmmiller  cgarner   daniel    dmvuong 
acstern   allisons  athsu     bmobashe  chadwicr  dariusa   dphillip
ademirta  almquist  aveidlab  brentm    changd1   dasher    dshanthi
<etc>

And once completed with the 'commit' command, it no longer reports the 
brick as part of the volume:
$ gluster volume info gli
Volume Name: gli
Type: Distribute
Volume ID: 76cc5e88-0ac4-42ac-a4a3-31bf2ba611d4
Status: Started
Number of Bricks: 4
Transport-type: tcp,rdma
Bricks:
Brick1: pbs1ib:/bducgl
Brick2: pbs2ib:/bducgl 
                       <-- no more pbs2ib:/bducgl 
Brick3: pbs3ib:/bducgl
Brick4: pbs4ib:/bducgl
Options Reconfigured:
performance.io-cache: on
performance.quick-read: on
performance.io-thread-count: 64

And no longer reports the removed brick as part of the gluster volume:
$ gluster volume status
Status of volume: gli
Gluster process          Port    Online  Pid
-----------------------------------------------
Brick pbs1ib:/bducgl     24016   Y       10770
Brick pbs2ib:/bducgl     24025   Y       1788
Brick pbs3ib:/bducgl     24018   Y       20953
Brick pbs4ib:/bducgl     24009   Y       20948


So this was a big improvement over the previous trial.  the only 
glitches were the 120 failures (which mean...?) and the directory 
skeleton left on the removed brick which may be a feature..?

So it seems to have been fixed in qa42.

thanks!

hjm




On Tuesday 22 May 2012 00:02:02 Amar Tumballi wrote:
> > pbs2ib 8780091379699182236 2994733 in progress
> 
> Hi Harry,
> 
> Can you please test once again with 'glusterfs-3.3.0qa42' and
> confirm the behavior? This seems like a bug (suspect it to be some
> overflow type of bug, not sure yet). Please help us with opening a
> bug report, meantime, we will investigate on this issue.
> 
> Regards,
> Amar

-- 
Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
[ZOT 2225] / 92697  Google Voice Multiplexer: (949) 478-4487
415 South Circle View Dr, Irvine, CA, 92697 [shipping]
MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20120524/38d0065c/attachment-0001.htm>


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux