brick online or not? Don't trust 'gluster peer status'

hjmangalam at gmail.com (Harry Mangalam) · Mon, 6 Aug 2012 13:13:11 -0700

As a final(?) follow-up to my problem, after restarting the rebalance with:

 gluster volume rebalance [vol-name] fix-layout start

it finished up last night after plowing thru the entirety of the filesystem
- fixing about ~1M files (apparently ~2.2TB), all while the fs remained
live (tho probably a bit slower than users would have liked).   That's a
strong '+' in the gluster column for resiliency.

I started the rebalance without waiting for any advice to the contrary.
 3.3 is supposed to have a built-in rebalance operator, but I saw no
evidence of it.  Other info from gluster.org suggested that it wouldn't do
any harm to do this, so I went ahead and started it.  Do the gluster
wizards have any final words on this before I write this up in our trouble
report?

best wishes
harry

On Thu, Aug 2, 2012 at 4:37 PM, Harry Mangalam <hjmangalam at gmail.com> wrote:

> Further to what I wrote before:
> gluster server overload; recovers, now "Transport endpoint is not
> connected" for some files
> <http://goo.gl/CN6ud>
>
> I'm getting conflicting info here.  On one hand, the peer that had its
> glusterfsd  lock up seems to be in the gluster system, according to
> the frequently referenced 'gluster peer status'
>
> Thu Aug 02 15:48:46 [1.00 0.89 0.92]  root at pbs1:~
> 729 $ gluster peer status
> Number of Peers: 3
>
> Hostname: pbs4ib
> Uuid: 2a593581-bf45-446c-8f7c-212c53297803
> State: Peer in Cluster (Connected)
>
> Hostname: pbs2ib
> Uuid: 26de63bd-c5b7-48ba-b81d-5d77a533d077
> State: Peer in Cluster (Connected)
>
> Hostname: pbs3ib
> Uuid: c79c4084-d6b9-4af9-b975-40dd6aa99b42
> State: Peer in Cluster (Connected)
>
> On the other hand, some errors that I provided yesterday:
> ===================================================
> [2012-08-01 18:07:26.104910] W
> [dht-selfheal.c:875:dht_selfheal_directory] 0-gli-dht: 1 subvolumes
> down -- not fixing
> ===================================================
>
> as well as this information:
> $ gluster volume status all detail
>
> [top 2 brick stanzas trimmed; they're online]
>
> ------------------------------------------------------------------------------
> Brick                : Brick pbs3ib:/bducgl
> Port                 : 24018
> Online               : N                   <<=====================
> Pid                  : 20953
> File System          : xfs
> Device               : /dev/md127
> Mount Options        : rw
> Inode Size           : 256
> Disk Space Free      : 6.1TB
> Total Disk Space     : 8.2TB
> Inode Count          : 1758158080
> Free Inodes          : 1752326373
>
> ------------------------------------------------------------------------------
> Brick                : Brick pbs4ib:/bducgl
> Port                 : 24009
> Online               : Y
> Pid                  : 20948
> File System          : xfs
> Device               : /dev/sda
> Mount Options        : rw
> Inode Size           : 256
> Disk Space Free      : 4.6TB
> Total Disk Space     : 6.4TB
> Inode Count          : 1367187392
> Free Inodes          : 1361305613
>
> The above implies fairly strongly that the brick did not re-establish
> connection to the volume, altho the gluster peer info did.
>
> Strangely enough, when I RE-restarted the glusterd, it DID come back
> and re-joined the gluster volume and now the (restarted) fix-layout
> job is proceeding without those  "subvolumes
> down -- not fixing" errors, just a steady stream of 'found
> anomalies/fixing the layout' messages, tho at the rate that it's going
> it looks like it will take several days.
>
> Still better several days to fix the data on-disk and having the fs
> live than having to tell users that their data is gone and then having
> to rebuild from zero.  Luckily, it's officially a /scratch filesystem.
>
> Harry
>
> --
> Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
> [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
> 415 South Circle View Dr, Irvine, CA, 92697 [shipping]
> MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
>

-- 
Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
[m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
415 South Circle View Dr, Irvine, CA, 92697 [shipping]
MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20120806/58799db7/attachment.htm>