Re: Clarification on common tasks

Gandalf Corvotempesta <gandalf.corvotempesta@xxxxxxxxx> · Thu, 11 Aug 2016 15:43:54 +0200

2016-08-11 13:08 GMT+02:00 Lindsay Mathieson <lindsay.mathieson@xxxxxxxxx>:
> Also "gluster volume status" lists the pid's of all the bricks processes.

Ok, let's break everything., just to try.

This is a working cluster. I have 3 server with 1 brick each, in
replica 3, thus, all files are replicated on all hosts.

# gluster volume info

Volume Name: gv0
Type: Replicate
Volume ID: 2a36dc0f-1d9b-469c-82de-9d8d98321b83
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 1.2.3.112:/export/sdb1/brick
Brick2: 1.2.3.113:/export/sdb1/brick
Brick3: 1.2.3.114:/export/sdb1/brick
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
features.shard: off
features.shard-block-size: 10MB
performance.write-behind-window-size: 1GB
performance.cache-size: 1GB

I did this on a client:

# echo 'hello world' > hello
# md5sum hello
6f5902ac237024bdd0c176cb93063dc4  hello

Obviously, on node 1.2.3.112 I have it:

# cat /export/sdb1/brick/hello
hello world
# md5sum /export/sdb1/brick/hello
6f5902ac237024bdd0c176cb93063dc4  /export/sdb1/brick/hello

Let's break everything, this is funny.
I take the brick pid from here:
# gluster volume status | grep 112
Brick 1.2.3.112:/export/sdb1/brick      49152     0          Y       14315

# kill -9 14315

# gluster volume status | grep 112
Brick 1.2.3.112:/export/sdb1/brick      N/A       N/A        N       N/A

this should be like a dregraded cluster, right ?

Now I add a new file from the client:
echo "hello world, i'm degraded" > degraded

Obviously, this file is not replicated on node 1.2.3.112

# gluster volume heal gv0 info
Brick 1.2.3.112:/export/sdb1/brick
Status: Transport endpoint is not connected
Number of entries: -

Brick 1.2.3.113:/export/sdb1/brick
/degraded
/
Status: Connected
Number of entries: 2

Brick 1.2.3.114:/export/sdb1/brick
/degraded
/
Status: Connected
Number of entries: 2

This means that "/" dir and "/degraded" file should be healed from
.113 and .114 ?

Let's format the disk on .112
# umount /dev/sdb1
# mkfs.xfs /dev/sdb1 -f
meta-data=/dev/sdb1              isize=256    agcount=4, agsize=122094597 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=488378385, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=238466, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

Now I mount it again on the old place:

# mount /dev/sdb1 /export/sdb1

it's empty:
# ls /export/sdb1/ -la
total 4
drwxr-xr-x 2 root root    6 Aug 11 15:37 .
drwxr-xr-x 3 root root 4096 Jul  5 17:03 ..

I create the "brick" directory used by gluster:

# mkdir /export/sdb1/brick

Now I run the volume start force:

# gluster volume start gv0 force
volume start: gv0: success

But brick process is still down:

# gluster volume status | grep 112
Brick 1.2.3.112:/export/sdb1/brick      N/A       N/A        N       N/A

And now ?

What I really don't like is the use of "force" in "gluster volume start"
Usually (in all software) force is used when "bad things" are needed.
In this case, the volume start is mandatory, thus why I have to use
the force?
If the volume is already started, gluster should be smart enough to
start only the missing processes, without force, or, better, another
command should be created, something like: "gluster bricks start"
using the force means running dangerous operation, not a common
administrative task.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users