Re: [glusterfs-3.6.0beta3-0.11.gitd01b00a] gluster volume status is running even though the Disk is detached

Kiran Patil <kiran@xxxxxxxxxxxxx> · Fri, 31 Oct 2014 15:16:21 +0530

I set zfs pool failmode to continue, which should disable only write and not read as explained below
failmode=wait | continue | panic

           Controls the system behavior in the event of catastrophic pool failure. This condition is typically a result of a loss of connec-
           tivity  to  the underlying storage device(s) or a failure of all devices within the pool. The behavior of such an event is deter-
           mined as follows:

           wait        Blocks all I/O access until the device connectivity is recovered and the errors are  cleared.  This  is  the  default
                       behavior.

           continue    Returns  EIO  to  any  new  write  I/O  requests  but allows reads to any of the remaining healthy devices. Any write
                       requests that have yet to be committed to disk would be blocked.

           panic       Prints out a message to the console and generates a system crash dump.

Now, I rebuilt the glusterfs master and tried to see if failed driver results in failed brick and in turn kill brick process and the brick is not going offline.

# gluster volume status
Status of volume: repvol
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 192.168.1.246:/zp1/brick1				49152	Y	2400
Brick 192.168.1.246:/zp2/brick2				49153	Y	2407
NFS Server on localhost					2049	Y	30488
Self-heal Daemon on localhost				N/A	Y	30495

Task Status of Volume repvol
------------------------------------------------------------------------------
There are no active volume tasks

The /var/log/gluster/mnt.log output:

[2014-10-31 09:18:15.934700] W [rpc-clnt-ping.c:154:rpc_clnt_ping_cbk] 0-repvol-client-1: socket disconnected
[2014-10-31 09:18:15.934725] I [client.c:2215:client_rpc_notify] 0-repvol-client-1: disconnected from repvol-client-1. Client process will keep trying to connect to glusterd until brick's port is available
[2014-10-31 09:18:15.935238] I [rpc-clnt.c:1765:rpc_clnt_reconfig] 0-repvol-client-1: changing port to 49153 (from 0)

Now if I copy a file to /mnt it copied without any hang and brick still shows online.

Thanks,
Kiran.

On Tue, Oct 28, 2014 at 3:44 PM, Niels de Vos <ndevos@xxxxxxxxxx> wrote:
On Tue, Oct 28, 2014 at 02:08:32PM +0530, Kiran Patil wrote:

> The content of file zp2-brick2.log is at http://ur1.ca/iku0l (

> http://fpaste.org/145714/44849041/ )

>

> I can't open the file /zp2/brick2/.glusterfs/health_check since it hangs

> due to no disk present.

>

> Let me know the filename pattern, so that I can find it.

Hmm, if there is a hang while reading from the disk, it will not get

detected in the current solution. We implemented failure detection on

top of the detection that is done by the filesystem. Suspending a

filesystem with fsfreeze or similar should probably not be seen as a

failure.

In your case, it seems that the filesystem suspends itself when the disk

went away. I have no idea if it is possible to configure ZFS to not

suspend, but return an error to the reading/writing application. Please

check with such an option.

If you find such an option, please update the wiki page and recommend

enabling it:

- http://gluster.org/community/documentation/index.php/GlusterOnZFS

Thanks,

Niels

>

> On Tue, Oct 28, 2014 at 1:42 PM, Niels de Vos <ndevos@xxxxxxxxxx> wrote:

>

> > On Tue, Oct 28, 2014 at 01:10:56PM +0530, Kiran Patil wrote:

> > > I applied the patches, compiled and installed the gluster.

> > >

> > > # glusterfs --version

> > > glusterfs 3.7dev built on Oct 28 2014 12:03:10

> > > Repository revision: git://git.gluster.com/glusterfs.git

> > > Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/>

> > > GlusterFS comes with ABSOLUTELY NO WARRANTY.

> > > It is licensed to you under your choice of the GNU Lesser

> > > General Public License, version 3 or any later version (LGPLv3

> > > or later), or the GNU General Public License, version 2 (GPLv2),

> > > in all cases as published by the Free Software Foundation.

> > >

> > > # git log

> > > commit 990ce16151c3af17e4cdaa94608b737940b60e4d

> > > Author: Lalatendu Mohanty <lmohanty@xxxxxxxxxx>

> > > Date:   Tue Jul 1 07:52:27 2014 -0400

> > >

> > >     Posix: Brick failure detection fix for ext4 filesystem

> > > ...

> > > ...

> > >

> > > I see below messages

> >

> > Many thanks Kiran!

> >

> > Do you have the messages from the brick that uses the zp2 mountpoint?

> >

> > There also should be a file with a timestamp when the last check was

> > done successfully. If the brick is still running, this timestamp should

> > get updated every storage.health-check-interval seconds:

> >     /zp2/brick2/.glusterfs/health_check

> >

> > Niels

> >

> > >

> > > File /var/log/glusterfs/etc-glusterfs-glusterd.vol.log :

> > >

> > > The message "I [MSGID: 106005]

> > > [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick

> > > 192.168.1.246:/zp2/brick2 has disconnected from glusterd." repeated 39

> > > times between [2014-10-28 05:58:09.209419] and [2014-10-28

> > 06:00:06.226330]

> > > [2014-10-28 06:00:09.226507] W [socket.c:545:__socket_rwv] 0-management:

> > > readv on /var/run/6154ed2845b7f728a3acdce9d69e08ee.socket failed (Invalid

> > > argument)

> > > [2014-10-28 06:00:09.226712] I [MSGID: 106005]

> > > [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick

> > > 192.168.1.246:/zp2/brick2 has disconnected from glusterd.

> > > [2014-10-28 06:00:12.226881] W [socket.c:545:__socket_rwv] 0-management:

> > > readv on /var/run/6154ed2845b7f728a3acdce9d69e08ee.socket failed (Invalid

> > > argument)

> > > [2014-10-28 06:00:15.227249] W [socket.c:545:__socket_rwv] 0-management:

> > > readv on /var/run/6154ed2845b7f728a3acdce9d69e08ee.socket failed (Invalid

> > > argument)

> > > [2014-10-28 06:00:18.227616] W [socket.c:545:__socket_rwv] 0-management:

> > > readv on /var/run/6154ed2845b7f728a3acdce9d69e08ee.socket failed (Invalid

> > > argument)

> > > [2014-10-28 06:00:21.227976] W [socket.c:545:__socket_rwv] 0-management:

> > > readv on

> > >

> > > .....

> > > .....

> > >

> > > [2014-10-28 06:19:15.142867] I

> > > [glusterd-handler.c:1280:__glusterd_handle_cli_get_volume] 0-glusterd:

> > > Received get vol req

> > > The message "I [MSGID: 106005]

> > > [glusterd-handler.c:4142:__glusterd_brick_rpc_notify] 0-management: Brick

> > > 192.168.1.246:/zp2/brick2 has disconnected from glusterd." repeated 12

> > > times between [2014-10-28 06:18:09.368752] and [2014-10-28

> > 06:18:45.373063]

> > > [2014-10-28 06:23:38.207649] W [glusterfsd.c:1194:cleanup_and_exit] (-->

> > > 0-: received signum (15), shutting down

> > >

> > >

> > > dmesg output:

> > >

> > > SPLError: 7869:0:(spl-err.c:67:vcmn_err()) WARNING: Pool 'zp2' has

> > > encountered an uncorrectable I/O failure and has been suspended.

> > >

> > > SPLError: 7868:0:(spl-err.c:67:vcmn_err()) WARNING: Pool 'zp2' has

> > > encountered an uncorrectable I/O failure and has been suspended.

> > >

> > > SPLError: 7869:0:(spl-err.c:67:vcmn_err()) WARNING: Pool 'zp2' has

> > > encountered an uncorrectable I/O failure and has been suspended.

> > >

> > > The brick is still online.

> > >

> > > # gluster volume status

> > > Status of volume: repvol

> > > Gluster process Port Online Pid

> > >

> > ------------------------------------------------------------------------------

> > > Brick 192.168.1.246:/zp1/brick1 49152 Y 4067

> > > Brick 192.168.1.246:/zp2/brick2 49153 Y 4078

> > > NFS Server on localhost 2049 Y 4092

> > > Self-heal Daemon on localhost N/A Y 4097

> > >

> > > Task Status of Volume repvol

> > >

> > ------------------------------------------------------------------------------

> > > There are no active volume tasks

> > >

> > > # gluster volume info

> > >

> > > Volume Name: repvol

> > > Type: Replicate

> > > Volume ID: ba1e7c6d-1e1c-45cd-8132-5f4fa4d2d22b

> > > Status: Started

> > > Number of Bricks: 1 x 2 = 2

> > > Transport-type: tcp

> > > Bricks:

> > > Brick1: 192.168.1.246:/zp1/brick1

> > > Brick2: 192.168.1.246:/zp2/brick2

> > > Options Reconfigured:

> > > storage.health-check-interval: 30

> > >

> > > Let me know if you need further information.

> > >

> > > Thanks,

> > > Kiran.

> > >

> > > On Tue, Oct 28, 2014 at 11:44 AM, Kiran Patil <kiran@xxxxxxxxxxxxx>

> > wrote:

> > >

> > > > I changed  git fetch git://review.gluster.org/glusterfs  to git fetch

> > > > http://review.gluster.org/glusterfs  and now it works.

> > > >

> > > > Thanks,

> > > > Kiran.

> > > >

> > > > On Tue, Oct 28, 2014 at 11:13 AM, Kiran Patil <kiran@xxxxxxxxxxxxx>

> > wrote:

> > > >

> > > >> Hi Niels,

> > > >>

> > > >> I am getting "fatal: Couldn't find remote ref refs/changes/13/8213/9"

> > > >> error.

> > > >>

> > > >> Steps to reproduce the issue.

> > > >>

> > > >> 1) # git clone git://review.gluster.org/glusterfs

> > > >> Initialized empty Git repository in /root/gluster-3.6/glusterfs/.git/

> > > >> remote: Counting objects: 84921, done.

> > > >> remote: Compressing objects: 100% (48307/48307), done.

> > > >> remote: Total 84921 (delta 57264), reused 63233 (delta 36254)

> > > >> Receiving objects: 100% (84921/84921), 23.23 MiB | 192 KiB/s, done.

> > > >> Resolving deltas: 100% (57264/57264), done.

> > > >>

> > > >> 2) # cd glusterfs

> > > >>     # git branch

> > > >>     * master

> > > >>

> > > >> 3) # git fetch git://review.gluster.org/glusterfs

> > refs/changes/13/8213/9

> > > >> && git checkout FETCH_HEAD

> > > >> fatal: Couldn't find remote ref refs/changes/13/8213/9

> > > >>

> > > >> Note: I also tried the above steps on git repo

> > > >> https://github.com/gluster/glusterfs and the result is same as above.

> > > >>

> > > >> Please let me know if I miss any steps.

> > > >>

> > > >> Thanks,

> > > >> Kiran.

> > > >>

> > > >> On Mon, Oct 27, 2014 at 5:53 PM, Niels de Vos <ndevos@xxxxxxxxxx>

> > wrote:

> > > >>

> > > >>> On Mon, Oct 27, 2014 at 05:19:13PM +0530, Kiran Patil wrote:

> > > >>> > Hi,

> > > >>> >

> > > >>> > I created replicated vol with two bricks on the same node and

> > copied

> > > >>> some

> > > >>> > data to it.

> > > >>> >

> > > >>> > Now removed the disk which has hosted one of the brick of the

> > volume.

> > > >>> >

> > > >>> > Storage.health-check-interval is set to 30 seconds.

> > > >>> >

> > > >>> > I could see the disk is unavailable using zpool command of zfs on

> > > >>> linux but

> > > >>> > the gluster volume status still displays the brick process running

> > > >>> which

> > > >>> > should have been shutdown by this time.

> > > >>> >

> > > >>> > Is this a bug in 3.6 since it is mentioned as feature "

> > > >>> >

> > > >>>

> > https://github.com/gluster/glusterfs/blob/release-3.6/doc/features/brick-failure-detection.md

> > > >>> "

> > > >>> >  or am I doing any mistakes here?

> > > >>>

> > > >>> The initial detection of brick failures did not work for all

> > > >>> filesystems. It may not work for ZFS too. A fix has been posted, but

> > it

> > > >>> has not been merged into the master branch yet. When the change has

> > been

> > > >>> merged, it can get backported to 3.6 and 3.5.

> > > >>>

> > > >>> You may want to test with the patch applied, and add your "+1

> > Verified"

> > > >>> to the change in case it makes it functional for you:

> > > >>> - http://review.gluster.org/8213

> > > >>>

> > > >>> Cheers,

> > > >>> Niels

> > > >>>

> > > >>> >

> > > >>> > [root@fractal-c92e gluster-3.6]# gluster volume status

> > > >>> > Status of volume: repvol

> > > >>> > Gluster process Port Online Pid

> > > >>> >

> > > >>>

> > ------------------------------------------------------------------------------

> > > >>> > Brick 192.168.1.246:/zp1/brick1 49154 Y 17671

> > > >>> > Brick 192.168.1.246:/zp2/brick2 49155 Y 17682

> > > >>> > NFS Server on localhost 2049 Y 17696

> > > >>> > Self-heal Daemon on localhost N/A Y 17701

> > > >>> >

> > > >>> > Task Status of Volume repvol

> > > >>> >

> > > >>>

> > ------------------------------------------------------------------------------

> > > >>> > There are no active volume tasks

> > > >>> >

> > > >>> >

> > > >>> > [root@fractal-c92e gluster-3.6]# gluster volume info

> > > >>> >

> > > >>> > Volume Name: repvol

> > > >>> > Type: Replicate

> > > >>> > Volume ID: d4f992b1-1393-43b8-9fda-2e2b6e3b5039

> > > >>> > Status: Started

> > > >>> > Number of Bricks: 1 x 2 = 2

> > > >>> > Transport-type: tcp

> > > >>> > Bricks:

> > > >>> > Brick1: 192.168.1.246:/zp1/brick1

> > > >>> > Brick2: 192.168.1.246:/zp2/brick2

> > > >>> > Options Reconfigured:

> > > >>> > storage.health-check-interval: 30

> > > >>> >

> > > >>> > [root@fractal-c92e gluster-3.6]# zpool status zp2

> > > >>> >   pool: zp2

> > > >>> >  state: UNAVAIL

> > > >>> > status: One or more devices are faulted in response to IO failures.

> > > >>> > action: Make sure the affected devices are connected, then run

> > 'zpool

> > > >>> > clear'.

> > > >>> >    see: http://zfsonlinux.org/msg/ZFS-8000-HC

> > > >>> >   scan: none requested

> > > >>> > config:

> > > >>> >

> > > >>> > NAME        STATE     READ WRITE CKSUM

> > > >>> > zp2         UNAVAIL      0     0     0  insufficient replicas

> > > >>> >   sdb       UNAVAIL      0     0     0

> > > >>> >

> > > >>> > errors: 2 data errors, use '-v' for a list

> > > >>> >

> > > >>> >

> > > >>> > Thanks,

> > > >>> > Kiran.

> > > >>>

> > > >>> > _______________________________________________

> > > >>> > Gluster-devel mailing list

> > > >>> > Gluster-devel@xxxxxxxxxxx

> > > >>> > http://supercolony.gluster.org/mailman/listinfo/gluster-devel

> > > >>>

> > > >>>

> > > >>

> > > >

> >

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel