Re: volume not working after yum update - gluster 3.6.3

Atin Mukherjee <atin.mukherjee83@xxxxxxxxx> · Tue, 11 Aug 2015 07:48:36 +0530



-Atin

Sent from one plus one

On Aug 10, 2015 11:58 PM, "Kingsley" <gluster@xxxxxxxxxxxxxxxxxxx> wrote:

>

>

> On Mon, 2015-08-10 at 22:53 +0530, Atin Mukherjee wrote:

> [snip]

>>

>> > stat("/sys/fs/selinux", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0

>>

>> > brk(0)                                  = 0x8db000

>> > brk(0x8fc000)                           = 0x8fc000

>> > mkdir("test", 0777

>> Can you also collect the statedump of all the brick processes when the command is hung?

>>   

>> + Ravi, could you check this?

>

>

> I ran the command but I could not find where it put the output:

>

>

> [root@gluster1a-1 ~]# gluster volume statedump callrec all

> volume statedump: success

> [root@gluster1a-1 ~]# gluster volume info callrec

>

> Volume Name: callrec

> Type: Replicate

> Volume ID: a39830b7-eddb-4061-b381-39411274131a

> Status: Started

> Number of Bricks: 1 x 4 = 4

> Transport-type: tcp

> Bricks:

> Brick1: gluster1a-1:/data/brick/callrec

> Brick2: gluster1b-1:/data/brick/callrec

> Brick3: gluster2a-1:/data/brick/callrec

> Brick4: gluster2b-1:/data/brick/callrec

> Options Reconfigured:

> performance.flush-behind: off

> [root@gluster1a-1 ~]#gluster volume status callrec

> Status of volume: callrec

> Gluster process                                         Port    Online  Pid

> ------------------------------------------------------------------------------

> Brick gluster1a-1:/data/brick/callrec                   49153   Y       29041

> Brick gluster1b-1:/data/brick/callrec                   49153   Y       31260

> Brick gluster2a-1:/data/brick/callrec                   49153   Y       31585

> Brick gluster2b-1:/data/brick/callrec                   49153   Y       12153

> NFS Server on localhost                                 2049    Y       29733

> Self-heal Daemon on localhost                           N/A     Y       29741

> NFS Server on gluster1b-1                               2049    Y       31872

> Self-heal Daemon on gluster1b-1                         N/A     Y       31882

> NFS Server on gluster2a-1                               2049    Y       32216

> Self-heal Daemon on gluster2a-1                         N/A     Y       32226

> NFS Server on gluster2b-1                               2049    Y       12752

> Self-heal Daemon on gluster2b-1                         N/A     Y       12762

>

> Task Status of Volume callrec

> ------------------------------------------------------------------------------

> There are no active volume tasks

>

> [root@gluster1a-1 ~]# ls -l /tmp

> total 144

> drwx------. 3 root root    16 Aug  8 22:20 systemd-private-Dp10Pz

> -rw-------. 1 root root  5818 Jul 31 06:39 yum_save_tx.2015-07-31.06-39.JCvHd5.yumtx

> -rw-------. 1 root root  5818 Aug  1 06:58 yum_save_tx.2015-08-01.06-58.wBytr2.yumtx

> -rw-------. 1 root root  5818 Aug  2 05:18 yum_save_tx.2015-08-02.05-18.AXIFSe.yumtx

> -rw-------. 1 root root  5818 Aug  3 07:15 yum_save_tx.2015-08-03.07-15.EDd8rg.yumtx

> -rw-------. 1 root root  5818 Aug  4 03:48 yum_save_tx.2015-08-04.03-48.XE513B.yumtx

> -rw-------. 1 root root  5818 Aug  5 09:03 yum_save_tx.2015-08-05.09-03.mX8xXF.yumtx

> -rw-------. 1 root root 28869 Aug  6 06:39 yum_save_tx.2015-08-06.06-39.166wJX.yumtx

> -rw-------. 1 root root 28869 Aug  7 07:20 yum_save_tx.2015-08-07.07-20.rLqJnT.yumtx

> -rw-------. 1 root root 28869 Aug  8 08:29 yum_save_tx.2015-08-08.08-29.KKaite.yumtx

> [root@gluster1a-1 ~]#

>

>

> Where should I find the output of the statedump command?

It should be there in var/run/gluster folder

>

> Cheers,

> Kingsley.

>

>

>> >

>> >> >

>> >> >

>> >> >

>> >> >

>> >> >> >

>> >> >> > Then ... do I need to run something on one of the bricks while strace is

>> >> >> > running?

>> >> >> >

>> >> >> > Cheers,

>> >> >> > Kingsley.

>> >> >> >

>> >> >> >

>> >> >> > > >

>> >> >> > > > [root@gluster1b-1 ~]# gluster volume heal callrec info

>> >> >> > > > Brick gluster1a-1.dns99.co.uk:/data/brick/callrec/

>> >> >> > > > <gfid:164f888f-2049-49e6-ad26-c758ee091863>

>> >> >> > > > /recordings/834723/14391 - Possibly undergoing heal

>> >> >> > > >

>> >> >> > > > <gfid:e280b40c-d8b7-43c5-9da7-4737054d7a7f>

>> >> >> > > > <gfid:b1fbda4a-732f-4f5d-b5a1-8355d786073e>

>> >> >> > > > <gfid:edb74524-b4b7-4190-85e7-4aad002f6e7c>

>> >> >> > > > <gfid:9b8b8446-1e27-4113-93c2-6727b1f457eb>

>> >> >> > > > <gfid:650efeca-b45c-413b-acc3-f0a5853ccebd>

>> >> >> > > > Number of entries: 7

>> >> >> > > >

>> >> >> > > > Brick gluster1b-1.dns99.co.uk:/data/brick/callrec/

>> >> >> > > > Number of entries: 0

>> >> >> > > >

>> >> >> > > > Brick gluster2a-1.dns99.co.uk:/data/brick/callrec/

>> >> >> > > > <gfid:e280b40c-d8b7-43c5-9da7-4737054d7a7f>

>> >> >> > > > <gfid:164f888f-2049-49e6-ad26-c758ee091863>

>> >> >> > > > <gfid:650efeca-b45c-413b-acc3-f0a5853ccebd>

>> >> >> > > > <gfid:b1fbda4a-732f-4f5d-b5a1-8355d786073e>

>> >> >> > > > /recordings/834723/14391 - Possibly undergoing heal

>> >> >> > > >

>> >> >> > > > <gfid:edb74524-b4b7-4190-85e7-4aad002f6e7c>

>> >> >> > > > <gfid:9b8b8446-1e27-4113-93c2-6727b1f457eb>

>> >> >> > > > Number of entries: 7

>> >> >> > > >

>> >> >> > > > Brick gluster2b-1.dns99.co.uk:/data/brick/callrec/

>> >> >> > > > Number of entries: 0

>> >> >> > > >

>> >> >> > > >

>> >> >> > > > If I query each brick directly for the number of files/directories

>> >> >> > > > within that, I get 1731 on gluster1a-1 and gluster2a-1, but 1737 on

>> >> >> > > the

>> >> >> > > > other two, using this command:

>> >> >> > > >

>> >> >> > > > # find /data/brick/callrec/recordings/834723/14391 -print | wc -l

>> >> >> > > >

>> >> >> > > > Cheers,

>> >> >> > > > Kingsley.

>> >> >> > > >

>> >> >> > > > On Mon, 2015-08-10 at 11:05 +0100, Kingsley wrote:

>> >> >> > > > > Sorry for the blind panic - restarting the volume seems to have

>> >> >> > > fixed

>> >> >> > > > > it.

>> >> >> > > > >

>> >> >> > > > > But then my next question - why is this necessary? Surely it

>> >> >> > > undermines

>> >> >> > > > > the whole point of a high availability system?

>> >> >> > > > >

>> >> >> > > > > Cheers,

>> >> >> > > > > Kingsley.

>> >> >> > > > >

>> >> >> > > > > On Mon, 2015-08-10 at 10:53 +0100, Kingsley wrote:

>> >> >> > > > > > Hi,

>> >> >> > > > > >

>> >> >> > > > > > We have a 4 way replicated volume using gluster 3.6.3 on CentOS

>> >> >> > > 7.

>> >> >> > > > > >

>> >> >> > > > > > Over the weekend I did a yum update on each of the bricks in

>> >> >> > > turn, but

>> >> >> > > > > > now when clients (using fuse mounts) try to access the volume,

>> >> >> > > it hangs.

>> >> >> > > > > > Gluster itself wasn't updated (we've disabled that repo so that

>> >> >> > > we keep

>> >> >> > > > > > to 3.6.3 for now).

>> >> >> > > > > >

>> >> >> > > > > > This was what I did:

>> >> >> > > > > >

>> >> >> > > > > >       * on first brick, "yum update"

>> >> >> > > > > >       * reboot brick

>> >> >> > > > > >       * watch "gluster volume status" on another brick and wait

>> >> >> > > for it

>> >> >> > > > > >         to say all 4 bricks are online before proceeding to

>> >> >> > > update the

>> >> >> > > > > >         next brick

>> >> >> > > > > >

>> >> >> > > > > > I was expecting the clients might pause 30 seconds while they

>> >> >> > > notice a

>> >> >> > > > > > brick is offline, but then recover.

>> >> >> > > > > >

>> >> >> > > > > > I've tried re-mounting clients, but that hasn't helped.

>> >> >> > > > > >

>> >> >> > > > > > I can't see much data in any of the log files.

>> >> >> > > > > >

>> >> >> > > > > > I've tried "gluster volume heal callrec" but it doesn't seem to

>> >> >> > > have

>> >> >> > > > > > helped.

>> >> >> > > > > >

>> >> >> > > > > > What shall I do next?

>> >> >> > > > > >

>> >> >> > > > > > I've pasted some stuff below in case any of it helps.

>> >> >> > > > > >

>> >> >> > > > > > Cheers,

>> >> >> > > > > > Kingsley.

>> >> >> > > > > >

>> >> >> > > > > > [root@gluster1b-1 ~]# gluster volume info callrec

>> >> >> > > > > >

>> >> >> > > > > > Volume Name: callrec

>> >> >> > > > > > Type: Replicate

>> >> >> > > > > > Volume ID: a39830b7-eddb-4061-b381-39411274131a

>> >> >> > > > > > Status: Started

>> >> >> > > > > > Number of Bricks: 1 x 4 = 4

>> >> >> > > > > > Transport-type: tcp

>> >> >> > > > > > Bricks:

>> >> >> > > > > > Brick1: gluster1a-1:/data/brick/callrec

>> >> >> > > > > > Brick2: gluster1b-1:/data/brick/callrec

>> >> >> > > > > > Brick3: gluster2a-1:/data/brick/callrec

>> >> >> > > > > > Brick4: gluster2b-1:/data/brick/callrec

>> >> >> > > > > > Options Reconfigured:

>> >> >> > > > > > performance.flush-behind: off

>> >> >> > > > > > [root@gluster1b-1 ~]#

>> >> >> > > > > >

>> >> >> > > > > >

>> >> >> > > > > > [root@gluster1b-1 ~]# gluster volume status callrec

>> >> >> > > > > > Status of volume: callrec

>> >> >> > > > > > Gluster process                                         Port

>> >> >> > > Online  Pid

>> >> >> > > > > >

>> >> >> > > ------------------------------------------------------------------------------

>> >> >> > > > > > Brick gluster1a-1:/data/brick/callrec                   49153

>> >> >> > >  Y       6803

>> >> >> > > > > > Brick gluster1b-1:/data/brick/callrec                   49153

>> >> >> > >  Y       2614

>> >> >> > > > > > Brick gluster2a-1:/data/brick/callrec                   49153

>> >> >> > >  Y       2645

>> >> >> > > > > > Brick gluster2b-1:/data/brick/callrec                   49153

>> >> >> > >  Y       4325

>> >> >> > > > > > NFS Server on localhost                                 2049

>> >> >> > > Y       2769

>> >> >> > > > > > Self-heal Daemon on localhost                           N/A

>> >> >> > >  Y       2789

>> >> >> > > > > > NFS Server on gluster2a-1                               2049

>> >> >> > > Y       2857

>> >> >> > > > > > Self-heal Daemon on gluster2a-1                         N/A

>> >> >> > >  Y       2814

>> >> >> > > > > > NFS Server on 88.151.41.100                             2049

>> >> >> > > Y       6833

>> >> >> > > > > > Self-heal Daemon on 88.151.41.100                       N/A

>> >> >> > >  Y       6824

>> >> >> > > > > > NFS Server on gluster2b-1                               2049

>> >> >> > > Y       4428

>> >> >> > > > > > Self-heal Daemon on gluster2b-1                         N/A

>> >> >> > >  Y       4387

>> >> >> > > > > >

>> >> >> > > > > > Task Status of Volume callrec

>> >> >> > > > > >

>> >> >> > > ------------------------------------------------------------------------------

>> >> >> > > > > > There are no active volume tasks

>> >> >> > > > > >

>> >> >> > > > > > [root@gluster1b-1 ~]#

>> >> >> > > > > >

>> >> >> > > > > >

>> >> >> > > > > > [root@gluster1b-1 ~]# gluster volume heal callrec info

>> >> >> > > > > > Brick gluster1a-1.dns99.co.uk:/data/brick/callrec/

>> >> >> > > > > > /to_process - Possibly undergoing heal

>> >> >> > > > > >

>> >> >> > > > > > Number of entries: 1

>> >> >> > > > > >

>> >> >> > > > > > Brick gluster1b-1.dns99.co.uk:/data/brick/callrec/

>> >> >> > > > > > Number of entries: 0

>> >> >> > > > > >

>> >> >> > > > > > Brick gluster2a-1.dns99.co.uk:/data/brick/callrec/

>> >> >> > > > > > /to_process - Possibly undergoing heal

>> >> >> > > > > >

>> >> >> > > > > > Number of entries: 1

>> >> >> > > > > >

>> >> >> > > > > > Brick gluster2b-1.dns99.co.uk:/data/brick/callrec/

>> >> >> > > > > > Number of entries: 0

>> >> >> > > > > >

>> >> >> > > > > > [root@gluster1b-1 ~]#

>> >> >> > > > > >

>> >> >> > > > > >

>> >> >> > > > > > _______________________________________________

>> >> >> > > > > > Gluster-users mailing list

>> >> >> > > > > > Gluster-users@xxxxxxxxxxx

>> >> >> > > > > > http://www.gluster.org/mailman/listinfo/gluster-users

>> >> >> > > > > >

>> >> >> > > > >

>> >> >> > > > > _______________________________________________

>> >> >> > > > > Gluster-users mailing list

>> >> >> > > > > Gluster-users@xxxxxxxxxxx

>> >> >> > > > > http://www.gluster.org/mailman/listinfo/gluster-users

>> >> >> > > > >

>> >> >> > > >

>> >> >> > > > _______________________________________________

>> >> >> > > > Gluster-users mailing list

>> >> >> > > > Gluster-users@xxxxxxxxxxx

>> >> >> > > > http://www.gluster.org/mailman/listinfo/gluster-users

>> >> >> > >

>> >> >> > >

>> >> >> >

>> >> >>

>> >> >>

>> >>

>> >>

>> >>

>> >> _______________________________________________

>> >> Gluster-users mailing list

>> >> Gluster-users@xxxxxxxxxxx

>> >> http://www.gluster.org/mailman/listinfo/gluster-users

>>

>>

>>

>> _______________________________________________

>> Gluster-users mailing list

>> Gluster-users@xxxxxxxxxxx

>> http://www.gluster.org/mailman/listinfo/gluster-users


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users