Re: [Bugs] Bricks are going offline unable to recover with heal/start force commands

Shaik Salam <shaik.salam@xxxxxxx> · Tue, 22 Jan 2019 12:46:31 +0530

Hi Surya,

It is already customer setup and cant
redeploy again.

Enabled debug for brick level log but
nothing writing to it.

Can you tell me is any other ways to
troubleshoot  or logs to look??

From:      
 Shaik Salam/HYD/TCS

To:      
 "Amar Tumballi
Suryanarayan" <atumball@xxxxxxxxxx>

Cc:      
 "gluster-users@xxxxxxxxxxx
List" <gluster-users@xxxxxxxxxxx>

Date:      
 01/22/2019 12:06 PM

Subject:    
   Re: [Bugs] Bricks
are going offline unable to recover with heal/start force commands

Hi Surya,

I have enabled DEBUG mode for brick
level. But nothing writing to brick log.

gluster volume set vol_3442e86b6d994a14de73f1b8c82cf0b8
diagnostics.brick-log-level DEBUG

sh-4.2# pwd

/var/log/glusterfs/bricks

sh-4.2# ls -la |grep brick_e15c12cceae12c8ab7782dd57cf5b6c1

-rw-------. 1 root root    
  0
Jan 20 02:46 var-lib-heketi-mounts-vg_d5f17487744584e3652d3ca943b0b91b-brick_e15c12cceae12c8ab7782dd57cf5b6c1-brick.log

BR

Salam

From:      
 "Amar Tumballi
Suryanarayan" <atumball@xxxxxxxxxx>

To:      
 "Shaik Salam"
<shaik.salam@xxxxxxx>

Cc:      
 "gluster-users@xxxxxxxxxxx
List" <gluster-users@xxxxxxxxxxx>

Date:      
 01/22/2019 11:38 AM

Subject:    
   Re: [Bugs] Bricks
are going offline unable to recover with heal/start force commands

"External email. Open with Caution"

Hi Shaik,

Can you check what is there in brick logs? They are located
in /var/log/glusterfs/bricks/*? 

Looks like the samba hooks script failed, but that shouldn't
matter in this use case.

Also, I see that you are trying to setup heketi to provision
volumes, which means you may be using gluster in container usecases. If
you are still in 'PoC' phase, can you give https://github.com/gluster/gcs
a try? That makes the deployment and the stack little simpler.

-Amar

On Tue, Jan 22, 2019 at 11:29 AM Shaik Salam <shaik.salam@xxxxxxx>
wrote:

Can anyone respond how to recover bricks
apart from heal/start force according to below events from logs.

Please let me know any other logs required. 

Thanks in advance. 

BR 

Salam 

From:        Shaik
Salam/HYD/TCS 

To:        bugs@xxxxxxxxxxx,
gluster-users@xxxxxxxxxxx

Date:        01/21/2019
10:03 PM 

Subject:        Bricks
are going offline unable to recover with heal/start force commands

Hi, 

Bricks are in offline and  unable to recover with following commands

gluster volume heal <vol-name> 

gluster volume start <vol-name> force 

But still bricks are offline. 

sh-4.2# gluster volume status vol_3442e86b6d994a14de73f1b8c82cf0b8

Status of volume: vol_3442e86b6d994a14de73f1b8c82cf0b8

Gluster process                
            TCP Port  RDMA Port  Online
 Pid 

------------------------------------------------------------------------------

Brick 192.168.3.6:/var/lib/heketi/mounts/vg 

_ca57f326195c243be2380ce4e42a4191/brick_952 

d75fd193c7209c9a81acbc23a3747/brick         49166  
  0          Y       269

Brick 192.168.3.5:/var/lib/heketi/mounts/vg 

_d5f17487744584e3652d3ca943b0b91b/brick_e15 

c12cceae12c8ab7782dd57cf5b6c1/brick         N/A  
    N/A        N       N/A

Brick 192.168.3.15:/var/lib/heketi/mounts/v 

g_462ea199185376b03e4b0317363bb88c/brick_17 

36459d19e8aaa1dcb5a87f48747d04/brick        49173  
  0          Y       225

Self-heal Daemon on localhost            
  N/A       N/A        Y  
    45826 

Self-heal Daemon on 192.168.3.6            
N/A       N/A        Y    
  65196 

Self-heal Daemon on 192.168.3.15            N/A
      N/A        Y      
52915 

Task Status of Volume vol_3442e86b6d994a14de73f1b8c82cf0b8

------------------------------------------------------------------------------

We can see following events from when we start forcing volumes

/mgmt/glusterd.so(+0xe2b3a) [0x7fca9e139b3a] -->/usr/lib64/glusterfs/4.1.5/xlator/mgmt/glusterd.so(+0xe2605)
[0x7fca9e139605] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7fcaa346f0e5]
) 0-management: Ran script: /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh
--volname=vol_3442e86b6d994a14de73f1b8c82cf0b8 --first=no --version=1 --volume-op=start
--gd-workdir=/var/lib/glusterd 

[2019-01-21 08:22:34.555068] E [run.c:241:runner_log] (-->/usr/lib64/glusterfs/4.1.5/xlator/mgmt/glusterd.so(+0xe2b3a)
[0x7fca9e139b3a] -->/usr/lib64/glusterfs/4.1.5/xlator/mgmt/glusterd.so(+0xe2563)
[0x7fca9e139563] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7fcaa346f0e5]
) 0-management: Failed to execute script: /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh
--volname=vol_3442e86b6d994a14de73f1b8c82cf0b8 --first=no --version=1 --volume-op=start
--gd-workdir=/var/lib/glusterd 

[2019-01-21 08:22:53.389049] I [MSGID: 106499] [glusterd-handler.c:4314:__glusterd_handle_status_volume]
0-management: Received status volume req for volume vol_3442e86b6d994a14de73f1b8c82cf0b8

[2019-01-21 08:23:25.346839] I [MSGID: 106487] [glusterd-handler.c:1486:__glusterd_handle_cli_list_friends]
0-glusterd: Received cli list req 

We can see following events from when we heal volumes.

[2019-01-21 08:20:07.576070] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glusterfs:
error returned while attempting to connect to host:(null), port:0

[2019-01-21 08:20:07.580225] I [cli-rpc-ops.c:9182:gf_cli_heal_volume_cbk]
0-cli: Received resp to heal volume 

[2019-01-21 08:20:07.580326] I [input.c:31:cli_batch] 0-: Exiting with:
-1 

[2019-01-21 08:22:30.423311] I [cli.c:768:main] 0-cli: Started running
gluster with version 4.1.5 

[2019-01-21 08:22:30.463648] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker]
0-epoll: Started thread with index 1 

[2019-01-21 08:22:30.463718] I [socket.c:2632:socket_event_handler] 0-transport:
EPOLLERR - disconnecting now 

[2019-01-21 08:22:30.463859] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glusterfs:
error returned while attempting to connect to host:(null), port:0

[2019-01-21 08:22:33.427710] I [socket.c:2632:socket_event_handler] 0-transport:
EPOLLERR - disconnecting now 

[2019-01-21 08:22:34.581555] I [cli-rpc-ops.c:1472:gf_cli_start_volume_cbk]
0-cli: Received resp to start volume 

[2019-01-21 08:22:34.581678] I [input.c:31:cli_batch] 0-: Exiting with:
0 

[2019-01-21 08:22:53.345351] I [cli.c:768:main] 0-cli: Started running
gluster with version 4.1.5 

[2019-01-21 08:22:53.387992] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker]
0-epoll: Started thread with index 1 

[2019-01-21 08:22:53.388059] I [socket.c:2632:socket_event_handler] 0-transport:
EPOLLERR - disconnecting now 

[2019-01-21 08:22:53.388138] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glusterfs:
error returned while attempting to connect to host:(null), port:0

[2019-01-21 08:22:53.394737] I [input.c:31:cli_batch] 0-: Exiting with:
0 

[2019-01-21 08:23:25.304688] I [cli.c:768:main] 0-cli: Started running
gluster with version 4.1.5 

[2019-01-21 08:23:25.346319] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker]
0-epoll: Started thread with index 1 

[2019-01-21 08:23:25.346389] I [socket.c:2632:socket_event_handler] 0-transport:
EPOLLERR - disconnecting now 

[2019-01-21 08:23:25.346500] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glusterfs:
error returned while attempting to connect to host:(null), port:0

Please let us know steps to recover bricks. 

BR 

Salam 

=====-----=====-----=====

Notice: The information contained in this e-mail

message and/or attachments to it may contain 

confidential or privileged information. If you are 

not the intended recipient, any dissemination, use, 

review, distribution, printing or copying of the 

information contained in this e-mail message 

and/or attachments to it are strictly prohibited. If 

you have received this communication in error, 

please notify us by reply e-mail or telephone and 

immediately and permanently delete the message 

and any attachments. Thank you

_______________________________________________

Bugs mailing list

Bugs@xxxxxxxxxxx

https://lists.gluster.org/mailman/listinfo/bugs

-- 

Amar Tumballi (amarts)

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users