Re: Previously replaced brick not coming up after reboot

Serkan Çoban <cobanserkan@xxxxxxxxx> · Thu, 16 Aug 2018 10:26:27 +0300

What is your gluster version? There was a bug in 3.10, when you reboot
a node some bricks may not come online but it fixed in later versions.

On 8/16/18, Hu Bert <revirii@xxxxxxxxxxxxxx> wrote:
> Hi there,
>
> 2 times i had to replace a brick on 2 different servers; replace went
> fine, heal took very long but finally finished. From time to time you
> have to reboot the server (kernel upgrades), and i've noticed that the
> replaced brick doesn't come up after the reboot. Status after reboot:
>
> gluster volume status
> Status of volume: shared
> Gluster process                             TCP Port  RDMA Port  Online
> Pid
> ------------------------------------------------------------------------------
> Brick gluster11:/gluster/bricksda1/shared   49164     0          Y
> 6425
> Brick gluster12:/gluster/bricksda1/shared   49152     0          Y
> 2078
> Brick gluster13:/gluster/bricksda1/shared   49152     0          Y
> 2478
> Brick gluster11:/gluster/bricksdb1/shared   49165     0          Y
> 6452
> Brick gluster12:/gluster/bricksdb1/shared   49153     0          Y
> 2084
> Brick gluster13:/gluster/bricksdb1/shared   49153     0          Y
> 2497
> Brick gluster11:/gluster/bricksdc1/shared   49166     0          Y
> 6479
> Brick gluster12:/gluster/bricksdc1/shared   49154     0          Y
> 2090
> Brick gluster13:/gluster/bricksdc1/shared   49154     0          Y
> 2485
> Brick gluster11:/gluster/bricksdd1/shared   49168     0          Y
> 7897
> Brick gluster12:/gluster/bricksdd1_new/shared  49157     0          Y
> 7632
> Brick gluster13:/gluster/bricksdd1_new/shared  N/A       N/A        N
>      N/A
> Self-heal Daemon on localhost               N/A       N/A        Y
> 25483
> Self-heal Daemon on gluster13               N/A       N/A        Y
> 2463
> Self-heal Daemon on gluster12               N/A       N/A        Y
> 17619
>
> Task Status of Volume shared
> ------------------------------------------------------------------------------
> There are no active volume tasks
>
> Here gluster13:/gluster/bricksdd1_new/shared is not up. Related log
> message after reboot in glusterd.log:
>
> [2018-08-16 05:22:52.986757] W [socket.c:593:__socket_rwv]
> 0-management: readv on
> /var/run/gluster/02d086b75bfc97f2cce96fe47e26dcf3.socket failed (No
> data available)
> [2018-08-16 05:22:52.987648] I [MSGID: 106005]
> [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management:
> Brick gluster13:/gluster/bricksdd1_new/shared has disconnected from
> glusterd.
> [2018-08-16 05:22:52.987908] E [rpc-clnt.c:350:saved_frames_unwind]
> (-->
> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x13e)[0x7fdbaa398b8e]
> (--> /usr/lib/x86_64-
> linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1d1)[0x7fdbaa15f111]
> (-->
> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fdbaa15f23e]
> (--> /usr/lib/x86_64-linu
> x-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x91)[0x7fdbaa1608d1]
> (-->
> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x288)[0x7fdbaa1613f8]
> ))))) 0-management: force
> d unwinding frame type(brick operations) op(--(4)) called at
> 2018-08-16 05:22:52.941332 (xid=0x2)
> [2018-08-16 05:22:52.988058] W [dict.c:426:dict_set]
> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.12.12/xlator/mgmt/glusterd.so(+0xd1e59)
> [0x7fdba4f9ce59]
> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_set_int32+0x2b)
> [0x7fdbaa39122b]
> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_set+0xd3)
> [0x7fdbaa38fa13] ) 0-dict: !this || !value for key=index [I
> nvalid argument]
> [2018-08-16 05:22:52.988092] E [MSGID: 106060]
> [glusterd-syncop.c:1014:gd_syncop_mgmt_brick_op] 0-management: Error
> setting index on brick status rsp dict
>
> This problem could be related to my previous mail. After executing
> "gluster volume start shared force" the brick comes up, resulting in
> healing the brick (and in high load, too). Is there any possibility to
> track down why this happens and how to ensure that the brick comes up
> at boot?
>
>
> Best regards
> Hubert
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users