Re: Slow volume, gluster volume status bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Tue, Nov 14, 2017 at 2:47 PM, Emmanuel Dreyfus <manu@xxxxxxxxxx> wrote:
On Tue, Nov 14, 2017 at 12:17:05PM +0530, Atin Mukherjee wrote:
> > gluster volume status also exhibits trouble: each server will only
> > list its bricks, but not the other's one. I suspect it could just
> > be some tiemout because of slow answer from the peer.

> Have you checked the output of gluster peer status? Also does glusterd log
> file give any hint on time outs, rpc failures, disconnections et all?

gluster peer status says "State: Sent and Received peer request (Connected)"
on both sides.

So this is the origin of why the peers don't understand they are connected. Friend handshaking got stuck in the middle and it never recovered back. Restarting the glusterd services ideally should fix the state, if not then you'd have to manually edit the /var/lib/glusterd/peers/UUID files with state=3 and then restart glusterd service.


I have this in glusterd.log:
[2017-11-14 08:49:47.289423] I [MSGID: 106143] [glusterd-pmap.c:279:pmap_registry_bind] 0-pmap: adding brick /export/wd3e on port 49155
[2017-11-14 08:49:52.289926] I [MSGID: 106143] [glusterd-pmap.c:279:pmap_registry_bind] 0-pmap: adding brick /export/wd0e on port 49152
[2017-11-14 08:49:52.295394] I [MSGID: 106143] [glusterd-pmap.c:279:pmap_registry_bind] 0-pmap: adding brick /export/wd1e on port 49153
[2017-11-14 08:49:52.302973] I [MSGID: 106143] [glusterd-pmap.c:279:pmap_registry_bind] 0-pmap: adding brick /export/wd2e on port 49154
[2017-11-14 08:54:31.535066] W [socket.c:593:__socket_rwv] 0-management: readv on 192.0.2.109:24007 failed (Connection reset by peer)
[2017-11-14 08:54:32.567745] I [MSGID: 106004] [glusterd-handler.c:6284:__glusterd_peer_rpc_notify] 0-management: Peer <bidon> (<2d7719d9-0466-434c-a881-4081156fac47>), in state <Probe Sent to Peer>, has disconnected from glusterd.

An odd thing: the registrations message suggest the local bricks should
show as online in glusterfs volume status output. They are displayed as
offline, until I kill the glusterfsd processes and issue a
 gluster volume start gfs force.

ALong with symetrical stuff, the peer has this;
[2017-11-14 08:56:05.799686] E [socket.c:2369:socket_connect_finish] 0-management: connection to 192.0.2.110:24007 failed (Connection timed out); disconnecting socket

In the meantime I tracked the performance problem to exteded atributes
system calls. The root of the problem is outside of glusterfs, but fixing
the consequuences would be nice.

--
Emmanuel Dreyfus
manu@xxxxxxxxxx

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux