Re: Glusterd proccess hangs on reboot

Atin Mukherjee <amukherj@xxxxxxxxxx> · Tue, 22 Aug 2017 15:47:35 +0000

My guess is there is a corruption in vol list or peer list which has lead glusterd to get into a infinite loop of traversing a peer/volume list and CPU to hog up. Again this is a guess and I've not got a chance to take a detail look at the logs and the strace output.

I believe if you get to reboot the node again the problem will disappear.

On Tue, 22 Aug 2017 at 20:07, Serkan Çoban <cobanserkan@xxxxxxxxx> wrote:
As an addition perf top shows %80 libc-2.12.so __strcmp_sse42 during

glusterd %100 cpu usage

Hope this helps...

On Tue, Aug 22, 2017 at 2:41 PM, Serkan Çoban <cobanserkan@xxxxxxxxx> wrote:

> Hi there,

>

> I have a strange problem.

> Gluster version in 3.10.5, I am testing new servers. Gluster

> configuration is 16+4 EC, I have three volumes, each have 1600 bricks.

> I can successfully create the cluster and volumes without any

> problems. I write data to cluster from 100 clients for 12 hours again

> no problem. But when I try to reboot a node, glusterd process hangs on

> %100 CPU usage and seems to do nothing, no brick processes come

> online. You can find strace of glusterd process for 1 minutes here:

>

> https://www.dropbox.com/s/c7bxfnbqxze1yus/gluster_strace.out?dl=0

>

> Here is the glusterd logs:

> https://www.dropbox.com/s/hkstb3mdeil9a5u/glusterd.log?dl=0

>

>

> By the way, reboot of one server completes without problem if I reboot

> the servers before creating any volumes.

_______________________________________________

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

http://lists.gluster.org/mailman/listinfo/gluster-users
-- 
- Atin (atinm)
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users