Re: Getting timedout error while rebalancing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Deepu,

I can see multiple errors in glusterd log.
[2019-02-06 13:22:21.012490] E [glusterd-rpc-ops.c:1429:__glusterd_commit_op_cbk] (-->/lib64/libgfrpc.so.0(+0xec20) [0x7f278d201c20] -->/usr/lib64/glusterfs/4.1.7/xlator/mgmt/glusterd.so(+0x7762a) [0x7f2781f1d62a] -->/usr/lib64/glusterfs/4.1.7/xlator/mgmt/glusterd.so(+0x75213) [0x7f2781f1b213] ) 0-: Assertion failed: rsp.op == txn_op_info.op    ----> error has repeated multiple times in log.

[2019-02-06 11:16:32.474268] E [MSGID: 106218] [glusterd-rebalance.c:460:glusterd_rebalance_cmd_validate] 0-glusterd: Volume test-volume is not a distribute type or contains only 1 brick
[2019-02-06 11:16:32.474361] E [MSGID: 106301] [glusterd-op-sm.c:4669:glusterd_op_ac_send_stage_op] 0-management: Staging of operation 'Volume Rebalance' failed on localhost : Volume test-volume is not a distribute volume or contains only 1 brick.
Not performing rebalance

[2019-02-06 13:18:35.253045] I [MSGID: 106482] [glusterd-brick-ops.c:448:__glusterd_handle_add_brick] 0-management: Received add brick req
[2019-02-06 13:18:35.253080] E [MSGID: 106026] [glusterd-brick-ops.c:483:__glusterd_handle_add_brick] 0-management: Volume 192.168.185.xxx:/home/data/repl does not exist [Invalid argument]     ----> Is the add-brick success?

It is difficult to confirm anything by only looking at the glusterd logs. Please share glusterd, cli and cmd_history logs from all the nodes and also provide output of below commands.
1. gluster --version
2. gluster vol info
3. gluster vol status

Thanks,
Sanju

On Thu, Feb 7, 2019 at 1:26 AM deepu srinivasan <sdeepugd@xxxxxxxxx> wrote:
Please find the glusterd.log file attached.

On Wed, Feb 6, 2019 at 2:01 PM Atin Mukherjee <amukherj@xxxxxxxxxx> wrote:


On Tue, Feb 5, 2019 at 8:43 PM Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:


On Tue, 5 Feb 2019 at 17:26, deepu srinivasan <sdeepugd@xxxxxxxxx> wrote:
HI Nithya
We have a test gluster setup.We are testing the rebalancing option of gluster. So we started the volume which have 1x3 brick with some data on it .
command : gluster volume create test-volume replica 3 192.168.xxx.xx1:/home/data/repl 192.168.xxx.xx2:/home/data/repl 192.168.xxx.xx3:/home/data/repl.

Now we tried to expand the cluster storage by adding three more bricks. 
command : gluster volume add-brick test-volume 192.168.xxx.xx4:/home/data/repl 192.168.xxx.xx5:/home/data/repl 192.168.xxx.xx6:/home/data/repl

So after the brick addition we tried to rebalance the layout and the data.
command : gluster volume rebalance test-volume fix-layout start.
The command exited with status "Error : Request timed out".

This sounds like an error in the cli or glusterd. Can you send the glusterd.log from the node on which you ran the command?

It seems to me that glusterd took more than 120 seconds to process the command and hence cli timed out. We can confirm the same by checking the status of the rebalance below which indicates rebalance did kick in and eventually completed. We need to understand why did it take such longer, so please pass on the cli and glusterd log from all the nodes as Nithya requested for.


regards,
Nithya 

After the failure of the command, we tried to view the status of the command and it is something like this :

                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s

                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------

                               localhost               41        41.0MB          8200             0             0            completed        0:00:09

                         192.168.xxx.xx4               79        79.0MB          8231             0             0            completed        0:00:12

                         192.168.xxx.xx6               58        58.0MB          8281             0             0            completed        0:00:10

                         192.168.xxx.xx2              136       136.0MB          8566             0           136            completed        0:00:07

                         192.168.xxx.xx4              129       129.0MB          8566             0           129            completed        0:00:07

                         192.168.xxx.xx6              201       201.0MB          8566             0           201            completed        0:00:08


Is the rebalancing option working fine? Why did gluster  throw the error saying that "Error : Request timed out"?
.On Tue, Feb 5, 2019 at 4:23 PM Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:
Hi,
Please provide the exact step at which you are seeing the error. It would be ideal if you could copy-paste the command and the error.

Regards,
Nithya



On Tue, 5 Feb 2019 at 15:24, deepu srinivasan <sdeepugd@xxxxxxxxx> wrote:
HI everyone. I am getting "Error : Request timed out " while doing rebalance . I have aded new bricks to my replicated volume.i.e. First it was 1x3 volume and added three more bricks to make it distributed-replicated volume(2x3) . What should i do for the timeout error ?
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users


--
Thanks,
Sanju
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux