Re: Rebalance is not working in single node cluster environment.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 06/13/2015 04:50 PM, Atin Mukherjee wrote:
>
> Sent from Samsung Galaxy S4 On 13 Jun 2015 14:42, "Anand Nekkunti"
> <anekkunt@xxxxxxxxxx <mailto:anekkunt@xxxxxxxxxx>> wrote:
>>
>>
>> On 06/13/2015 02:27 PM, Atin Mukherjee wrote:
>>>
>>> Sent from Samsung Galaxy S4 On 13 Jun 2015 13:15, "Raghavendra
>>> Talur" <raghavendra.talur@xxxxxxxxx
>>> <mailto:raghavendra.talur@xxxxxxxxx>> wrote:
>>>>
>>>>
>>>>
>>>> On Sat, Jun 13, 2015 at 1:00 PM, Atin Mukherjee
>>>> <atin.mukherjee83@xxxxxxxxx
>>>> <mailto:atin.mukherjee83@xxxxxxxxx>> wrote:
>>>>>
>>>>> Sent from Samsung Galaxy S4 On 13 Jun 2015 12:58, "Anand
>>>>> Nekkunti" <anekkunt@xxxxxxxxxx <mailto:anekkunt@xxxxxxxxxx>>
>>>>> wrote:
>>>>>>
>>>>>> Hi All Rebalance is not working in single node cluster
>>>>>> environment ( current test frame work ).  I am getting
>>>>>> error in below test , it seems re-balance is not migrated
>>>>>> to  current cluster test framework.
>>>>> Could you pin point which test case fails and what log do you
>>>>> see?
>>>>>>
>>>>>> cleanup; TEST launch_cluster 2; TEST $CLI_1 peer probe
>>>>>> $H2;
>>>>>>
>>>>>> EXPECT_WITHIN $PROBE_TIMEOUT 1 check_peers
>>>>>>
>>>>>> $CLI_1 volume create $V0 $H1:$B1/$V0  $H2:$B2/$V0 EXPECT
>>>>>> 'Created' volinfo_field $V0 'Status';
>>>>>>
>>>>>> $CLI_1 volume start $V0 EXPECT 'Started' volinfo_field $V0
>>>>>> 'Status';
>>>>>>
>>>>>> #Mount FUSE TEST glusterfs -s $H1 --volfile-id=$V0 $M0;
>>>>>>
>>>>>> TEST mkdir $M0/dir{1..4}; TEST touch
>>>>>> $M0/dir{1..4}/files{1..4};
>>>>>>
>>>>>> TEST $CLI_1 volume add-brick $V0 $H1:$B1/${V0}1
>>>>>> $H2:$B2/${V0}1
>>>>>>
>>>>>> TEST $CLI_1 volume rebalance $V0  start
>>>>>>
>>>>>> EXPECT_WITHIN 60 "completed" CLI_1_rebalance_status_field
>>>>>> $V0
>>>>>>
>>>>>> $CLI_2 volume status $V0 EXPECT 'Started' volinfo_field $V0
>>>>>> 'Status';
>>>>>>
>>>>>> cleanup;
>>>>>>
>>>>>> Regards Anand.N
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx
>>>>>> <mailto:Gluster-devel@xxxxxxxxxxx>
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________ Gluster-devel
>>>>> mailing list Gluster-devel@xxxxxxxxxxx
>>>>> <mailto:Gluster-devel@xxxxxxxxxxx>
>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>>
>>>> If it is a crash of glusterd when you do rebalance start, it is
>>>> because of FORTIFY_FAIL in libc. Here is the patch that Susant
>>>> has already sent: http://review.gluster.org/#/c/11090/
>>>>
>>>> You can verify that it is the same crash by checking the core
>>>> in gdb; a SIGABRT would be raised after strncpy.
>>
>>
>> glusterd  is not crashing, but I am getting rebalance status as
>> fail  in my test case. It is happening in test frame work ( any
>> simulated cluster environment in same node ) only. RCA: 1. we are
>> passing always "localhost" as volfile server for rebalance xlator
>> . 2.Rebalance processes are  overwriting  unix socket and log files
>> each other (All rebalance processes are creating socket with same
>> name) .
>>
>> I will send patch for this
> I thought we were already in an agreement for this yesterday. IIRC,
> the same is true for all other daemons. As of now we dont have any
> tests which invoke daemons using cluster.rc
>

     ya .. yesterday we found that volfile server is the  problem , I modified volfile server but still i was getting rebalance  status fail . Initially I thought some problem in rebalance process, later I found that rebalance not able send respond to to glusterd after completing rebalance due to unix socket file corruption  and  all rebalance  daemons are writing log into same log file .
 I think there is no issue with other daemons  which are are using SVC framwork work.

patch: http://review.gluster.org/#/c/11210/  - this patch enable the writing test cases  for rebalance in cluster environment.


>>
>> Regards Anand.N
>>>
>>>>
>>> AFAIR Anand tried it in mainline and that fix was already in
>>> place.  I think this is something different.
>>>> -- Raghavendra Talur
>>>>
>>
>>
>



_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux