Re: Release 3.12: Glusto run status

Jonathan Holloway <jholloway@xxxxxxxxxx> · Wed, 30 Aug 2017 15:16:42 -0400 (EDT)

From: "Shwetha Panduranga" <spandura@xxxxxxxxxx>
To: "Nigel Babu" <nigelb@xxxxxxxxxx>
Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx>
Sent: Wednesday, August 30, 2017 6:29:08 AM
Subject: Re:  Release 3.12: Glusto run status

I submitted a patch making the changes: https://review.gluster.org/#/c/18152/2
We can easily replace the sleep with a timed wait_for_rebalance_to_start() loop function that watches for "in progress", log the status, followed by the existing wait_for_rebalance_to_complete().
It solves the problem and scales (at least up).

We can also refactor into a single wait_for_rebalance_status() method and call it twice with appropriate status(es) as args.
For backwards compatibility just treat wait_for_rebalance_to_complete() as a wrapper to the new function and doc/log as deprecated.

Cheers,
Jonathan

On Wed, Aug 30, 2017 at 4:45 PM, Shwetha Panduranga <spandura@xxxxxxxxxx> wrote:
we had the first  'rebalance status" for logging purposes. wait_for_rebalance_to_complete will get the xml command output for validations.  --xml ouputs go to debug log levels.

On Wed, Aug 30, 2017 at 4:35 PM, Nigel Babu <nigelb@xxxxxxxxxx> wrote:
Why are we failing because the first "rebalance status" fails? Isn't it supposed to check in a loop and wait until it succeeds?

Specifically, I think line 288 and 289 need to be removed http://git.gluster.org/cgit/glusto-tests.git/tree/glustolibs-gluster/glustolibs/gluster/rebalance_ops.py#n288

Is that a fair assessment?

On Wed, Aug 30, 2017 at 4:28 PM, Shwetha Panduranga <spandura@xxxxxxxxxx> wrote:
May be i should change the log message from 'Checking rebalance status' to 'Logging rebalance status' because the first 'rebalance status' command just does that . It executes 'rebalance status'. Now wait_for_rebalance_to_complete validates rebalance is 'completed' within 5 minutes ( default time out ). If that makes sense i will make those changes as well along with introducing the delay b/w 'start' and 'status'

On Wed, Aug 30, 2017 at 4:26 PM, Atin Mukherjee <amukherj@xxxxxxxxxx> wrote:

On Wed, Aug 30, 2017 at 4:23 PM, Shwetha Panduranga <spandura@xxxxxxxxxx> wrote:
This is the first check where we just execute 'rebalance status' . That's the command which failed and hence failed the test case. If u see the test case, the next step is wait_for_rebalance_to_complete (status --xml). This is where we execute  rebalance status until 5 minutes for rebalance to get completed. Even before waiting for rebalance, the first execution of status command failed. Hence the test case failed. 

Cool. So there is still a problem in the test case. We can't assume rebalance status to report back success immediately after rebalance start and I've explained the why part in the earlier thread. Why do we need to do an intermediate check of rebalance status before going for   wait_for_rebalance_to_complete ?

-- 
nigelb

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel