Re: [Autotest] [PATCH] Add a kvm test guest_s4 which supports both Linux and Windows platform

Michael Goldish <mgoldish@xxxxxxxxxx> · Tue, 13 Oct 2009 19:53:31 -0400 (EDT)

I mostly agree with Lucas's comments and would like to add some of my own.

----- "Lucas Meneghel Rodrigues" <lmr@xxxxxxxxxx> wrote:

> Hi Yolkfull and Chen:
> 
> Thanks for your test! I have some comments and doubts to clear, most
> of them are about content of the messages delivered for the user and
> some other details.
> 
> On Sun, Sep 27, 2009 at 6:11 AM, Yolkfull Chow <yzhou@xxxxxxxxxx>
> wrote:
> > For this case, Ken Cao wrote the linux part previously and I did
> extensive
> > modifications on Windows platform support.
> >
> > Signed-off-by: Ken Cao <kcao@xxxxxxxxxx>
> > Signed-off-by: Yolkfull Chow <yzhou@xxxxxxxxxx>
> > ---
> >  client/tests/kvm/kvm_tests.cfg.sample |   14 +++++++
> >  client/tests/kvm/tests/guest_s4.py    |   66
> +++++++++++++++++++++++++++++++++
> >  2 files changed, 80 insertions(+), 0 deletions(-)
> >  create mode 100644 client/tests/kvm/tests/guest_s4.py
> >
> > diff --git a/client/tests/kvm/kvm_tests.cfg.sample
> b/client/tests/kvm/kvm_tests.cfg.sample
> > index 285a38f..f9ecb61 100644
> > --- a/client/tests/kvm/kvm_tests.cfg.sample
> > +++ b/client/tests/kvm/kvm_tests.cfg.sample
> > @@ -94,6 +94,14 @@ variants:
> >     - linux_s3:     install setup
> >         type = linux_s3
> >
> > +    - guest_s4:
> > +        type = guest_s4
> > +        check_s4_support_cmd = grep -q disk /sys/power/state
> > +        test_s4_cmd = "cd /tmp/;nohup tcpdump -q -t ip host
> localhost"
> > +        check_s4_cmd = pgrep tcpdump
> > +        set_s4_cmd = echo disk > /sys/power/state
> > +        kill_test_s4_cmd = pkill tcpdump
> > +
> >     - timedrift:    install setup
> >         type = timedrift
> >         extra_params += " -rtc-td-hack"
> > @@ -382,6 +390,12 @@ variants:
> >             # Alternative host load:
> >             #host_load_command = "dd if=/dev/urandom of=/dev/null"
> >             host_load_instances = 8
> > +        guest_s4:
> > +            check_s4_support_cmd = powercfg /hibernate on
> > +            test_s4_cmd = start /B ping -n 3000 localhost

When running the test in user mode with a windows guest, I found that
the only way to keep ping.exe alive through s4 was
"start ping -t localhost" (or "start ping -n 3000 localhost"),
without /B.  I tried this with windows XP and it should be confirmed
with other guests as well.  Running it any other way redirects
ping.exe's stdout to the shell session, and since the shell session
can't survive s4 in user mode, ping.exe terminates as well.

Did you get it to work in user mode with "start /B" or did you just
test it in TAP mode?

I haven't tried it on linux yet, but I think nohup should do the job.

> > +            check_s4_cmd = tasklist | find /I "ping"

This should work, but I think "ping.exe" would be slightly more
specific than "ping", because who knows what else the guest might be
running.  It still won't save us from processes ending with
"mapping.exe" for example.  Too bad find doesn't support regular
expressions.

> > +            set_s4_cmd = rundll32.exe PowrProf.dll,
> SetSuspendState
> > +            kill_test_s4_cmd = taskkill /IM ping.exe /F
> >
> >         variants:
> >             - Win2000:
> > diff --git a/client/tests/kvm/tests/guest_s4.py
> b/client/tests/kvm/tests/guest_s4.py
> > new file mode 100644
> > index 0000000..5d8fbdf
> > --- /dev/null
> > +++ b/client/tests/kvm/tests/guest_s4.py
> > @@ -0,0 +1,66 @@
> > +import logging, time
> > +from autotest_lib.client.common_lib import error
> > +import kvm_test_utils, kvm_utils
> > +
> > +
> > +def run_guest_s4(test, params, env):
> > +    """
> > +    Suspend guest to disk,supports both Linux & Windows OSes.
> > +
> > +    @param test: kvm test object.
> > +    @param params: Dictionary with test parameters.
> > +    @param env: Dictionary with the test environment.
> > +    """
> > +    vm = kvm_test_utils.get_living_vm(env, params.get("main_vm"))
> > +    session = kvm_test_utils.wait_for_login(vm)
> > +
> > +    logging.info("Checking whether VM supports S4")
> > +    status =
> session.get_command_status(params.get("check_s4_support_cmd"))
> > +    if status is None:
> > +        logging.error("Failed to check if S4 exists")
> > +    elif status != 0:
> > +        raise error.TestFail("Guest does not support S4")
> > +
> > +    logging.info("Waiting for a while for X to start...")
> 
> Yes, generally X starts a bit later than the SSH service, so I
> understand the time being here, however:
> 
>  * In fact we are waiting for all services of the guest to be up and
> functional, so depending on the level of load, I don't think 10s is
> gonna make it. So I suggest something >= 30s
>  * It's also true that just wait for a given time and hope that it
> will be OK kinda sucks, so ideally we need to write utility functions
> to stablish as well as possible when all services of a host are fully
> booted up. Stated this way, it looks simple, but it's not.
> 
> Autotest experience suggests that there's no real sane way to
> determine when a linux box is booted up, but we can take a
> semi-rational approach and verify if all services for the current run
> level have the status up or a similar approach. For windows, I was
> talking to Yaniv Kaul and it seems that processing the output of the
> 'sc query' command might give what we want. Bottom line, I'd like to
> add a TODO item, and write a function to stablish (fairly
> confidently) that a windows/linux guest is booted up.
> 
> > +    time.sleep(10)

For now 30 sounds reasonable.
If we want to save time on windows we can accept a parameter to control
the sleep duration.  On windows 10 seconds may suffice because rss.exe
is one of the last programs to run (if not the last one).
But then again, it doesn't hurt to give the guest some extra time, so 30
might be good enough.

> > +    # Start up a program(tcpdump for linux OS & ping for M$ OS), as
> a flag.
> > +    # If the program died after suspend, then fails this testcase.
> > +    test_s4_cmd = params.get("test_s4_cmd")
> > +    session.sendline(test_s4_cmd)
> > +
> > +    # Get the second session to start S4
> > +    session2 = kvm_test_utils.wait_for_login(vm)
> 
> > +    check_s4_cmd = params.get("check_s4_cmd")
> > +    if session2.get_command_status(check_s4_cmd):
> > +        raise error.TestError("Failed to launch %s background" %
> test_s4_cmd)
> > +    logging.info("Launched command background in guest: %s" %
> test_s4_cmd)
> > +
> > +    # Implement S4
> > +    logging.info("Start suspend to disk now...")
> > +    session2.sendline(params.get("set_s4_cmd"))
> > +
> > +    if not kvm_utils.wait_for(vm.is_dead, 360, 30, 2):
> > +        raise error.TestFail("VM refuse to go down,suspend
> failed")
> > +    logging.info("VM suspended successfully.")
> > +
> > +    logging.info("VM suspended to disk. sleep 10 seconds to have a
> break...")
> > +    time.sleep(10)
> > +
> > +    # Start vm, and check whether the program is still running
> > +    logging.info("Restart VM now...")
> > +
> > +    if not vm.create():
> > +        raise error.TestError("failed to start the vm again.")
> > +    if not vm.is_alive():
> > +        raise error.TestError("VM seems to be dead; Test requires a
> live VM.")

I don't think we need to check vm.is_alive() because vm.create() already
does that after starting the VM.

> > +    # Check whether test command still alive
> > +    if session2.get_command_status(check_s4_cmd):
> > +        raise error.TestFail("%s died, indicating that S4 failed" %
> test_s4_cmd)

In TAP mode this should work, but in user mode I think session2 won't
survive s4, so you need to start a new one, preferably with
wait_for(vm.remote_login, ...) and with proper logging messages.
kvm_test_utils.wait_for_login() should work, but if it fails it will
raise a very generic "could not log into guest" message, which doesn't
inform the user that this is happening _after_ s4.

For example,

session2 = kvm_utils.wait_for(vm.remote_login, 30, 0 2)
if not session2:
    raise error.TestFail("Could not log into guest after S4")

Also, you should probably use "if session2.get_command_status(...) != 0",
because get_command_status() returns None if it can't get the command
status.

> My doubt here is if by issuing a get_command_status we are actually
> checking if the first command issued is alive. I am under the
> impression that this will only create a new process with the same
> command line, and hence, we are not checking if the background
> command
> is alive... Perhaps check if the command is alive using ps/windows
> task manager or something?

This is exactly what the command does.  test_s4_cmd starts a process
in the background, and we don't care about its exit status (it shouldn't
exit).  check_s4_cmd just checks whether the background process is still
running, and sets the exit status accordingly.
"pgrep tcpdump" exits with 0 only if tcpdump is running.
The same goes for 'tasklist | find /I "ping.exe"'.

> About the wording of some messages, I have here a suggestion (full
> code that I had edited, see below), please take a look at it and let
> me know if you think it's OK.
> 
> {{{
> import logging, time
> from autotest_lib.client.common_lib import error
> import kvm_test_utils, kvm_utils
> 
> 
> def run_guest_s4(test, params, env):
>     """
>     Suspend guest to disk,supports both Linux & Windows OSes.
> 
>     @param test: kvm test object.
>     @param params: Dictionary with test parameters.
>     @param env: Dictionary with the test environment.
>     """
>     vm = kvm_test_utils.get_living_vm(env, params.get("main_vm"))
>     session = kvm_test_utils.wait_for_login(vm)
> 
>     logging.info("Checking whether VM supports S4")
>     status =
> session.get_command_status(params.get("check_s4_support_cmd"))
>     if status is None:
>         logging.error("Failed to check if host supports S4")
>     elif status != 0:
>         raise error.TestFail("Guest does not support S4")
> 
>     logging.info("Wait until all guest services are fully started")
>     time.sleep(30)
> 
>     # Start up a program (tcpdump for linux & ping for Windows), as a
> flag.
>     # If the program died after suspend, then fails this testcase.
>     test_s4_cmd = params.get("test_s4_cmd")
>     session.sendline(test_s4_cmd)
> 
>     # Get the second session to start S4
>     session2 = kvm_test_utils.wait_for_login(vm)
> 
>     check_s4_cmd = params.get("check_s4_cmd")
>     if session2.get_command_status(check_s4_cmd):
>         raise error.TestError("Failed to launch '%s' as a background
> process" %
>                               test_s4_cmd)
>     logging.info("Launched background command in guest: %s" %
> test_s4_cmd)
> 
>     # Suspend to disk
>     logging.info("Start suspend to disk now...")
>     session2.sendline(params.get("set_s4_cmd"))
> 
>     if not kvm_utils.wait_for(vm.is_dead, 360, 30, 2):
>         raise error.TestFail("VM refuses to go down. Suspend failed")
>     logging.info("VM suspended successfully. Wait before booting it
> again.")
>     time.sleep(10)
> 
>     # Start vm, and check whether the program is still running
>     logging.info("Start suspended VM...")
> 
>     if not vm.create():
>         raise error.TestError("Failed to start the VM after suspend to
> disk")
>     if not vm.is_alive():
>         raise error.TestError("VM seems to be dead after it was
> suspended")
> 
>     # Check whether test command still alive
>     logging.info("Checking if background command returned")
>     if session2.get_command_status(check_s4_cmd):
>         raise error.TestFail("Command %s failed. S4 failed" %
> test_s4_cmd)
> 
>     logging.info("VM resumed after S4")
>     session2.sendline(params.get("kill_test_s4_cmd"))
>     session.close()
>     session2.close()
> 
> }}}
> 
> So I am waiting your comments before I can apply this patch.
> 
> Thanks!
> 
> Lucas
> _______________________________________________
> Autotest mailing list
> Autotest@xxxxxxxxxxxxxxx
> http://test.kernel.org/cgi-bin/mailman/listinfo/autotest
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html