----- "Yolkfull Chow" <yzhou@xxxxxxxxxx> wrote: > On 06/10/2009 06:03 PM, Michael Goldish wrote: > > ----- "Yolkfull Chow"<yzhou@xxxxxxxxxx> wrote: > > > > > >> On 06/09/2009 05:44 PM, Michael Goldish wrote: > >> > >>> The test looks pretty nicely written. Comments: > >>> > >>> 1. Consider making all the cloned VMs use image snapshots: > >>> > >>> curr_vm = vm1.clone() > >>> curr_vm.get_params()["extra_params"] += " -snapshot" > >>> > >>> I'm not sure it's a good idea to let all VMs use the same disk > >>> > >> image. > >> > >>> Or maybe you shouldn't add -snapshot yourself, but rather do it > in > >>> > >> the config > >> > >>> file for the first VM, and then all cloned VMs will have > -snapshot > >>> > >> as well. > >> > >>> > >>> > >> Yes I use 'image_snapshot = yes' in config file. > >> > >>> 2. Consider changing the message > >>> " Booting the %dth guest" % num > >>> to > >>> "Booting guest #%d" % num > >>> (because there's no such thing as 2th and 3th) > >>> > >>> 3. Consider changing the message > >>> "Cannot boot vm anylonger" > >>> to > >>> "Cannot create VM #%d" % num > >>> > >>> 4. Why not add curr_vm to vms immediately after cloning it? > >>> That way you can kill it in the exception handler later, without > >>> > >> having > >> > >>> to send it a 'quit' if you can't login ('if not > curr_vm_session'). > >>> > >>> > >> Yes, good idea. > >> > >>> 5. " %dth guest boots up successfully" % num --> again, 2th and > 3th > >>> > >> make no sense. > >> > >>> Also, I wonder why you add those spaces before every info > message. > >>> > >>> 6. "%dth guest's session is not responsive" --> same > >>> (maybe use "Guest session #%d is not responsive" % num) > >>> > >>> 7. "Shut down the %dth guest" --> same > >>> (maybe "Shutting down guest #%d"? or destroying/killing?) > >>> > >>> 8. Shouldn't we fail the test when we find an unresponsive > session? > >>> It seems you just display an error message. You can simply > replace > >>> logging.error( with raise error.TestFail(. > >>> > >>> > >> > >>> 9. Consider using a stricter test than just > >>> > >> vm_session.is_responsive(). > >> > >>> vm_session.is_responsive() just sends ENTER to the sessions and > >>> > >> returns > >> > >>> True if it gets anything as a result (usually a prompt, or even > just > >>> > >> a > >> > >>> newline echoed back). If the session passes this test it is > indeed > >>> responsive, so it's a decent test, but maybe you can send some > >>> > >> command > >> > >>> (user configurable?) and test for some output. I'm really not > sure > >>> > >> this > >> > >>> is important, because I can't imagine a session would respond to > a > >>> > >> newline > >> > >>> but not to other commands, but who knows. Maybe you can send the > >>> > >> first VM > >> > >>> a user-specified command when the test begins, remember the > output, > >>> > >> and > >> > >>> then send all other VMs the same command and make sure the output > is > >>> > >> the > >> > >>> same. > >>> > >>> > >> maybe use 'info status' and send command 'help' via session to vms > and > >> compare their output? > >> > > I'm not sure I understand. What does 'info status' do? We're talking > about > > an SSH shell, not the monitor. You can do whatever you like, like > 'uname -a', > > and 'ls /', but you should leave it up to the user to decide, so > he/she > > can specify different commands for different guests. Linux commands > won't > > work under Windows, so Linux and Windows must have different > commands in > > the config file. In the Linux section, under '- @Linux:' you can > add > > something like: > > > > stress_boot: > > stress_boot_test_command = uname -a > > > > and under '- @Windows:': > > > > stress_boot: > > stress_boot_test_command = ver && vol > > > > These commands are just naive suggestions. I'm sure someone can > think of > > much more informative commands. > > > That's really good suggestions. Thanks, Michael. And can I use > 'migration_test_command' instead? Not really. Why would you want to use another test's param? 1. There's no guarantee that 'migration_test_command' is defined for your boot stress test. In fact, it is probably only defined for migration tests, so you probably won't be able to access it. Try params.get('migration_test_command') in your test and you'll probably get None. 2. The user may not want to run migration at all, and then he/she will probably not define 'migration_test_command'. 3. The user might want to use different test commands for migration and for the boot stress test. > >>> 10. I'm not sure you should use the param "kill_vm_gracefully" > >>> > >> because that's > >> > >>> a postprocessor param (probably not your business). You can just > >>> > >> call > >> > >>> destroy() in the exception handler with gracefully=False, because > if > >>> > >> the VMs > >> > >>> are non- responsive, I don't expect them to shutdown nicely with > an > >>> > >> SSH > >> > >>> command (that's what gracefully does). Also, we're using > -snapshot, > >>> > >> so > >> > >>> there's no reason to shut them down nicely. > >>> > >>> > >> Yes, I agree. :) > >> > >>> 11. "Total number booted successfully: %d" % (num - 1) --> why > not > >>> > >> just num? > >> > >>> We really have num VMs including the first one. > >>> Or you can say: "Total number booted successfully in addition to > the > >>> > >> first one" > >> > >>> but that's much longer. > >>> > >>> > >> Since after the first guest booted, I set num = 1 and then 'num += > 1' > >> > >> at first in while loop ( for the purpose of getting a new vm ). > >> So curr_vm is vm2 ( num is 2) now. If the second vm failed to boot > up, > >> the num booted successfully should be (num - 1). > >> I would use enumerate(vms) that Uri suggested to make number easier > to > >> count. > >> > > OK, I didn't notice that. > > > > > >>> 12. Consider adding a 'max_vms' (or 'threshold') user param to > the > >>> > >> test. If > >> > >>> num reaches 'max_vms', we stop adding VMs and pass the test. > >>> > >> Otherwise the > >> > >>> test will always fail (which is depressing). If > >>> > >> params.get("threshold") is > >> > >>> None or "", or in short -- 'if not params.get("threshold")', > disable > >>> > >> this > >> > >>> feature and keep adding VMs forever. The user can enable the > feature > >>> > >> with: > >> > >>> max_vms = 50 > >>> or disable it with: > >>> max_vms = > >>> > >>> > >> This is a good idea for hardware resource limit of host. > >> > >>> 13. Why are you catching OSError? If you get OSError it might be > a > >>> > >> framework bug. > >> > >>> > >>> > >> Since sometimes, vm.create() successfully but failed to ssh-login > >> since > >> the running python cannot allocate physical memory (OSError). > >> Add max_vms could fix this problem I think. > >> > > Do you remember exactly where OSError was thrown? Do you happen to > have > > a backtrace? (I just want to be very it's not a bug.) > > > The OSError was thrown when checking all VMs are responsive and I got > many traceback about "OSError: [Errno 12] Cannot allocate memory". > Maybe since when last VM was created successfully with lucky, whereas > python cannot get physical memory after that when checking all > sessions. > So can we now catch the OSError and tell user the number of max_vms > is too large? Sure. I was just worried it might be a framework bug. If it's a legitimate memory error -- catch it and fail the test. If you happen to catch that OSError again, and get a backtrace, I'd like to see it if that's possible. Thanks, Michael > >>> 14. At the end of the exception handler you should proably > re-raise > >>> > >> the exception > >> > >>> you caught. Otherwise the user won't see the error message. You > can > >>> > >> simply replace > >> > >>> 'break' with 'raise' (no parameters), and it should work, > >>> > >> hopefully. > >> > >>> > >>> > >> Yes I should if add a 'max_vms'. > >> > > I think you should re-raise anyway. Otherwise, what's the point in > writing > > error messages such as "raise error.TestFail("Cannot boot vm > anylonger")"? > > I you don't re-raise, the user won't see the messages. > > > > > >>> I know these are quite a few comments, but they're all rather > minor > >>> > >> and the test > >> > >>> is well written in my opinion. > >>> > >>> > >> Thank you, I will do modification according to your and Uri's > >> comments, > >> and will re-submit it here later. :) > >> > >> Thanks and Best Regards, > >> Yolkfull > >> > >>> Thanks, > >>> Michael > >>> > >>> ----- Original Message ----- > >>> From: "Yolkfull Chow"<yzhou@xxxxxxxxxx> > >>> To:kvm@xxxxxxxxxxxxxxx > >>> Cc: "Uri Lublin"<uril@xxxxxxxxxx> > >>> Sent: Tuesday, June 9, 2009 11:41:54 AM (GMT+0200) Auto-Detected > >>> Subject: [KVM-AUTOTEST PATCH] A test patch - Boot VMs until one > of > >>> > >> them becomes unresponsive > >> > >>> > >>> Hi, > >>> > >>> This test will boot VMs until one of them becomes unresponsive, > and > >>> records the maximum number of VMs successfully started. > >>> > >>> > >>> > >>> > >> -- > >> To unsubscribe from this list: send the line "unsubscribe kvm" in > >> the body of a message to majordomo@xxxxxxxxxxxxxxx > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> > > > -- > Yolkfull > Regards, -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html