On 04/17/2010 01:23 AM, Lucas Meneghel Rodrigues wrote: > In some occasions even though a VM has terminated, > some remote shell sessions will take a long time > before giving up on the host. This situation is > happening frequently on subtests such as autotest: > The VM shuts down, but the session will be alive > for a long time after the VM died. How long? See comments below. > So let's keep track of all sessions stablished to > a VM object, and create a remote session cleaning > thread, very much similar to the screendumps thread > recently introduced. This thread would go over all > dead VMs in the environment, and kill any remote > sessions associated to them that might still be alive. > > This way we can save some good time in tests where > the VM was terminated due to some weird reason. > > Signed-off-by: Lucas Meneghel Rodrigues <lmr@xxxxxxxxxx> > --- > client/tests/kvm/kvm_preprocessing.py | 33 ++++++++++++++++++++++++++++++++- > client/tests/kvm/kvm_vm.py | 7 +++++++ > 2 files changed, 39 insertions(+), 1 deletions(-) > > diff --git a/client/tests/kvm/kvm_preprocessing.py b/client/tests/kvm/kvm_preprocessing.py > index 50db65c..707565d 100644 > --- a/client/tests/kvm/kvm_preprocessing.py > +++ b/client/tests/kvm/kvm_preprocessing.py > @@ -13,7 +13,8 @@ except ImportError: > > _screendump_thread = None > _screendump_thread_termination_event = None > - > +_session_cleaning_thread = None > +_session_cleaning_thread_termination_event = None > > def preprocess_image(test, params): > """ > @@ -267,6 +268,14 @@ def preprocess(test, params, env): > args=(test, params, env)) > _screendump_thread.start() > > + # Start the session cleaning thread > + logging.debug("Starting remote session cleaning thread") > + global _session_cleaning_thread, _session_cleaning_thread_termination_event > + _session_cleaning_thread_termination_event = threading.Event() > + _session_cleaning_thread = threading.Thread(target=_clean_remote_sessions, > + args=(test, params, env)) > + _session_cleaning_thread.start() > + > > def postprocess(test, params, env): > """ > @@ -442,3 +451,25 @@ def _take_screendumps(test, params, env): > if _screendump_thread_termination_event.isSet(): > break > _screendump_thread_termination_event.wait(delay) > + > + > +def _clean_remote_sessions(test, params, env): > + """ > + Some remote shell servers such as SSH can be very slow on giving up of a > + connection, which is fair. However, if the VM is known to be dead, we > + can speed up this process reaping the remote sessions stablished to it. > + > + @param test: KVM test object. > + @param params: Dictionary with test parameters. > + @param env: KVM test environment. > + """ > + global _session_cleaning_termination_event > + delay = float(params.get("session_cleaning_delay", 30)) > + while True: > + for vm in kvm_utils.env_get_all_vms(env): > + if vm.is_dead(): > + for session in vm.get_remote_session_list(): > + session.close() > + if _session_cleaning_thread_termination_event.isSet(): > + break > + _session_cleaning_thread_termination_event.wait(delay) > diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py > index 047505a..82ef3cf 100755 > --- a/client/tests/kvm/kvm_vm.py > +++ b/client/tests/kvm/kvm_vm.py > @@ -115,6 +115,7 @@ class VM: > self.root_dir = root_dir > self.address_cache = address_cache > self.pci_assignable = None > + self.remote_session_list = [] > > # Find available monitor filename > while True: > @@ -749,6 +750,10 @@ class VM: > return self.process.get_pid() > > > + def get_remote_session_list(self): > + return self.remote_session_list > + > + > def get_shared_meminfo(self): > """ > Returns the VM's shared memory information. > @@ -802,6 +807,8 @@ class VM: > if session: > session.set_status_test_command(self.params.get("status_test_" > "command", "")) > + self.remote_session_list.append(session) > + > return session > > The problem with this approach is that storing the remote sessions in the VM object can have weird side effects: - The VM object is pickled at the end of the test, and unpickled at the beginning of the next test. The stored sessions will also be unpickled. I'm not sure if that's bad but we should check it out. - Currently sessions are terminated automatically as soon as they're not needed. Keeping them in the VM object will delay their termination (though I don't think they'll make it to the next test alive). I'm still not sure these are actual problems -- I'd like to test it to be sure. Minor issues: - You need to set() the termination event and join() the thread at postprocessing, otherwise it's pointless to call isSet() on the termination event. - Dead sessions should probably be removed from remote_session_list. Alternative solution I: We can try ssh's ServerAliveInterval option. The ssh man page says: "If, for example, ServerAliveInterval is set to 15, and ServerAliveCountMax is left at the default, if the server becomes unresponsive ssh will disconnect after approximately 45 seconds." For nc and rss.exe, if the problem exists there, we can use nc's -w option, though it will also kill sessions that are merely idle (not dead). Alternative solution II: Modify kvm_subprocess: either make it kill idle processes (optionally, because we don't want qemu to be killed), or make it exit when a given PID ceases to exist, like tail's --pid option (the PID will be qemu's). -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html