On Mon, Aug 06, 2018 at 07:20:10AM +0200, Christian Ehrhardt wrote: > In that case I wonder what the libvirt community thinks of the proposed > general "Pid is gone means we can assume it is dead" approach? The key thing with the shutdown process is that we use the dissapperance of the PID as the flag to indicate that it is safe to release any resources that the PID was using. eg the hostdevs are now available for another guest to use. I'd be concerned that if we looking /proc/$PID going away as the flag, then we would be releasing the hostdevs for reuse, before the kernel has cleaned them up. In the best case this would result in a 2nd guest failing to start because the device was still in the case, in the worst case we could crash the entire host (though I'd be hopeful vfio prevents that). > An alternative would be to understand on the Kernel side why the PID is > gone "too early" and fix that so it stays until fully cleaned up. > But even then on the Libvirt side we would need the extended timeout values. Yeah, looks like extended timeouts are unavoidable. The only real optimization would be to pass an explicit timeout to the kill method, increasing it by 2 seconds for each hostdev that is assigned. That way we'll scale the timeout up as we need, so don't have to predict the worst case number of assigned devices. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list