On Tue, 2010-11-09 at 17:42 +0200, Michael S. Tsirkin wrote: > On Tue, Nov 09, 2010 at 08:34:54AM -0700, Alex Williamson wrote: > > On Tue, 2010-11-09 at 17:07 +0200, Michael S. Tsirkin wrote: > > > On Tue, Nov 09, 2010 at 07:58:23AM -0700, Alex Williamson wrote: > > > > On Tue, 2010-11-09 at 14:00 +0200, Michael S. Tsirkin wrote: > > > > > On Mon, Nov 08, 2010 at 02:23:37PM -0700, Alex Williamson wrote: > > > > > > On Mon, 2010-11-08 at 22:59 +0200, Michael S. Tsirkin wrote: > > > > > > > On Mon, Nov 08, 2010 at 10:20:46AM -0700, Alex Williamson wrote: > > > > > > > > On Mon, 2010-11-08 at 18:54 +0200, Michael S. Tsirkin wrote: > > > > > > > > > On Mon, Nov 08, 2010 at 07:59:57AM -0700, Alex Williamson wrote: > > > > > > > > > > On Mon, 2010-11-08 at 13:40 +0200, Michael S. Tsirkin wrote: > > > > > > > > > > > On Wed, Oct 06, 2010 at 02:58:57PM -0600, Alex Williamson wrote: > > > > > > > > > > > > Our code paths for saving or migrating a VM are full of functions that > > > > > > > > > > > > return void, leaving no opportunity for a device to cancel a migration, > > > > > > > > > > > > either from error or incompatibility. The ivshmem driver attempted to > > > > > > > > > > > > solve this with a no_migrate flag on the save state entry. I think the > > > > > > > > > > > > more generic and flexible way to solve this is to allow driver save > > > > > > > > > > > > functions to fail. This series implements that and converts ivshmem > > > > > > > > > > > > to uses a set_params function to NAK migration much earlier in the > > > > > > > > > > > > processes. This touches a lot of files, but bulk of those changes are > > > > > > > > > > > > simply s/void/int/ and tacking a "return 0" to the end of functions. > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > > > > > > > > > Well error handling is always tricky: it seems easier to > > > > > > > > > > > require save handlers to never fail. > > > > > > > > > > > > > > > > > > > > Sure it's easier, but does that make it robust? > > > > > > > > > > > > > > > > > > More robust in the face of wwhat kind of failure? > > > > > > > > > > > > > > > > I really don't understand why we're having a discussion about whether > > > > > > > > providing a means to return an error is a good thing or not. These > > > > > > > > patches touch a lot of files, but the change is dead simple. > > > > > > > > > > > > > > I just don't see the motivation. Presumably your patches are > > > > > > > there to achieve some kind of goal, right? I am trying to > > > > > > > figure out what that goal is. > > > > > > > > > > > > My goal is that I want to be able to NAK a migration when devices are > > > > > > assigned, and I think we can do it more generically than the no_migrate > > > > > > flag so that it supports this application and any other reason that > > > > > > saves might fail in the future. > > > > > > > > > > More generically but harder to understand and debug, IMO. > > > > > > > > How is returning an error condition hard to understand? Debugging seems > > > > easier to me, especially if drivers follow the precedent set in the last > > > > patch and fprintf the reason for the failure. Ideally this would be > > > > some kind of push out to qmp, but it still seems easier than figuring > > > > out which driver called register_device_unmigratable(). > > > > > > > > > > > Currently savevm callbacks never fail. So they > > > > > > > return void. Why is returing 0 and adding a bunch of code to test the > > > > > > > condition that never happens a good idea? It just seems to create more > > > > > > > ways for devices to shoot themselves in the foot. > > > > > > > > > > > > And more ways to indicate something bad happened and keep running. We > > > > > > already have far too many abort() calls in the code. > > > > > > > > > > If you can keep running why can't you migrate? > > > > > > > > Well, as you know device assignment is tied to the hardware, so can't > > > > migrate, but can always keep running. The ivshmem driver has a peer > > > > role, where it's tied to the host memory, so can't migrate, but can keep > > > > running. > > > > > > Right. All these are covered with no_migrate flag well enough. > > > Their inability to migrate does not change at runtime. > > > > But it could. What if ivshmem is acting in a peer role, but has no > > clients, could it migrate? What if ivshmem is migratable when the > > migration begins, but while the migration continues, a connection is > > setup and it becomes unmigratable. > > Sounds like something we should work to prevent, not support :) s/:)/:(/ why? > > Using this series, ivshmem would > > have multiple options how to support this. It could a) NAK the > > migration, b) drop connections and prevent new connections until the > > migration finishes, c) detect that new connections have happened since > > the migration started and cancel. And probably more. no_migrate can > > only do a). And in fact, we can only test no_migrate after the VM is > > stopped (after all memory is migrated) because otherwise it could race > > with devices setting no_migrate during migration. > > We really want no_migrate to be static. changing it is abusing > the infrastructure. You call it abusing, I call it making use of the infrastructure. Why unnecessarily restrict ourselves? Is return 0/-1 really that scary, unmaintainable, undebuggable? I don't understand the resistance. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html