On Fri, Jun 23, 2023 at 08:17:46AM +0100, Daniel P. Berrangé wrote: > On Thu, Jun 22, 2023 at 03:20:01PM -0400, Peter Xu wrote: > > On Thu, Jun 22, 2023 at 05:33:29PM +0100, Daniel P. Berrangé wrote: > > > On Thu, Jun 22, 2023 at 11:54:43AM -0400, Peter Xu wrote: > > > > I can try to move the todo even higher. Trying to list the initial goals > > > > here: > > > > > > > > - One extra phase of handshake between src/dst (maybe the time to boost > > > > QEMU_VM_FILE_VERSION) before anything else happens. > > > > > > > > - Dest shouldn't need to apply any cap/param, it should get all from src. > > > > Dest still need to be setup with an URI and that should be all it needs. > > > > > > > > - Src shouldn't need to worry on the binary version of dst anymore as long > > > > as dest qemu supports handshake, because src can fetch it from dest. > > > > > > I'm not sure that works in general. Even if we have a handshake and > > > bi-directional comms for live migration, we still haave the save/restore > > > to file codepath to deal with. The dst QEMU doesn't exist at the time > > > the save process is done, so we can't add logic to VMSate handling that > > > assumes knowledge of the dst version at time of serialization. > > > > My current thought was still based on a new cap or anything the user would > > need to specify first on both sides (but hopefully the last cap to set on > > dest). > > > > E.g. if with a new handshake cap we shouldn't set it on a exec: or file: > > protocol migration, and it should just fail on qmp_migrate() telling that > > the URI is not supported if the cap is set. Return path is definitely > > required here. > > exec can support bi-directional migration - we have both stdin + stdout > for the command. For exec it is mostly a documentation problem - you > can't merely use 'cat' for example, but if you used 'socat' that could > be made to work bi-directionally. Okay. Just an example that the handshake just cannot work for all the cases, and it should always be able to fail. So when exec doesn't properly provide return path, I think with handshake=on we should get a timeout of not getting response properly and fail the migration after the timeout, then. There're a bunch of implications and details that need to be investigated around such a handshake if it'll be proposed for real, so I'm not yet sure whether there's something that may be surprising. For channeltypes it seems all fine for now. Hopefully nothing obvious overlooked. > > > > I don't think its possible for QEMU to validate that it has a fully > > > bi-directional channel, without adding timeouts to its detection which I > > > think we should strive to avoid. > > > > > > I don't think we actually need self-bootstrapping anyway. > > > > > > I think the mgmt app can just indicate the new v2 bi-directional > > > protocol when issuing the 'migrate' and 'migrate-incoming' > > > commands. This becomes trivial when Het's refactoring of the > > > migrate address QAPI is accepted: > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg04851.html > > > > > > eg: > > > > > > { "execute": "migrate", > > > "arguments": { > > > "channels": [ { "channeltype": "main", > > > "addr": { "transport": "socket", "type": "inet", > > > "host": "10.12.34.9", > > > "port": "1050" } } ] } } > > > > > > note the 'channeltype' parameter here. If we declare the 'main' > > > refers to the existing migration protocol, then we merely need > > > to define a new 'channeltype' to use as an indicator for the > > > v2 migration handshake protocol. > > > > Using a new channeltype would also work at least on src qemu, but I'm not > > sure on how dest qemu would know that it needs a handshake in that case, > > because it knows nothing until the connection is established. > > In Het's series the 'migrate_incoming' command similarly has a chaneltype > parameter. Oh, yeah then that'll just work. Thanks. -- Peter Xu