The patches for secure migration raise an interesting question wrt the handling of data streams and their effects on the internal driver API and the public API. Although the migration helper APIs are not technically public, they do map onto the remote wire protocol and as such we have the same long term compatability issues to worry about. The way the migration APIs fit together are obscuring the picture a little, so for the sake of clarity, the remainder of this mail is going to talk about a ficticious public API 'virDomainRestoreStream' which allows a guest domain to be restored from a generic data stream, rather than a named file. If we can solve this API problem, then the design will trivially apply to secure migration. I'll now outline some possible approaches at public API level: 1. Pass a file handle to the public API Application usage: fd = open(filename); ret = virDomainRestoreStream(dom, fd); Driver internal usage: int virDomainRestoreStreamImpl(virDomainPtr dom, int fd) char buf[4096]; int ret; int qemuFD; qemuFD = runQEMUWithIncomingFD(dom); do { ret = read(fd, buf, sizeof buf); if (ret > 0) write(qemuFD, buf, ret); } while (ret > 0); } Good: Restore functionality all in one driver method Good: Public API is very simple Good: Internal driver can poll() on the FD to avoid blocking Bad: Application API is blocked Bad: Data read from FD might need transformation eg, uncompress, or decrypt TLS/SASL. 2. Provide public APIs for starting restore, feeding data, and completing. This matches proposal for secure migration patchset. Application usage: fd = open(filename); ret = virDomainRestorePrepare(dom); do { char buf[4096]; ret = read(fd, buf, sizeof buf); virDomainRestoreData(dom, buf, ret); } while (ret > 0); virDomainRestoreFinish(dom, ret == 0 ? 0 : 1); Driver internal usage: int virDomainRestorePrepareImpl(virDomainPtr dom) { qemuFD = runQEMUWithIncomingFD(dom); } int virDomainRestoreDataImpl(virDomainPtr dom, const char *buf, int buflen) { qemudFD = ...find previously opened qemuFD ... write(qemuFD, buf, buflen) } int virDomainRestoreFinishImpl(virDomainPtr dom, int error) { if (error) ...kill QEMU ... else qemuFD = ...find previously opened qemuFD ... close(qemuFD); } Good: Application can easily decrypt input data Good: Application can use an event loop to feed in data as it becomes available. eg poll in socket() Good: Application API never blocks execution for long time Bad: Driver has to maintain state across calls indefinitely Bad: Cannot guarentee that same client calls prepare/data. ie Different clients can get mixed up feeding data. Bad: Public API is fairly complex Bad: Lots of public API entry points for each method needing streams. 3. Provide a stream object for feeding data to driver from application. Similar to option 2, but provides easier state mgmt for driver. The driver will set callbacks on the data stream to receive data from client. Application usage: virDataStream stream = virDataStreamNew(); ret = virDomainRestore(dom, stream); do { char buf[4096]; ret = read(fd, buf, sizeof buf); virDataStreamWrite(stream, buf, ret); } while (ret > 0); virDataStreamFinish(dom, ret == 0 ? 0 : 1); Driver internal usage: int virDomainRestoreStreamImpl(virDomainPtr dom, virDataStream stream) { qemuFD = runQEMUWithIncomingFD(dom); virDataStreamSetCallbacks(stream, virDomainRestoreDataImpl, virDomainRestoreFinishImpl, (void *)qemudFD); } int virDomainRestoreDataImpl(virDomainPtr dom, const char *buf, int buflen, void *opaque) { int qemudFD = (int)opaque; write(qemuFD, buf, buflen) } int virDomainRestoreFinishImpl(virDomainPtr dom, int error, void *opaque) { if (error) ...kill QEMU ... else int qemudFD = (int)opaque; close(qemuFD); } Good: Application can easily decrypt input data Good: Application can use an event loop to feed in data as it becomes available. eg poll in socket() Good: Application API never blocks execution for long time Good: New APIs reusable for any public API with data stream Bad: Public API is fairly complex Bad: Driver has to maintain state across calls indefinitely 4. Provide a callback for driver to fetch data from the client app. Similar to option 1, but avoids need to expose concept of a 'fd' in public API directly. Application usage: int appreader(virDomainPtr dom, char *buf, int buflen, void *opaque) { int fd = (int)opaque; return read(fd, buf, buflen) } fd = open(filename); ret = virDomainRestoreStream(dom, appreader, (void *)fd); Driver internal usage int virDomainRestoreStreamImpl(virDomainPtr dom, int (*reader)(virDomainPtr, char *, int, voi d*), void *opaque) char buf[4096]; int ret; int qemuFD; qemuFD = runQEMUWithIncomingFD(dom); do { ret = (*reader)(dom, buf, sizeof buf, opaque); if (ret > 0) write(qemuFD, buf, ret); } while (ret > 0); } Good: Restore functionality all in one driver method Good: Public API is very simple Good: Client app callback can decrypt data Bad: Application API is blocked Bad: Internal driver code will block on executing callback if not data is available. Can't integrate with event loop. 5. Provide a generic public stream API to fetch data from the client app. Similar to option 4, but adding an stream object to manage the callback - allows more functionality to be added to public API later without changing restore API contract. Application usage: int appreader(virDomainPtr dom, char *buf, int buflen, void *opaque) { int fd = (int)opaque; return read(fd, buf, buflen) } fd = open(filename); stream = virDataStreamNewReader(appreader, (void*)fd); ret = virDomainRestoreStream(dom, stream); Driver internal usage int virDomainRestoreStreamImpl(virDomainPtr dom, virDataStream stream) { char buf[4096]; int ret; int qemuFD; qemuFD = runQEMUWithIncomingFD(dom); do { ret = virDataStreamRead(dom, buf, sizeof buf); if (ret > 0) write(qemuFD, buf, ret); } while (ret > 0); } Good: Restore functionality all in one driver method Good: Public API is fairly simple Good: Client app callback can decrypt data Bad: Application API is blocked Bad: Internal driver code will block on executing callback if not data is available. Can't integrate with event loop. All the APIs have good and bad points to them, in particular there is a difficult tradeoff between simplicity of the public API application code, vs the internal API implementation code. Some important goals though: - There must be a way to invoke the public API without blocking the application code - The driver must be able to receive data from an encrypted channel, because in libvirtd the FD might be the SASL/TLS socket - Internal driver API should not block on callbacks to app code, since it might need to be polling on another FD concurrently with reading data. - The number of new APIs to support streaming should not increase for each new method needing stream support. Ultimately I think options 3 or 5 are the most promising, because the addition of a generic 'virDataStream' public object makes it easier to manage the processing of the data stream without adding huge numbers of new APIs. Option 3 is a little more cumbersome to use from application code, but it avoids blocking either the client app, or the internal driver. The downside is the driver impl code is split across several methods. With option 5 it is harder to avoid blocking the client and internal driver, since it'd require the driver to integrate with an event loop, but there is no direct FD for the driver to poll() on. The choices made also have possible implications on the design of the remote wire protocol to support these methods. Ignoring the design of the public API, there are a handful of ways to stream data between client and server 1. Invoke primary method eg "restore domain", then feed the data in a sequence of following RPC calls. C --------------> S Restore domain call C <-------------- S Restore domain reply C --------------> S Restore data call 1 C <-------------- S Restore data reply 1 C --------------> S Restore data call 2 C <-------------- S Restore data reply 2 ............... C --------------> S Restore data call n C <-------------- S Restore data reply n C --------------> S Restore data complete C <-------------- S Restore data reply If server wants to abort a restore operation, it'll send an error on one of the replies. 2. Invoke primary method eg "restore domain", then feed the data in a sequence of following async messages. C --------------> S Restore domain call C <-------------- S Restore domain reply C --------------> S Restore data msg 1 C --------------> S Restore data msg 2 ............... C --------------> S Restore data msg n C --------------> S Restore data complete C <-------------- S Restore data reply If server wants to stop the client without closing the socket, it needs an async 'stop' message from server to client. This is prety much same as option 2, but is killing off the explicit replies for each data packet. 3. Invoke primary method eg "restore domain", but require the data to be stream to server, before the reply is sent back C --------------> S Restore domain call C --------------> S Stream data msg 1 C --------------> S Stream data msg 2 ............... C --------------> S Stream data msg n C --------------> S Stream finish msg C <-------------- S Restore domain reply If server wants to stop the client without closing the socket, it simply sends back the 'restore domain reply' message as an error before client finishes sending data, and ignore any further data messages. Options 2 or 3 have potential benefits on links with noticable latency, since they're not blocking the client on synchronous replies from the server. That said, the remote protocol does allow for interleaving of calls & replies, so with Option 1 the client could send multiple data packets without waiting for their replies, and deal with possible delayed error replies. If doing that though, the benefit of having a 1-to-1 call-to-reply ratio is minimal, miht as well go for a n-to-1, call-to-reply approach. It is hard to match public API option 3, with wire protocol option 3, because of the delayed 'restore domain reply' message. The options I'm really thinking are most viable are - Public API 3 + RPC 2 - Public API 5 + RPC 3 Both of these are a little more complex to implement in the libvirtd daemon that Chris' current secure migration patches, but then then also have functional & design benefits Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| -- Libvir-list mailing list Libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list