On Sat, Oct 17, 2020 at 8:09 PM Alexander Graf <graf@xxxxxxxxx> wrote: > There are applications way beyond that though. What do you do with > applications that already consumed randomness? For example a cached pool > of SSL keys. Or a higher level language primitive that consumes > randomness and caches its seed somewhere in an internal data structure. For deterministic protection, those would also have to poll some memory location that tells them whether the VmGenID changed: 1. between reading entropy from their RNG pool and using it 2. between collecting data from external sources (user input, clock, ...) and encrypting it and synchronously shoot down the connection if a change happened. If e.g. an application inside the VM has an AES-GCM-encrypted TLS connection and, directly after the VM is restored, triggers an application-level timeout that sends some fixed message across the connection, then the TLS library must guarantee that either the VM was already committed to sending exactly that message before the VM was forked or the message will be blocked. If we don't do that, an attacker who captures both a single packet from the forked VM and traffic from the old VM can decrypt the next message from the old VM after the fork (because AES-GCM is like AES-CTR plus an authenticator, and CTR means that when keystream reuse occurs and one of the plaintexts is known, the attacker can simply recover the other plaintext using XOR). (Or maybe, in disaster failover environments, TLS 1.3 servers could get away with rekeying the connection instead of shooting it down? Ask your resident friendly cryptographer whether that would be secure, I am not one.) I don't think a mechanism based around asynchronously telling the application and waiting for it to confirm the rotation at a later point is going to cut it; we should have some hard semantics on when an application needs to poll this value. > Or even worse: your system's host ssh key. Mmmh... I think I normally would not want a VM to reset its host ssh key after merely restoring a snapshot though? And more importantly, Microsoft's docs say that they also change the VmGenID on disaster failover. I think you very much wouldn't want your server to lose its host key every time disaster failover happens. On the other hand, after importing a public VM image, it might be a good idea. I guess you could push that responsibility on the user, by adding an option to the sshd_config that tells OpenSSH whether the host key should be rotated on an ID change or not... but that still would not be particularly pretty. Ideally we would have the host tell us what type of events happened to the VM, or something like that... or maybe even get the host VM management software to ask the user whether they're importing a public image... I really feel like with Microsoft's current protocol, we don't get enough information to figure out what we should do about private long-term authentication keys.