On Wed, Nov 03, 2021 at 12:04:11PM -0600, Alex Williamson wrote: > We agreed that it's easier to add a feature than a restriction in a > uAPI, so how do we resolve that some future device may require a new > state in order to apply the SET_IRQS configuration? I would say don't support those devices. If there is even a hint that they could maybe exist then we should fix it now. Once the uapi is set and documented we should expect device makers to consider it when building their devices. As for SET_IRQs, I have been looking at making documentation and I don't like the way the documentation has to be wrriten because of this. What I see as an understandable, clear, documentation is: - SAVING set - no device touches allowed beyond migration operations and reset via XX Must be set with !RUNNING - RESUMING set - same as SAVING - RUNNING cleared - limited device touches in this list: SET_IRQs, XX config, XX. Device may assume no touches outside the above. (ie no MMIO) Implies NDMA - NDMA set - full device touches Device may not issue DMA or interrupts (??) Device may not dirty pages - RUNNING set - full functionality * In no state may a device generate an error TLP, device hang/integrity failure or kernel intergity failure, no matter what userspace does. The device is permitted to corrupt the migration/VM or SEGV userspace if userspace doesn't follow the rules. (we are trying to figure out what the XX's are right now, would appreciate any help) This is something I think we could expect a HW engineering team to follow and implement in devices. It doesn't complicate things. Overall, at this moment, I would prioritize documentation clarity over strict compatability with qemu, because people have to follow this documentation and make their devices long into the future. If the documentation is convoluted for compatibility reasons HW people are more likely to get it wrong. When HW people get it wrong they are more likely to ask for "quirks" in the uAPI to fix their mistakes. The pending_bytes P2P idea is also quite complicated to document as now we have to describe an HW state not in terms of a NDMA control bit, but in terms of a bunch of implicit operations in a protocol. Not so nice. So, here is what I propose. Let us work on some documentation and come up with the sort of HW centric docs like above and we can then decide if we want to make the qemu changes it will imply, or not. We'll include the P2P stuff, as we see it, so it shows a whole picture. I think that will help everyone participate fully in the discussion. > If we're going to move forward with the existing uAPI, then we're going > to need to start factoring compatibility into our discussions of > missing states and protocols. For example, requiring that the device > is "quiesced" when the _RUNNING bit is cleared and "frozen" when > pending_bytes is read has certain compatibility advantages versus > defining a new state bit. Not entirely, to support P2P going from RESUMING directly to RUNNING is not possible. There must be an in between state that all devices reach before they go to RUNNING. It seems P2P cannot be bolted into the existing qmeu flow with a kernel only change? > clarifications were trying for within the existing uAPI rather than > toss out new device states and protocols at every turn for the sake of > API purity. The rate at which we're proposing new states and required > transitions without a plan for the uAPI is not where I want to be for > adding the driver that could lock us in to a supported uAPI. Thanks, Well, to be fair, the other cases I suggested new stats was when you asked about features we don't have at all today (like post-copy). I think adding new states is a very reasonable way to approach adding new features. As long as new features can be supported with new states we have a forward compatability story. Thanks, Jason