On Mon, 2024-10-07 at 09:47 +0100, David Woodhouse wrote: > On Mon, 2024-10-07 at 08:39 +0000, Gowans, James wrote: > > > > I think we have two other possible approaches here: > > > > 1. What this RFC is sketching out, serialising fields from the structs > > and setting those fields again on deserialise. As you point out this > > will be complicated. > > > > 2. Get userspace to do the work: userspace needs to re-do the ioctls > > after kexec to reconstruct the objects. My main issue with this approach > > is that the kernel needs to do some sort of trust but verify approach to > > ensure that userspace constructs everything the same way after kexec as > > it was before kexec. We don't want to end up in a state where the > > iommufd objects don't match the persisted page tables. > > To what extent does the kernel really need to trust or verify? At LPC > we seemed to speak of a model where userspace builds a "new" address > space for each device and then atomically switches to the new page > tables instead of the original ones inherited from the previous kernel. > > That does involve having space for another set of page tables, of > course, but that's not impossible. The idea of constructing fresh page tables and then swapping over to that is indeed appealing, but I don't know if that's always possible. With the ARM SMMUv3 for example I think there are break-before-make requirement, so is it possible to do an atomic switch of the SMMUv3 page table PGD in a hitless way? Everything here must be hitless - serialise and deserialise must not cause any DMA faults. If it's not possible to do a hitless atomic switch (I am unsure about this, need to RTFM) then we're compelled to re-use the existing page tables and if that's the case I think the kernel MUST ensure that the iommufd IOAS object exactly match the ones before kexec. I can imagine all sorts of mess if those go out of sync!