Hi Jacob, apologies for the late reply. On Mon, 2024-10-28 at 09:03 -0700, Jacob Pan wrote: > Hi James, > > Just a gentle reminder. Let me also explain the problem we are trying > to solve for the live update of OpenHCL paravisor[1]. OpenHCL has user > space drivers based on VFIO noiommu mode, we are in the process of > converting to iommufd cdev. > > Similarly, running DMA continuously across updates is required, but > unlike your case, OpenHCL updates do not involve preserving the IO page > tables in that it is managed by the hypervisor which is not part of the > update. > > It seems reasonable to share the device persistence code path > with the plan laid out in your cover letter. IOAS code path will be > different since noiommu option does not have IOAS. > > If we were to revive noiommu support for iommufd cdev[2], can we use > the persistent iommufd context to allow device persistence? Perhaps > through IOMMUFD_OBJ_DEVICE and IOMMUFD_OBJ_ACCESS(used in [2])? > > @David, @Jason, @Alex, @Yi, any comments or suggestions? IIRC we did discuss this device persistence use case with some of your colleagues at Linux Plumbers. Adding Jinank to this thread as he was part of the conversation too. Yes, I think the guidance was to bind a device to iommufd in noiommu mode. It does seem a bit weird to use iommufd with noiommu, but we agreed it's the best/simplest way to get the functionality. Then as you suggest below the IOMMUFD_OBJ_DEVICE would be serialised too in some way, probably by iommufd telling the PCI layer that this device must be persistent and hence not to re-probe it on kexec. I think this would get you what you want? Specifically you want to make sure that the device is not reset during kexec so it can keep running? And some handle for userspace to pick it up again and continue interacting with it after kexec. It's all a bit hand wavy at the moment, but something along those lines probably makes sense. I need to work on rev2 of this RFC as per Jason's feedback in the other thread. Rev2 will make the restore path more userspace driven, with fresh iommufd and pgtables objects being created and then atomically swapped over too. I'll also get the PCI layer involved with rev2. Once that's out (it'll be a few weeks as I'm on leave) then let's take a look at how the noiommu device persistence case would fit in. JG > > > Thanks, > > Jacob > > 1. (openvmm/Guide/src/reference/architecture/openhcl.md at main · > microsoft/openvmm. > 2. [PATCH v11 00/23] Add vfio_device cdev for > iommufd support - Yi Liu > > On Wed, 16 Oct 2024 15:20:47 -0700 Jacob Pan > <jacob.pan@xxxxxxxxxxxxxxxxxxx> wrote: > > > Hi James, > > > > On Mon, 16 Sep 2024 13:30:54 +0200 > > James Gowans <jgowans@xxxxxxxxxx> wrote: > > > > > +static int serialise_iommufd(void *fdt, struct iommufd_ctx *ictx) > > > +{ > > > + int err = 0; > > > + char name[24]; > > > + struct iommufd_object *obj; > > > + unsigned long obj_idx; > > > + > > > + snprintf(name, sizeof(name), "%lu", ictx->persistent_id); > > > + err |= fdt_begin_node(fdt, name); > > > + err |= fdt_begin_node(fdt, "ioases"); > > > + xa_for_each(&ictx->objects, obj_idx, obj) { > > > + struct iommufd_ioas *ioas; > > > + struct iopt_area *area; > > > + int area_idx = 0; > > > + > > > + if (obj->type != IOMMUFD_OBJ_IOAS) > > > + continue; > > I was wondering how device state persistency is managed here. Is it > > correct to assume that all devices bound to an iommufd context should > > be persistent? If so, should we be serializing IOMMUFD_OBJ_DEVICE as > > well? > > > > I'm considering this from the perspective of user mode drivers, > > including those that use noiommu mode (need to be added to iommufd > > cdev). In this scenario, we only need to maintain the device states > > persistently without IOAS. >