On Thu, Apr 14, 2022 at 03:47:07AM -0700, Yi Liu wrote: > +static int vfio_get_devicefd(const char *sysfs_path, Error **errp) > +{ > + long int vfio_id = -1, ret = -ENOTTY; > + char *path, *tmp = NULL; > + DIR *dir; > + struct dirent *dent; > + struct stat st; > + gchar *contents; > + gsize length; > + int major, minor; > + dev_t vfio_devt; > + > + path = g_strdup_printf("%s/vfio-device", sysfs_path); > + if (stat(path, &st) < 0) { > + error_setg_errno(errp, errno, "no such host device"); > + goto out; > + } > + > + dir = opendir(path); > + if (!dir) { > + error_setg_errno(errp, errno, "couldn't open dirrectory %s", path); > + goto out; > + } > + > + while ((dent = readdir(dir))) { > + const char *end_name; > + > + if (!strncmp(dent->d_name, "vfio", 4)) { > + ret = qemu_strtol(dent->d_name + 4, &end_name, 10, &vfio_id); > + if (ret) { > + error_setg(errp, "suspicious vfio* file in %s", path); > + goto out; > + } Userspace shouldn't explode if there are different files here down the road. Just search for the first match of vfio\d+ and there is no need to parse out the vfio_id from the string. Only fail if no match is found. > + tmp = g_strdup_printf("/dev/vfio/devices/vfio%ld", vfio_id); > + if (stat(tmp, &st) < 0) { > + error_setg_errno(errp, errno, "no such vfio device"); > + goto out; > + } And simply pass the string directly here, no need to parse out vfio_id. I also suggest falling back to using "/dev/char/%u:%u" if the above does not exist which prevents "vfio/devices/vfio" from turning into ABI. It would be a good idea to make a general open_cdev function that does all this work once the sysfs is found and cdev read out of it, all the other vfio places can use it too. > +static int iommufd_attach_device(VFIODevice *vbasedev, AddressSpace *as, > + Error **errp) > +{ > + VFIOContainer *bcontainer; > + VFIOIOMMUFDContainer *container; > + VFIOAddressSpace *space; > + struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) }; > + int ret, devfd, iommufd; > + uint32_t ioas_id; > + Error *err = NULL; > + > + devfd = vfio_get_devicefd(vbasedev->sysfsdev, errp); > + if (devfd < 0) { > + return devfd; > + } > + vbasedev->fd = devfd; > + > + space = vfio_get_address_space(as); > + > + /* try to attach to an existing container in this space */ > + QLIST_FOREACH(bcontainer, &space->containers, next) { > + if (!object_dynamic_cast(OBJECT(bcontainer), > + TYPE_VFIO_IOMMUFD_CONTAINER)) { > + continue; > + } > + container = container_of(bcontainer, VFIOIOMMUFDContainer, obj); > + if (vfio_device_attach_container(vbasedev, container, &err)) { > + const char *msg = error_get_pretty(err); > + > + trace_vfio_iommufd_fail_attach_existing_container(msg); > + error_free(err); > + err = NULL; > + } else { > + ret = vfio_ram_block_discard_disable(true); > + if (ret) { > + vfio_device_detach_container(vbasedev, container, &err); > + error_propagate(errp, err); > + vfio_put_address_space(space); > + close(vbasedev->fd); > + error_prepend(errp, > + "Cannot set discarding of RAM broken (%d)", ret); > + return ret; > + } > + goto out; > + } > + } ?? this logic shouldn't be necessary, a single ioas always supports all devices, userspace should never need to juggle multiple ioas's unless it wants to have different address maps. Something I would like to see confirmed here in qemu is that qemu can track the hw pagetable id for each device it binds because we will need that later to do dirty tracking and other things. > + /* > + * TODO: for now iommufd BE is on par with vfio iommu type1, so it's > + * fine to add the whole range as window. For SPAPR, below code > + * should be updated. > + */ > + vfio_host_win_add(bcontainer, 0, (hwaddr)-1, 4096); ? Not sure what this is, but I don't expect any changes for SPAPR someday IOMMU_IOAS_IOVA_RANGES should be able to accurately report its configuration. I don't see IOMMU_IOAS_IOVA_RANGES called at all, that seems like a problem.. (and note that IOVA_RANGES changes with every device attached to the IOAS) Jason