On Fri, Apr 12, 2024 at 01:15:06AM -0700, Yi Liu wrote: > - if (device == last_gdev) > + /* > + * Rollback the devices/pasid that have attached to the new > + * domain. And it is a driver bug to fail attaching with a > + * previously good domain. > + */ > + if (device == last_gdev) { > + WARN_ON(old->ops->set_dev_pasid(old, device->dev, > + pasid, NULL)); > break; > - ops->remove_dev_pasid(device->dev, pasid, domain); Suggest writing this as if (WARN_ON(old->ops->set_dev_pasid(old, device->dev, pasid, curr))) ops->remove_dev_pasid(device->dev, pasid, domain); As we may as well try to bring the system back to some kind of safe state before we continue on. Also NULL doesn't seem right, if we here then the new domain was attached successfully and we are put it back to old. > + mutex_lock(&group->mutex); > + curr = xa_store(&group->pasid_array, pasid, domain, GFP_KERNEL); > + if (!curr) { > + xa_erase(&group->pasid_array, pasid); A comment here about order is likely a good idea.. At this point the pasid_array and the translation are not matched. If we get a PRI at this instant it will deliver to the new domain until the translation is updated. There is nothing to do about this race, but lets note it and say the concurrent PRI path will eventually become consistent and there is no harm in directing PRI to the wrong domain. Let's also check that receiving a PRI on a domain that is not PRI capable doesn't explode in case someone uses replace to change from a PRI to non PRI domain. Jason