This is a note to let you know that I've just added the patch titled iommu: Attach device group to old domain in error path to the 6.1-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: iommu-attach-device-group-to-old-domain-in-error-pat.patch and it can be found in the queue-6.1 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit c2914e324f4a06a5b27074485d0b92df9601a9ea Author: Vasant Hegde <vasant.hegde@xxxxxxx> Date: Wed Feb 15 05:26:40 2023 +0000 iommu: Attach device group to old domain in error path [ Upstream commit 2cc73c5712f97de98c38c2fafc1f288354a9f3c3 ] iommu_attach_group() attaches all devices in a group to domain and then sets group domain (group->domain). Current code (__iommu_attach_group()) does not handle error path. This creates problem as devices to domain attachment is in inconsistent state. Flow: - During boot iommu attach devices to default domain - Later some device driver (like amd/iommu_v2 or vfio) tries to attach device to new domain. - In iommu_attach_group() path we detach device from current domain. Then it tries to attach devices to new domain. - If it fails to attach device to new domain then device to domain link is broken. - iommu_attach_group() returns error. - At this stage iommu_attach_group() caller thinks, attaching device to new domain failed and devices are still attached to old domain. - But in reality device to old domain link is broken. It will result in all sort of failures (like IO page fault) later. To recover from this situation, we need to attach all devices back to the old domain. Also log warning if it fails attach device back to old domain. Suggested-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> Reported-by: Matt Fagnani <matt.fagnani@xxxxxxxx> Signed-off-by: Vasant Hegde <vasant.hegde@xxxxxxx> Reviewed-by: Jason Gunthorpe <jgg@xxxxxxxxxx> Tested-by: Matt Fagnani <matt.fagnani@xxxxxxxx> Link: https://lore.kernel.org/r/20230215052642.6016-1-vasant.hegde@xxxxxxx Link: https://bugzilla.kernel.org/show_bug.cgi?id=216865 Link: https://lore.kernel.org/lkml/15d0f9ff-2a56-b3e9-5b45-e6b23300ae3b@xxxxxxxxxxxxx/ Signed-off-by: Joerg Roedel <jroedel@xxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index fd8c8aeb3c504..bfb2f163c6914 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2089,8 +2089,22 @@ static int __iommu_attach_group(struct iommu_domain *domain, ret = __iommu_group_for_each_dev(group, domain, iommu_group_do_attach_device); - if (ret == 0) + if (ret == 0) { group->domain = domain; + } else { + /* + * To recover from the case when certain device within the + * group fails to attach to the new domain, we need force + * attaching all devices back to the old domain. The old + * domain is compatible for all devices in the group, + * hence the iommu driver should always return success. + */ + struct iommu_domain *old_domain = group->domain; + + group->domain = NULL; + WARN(__iommu_group_set_domain(group, old_domain), + "iommu driver failed to attach a compatible domain"); + } return ret; }