Re: [PATCH] drm/amdkfd: Cleanup IO links during KFD device removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 2022-04-11 um 21:14 schrieb Joshi, Mukul:
[AMD Official Use Only]



-----Original Message-----
From: Kuehling, Felix <Felix.Kuehling@xxxxxxx>
Sent: Monday, April 11, 2022 8:16 PM
To: Joshi, Mukul <Mukul.Joshi@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Shuotao Xu <shuotaoxu@xxxxxxxxxxxxx>
Subject: Re: [PATCH] drm/amdkfd: Cleanup IO links during KFD device
removal

Am 2022-04-07 um 12:15 schrieb Mukul Joshi:
Currently, the IO-links to the device being removed from topology, are
not cleared. As a result, there would be dangling links left in the
KFD topology. This patch aims to fix the following:
1. Cleanup all IO links to the device being removed.
2. Ensure that node numbering in sysfs and nodes proximity domain
     values are consistent after the device is removed:
     a. Adding a device and removing a GPU device are made mutually
        exclusive.
     b. The global proximity domain counter is no longer required to be
        an atomic counter. A normal 32-bit counter can be used instead.
3. Update generation_count to let user-mode know that topology has
     changed due to device removal.

CC: Shuotao Xu <shuotaoxu@xxxxxxxxxxxxx>
Signed-off-by: Mukul Joshi <mukul.joshi@xxxxxxx>
Looks good to me. I have two nit-picks inline.


---
   drivers/gpu/drm/amd/amdkfd/kfd_crat.c     |  4 +-
   drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  2 +
   drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 79
++++++++++++++++++++---
   3 files changed, 74 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index 1eaabd2cb41b..afc8a7fcdad8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -1056,7 +1056,7 @@ static int kfd_parse_subtype_iolink(struct
crat_subtype_iolink *iolink,
   	 * table, add corresponded reversed direction link now.
   	 */
   	if (props && (iolink->flags & CRAT_IOLINK_FLAGS_BI_DIRECTIONAL))
{
-		to_dev =
kfd_topology_device_by_proximity_domain(id_to);
+		to_dev =
kfd_topology_device_by_proximity_domain_no_lock(id_to);
   		if (!to_dev)
   			return -ENODEV;
   		/* same everything but the other direction */ @@ -2225,7
+2225,7
@@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
   	 */
   	if (kdev->hive_id) {
   		for (nid = 0; nid < proximity_domain; ++nid) {
-			peer_dev =
kfd_topology_device_by_proximity_domain(nid);
+			peer_dev =
kfd_topology_device_by_proximity_domain_no_lock(nid);
   			if (!peer_dev->gpu)
   				continue;
   			if (peer_dev->gpu->hive_id != kdev->hive_id) diff --
git
a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index e1b7e6afa920..8a43def1f638 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -1016,6 +1016,8 @@ int kfd_topology_add_device(struct kfd_dev
*gpu);
   int kfd_topology_remove_device(struct kfd_dev *gpu);
   struct kfd_topology_device
*kfd_topology_device_by_proximity_domain(
   						uint32_t proximity_domain);
+struct kfd_topology_device
*kfd_topology_device_by_proximity_domain_no_lock(
+						uint32_t proximity_domain);
   struct kfd_topology_device *kfd_topology_device_by_id(uint32_t
gpu_id);
   struct kfd_dev *kfd_device_by_id(uint32_t gpu_id);
   struct kfd_dev *kfd_device_by_pci_dev(const struct pci_dev *pdev);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 3bdcae239bc0..874a273b81f7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -46,27 +46,38 @@ static struct list_head topology_device_list;
   static struct kfd_system_properties sys_props;

   static DECLARE_RWSEM(topology_lock); -static atomic_t
topology_crat_proximity_domain;
+static uint32_t topology_crat_proximity_domain;

-struct kfd_topology_device
*kfd_topology_device_by_proximity_domain(
+struct kfd_topology_device
+*kfd_topology_device_by_proximity_domain_no_lock(
   						uint32_t proximity_domain)
I remember we discussed this and I suggested splitting a no_lock version out
of this function. But now I don't see it being used anywhere. Was that lost
somewhere in refactoring or porting to the upstream branch?
Maybe the no_lock version isn't needed any more.

Its used in the changes in kfd_crat.c (in kfd_create_vcrat_image_gpu() and
kfd_parse_subtype_iolink ()) and  below in kfd_topology_device_by_proximity_domain().

You're right, I missed the changes in kfd_crat.c. And they are needed because the whole CRAT table parsing is now under the topology lock. Thanks for the reminder.

Regards,
  Felix



   {
   	struct kfd_topology_device *top_dev;
   	struct kfd_topology_device *device = NULL;

-	down_read(&topology_lock);
-
   	list_for_each_entry(top_dev, &topology_device_list, list)
   		if (top_dev->proximity_domain == proximity_domain) {
   			device = top_dev;
   			break;
   		}

+	return device;
+}
+
+struct kfd_topology_device
*kfd_topology_device_by_proximity_domain(
+						uint32_t proximity_domain)
+{
+	struct kfd_topology_device *device = NULL;
+
+	down_read(&topology_lock);
+
+	device = kfd_topology_device_by_proximity_domain_no_lock(
+							proximity_domain);
   	up_read(&topology_lock);

   	return device;
   }

+
   struct kfd_topology_device *kfd_topology_device_by_id(uint32_t
gpu_id)
   {
   	struct kfd_topology_device *top_dev = NULL; @@ -1060,7 +1071,7
@@
int kfd_topology_init(void)
   	down_write(&topology_lock);
   	kfd_topology_update_device_list(&temp_topology_device_list,
   					&topology_device_list);
-	atomic_set(&topology_crat_proximity_domain,
sys_props.num_devices-1);
+	topology_crat_proximity_domain = sys_props.num_devices-1;
   	ret = kfd_topology_update_sysfs();
   	up_write(&topology_lock);

@@ -1295,8 +1306,6 @@ int kfd_topology_add_device(struct kfd_dev
*gpu)
   	pr_debug("Adding new GPU (ID: 0x%x) to topology\n", gpu_id);

-	proximity_domain =
atomic_inc_return(&topology_crat_proximity_domain);
-
   	/* Include the CPU in xGMI hive if xGMI connected by assigning it the
hive ID. */
   	if (gpu->hive_id && gpu->adev->gmc.xgmi.connected_to_cpu) {
   		struct kfd_topology_device *top_dev; @@ -1321,12
+1330,16 @@ int
kfd_topology_add_device(struct kfd_dev *gpu)
   	 */
   	dev = kfd_assign_gpu(gpu);
   	if (!dev) {
+		down_write(&topology_lock);
+		proximity_domain = ++topology_crat_proximity_domain;
+
   		res = kfd_create_crat_image_virtual(&crat_image,
&image_size,
   						    COMPUTE_UNIT_GPU,
gpu,
   						    proximity_domain);
   		if (res) {
   			pr_err("Error creating VCRAT for GPU (ID: 0x%x)\n",
   			       gpu_id);
+			topology_crat_proximity_domain--;
   			return res;
   		}
   		res = kfd_parse_crat_table(crat_image, @@ -1335,10
+1348,10 @@ int
kfd_topology_add_device(struct kfd_dev *gpu)
   		if (res) {
   			pr_err("Error parsing VCRAT for GPU (ID: 0x%x)\n",
   			       gpu_id);
+			topology_crat_proximity_domain--;
   			goto err;
   		}

-		down_write(&topology_lock);

	kfd_topology_update_device_list(&temp_topology_device_list,
   			&topology_device_list);

@@ -1485,25 +1498,73 @@ int kfd_topology_add_device(struct kfd_dev
*gpu)
   	return res;
   }

+static void kfd_topology_update_io_links(int proximity_domain) {
+	struct kfd_topology_device *dev;
+	struct kfd_iolink_properties *iolink, *p2plink, *tmp;
+	/*
+	 * The topology list currently is arranged in increasing
+	 * order of proximity domain.
+	 *
+	 * Two things need to be done when a device is removed:
+	 * 1. All the IO links to this device need to be
+	 *    removed.
+	 * 2. All nodes after the current device node need to move
+	 *    up once this device node is removed from the topology
+	 *    list. As a result, the proximity domain values for
+	 *    all nodes after the node being deleted reduce by 1.
+	 *    This would also cause the proximity domain values for
+	 *    io links to be updated based on new proximity
+	 *    domain values.
+	 */
I'd put this comment in front of the function, not in the middle of it.
You can make it a proper kernel-doc comment, especially since the function
name is a bit generic (and I can't think of a better one that isn't excessively
long).

Sure will do that and send a v2.

Regards,
Mukul

Regards,
    Felix


+	list_for_each_entry(dev, &topology_device_list, list) {
+		if (dev->proximity_domain > proximity_domain)
+			dev->proximity_domain--;
+
+		list_for_each_entry_safe(iolink, tmp, &dev->io_link_props,
list) {
+			/*
+			 * If there is an io link to the dev being deleted
+			 * then remove that IO link also.
+			 */
+			if (iolink->node_to == proximity_domain) {
+				list_del(&iolink->list);
+				dev->io_link_count--;
+				dev->node_props.io_links_count--;
+			} else if (iolink->node_from > proximity_domain) {
+				iolink->node_from--;
+			} else if (iolink->node_to > proximity_domain) {
+				iolink->node_to--;
+			}
+		}
+
+	}
+}
+
   int kfd_topology_remove_device(struct kfd_dev *gpu)
   {
   	struct kfd_topology_device *dev, *tmp;
   	uint32_t gpu_id;
   	int res = -ENODEV;
+	int i = 0;

   	down_write(&topology_lock);

-	list_for_each_entry_safe(dev, tmp, &topology_device_list, list)
+	list_for_each_entry_safe(dev, tmp, &topology_device_list, list) {
   		if (dev->gpu == gpu) {
   			gpu_id = dev->gpu_id;
   			kfd_remove_sysfs_node_entry(dev);
   			kfd_release_topology_device(dev);
   			sys_props.num_devices--;
+			kfd_topology_update_io_links(i);
+			topology_crat_proximity_domain =
sys_props.num_devices-1;
+			sys_props.generation_count++;
   			res = 0;
   			if (kfd_topology_update_sysfs() < 0)
   				kfd_topology_release_sysfs();
   			break;
   		}
+		i++;
+	}

   	up_write(&topology_lock);




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux