Re: 回复: [PATCH] drm/amdgpu: add amdgpu vram usage information into amdgpu_vram_mm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 11/25/2022 8:05 AM, Wang, Yang(Kevin) wrote:
[AMD Official Use Only - General]

-----邮件原件-----
发件人: Paneer Selvam, Arunpravin <Arunpravin.PaneerSelvam@xxxxxxx>
发送时间: 2022年11月24日 22:57
收件人: Christian König <ckoenig.leichtzumerken@xxxxxxxxx>; Wang, Yang(Kevin) <KevinYang.Wang@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
抄送: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Zhang, Hawking <Hawking.Zhang@xxxxxxx>
主题: Re: [PATCH] drm/amdgpu: add amdgpu vram usage information into amdgpu_vram_mm



On 11/24/2022 1:17 PM, Christian König wrote:
Am 24.11.22 um 08:45 schrieb Wang, Yang(Kevin):
[AMD Official Use Only - General]

-----Original Message-----
From: Koenig, Christian <Christian.Koenig@xxxxxxx>
Sent: Thursday, November 24, 2022 3:25 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@xxxxxxx>;
amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Paneer Selvam, Arunpravin
<Arunpravin.PaneerSelvam@xxxxxxx>
Cc: Zhang, Hawking <Hawking.Zhang@xxxxxxx>; Deucher, Alexander
<Alexander.Deucher@xxxxxxx>
Subject: Re: [PATCH] drm/amdgpu: add amdgpu vram usage information
into amdgpu_vram_mm

Am 24.11.22 um 06:49 schrieb Yang Wang:
add vram usage information into dri debugfs amdgpu_vram_mm node.

Background:
when amdgpu driver introduces drm buddy allocator, the kernel driver
(and developer) is difficult to track the vram usage information.

Field:
0xaaaaaaaa-0xbbbbbbbb: vram usaged range.
type: kernel, device, sg
usage: normal, vm, user
domain: C-CPU, G-GTT, V-VRAM, P-PRIV
@xxxxx: the address of "amdgpu_bo" object in kernel space.
4096: vram range range.

Example:
0x00000003fea68000-0x00000003fea68fff: (type:kernel usage:vm
domain:--V- --V-) @000000001d33dfee 4096 bytes
0x00000003fea69000-0x00000003fea69fff: (type:kernel usage:vm
domain:--V- --V-) @00000000a79155b5 4096 bytes
0x00000003fea6b000-0x00000003fea6bfff: (type:kernel usage:vm
domain:--V- --V-) @0000000038ad633b 4096 bytes
0x00000003fea6c000-0x00000003fea6cfff: (type:device usage:user
domain:--V- --V-) @00000000e302f90b 4096 bytes
0x00000003fea6d000-0x00000003fea6dfff: (type:device usage:user
domain:--V- --V-) @00000000e664c172 4096 bytes
0x00000003fea6e000-0x00000003fea6efff: (type:kernel usage:vm
domain:--V- --V-) @000000004528cb2f 4096 bytes
0x00000003fea6f000-0x00000003fea6ffff: (type:kernel usage:vm
domain:--V- --V-) @00000000a446bdbf 4096 bytes
0x00000003fea70000-0x00000003fea7ffff: (type:device usage:user
domain:--V- --V-) @0000000078fae42f 65536 bytes
0x00000003fead8000-0x00000003feadbfff: (type:kernel usage:normal
domain:--V- --V-) @000000001327b7ff 16384 bytes
0x00000003feadc000-0x00000003feadcfff: (type:kernel usage:normal
domain:--V- --V-) @000000001327b7ff 4096 bytes
0x00000003feadd000-0x00000003feaddfff: (type:kernel usage:normal
domain:--V- --V-) @00000000b9706fc1 4096 bytes
0x00000003feade000-0x00000003feadefff: (type:kernel usage:vm
domain:--V- --V-) @0000000071a25571 4096 bytes

Note:
although some vram ranges can be merged in the example above, but
this can reflect the actual distribution of drm buddy allocator.
Well completely NAK. This is way to much complexity for simple
debugging.

Question is what are your requirements here? E.g. what information do
you want and why doesn't the buddy allocator already expose this?

Regards,
Christian.

[Kevin]:

For KMD debug.
The DRM buddy interface doesn't provide an interface to query which
ranges of ram(resource) are used.
It is not easy to debug in KMD side if driver create BO fail at
specific location.
and from the view of KMD, the VRAM at some locations has special
purposes.
with this patch, we can know which range of vram are actually used.
Well that's not a good reason to add this complexity. Debugging
doesn't justify that.

Please work with Arun to add the necessary information to the buddy
allocator interface.

Regards,
Christian.

Hi Kevin,

I will check and list down some of the necessary information that we can add to the buddy allocator interface.

Regards,
Arun.

[kevin]:

Thanks,
but some information is AMDGPU specific, so I hope we can add some flexible interfaces to adapt the different driver's extension.
We have a debug interface in drm buddy module drm_buddy_print() function, it prints the available memory in each order at any instance and we have drm_buddy_block_print() function to print the address range of each blocks and its size. Please check if it solves the purpose.

Regards,
Arun

Best Regards,
Kevin

Best Regards,
Kevin
Signed-off-by: Yang Wang <KevinYang.Wang@xxxxxxx>
---
    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c   |   6 +-
    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h   |   3 +
    drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 130
++++++++++++++++++-
    drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h |   1 +
    4 files changed, 136 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 90eb07106609..117c754409b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -53,7 +53,7 @@
     *
     */

-static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
+void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
    {
        struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);

@@ -66,7 +66,7 @@ static void amdgpu_bo_destroy(struct
ttm_buffer_object *tbo)
        kvfree(bo);
    }

-static void amdgpu_bo_user_destroy(struct ttm_buffer_object *tbo)
+void amdgpu_bo_user_destroy(struct ttm_buffer_object *tbo)
    {
        struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
        struct amdgpu_bo_user *ubo;
@@ -76,7 +76,7 @@ static void amdgpu_bo_user_destroy(struct
ttm_buffer_object *tbo)
        amdgpu_bo_destroy(tbo);
    }

-static void amdgpu_bo_vm_destroy(struct ttm_buffer_object *tbo)
+void amdgpu_bo_vm_destroy(struct ttm_buffer_object *tbo)
    {
        struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev);
        struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo); diff --git
a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 147b79c10cbb..3f6a687309a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -332,6 +332,9 @@ int amdgpu_bo_restore_shadow(struct amdgpu_bo
*shadow,
                             struct dma_fence **fence);
    uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device
*adev,
                                            uint32_t domain);
+void amdgpu_bo_destroy(struct ttm_buffer_object *tbo); void
+amdgpu_bo_user_destroy(struct ttm_buffer_object *tbo); void
+amdgpu_bo_vm_destroy(struct ttm_buffer_object *tbo);

    /*
     * sub allocation
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 27159f1d112e..165f4f1a8141 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -23,9 +23,11 @@
     */

    #include <linux/dma-mapping.h>
+#include <linux/interval_tree_generic.h>
    #include <drm/ttm/ttm_range_manager.h>

    #include "amdgpu.h"
+#include "amdgpu_object.h"
    #include "amdgpu_vm.h"
    #include "amdgpu_res_cursor.h"
    #include "amdgpu_atomfirmware.h"
@@ -38,6 +40,122 @@ struct amdgpu_vram_reservation {
        struct list_head blocks;
    };

+struct amdgpu_vram_node {
+     struct rb_node node;
+     u64 start;
+     u64 last;
+     u64 __subtree_last;
+     struct ttm_buffer_object *tbo; };
+
+#define START(node) ((node)->start) #define LAST(node)
+((node)->last)
+
+INTERVAL_TREE_DEFINE(struct amdgpu_vram_node, node, u64,
__subtree_last,
+                  START, LAST, static, amdgpu_vram_it)
+
+#undef START
+#undef LAST
+
+#define for_each_vram_mm_node(node, mgr) \
+     for (node = amdgpu_vram_it_iter_first(&(mgr)->root, 0,
U64_MAX); node; \
+          node = amdgpu_vram_it_iter_next(node, 0, U64_MAX))
+
+static void amdgpu_vram_mm_add_block(struct drm_buddy_block *block,
+struct amdgpu_vram_mgr *mgr, struct ttm_buffer_object *tbo) {
+     struct amdgpu_vram_node *node;
+
+     node = kvzalloc(sizeof(*node), GFP_KERNEL);
+     if (!node)
+             return;
+
+     node->start = amdgpu_vram_mgr_block_start(block);
+     node->last = node->start + amdgpu_vram_mgr_block_size(block) -
+1;
+     node->tbo = tbo;
+
+     amdgpu_vram_it_insert(node, &mgr->root); }
+
+static void amdgpu_vram_mm_remove_block(struct drm_buddy_block
+*block, struct amdgpu_vram_mgr *mgr) {
+     struct amdgpu_vram_node *node;
+     u64 start, last;
+
+     start = amdgpu_vram_mgr_block_start(block);
+     last = start + amdgpu_vram_mgr_block_size(block) - 1;
+
+     node = amdgpu_vram_it_iter_first(&mgr->root, start, last);
+     if (node) {
+             amdgpu_vram_it_remove(node, &mgr->root);
+             kvfree(node);
+     }
+}
+
+static inline const char* ttm_bo_type2str(enum ttm_bo_type type) {
+     switch (type) {
+     case ttm_bo_type_kernel:
+             return "kernel";
+     case ttm_bo_type_device:
+             return "device";
+     case ttm_bo_type_sg:
+             return "sg";
+     default:
+             return "unknow";
+     }
+}
+
+static inline const char* amdgpu_vram_domain_str(u32 domain, char
+*tmp) {
+     int index = 0;
+
+     tmp[index++] = domain & AMDGPU_GEM_DOMAIN_CPU ? 'C' : '-';
+     tmp[index++] = domain & AMDGPU_GEM_DOMAIN_GTT ? 'G' : '-';
+     tmp[index++] = domain & AMDGPU_GEM_DOMAIN_VRAM ? 'V' : '-';
+     tmp[index++] = domain & (AMDGPU_GEM_DOMAIN_GDS |
AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA) ?
+             'P' : '-';
+     tmp[index++] = '\0';
+
+     return tmp;
+}
+
+static inline const char* amdgpu_vram_bo_usage(struct
+ttm_buffer_object *tbo) {
+     if (tbo->destroy == &amdgpu_bo_destroy)
+             return "normal";
+     else if (tbo->destroy == &amdgpu_bo_user_destroy)
+             return "user";
+     else if (tbo->destroy == &amdgpu_bo_vm_destroy)
+             return "vm";
+     else
+             return "unknow";
+}
+
+static void amdgpu_vram_mm_debug(struct amdgpu_vram_mgr *mgr,
+struct drm_printer *p) {
+     struct amdgpu_vram_node *node;
+     struct ttm_buffer_object *tbo;
+     struct amdgpu_bo *abo;
+     char tmp[5];
+
+     for_each_vram_mm_node(node, mgr) {
+             tbo = node->tbo;
+             abo = ttm_to_amdgpu_bo(tbo);
+             drm_printf(p, "%#018llx-%#018llx:", node->start,
node->last);
+             if (abo)
+                     drm_printf(p, " (type:%-5s usage:%-8s
domain:%s %s) @%p",
+ ttm_bo_type2str(tbo->type),
+                                amdgpu_vram_bo_usage(tbo),
+amdgpu_vram_domain_str(abo->preferred_domains, tmp),
+amdgpu_vram_domain_str(abo->allowed_domains, tmp),
+                                abo);
+             else
+                     drm_printf(p, " (reserved)");
+             drm_printf(p, " %llu bytes\n",
+                        node->last - node->start + 1);
+     }
+}
+
    static inline struct amdgpu_vram_mgr *
    to_vram_mgr(struct ttm_resource_manager *man)
    {
@@ -288,6 +406,7 @@ static void amdgpu_vram_mgr_do_reserve(struct
ttm_resource_manager *man)
                dev_dbg(adev->dev, "Reservation 0x%llx - %lld,
Succeeded\n",
                        rsv->start, rsv->size);

+             amdgpu_vram_mm_add_block(block, mgr, NULL);
                vis_usage = amdgpu_vram_mgr_vis_size(adev, block);
                atomic64_add(vis_usage, &mgr->vis_usage);
                spin_lock(&man->bdev->lru_lock); @@ -540,6 +659,8 @@
static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
                vres->base.start = max(vres->base.start, start);

                vis_usage += amdgpu_vram_mgr_vis_size(adev, block);
+
+             amdgpu_vram_mm_add_block(block, mgr, tbo);
        }

        if (amdgpu_is_vram_mgr_blocks_contiguous(&vres->blocks))
@@ -583,8 +704,10 @@ static void amdgpu_vram_mgr_del(struct
ttm_resource_manager *man,
        uint64_t vis_usage = 0;

        mutex_lock(&mgr->lock);
-     list_for_each_entry(block, &vres->blocks, link)
+     list_for_each_entry(block, &vres->blocks, link) {
                vis_usage += amdgpu_vram_mgr_vis_size(adev, block);
+             amdgpu_vram_mm_remove_block(block, mgr);
+     }

        amdgpu_vram_mgr_do_reserve(man);

@@ -747,6 +870,9 @@ static void amdgpu_vram_mgr_debug(struct
ttm_resource_manager *man,
        drm_printf(printer, "reserved:\n");
        list_for_each_entry(block, &mgr->reserved_pages, link)
                drm_buddy_block_print(mm, block, printer);
+     drm_printf(printer, "vram usage:\n");
+     amdgpu_vram_mm_debug(mgr, printer);
+
        mutex_unlock(&mgr->lock);
    }

@@ -769,6 +895,8 @@ int amdgpu_vram_mgr_init(struct amdgpu_device
*adev)
        struct ttm_resource_manager *man = &mgr->manager;
        int err;

+     mgr->root = RB_ROOT_CACHED;
+
        ttm_resource_manager_init(man, &adev->mman.bdev,
                                  adev->gmc.real_vram_size);

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
index 0e04e42cf809..a14c56e1e407 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
@@ -28,6 +28,7 @@

    struct amdgpu_vram_mgr {
        struct ttm_resource_manager manager;
+     struct rb_root_cached root;
        struct drm_buddy mm;
        /* protects access to buffer objects */
        struct mutex lock;




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux