On 2020-10-28 15:09, Sierra Guiza, Alejandro (Alex) wrote: > [AMD Public Use] > > Please ignore this patch, it should be in a different branch. As PCIe p2p is not supported in upstream. No problem, but if you do add it elsewhere, please use something more specific, like amdgpu_xgmi_p2p as the (positive-controlled) flag, since more generic flags could be added later, to control a more encompassing behaviour. Regards, Luben > > Regards, > Alex Sierra > >> -----Original Message----- >> From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of >> Sierra Guiza, Alejandro (Alex) >> Sent: Wednesday, October 28, 2020 1:09 PM >> To: Koenig, Christian <Christian.Koenig@xxxxxxx>; amd- >> gfx@xxxxxxxxxxxxxxxxxxxxx >> Subject: Re: [PATCH] drm/amdgpu: Add kernel parameter to force no xgmi >> >> >> On 10/28/2020 9:58 AM, Christian König wrote: >>> Am 28.10.20 um 15:55 schrieb Alex Sierra: >>>> By enabling this parameter, the system will be forced to use pcie >>>> interface only for p2p transactions. >>> >>> Better name that amdgpu_xgmi with a default value of enabled. >>> >>> Or maybe add another bit value for amdgpu_vm_debug instead. >> >> Ack >> >> Regards, >> Alex Sierra >> >>> >>> >>>> >>>> Signed-off-by: Alex Sierra <alex.sierra@xxxxxxx> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 9 +++++++++ >>>> 3 files changed, 11 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>>> index ba65d4f2ab67..3645f00e9f61 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>>> @@ -188,6 +188,7 @@ extern int amdgpu_discovery; >>>> extern int amdgpu_mes; >>>> extern int amdgpu_noretry; >>>> extern int amdgpu_force_asic_type; >>>> +extern int amdgpu_force_no_xgmi; >>>> #ifdef CONFIG_HSA_AMD >>>> extern int sched_policy; >>>> extern bool debug_evictions; >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> index 1fe850e0a94d..0a5d97a84017 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> @@ -2257,7 +2257,7 @@ static int amdgpu_device_ip_init(struct >>>> amdgpu_device *adev) >>>> if (r) >>>> goto init_failed; >>>> - if (adev->gmc.xgmi.num_physical_nodes > 1) >>>> + if (!amdgpu_force_no_xgmi && adev- >>> gmc.xgmi.num_physical_nodes > >>>> +1) >>> >>> Mhm, this will most likely cause problems. You still need to add the >>> device to the hive because otherwise GPU won't work. >> >> What kind of problems? So far, I have validated this using a system with >> multiple devices and running ./rocm_bandwidth_test -t. With and without >> the parameter set. >> >> Regards, >> Alex Sierra >> >>> >>> Apart from that sounds like a good idea in general. >>> >>> Christian. >>> >>>> amdgpu_xgmi_add_device(adev); >>>> amdgpu_amdkfd_device_init(adev); >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >>>> index 4b78ecfd35f7..22485067cf31 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >>>> @@ -160,6 +160,7 @@ int amdgpu_force_asic_type = -1; >>>> int amdgpu_tmz = 0; >>>> int amdgpu_reset_method = -1; /* auto */ >>>> int amdgpu_num_kcq = -1; >>>> +int amdgpu_force_no_xgmi = 0; >>>> struct amdgpu_mgpu_info mgpu_info = { >>>> .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex), >>>> @@ -522,6 +523,14 @@ module_param_named(ras_enable, >>>> amdgpu_ras_enable, int, 0444); >>>> MODULE_PARM_DESC(ras_mask, "Mask of RAS features to enable >> (default >>>> 0xffffffff), only valid when ras_enable == 1"); >>>> module_param_named(ras_mask, amdgpu_ras_mask, uint, 0444); >>>> +/** >>>> + * DOC: force_no_xgmi (uint) >>>> + * Forces not to use xgmi interface (0 = disable, 1 = enable). >>>> + * Default is 0 (disabled). >>>> + */ >>>> +MODULE_PARM_DESC(force_no_xgmi, "Force not to use xgmi >> interface"); >>>> +module_param_named(force_no_xgmi, amdgpu_force_no_xgmi, int, >> 0600); >>>> + >>>> /** >>>> * DOC: si_support (int) >>>> * Set SI support driver. This parameter works after set config >>>> CONFIG_DRM_AMDGPU_SI. For SI asic, when radeon driver is enabled, >>> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists. >> freedesktop.org%2Fmailman%2Flistinfo%2Famd- >> gfx&data=04%7C01%7Calex.sierra%40amd.com%7C6a2e34427fb449865 >> 91208d87b6c8c05%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63 >> 7395053457347633%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD >> AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata= >> s2hXVAVFtHJsNNBwhzzKDjMlEjES9uNGbYi6GdeD5cc%3D&reserved=0 > _______________________________________________ > amd-gfx mailing list > amd-gfx@xxxxxxxxxxxxxxxxxxxxx > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cluben.tuikov%40amd.com%7Cedd479f495ff42c3059408d87b75070b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637395089882966375%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=K3AKD5tbr82wVMZDKlCiVO1P3MkV%2FyryqxF3KyOl1uU%3D&reserved=0 > _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx