Re: [PATCH v3 2/2] drm/amd/display: move remaining FPU code to dml folder

Rodrigo Siqueira <Rodrigo.Siqueira@xxxxxxx> · Thu, 27 Oct 2022 11:38:48 -0400

Hi Ao,

Could you share a link that describe your workstation?

Thanks

On 10/26/22 17:17, Ao Zhong wrote:
Hi Rodrigo,

Thanks for your review! This is my first time submitting a patch to the kernel.

I'm not very good at using these tools yet. 😂

Recently I got a Huawei Qingyun W510 (擎云 W510) ARM workstation

from the second-hand market in China. It's SBSA and has a Kunpeng 920 (3211k) SoC

with 24 Huawei-customized TSV110 cores. Since it's SFF form factor, and my machine

supports PCIe 4.0 (looks like some W510 have it disabled), I installed an RX 6400 on it

as my daily drive machine. It has decent performance. I uploaded a benchmark result on Geekbench.

Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbrowser.geekbench.com%2Fv5%2Fcpu%2F18237269&amp;data=05%7C01%7CRodrigo.Siqueira%40amd.com%7Cdaa18df14f004d2d621d08dab7977866%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638024158436988558%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=Iyq4tWJL%2FfXuKB9xAUaVTQQmJQ0GRZ2rH%2F%2BXPTT%2F2tc%3D&amp;reserved=0

Ao

Am 26.10.22 um 18:12 schrieb Rodrigo Siqueira:


On 10/26/22 07:13, Ao Zhong wrote:
pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0;
pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0;
these two operations in dcn32/dcn32_resource.c still need to use FPU,
This will cause compilation to fail on ARM64 platforms because
-mgeneral-regs-only is enabled by default to disable the hardware FPU.
Therefore, imitate the dcn31_zero_pipe_dcc_fraction function in
dml/dcn31/dcn31_fpu.c, declare the dcn32_zero_pipe_dcc_fraction function
in dcn32_fpu.c, and move above two operations into this function.

Acked-by: Christian König <christian.koenig@xxxxxxx>
Signed-off-by: Ao Zhong <hacc1225@xxxxxxxxx>
---
   drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c | 5 +++--
   drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  | 8 ++++++++
   drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h  | 3 +++
   3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
index a88dd7b3d1c1..287b7fa9bf41 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
@@ -1918,8 +1918,9 @@ int dcn32_populate_dml_pipes_from_context(
           timing = &pipe->stream->timing;
             pipes[pipe_cnt].pipe.src.gpuvm = true;
-        pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0;
-        pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0;
+        DC_FP_START();
+        dcn32_zero_pipe_dcc_fraction(pipes, pipe_cnt);
+        DC_FP_END();
           pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
           pipes[pipe_cnt].pipe.src.gpuvm_min_page_size_kbytes = 256; // according to spreadsheet
           pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
index 819de0f11012..58772fce6437 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
@@ -2521,3 +2521,11 @@ void dcn32_update_bw_bounding_box_fpu(struct dc *dc, struct clk_bw_params *bw_pa
       }
   }
   +void dcn32_zero_pipe_dcc_fraction(display_e2e_pipe_params_st *pipes,
+                  int pipe_cnt)
+{
+    dc_assert_fp_enabled();
+
+    pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0;
+    pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0;
+}
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h
index 3a3dc2ce4c73..ab010e7e840b 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h
@@ -73,4 +73,7 @@ int dcn32_find_dummy_latency_index_for_fw_based_mclk_switch(struct dc *dc,
     void dcn32_patch_dpm_table(struct clk_bw_params *bw_params);
   +void dcn32_zero_pipe_dcc_fraction(display_e2e_pipe_params_st *pipes,
+                  int pipe_cnt);
+
   #endif

Hi Ao,

First of all, thanks a lot for your patchset.

For both patches:

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@xxxxxxx>

And I also applied them to amd-staging-drm-next.

Btw, if you are using git-send-email for sending patches, I recommend the following options:

git send-email --annotate --cover-letter --thread --no-chain-reply-to --to="EMAILS" --cc="mailing@xxxxxxxx" <SHA>

Always add a cover letter, it makes it easier to follow the patchset, and you can also describe each change in the cover letter.

When you send that other patch enabling ARM64, please add as many details as possible in the cover letter. Keep in mind that we have been working for isolating those FPU codes in a way that we do not regress any of our ASICs, which means that every change was well-tested on multiple devices. Anyway, maybe you can refer to this cover letter to write down the commit message:

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F93042%2F&amp;data=05%7C01%7CRodrigo.Siqueira%40amd.com%7Cdaa18df14f004d2d621d08dab7977866%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638024158436988558%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=0GMN1Uj9iuQv2ZjipDHnl29V0UvWk6IL4XwlehdPNLA%3D&amp;reserved=0

Finally, do you have a use case for this change? I mean, ARM64 + AMD dGPU.

Thanks again!
Siqueira