On 03/30, VURDIGERENATARAJ, CHANDAN wrote: > Hi Paul, > > >Am 29.03.22 um 10:29 schrieb CHANDAN VURDIGERE NATARAJ: > > > >Is it common to spell your name all uppercase? If not, please use Chandan nVurdigere Nataraj. > > > >> [WHY] > > > >The [] already emphasize the word, so Why could be used. > > > >> Below general protection fault observed when WebGL Aquarium is run for > >> longer duration. If drm debug logs are enabled and set to 0x1f then > >> the > > > >In what browser and what version? > The issue was observed on ChromiumOS and Chromium Browser version 100.0.4877.0 > > > >> issue is observed within 10 minutes of run. > > > >Where you able to reproduce it without drm debug logs? > Yes. It took 34 hours to reproduce without drm debug logs. Using drm debug logs was a faster way to reproduce the issue. > > > >> [ 100.717056] general protection fault, probably for non-canonical address 0x2d33302d32323032: 0000 [#1] PREEMPT SMP NOPTI > >> [ 100.727921] CPU: 3 PID: 1906 Comm: DrmThread Tainted: G W 5.15.30 #12 d726c6a2d6ebe5cf9223931cbca6892f916fe18b > >> [ 100.754419] RIP: 0010:CalculateSwathWidth+0x1f7/0x44f > >> [ 100.767109] Code: 00 00 00 f2 42 0f 11 04 f0 48 8b 85 88 00 00 00 > >> f2 42 0f 10 04 f0 48 8b 85 98 00 00 00 f2 42 0f 11 04 f0 48 8b 45 10 > >> 0f 57 c0 <f3> 42 0f 2a 04 b0 0f 57 c9 f3 43 0f 2a 0c b4 e8 8c e2 f3 ff > >> 48 8b [ 100.781269] RSP: 0018:ffffa9230079eeb0 EFLAGS: 00010246 [ > >> 100.812528] RAX: 2d33302d32323032 RBX: 0000000000000500 RCX: > >> 0000000000000000 [ 100.819656] RDX: 0000000000000001 RSI: > >> ffff99deb712c49c RDI: 0000000000000000 [ 100.826781] RBP: > >> ffffa9230079ef50 R08: ffff99deb712460c R09: ffff99deb712462c [ > >> 100.833907] R10: ffff99deb7124940 R11: ffff99deb7124d70 R12: > >> ffff99deb712ae44 [ 100.841033] R13: 0000000000000001 R14: > >> 0000000000000000 R15: ffffa9230079f0a0 [ 100.848159] FS: 00007af121212640(0000) GS:ffff99deba780000(0000) knlGS:0000000000000000 [ 100.856240] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 100.861980] CR2: 0000209000fe1000 CR3: >000000011b18c000 CR4: 0000000000350ee0 [ 100.869106] Call Trace: > >> [ 100.871555] <TASK> > >> [ 100.873655] ? asm_sysvec_reschedule_ipi+0x12/0x20 > >> [ 100.878449] CalculateSwathAndDETConfiguration+0x1a3/0x6dd > >> [ 100.883937] > >> dml31_ModeSupportAndSystemConfigurationFull+0x2ce4/0x76da > >> [ 100.890467] ? kallsyms_lookup_buildid+0xc8/0x163 > >> [ 100.895173] ? kallsyms_lookup_buildid+0xc8/0x163 > >> [ 100.899874] ? __sprint_symbol+0x80/0x135 [ 100.903883] ? > >> dm_update_plane_state+0x3f9/0x4d2 [ 100.908500] ? > >> symbol_string+0xb7/0xde [ 100.912250] ? number+0x145/0x29b [ > >> 100.915566] ? vsnprintf+0x341/0x5ff [ 100.919141] ? > >> desc_read_finalized_seq+0x39/0x87 [ 100.923755] ? > >> update_load_avg+0x1b9/0x607 [ 100.927849] ? > >> compute_mst_dsc_configs_for_state+0x7d/0xd5b > >> [ 100.933416] ? fetch_pipe_params+0xa4d/0xd0c [ 100.937686] ? > >> dc_fpu_end+0x3d/0xa8 [ 100.941175] dml_get_voltage_level+0x16b/0x180 > >> [ 100.945619] dcn30_internal_validate_bw+0x10e/0x89b > >> [ 100.950495] ? dcn31_validate_bandwidth+0x68/0x1fc > >> [ 100.955285] ? resource_build_scaling_params+0x98b/0xb8c > >> [ 100.960595] ? dcn31_validate_bandwidth+0x68/0x1fc > >> [ 100.965384] dcn31_validate_bandwidth+0x9a/0x1fc > >> [ 100.970001] dc_validate_global_state+0x238/0x295 > >> [ 100.974703] amdgpu_dm_atomic_check+0x9c1/0xbce > >> [ 100.979235] ? _printk+0x59/0x73 > >> [ 100.982467] drm_atomic_check_only+0x403/0x78b [ 100.986912] > >> drm_mode_atomic_ioctl+0x49b/0x546 [ 100.991358] ? > >> drm_ioctl+0x1c1/0x3b3 [ 100.994936] ? > >> drm_atomic_set_property+0x92a/0x92a > >> [ 100.999725] drm_ioctl_kernel+0xdc/0x149 [ 101.003648] > >> drm_ioctl+0x27f/0x3b3 [ 101.007051] ? > >> drm_atomic_set_property+0x92a/0x92a > >> [ 101.011842] amdgpu_drm_ioctl+0x49/0x7d [ 101.015679] > >> __se_sys_ioctl+0x7c/0xb8 [ 101.015685] do_syscall_64+0x5f/0xb8 [ > >> 101.015690] ? __irq_exit_rcu+0x34/0x96 > >> > >> [HOW] > >> It calles populate_dml_pipes which uses doubles to initialize. > > > >calls > > > >Excuse my ignorance. So using doubles causes a context switch? > If we don’t add FPU protection then context switch can happen. DC_FP_START would in-turn call preempt_disable. > > >> Adding FPU protection avoids context switch and probable loss of vba > >> context as there is potential contention while drm debug logs are enabled. > >> > >> Signed-off-by: CHANDAN VURDIGERE NATARAJ > >> <chandan.vurdigerenataraj@xxxxxxx> > >> > >> diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c > >> b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c > >> index 826970f2bd0a..f27262417abe 100644 > >> --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c > >> +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c > >> @@ -1750,7 +1750,9 @@ bool dcn31_validate_bandwidth(struct dc *dc, > >> > >> BW_VAL_TRACE_COUNT(); > >> > >> + DC_FP_START(); > >> out = dcn30_internal_validate_bw(dc, context, pipes, &pipe_cnt, > >> &vlevel, fast_validate); > >> + DC_FP_END(); > >> > >> // Disable fast_validate to set min dcfclk in alculate_wm_and_dlg > >> if (pipe_cnt == 0) Acked-by: Melissa Wen <mwen@xxxxxxxxxx> In fact, I revisited the code and realized the FPU protection is missing for two other dcn30 functions called by dcn31: - dcn30_populate_dml_writeback_from_context() - dcn30_set_mcif_arb_params() I'll send a patch addressing this shortly. CC'ing: Siqueira Best regards, Melissa > > > > > >Kind regards, > > > >Paul
Attachment:
signature.asc
Description: PGP signature