[AMD Official Use Only] Hi Jani: > -----Original Message----- > From: Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> > Sent: Wednesday, November 3, 2021 7:31 PM > To: Yuan, Perry <Perry.Yuan@xxxxxxx>; Maarten Lankhorst > <maarten.lankhorst@xxxxxxxxxxxxxxx>; Maxime Ripard > <mripard@xxxxxxxxxx>; Thomas Zimmermann <tzimmermann@xxxxxxx>; > David Airlie <airlied@xxxxxxxx>; Daniel Vetter <daniel@xxxxxxxx> > Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Huang, > Shimmer <Xinmei.Huang@xxxxxxx>; Huang, Ray <Ray.Huang@xxxxxxx> > Subject: RE: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer dereference > on drm_dp_dpcd_access > > [CAUTION: External Email] > > On Wed, 03 Nov 2021, "Yuan, Perry" <Perry.Yuan@xxxxxxx> wrote: > > [AMD Official Use Only] > > > > Hi Jani: > > > >> -----Original Message----- > >> From: Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> > >> Sent: Tuesday, November 2, 2021 4:40 PM > >> To: Yuan, Perry <Perry.Yuan@xxxxxxx>; Maarten Lankhorst > >> <maarten.lankhorst@xxxxxxxxxxxxxxx>; Maxime Ripard > >> <mripard@xxxxxxxxxx>; Thomas Zimmermann <tzimmermann@xxxxxxx>; > David > >> Airlie <airlied@xxxxxxxx>; Daniel Vetter <daniel@xxxxxxxx> > >> Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > >> Huang, Shimmer <Xinmei.Huang@xxxxxxx>; Huang, Ray > <Ray.Huang@xxxxxxx> > >> Subject: RE: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer > >> dereference on drm_dp_dpcd_access > >> > >> [CAUTION: External Email] > >> > >> On Tue, 02 Nov 2021, "Yuan, Perry" <Perry.Yuan@xxxxxxx> wrote: > >> > [AMD Official Use Only] > >> > > >> > Hi Jani: > >> > Thanks for your comments. > >> > > >> >> -----Original Message----- > >> >> From: Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> > >> >> Sent: Monday, November 1, 2021 9:07 PM > >> >> To: Yuan, Perry <Perry.Yuan@xxxxxxx>; Maarten Lankhorst > >> >> <maarten.lankhorst@xxxxxxxxxxxxxxx>; Maxime Ripard > >> >> <mripard@xxxxxxxxxx>; Thomas Zimmermann > <tzimmermann@xxxxxxx>; > >> David > >> >> Airlie <airlied@xxxxxxxx>; Daniel Vetter <daniel@xxxxxxxx> > >> >> Cc: Yuan, Perry <Perry.Yuan@xxxxxxx>; > >> >> dri-devel@xxxxxxxxxxxxxxxxxxxxx; linux- kernel@xxxxxxxxxxxxxxx; > >> >> Huang, Shimmer <Xinmei.Huang@xxxxxxx>; Huang, Ray > >> <Ray.Huang@xxxxxxx> > >> >> Subject: Re: [PATCH v2] drm/dp: Fix aux->transfer NULL pointer > >> >> dereference on drm_dp_dpcd_access > >> >> > >> >> [CAUTION: External Email] > >> >> > >> >> On Mon, 01 Nov 2021, Perry Yuan <Perry.Yuan@xxxxxxx> wrote: > >> >> > Fix below crash by adding a check in the drm_dp_dpcd_access > >> >> > which ensures that aux->transfer was actually initialized earlier. > >> >> > >> >> Gut feeling says this is papering over a real usage issue > >> >> somewhere else. Why is the aux being used for transfers before > >> >> ->transfer has been set? Why should the dp helper be defensive > >> >> against all kinds of > >> misprogramming? > >> >> > >> >> > >> >> BR, > >> >> Jani. > >> >> > >> > > >> > The issue was found by Intel IGT test suite, graphic by pass test case. > >> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fg > >> > itl > >> > ab.freedesktop.org%2Fdrm%2Figt-gpu- > >> tools&data=04%7C01%7CPerry.Yuan > >> > %40amd.com%7C83d011acfe65437c0fa808d99ddc65b0%7C3dd8961fe4 > 884e6 > >> 08e11a8 > >> > > >> > 2d994e183d%7C0%7C0%7C637714392203200313%7CUnknown%7CTWFpbG > Zsb > >> 3d8eyJWIj > >> > > >> > oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1 > 00 > >> 0&am > >> > > >> > p;sdata=snPpRYLGeJtTpNGle1YHZAvevcABbgLkgOsffiNzQPw%3D&reser > ved > >> =0 > >> > normally use case will not see the issue. > >> > To avoid this issue happy again when we run the test case , it will > >> > be nice to > >> add a check before the transfer is called. > >> > And we can see that it really needs to have a check here to make > >> > ITG &kernel > >> happy. > >> > >> You're missing my point. What is the root cause? Why do you have the > >> aux device or connector registered before ->transfer function is > >> initialized. I don't think you should do that. > >> > >> BR, > >> Jani. > >> > > > > One potential IGT fix patch to resolve the test case failure is: > > > > tests/amdgpu/amd_bypass.c > > data->pipe_crc = igt_pipe_crc_new(data->drm_fd, data->pipe_id, > > - AMDGPU_PIPE_CRC_SOURCE_DPRX); > > + INTEL_PIPE_CRC_SOURCE_AUTO); > > The kernel panic error gone after change "dprx" to "auto" in the IGT test. > > > > In my view ,the IGT amdgpu bypass test will do some common setup work > including crc piple, source. > > When the IGT sets up a new CRC pipe capture source for amdgpu bypass > test, the SOURCE was set as "dprx" instead of "auto" > > It makes "amdgpu_dm_crtc_set_crc_source()" failed to set correct AUX > and it's transfer function invalid . > > The system I tested use HDMI port connected to monitor . > > > > amdgpu_dm_crtc_set_crc_source-> (aux = (aconn->port) ? &aconn- > >port->aux : &aconn->dm_dp_aux.aux;) > > drm_dp_start_crc -> > > drm_dp_dpcd_readb-> aux->transfer is NULL, issue here. > > The fix will use the "auto" keyword, which will let the driver select a > default source of frame CRCs for this CRTC. > > > > Correct me if have some wrong points. > > Apparently I'm completely failing to communicate my POV to you. > > If you have a kernel oops, the fix needs to be in the kernel, not IGT. > > The question is, why is it possible for IGT (or any userspace) to trigger AUX > communication when the ->transfer function is not set? In my opinion, the > kernel driver should not have exposed the interface at all if the ->transfer > function is not set. The interface is useless without the ->transfer function. > IMO, that's the bug. > Yes , you are correct , the transfer shouldn't be called before it is ready ! Let me explain more details in my view . Maybe the root cause is not why the aux->transfer is not called before it is registered in this case. I suppose the issue was triggered by wrong CRC pipe source . Actually the aux->transfer has been registered when amdgpu DM registered at kernel boot. IGT test was run when system login to Gnome desktop. amdgpu_dm_initialize_dp_connector-> aconnector->dm_dp_aux.aux.transfer = dm_dp_aux_transfer; The test case failed when the IGT set an "DPRX" CRC pipe source while the HDMI connected to monitor only. At this time, the aux->transfer is NULL, and dp helper did not check the transfer pointer NULL or not. It calls the transfers to DPCD read, then you see the kernel panic log. amdgpu_dm_crtc_funcs-> amdgpu_dm_crtc_set_crc_source-> drm_dp_start_crc * And if the DP cable connected only, the issue will not happen. Test will pass. * If I change the CRC source to "auto", kernel will not see the panic as well. Maybe the failed test case need to run on the DP instead of HDMI, I am not sure at now. Hopping this info can help. Perry. > > BR, > Jani. > > > > > Thank you! > > Perry. > > > >> > >> > > >> > Perry. > >> > > >> >> > >> >> > > >> >> > BUG: kernel NULL pointer dereference, address: 0000000000000000 > >> >> > PGD > >> >> > 0 P4D 0 > >> >> > Oops: 0010 [#1] SMP NOPTI > >> >> > RIP: 0010:0x0 > >> >> > Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. > >> >> > RSP: 0018:ffffa8d64225bab8 EFLAGS: 00010246 > >> >> > RAX: 0000000000000000 RBX: 0000000000000020 RCX: > >> >> > ffffa8d64225bb5e > >> >> > RDX: ffff93151d921880 RSI: ffffa8d64225bac8 RDI: > >> >> > ffff931511a1a9d8 > >> >> > RBP: ffffa8d64225bb10 R08: 0000000000000001 R09: > >> >> > ffffa8d64225ba60 > >> >> > R10: 0000000000000002 R11: 000000000000000d R12: > >> >> > 0000000000000001 > >> >> > R13: 0000000000000000 R14: ffffa8d64225bb5e R15: > >> >> > ffff931511a1a9d8 > >> >> > FS: 00007ff8ea7fa9c0(0000) GS:ffff9317fe6c0000(0000) > >> >> > knlGS:0000000000000000 > >> >> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> >> > CR2: ffffffffffffffd6 CR3: 000000010d5a4000 CR4: > >> >> > 0000000000750ee0 > >> >> > PKRU: 55555554 > >> >> > Call Trace: > >> >> > drm_dp_dpcd_access+0x72/0x110 [drm_kms_helper] > >> >> > drm_dp_dpcd_read+0xb7/0xf0 [drm_kms_helper] > >> >> > drm_dp_start_crc+0x38/0xb0 [drm_kms_helper] > >> >> > amdgpu_dm_crtc_set_crc_source+0x1ae/0x3e0 [amdgpu] > >> >> > crtc_crc_open+0x174/0x220 [drm] > >> >> > full_proxy_open+0x168/0x1f0 > >> >> > ? open_proxy_open+0x100/0x100 > >> >> > do_dentry_open+0x156/0x370 > >> >> > vfs_open+0x2d/0x30 > >> >> > > >> >> > v2: fix some typo > >> >> > > >> >> > Signed-off-by: Perry Yuan <Perry.Yuan@xxxxxxx> > >> >> > --- > >> >> > drivers/gpu/drm/drm_dp_helper.c | 4 ++++ > >> >> > 1 file changed, 4 insertions(+) > >> >> > > >> >> > diff --git a/drivers/gpu/drm/drm_dp_helper.c > >> >> > b/drivers/gpu/drm/drm_dp_helper.c index > >> >> > 6d0f2c447f3b..76b28396001a > >> >> > 100644 > >> >> > --- a/drivers/gpu/drm/drm_dp_helper.c > >> >> > +++ b/drivers/gpu/drm/drm_dp_helper.c > >> >> > @@ -260,6 +260,10 @@ static int drm_dp_dpcd_access(struct > >> >> > drm_dp_aux > >> >> *aux, u8 request, > >> >> > msg.buffer = buffer; > >> >> > msg.size = size; > >> >> > > >> >> > + /* No transfer function is set, so not an available DP connector */ > >> >> > + if (!aux->transfer) > >> >> > + return -EINVAL; > >> >> > + > >> >> > mutex_lock(&aux->hw_mutex); > >> >> > > >> >> > /* > >> >> > >> >> -- > >> >> Jani Nikula, Intel Open Source Graphics Center > >> > >> -- > >> Jani Nikula, Intel Open Source Graphics Center > > -- > Jani Nikula, Intel Open Source Graphics Center