Re: imxdrm issue on SABRE Lite

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Feb 12, 2017 at 12:15:46AM +0000, Russell King - ARM Linux wrote:
> On Sat, Feb 11, 2017 at 09:09:34PM +0000, Dan MacDonald wrote:
> > [   43.997066] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> > [CRTC:24:crtc-0] flip_done timed out
> > [   55.517063] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> > [CRTC:24:crtc-0] flip_done timed out
> 
> This seems to lay the foundation for the kernel to Oops itself later.
> The problem seems to be this:
> 
> drm_atomic_helper_commit(state->dev, state, false)
> - drm_atomic_helper_setup_commit(state, false)
>   - foreach crtc in state
>     - commit->event = kzalloc()
>     - crtc_state->event = commit->event
>     - crtc_state->event->base.completion = &commit->flip_done
> ...
> - commit_tail(state)
>   - funcs->atomic_commit_tail(state)
> ...
>     - drm_atomic_helper_commit_planes(dev, state,
>                                         DRM_PLANE_COMMIT_ACTIVE_ONLY |
>                                         DRM_PLANE_COMMIT_NO_DISABLE_AFTER_MODES$      - foreach active crtc in state
>         - funcs->atomic_begin(crtc, old_crtc_state)
>           - ipu_crtc_atomic_begin()
>             - drm_crtc_vblank_on()
>             - if crtc->state->event
>               - drm_crtc_arm_vblank_event(crtc, crtc->state->event)
>               - crtc->state->event = NULL
> 
> At this point, the "commit->flip_done" completion is queued with the
> event onto the vblank list.
> ...
>   - drm_atomic_helper_commit_cleanup_done(state)
>     - foreach crtc in state
>       - try_wait_for_completion(&commit->hw_done)
>       - wait_for_completion_timeout(&commit->flip_done, 10sec)
> 
> This is where we get the timeout message.
> 
>   - drm_atomic_state_free(state)
> 
> This "clears" the commit state (calling drm_crtc_commit_put() on it)
> which has the effect of kfree()'ing the structure containing the
> flip_done, but which is still on the vblank list.
> 
> The next time we try to set a mode, the result is that a call to
> drm_crtc_vblank_off() causes all queued events to be sent, including
> the now kfree()'d flip_done completion, resulting in the reported
> kernel oops.
> 
> It seems others are also suffering similar issues when the flip_done
> completion times out with other drivers:
> 
> https://lkml.org/lkml/2016/12/1/171
> https://bugs.freedesktop.org/show_bug.cgi?id=96781
> https://lists.opensuse.org/opensuse-bugs/2016-10/msg03011.html
> https://patchwork.kernel.org/patch/9280223/ (which is me...)
> 
> This is likely the same, although the timeout line was not captured:
> https://bugzilla.redhat.com/show_bug.cgi?id=1415180
> https://bodhi.fedoraproject.org/updates/kernel-4.8.7-200.fc24
> 
> So, can we please avoid killing the kernel when the hardware doesn't
> quite behave as we want it to?
> 
> Right now, when we oops the kernel, we're leaking all the memory
> associated with the atomic modeset, so if we stop oopsing the kernel
> but still leak the memory, surely that would be an improvement?
> Maybe something like the untested patch at the bottom of this mail?
> 
> It would give the opportunity to poke about on a failed system to
> work out what happened and maybe why the hardware misbehaved.
> 
> The real answer is for the hardware to behave, but we can't always
> have our cake.
> 
> Note - I can't reproduce Dan's problem here on 4.10-rc7 as I suspect
> it needs multiple CRTCs/outputs running in the IPU to trigger it,
> which Sabre lite has.  I'll try enabling the (disconnected) LVDS
> output tomorrow (I have Fabio's LVDS patch knocking about), but I
> suspect those with a deeper knowledge of the IPU need to investigate
> what's going on.
> 
> > [  214.765689] imx-ipuv3 2400000.ipu: DC stop timeout after 50 ms
> > [  214.825688] Unable to handle kernel NULL pointer dereference at
> > virtual address 00000000
> > [  214.833783] pgd = ed1b8000
> > [  214.836491] [00000000] *pgd=4c974831
> > [  214.840084] Internal error: Oops: 17 [#1] SMP ARM
> > [  214.844789] Modules linked in: mousedev snd_soc_sgtl5000
> > snd_soc_fsl_ssi snd_soc_imx_sgtl5000 imx_pcm_fiq imx_pcm_dma
> > snd_soc_fsl_asrc snd_soc_fsl_asoc_card snd_soc_core dw_hdmi_ahb_audio
> > snd_pcm_dmaengine caam_jr imx_ipuv3_crtc snd_ac97_codec coda
> > v4l2_mem2mem videobuf2_dma_contig ac97_bus imx_ipu_v3
> > snd_soc_imx_audmux snd_pcm videobuf2_vmalloc videobuf2_memops
> > videobuf2_v4l2 videobuf2_core dw_hdmi_imx caam imx2_wdt ofpart spi_imx
> > evdev dw_hdmi etnaviv imx_ldb pwm_imx snd_timer parallel_display
> > uio_pdrv_genirq uio imxdrm sch_fq_codel ip_tables x_tables
> > [  214.894338] CPU: 2 PID: 316 Comm: Xorg Tainted: G        W
> > 4.9.8-1-ARCH #1
> > [  214.901735] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> > [  214.908264] task: ed2c4d00 task.stack: ed2a6000
> > [  214.912803] PC is at __wake_up_common+0x1c/0x80
> > [  214.917337] LR is at __wake_up_locked+0x14/0x1c
> > [  214.921871] pc : [<c0186568>]    lr : [<c018662c>]    psr: a0070093
> > [  214.921871] sp : ed2a7c68  ip : 00000000  fp : c0fa2a70
> > [  214.933348] r10: c0f37384  r9 : 00000003  r8 : 00000000
> > [  214.938574] r7 : 00000000  r6 : edbf3410  r5 : edbf3408  r4 : edbf340c
> > [  214.945101] r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 : edbf340c
> > [  214.951630] Flags: NzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
> > Segment none
> > [  214.958854] Control: 10c5387d  Table: 3d1b804a  DAC: 00000051
> > [  214.964600] Process Xorg (pid: 316, stack limit = 0xed2a6220)
> > [  214.970347] Stack: (0xed2a7c68 to 0xed2a8000)
> > [  214.974708] 7c60:                   00000000 edbf340c edbf3408
> > a0070093 edb46114 edb46000
> > [  214.982889] 7c80: 0000039f c0f37384 c0fa2a70 c018662c 00000000
> > c0fa1144 bf21c4fc c018704c
> > [  214.991069] 7ca0: edbf3380 edbf3380 00000000 c06ccac0 edfa1400
> > edbf3380 00000000 c06d023c
> > [  214.999250] 7cc0: 0000039f 00000000 edab2a00 80070013 edb4611c
> > c12040cc 00000000 00000000
> > [  215.007430] 7ce0: 00000000 00040908 edb3f410 ed412c40 00000000
> > 00000000 ed26a800 edb47c18
> > [  215.015610] 7d00: c0f37384 c0fa1144 bf21c4fc c06c3e3c edc67780
> > c06c1c24 edb46000 edb46000
> > [  215.023791] 7d20: bf01862c ed412c40 edb46000 00000000 edb46000
> > ed410100 ed412240 ed412240
> > [  215.031971] 7d40: c0c0c0c0 bf0182bc ed412c40 bf018ca4 00000000
> > c06c40f4 ed412c40 00000000
> > [  215.040152] 7d60: 00000000 c06c41ac ed412c40 00000000 ed2a7dd0
> > edb47c18 ed410100 c06c46dc
> > [  215.048332] 7d80: eda36180 edb47c18 00000001 c12040cc ed410100
> > c06d5f8c ed2a7e4c edb46000
> > [  215.056513] 7da0: 00000001 c12040cc ed410100 c06d74c4 edaa5980
> > edb46248 edb47c18 eda3618c
> > [  215.064694] 7dc0: eda36180 ed412240 ed410100 c12040cc eda36180
> > edb47c18 ed410100 00000000
> > [  215.072874] 7de0: 00000000 ed412240 00000001 00040908 ed2a7e70
> > 00000051 00000068 c12040cc
> > [  215.081055] 7e00: c0cbf0a0 00000068 edf66800 ed2a7e4c c06864a2
> > c06ce704 0000e201 00000001
> > [  215.089236] 7e20: c0fa22f0 00040908 f50417fa ed2a7e4c be9d1910
> > 000000a2 edb46000 00000068
> > [  215.097415] 7e40: 00000062 c06d7020 c1208504 008a1210 00000000
> > 00000001 00000018 00000047
> > [  215.105596] 7e60: 00000000 00000000 00000000 00000001 00007530
> > 03480320 03a00378 01e00000
> > [  215.113775] 7e80: 01f001ed 0000020d 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [  215.121955] 7ea0: 00000000 00000000 00000000 00000000 00000000
> > c0b0a268 00000000 00040908
> > [  215.130137] 7ec0: ed2a7edc ed2a7fb0 00000008 00040908 00000000
> > c12040cc be9d1910 edd8cbc8
> > [  215.138317] 7ee0: edcc9b40 be9d1910 0000000b 00000000 00000000
> > c0288638 004f10a0 c0101308
> > [  215.146497] 7f00: 00000000 eff6fe48 eff6fe24 eff6fe00 000112da
> > c02bb9bc 00000008 c01b4a08
> > [  215.154677] 7f20: 00040908 c12040cc 00000000 ed29b1b8 00000003
> > be9d1910 be9d1808 ffffe000
> > [  215.162857] 7f40: 00000051 ed2a6100 00000100 00000000 ed2a6000
> > c0102fa4 be9d1678 00040908
> > [  215.171038] 7f60: c12040cc 00000000 edcc9b41 edcc9b40 c06864a2
> > be9d1910 0000000b 00000000
> > [  215.179218] 7f80: 00000000 c0288f78 b6fc3c90 be9d1910 c06864a2
> > 00000036 c0107e24 ed2a6000
> > [  215.187398] 7fa0: 00000000 c0107c60 b6fc3c90 be9d1910 0000000b
> > c06864a2 be9d1910 00000001
> > [  215.195579] 7fc0: b6fc3c90 be9d1910 c06864a2 00000036 008a1210
> > 00000047 00000018 00000000
> > [  215.203759] 7fe0: b6ddf088 be9d18f4 b6dc7ad4 b6b11adc 40070010
> > 0000000b 3fffd861 3fffdc61
> > [  215.211946] [<c0186568>] (__wake_up_common) from [<c018662c>]
> > (__wake_up_locked+0x14/0x1c)
> > [  215.220216] [<c018662c>] (__wake_up_locked) from [<c018704c>]
> > (complete_all+0x34/0x44)
> > [  215.228141] [<c018704c>] (complete_all) from [<c06ccac0>]
> > (drm_send_event_locked+0x28/0x11c)
> > [  215.236588] [<c06ccac0>] (drm_send_event_locked) from [<c06d023c>]
> > (drm_vblank_off+0x1b0/0x21c)
> > [  215.245296] [<c06d023c>] (drm_vblank_off) from [<c06c3e3c>]
> > (drm_atomic_helper_commit_modeset_disables+0x1dc/0x3fc)
> > [  215.255750] [<c06c3e3c>]
> > (drm_atomic_helper_commit_modeset_disables) from [<bf0182bc>]
> > (imx_drm_atomic_commit_tail+0x18/0x58 [imxdrm])
> > [  215.267843] [<bf0182bc>] (imx_drm_atomic_commit_tail [imxdrm]) from
> > [<c06c40f4>] (commit_tail+0x40/0x5c)
> > [  215.277327] [<c06c40f4>] (commit_tail) from [<c06c41ac>]
> > (drm_atomic_helper_commit+0x94/0xd8)
> > [  215.285858] [<c06c41ac>] (drm_atomic_helper_commit) from
> > [<c06c46dc>] (drm_atomic_helper_set_config+0x78/0x9c)
> > [  215.295865] [<c06c46dc>] (drm_atomic_helper_set_config) from
> > [<c06d5f8c>] (drm_mode_set_config_internal+0x58/0xdc)
> > [  215.306219] [<c06d5f8c>] (drm_mode_set_config_internal) from
> > [<c06d74c4>] (drm_mode_setcrtc+0x4a4/0x550)
> > [  215.315703] [<c06d74c4>] (drm_mode_setcrtc) from [<c06ce704>]
> > (drm_ioctl+0x214/0x44c)
> > [  215.323540] [<c06ce704>] (drm_ioctl) from [<c0288638>]
> > (do_vfs_ioctl+0xac/0x980)
> > [  215.330940] [<c0288638>] (do_vfs_ioctl) from [<c0288f78>]
> > (SyS_ioctl+0x6c/0x7c)
> > [  215.338258] [<c0288f78>] (SyS_ioctl) from [<c0107c60>]
> > (ret_fast_syscall+0x0/0x3c)
> > [  215.345832] Code: e5b61004 e1a08003 e59d7028 e1560001 (e5913000)
> > [  215.351929] ---[ end trace e8a77aa320be7e56 ]---
> 
>  drivers/gpu/drm/drm_atomic_helper.c | 15 +++++++++------
>  include/drm/drm_atomic_helper.h     |  2 +-
>  2 files changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> index 21f992605541..46668d071d6a 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -1201,9 +1201,8 @@ static void commit_tail(struct drm_atomic_state *state)
>  	else
>  		drm_atomic_helper_commit_tail(state);
>  
> -	drm_atomic_helper_commit_cleanup_done(state);
> -
> -	drm_atomic_state_free(state);
> +	if (drm_atomic_helper_commit_cleanup_done(state) == 0)
> +		drm_atomic_state_free(state);

Chris (Cc'ed) added reference counting to atomic state for v4.10, maybe
that already fixes the issue?

Thierry

>  }
>  
>  static void commit_work(struct work_struct *work)
> @@ -1591,12 +1590,12 @@ EXPORT_SYMBOL(drm_atomic_helper_commit_hw_done);
>   * This is part of the atomic helper support for nonblocking commits, see
>   * drm_atomic_helper_setup_commit() for an overview.
>   */
> -void drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state)
> +int drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state)
>  {
>  	struct drm_crtc *crtc;
>  	struct drm_crtc_state *crtc_state;
>  	struct drm_crtc_commit *commit;
> -	int i;
> +	int i, failed = 0;
>  	long ret;
>  
>  	for_each_crtc_in_state(state, crtc, crtc_state, i) {
> @@ -1621,15 +1620,19 @@ void drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state)
>  		 * not hold a reference of its own. */
>  		ret = wait_for_completion_timeout(&commit->flip_done,
>  						  10*HZ);
> -		if (ret == 0)
> -			DRM_ERROR("[CRTC:%d:%s] flip_done timed out\n",
> +		if (ret == 0) {
> +			DRM_ERROR("[CRTC:%d:%s] flip_done timed out, memory leaked\n",
>  				  crtc->base.id, crtc->name);
> +			failed = -ETIMEDOUT;
> +		}
>  
>  		spin_lock(&crtc->commit_lock);
>  del_commit:
>  		list_del(&commit->commit_entry);
>  		spin_unlock(&crtc->commit_lock);
>  	}
> +
> +	return failed;
>  }
>  EXPORT_SYMBOL(drm_atomic_helper_commit_cleanup_done);
>  
> diff --git a/include/drm/drm_atomic_helper.h b/include/drm/drm_atomic_helper.h
> index 7ff92b09fd9c..ee3d642c1feb 100644
> --- a/include/drm/drm_atomic_helper.h
> +++ b/include/drm/drm_atomic_helper.h
> @@ -88,7 +88,7 @@ int drm_atomic_helper_setup_commit(struct drm_atomic_state *state,
>  				   bool nonblock);
>  void drm_atomic_helper_wait_for_dependencies(struct drm_atomic_state *state);
>  void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *state);
> -void drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state);
> +int drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state);
>  
>  /* implementations for legacy interfaces */
>  int drm_atomic_helper_update_plane(struct drm_plane *plane,
> 
> 
> -- 
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.
> _______________________________________________
> dri-devel mailing list
> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Attachment: signature.asc
Description: PGP signature

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux