On Tue, 2018-12-04 at 21:43 +0200, Ville Syrjälä wrote: > On Tue, Dec 04, 2018 at 11:46:39AM +0200, Mika Kahola wrote: > > Occasionally, we get the following error in our CI runs > > > > [853.132830] Workqueue: events i915_hotplug_work_func [i915] > > [853.132844] RIP: 0010:drm_wait_one_vblank+0x19b/0x1b0 > > [853.132852] Code: fe ff ff e8 b7 4e a6 ff 48 89 e6 4c 89 ff e8 6c > > 5f ab ff 45 85 ed 0f 85 > > 15 ff ff ff 89 ee 48 c7 c7 e8 03 10 82 e8 b5 4b a6 ff <0f> 0b e9 00 > > ff ff ff 0f 1f 40 00 66 > > 2e 0f 1f 84 00 00 00 00 00 8b > > [853.132859] RSP: 0018:ffffc9000146bca0 EFLAGS: 00010286 > > [853.132866] RAX: 0000000000000000 RBX: ffff88849ef00000 RCX: > > 0000000000000000 > > [853.132873] RDX: 0000000000000007 RSI: ffffffff820c6f58 RDI: > > 00000000ffffffff > > [853.132879] RBP: 0000000000000000 R08: 000000007ffc637a R09: > > 0000000000000000 > > [853.132884] R10: 0000000000000000 R11: 0000000000000000 R12: > > 0000000000000000 > > [853.132890] R13: 0000000000000000 R14: 000000000000d0c2 R15: > > ffff8884a491e680 > > [853.132897] FS: 0000000000000000(0000) GS:ffff8884afe80000(0000) > > knlGS:0000000000000000 > > [853.132904] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [853.132910] CR2: 00007f63bf0df000 CR3: 0000000005210006 CR4: > > 0000000000760ee0 > > [853.132916] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > > 0000000000000000 > > [853.132922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > > 0000000000000400 > > [853.132927] PKRU: 55555554 > > [853.132932] Call Trace: > > [853.132949] ? wait_woken+0xa0/0xa0 > > [853.133068] intel_dp_retrain_link+0x130/0x190 [i915] > > [853.133176] intel_ddi_hotplug+0x54/0x2e0 [i915] > > [853.133298] i915_hotplug_work_func+0x1a9/0x240 [i915] > > [853.133324] process_one_work+0x262/0x630 > > [853.133349] worker_thread+0x37/0x380 > > [853.133365] ? process_one_work+0x630/0x630 > > [853.133373] kthread+0x119/0x130 > > [853.133383] ? kthread_park+0x80/0x80 > > [853.133400] ret_from_fork+0x3a/0x50 > > [853.133433] irq event stamp: 1426928 > > > > I suspect that this is caused by a racy condition when retraining > > the > > DisplayPort link. My proposal is to wait for one additional vblank > > event before we send out a hotplug event to userspace for > > reprobing. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108835 > > The first problem in the log is > <3> [853.020316] [drm:intel_ddi_prepare_link_retrain [i915]] *ERROR* > Timeout waiting for DDI BUF A idle bit > That's where one should start. > > Some suspects: > - icl_enable/disable_phy_clock_gating() > - intel_ddi_enable/disable_pipe_clock() Thanks! I will have a look at those too. On the other hand, this test failure as INCOMPLETE in CI might have caused by jenkins issue [21/79] ( 873s left) kms_flip (blocking-absolute-wf_vblank- interruptible) FATAL: command execution failed java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.j ava:2681) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStr eam.java:3156) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49) at hudson.remoting.Command.readFrom(Command.java:140) at hudson.remoting.Command.readFrom(Command.java:126) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(Abstr actSynchronousByteArrayCommandTransport.java:36) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchronou sCommandTransport.java:63) Caused: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchronou sCommandTransport.java:77) Caused: java.io.IOException: Backing channel 'shard-iclb7' is disconnected. at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationH andler.java:214) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler. java:283) at com.sun.proxy.$Proxy64.isAlive(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1144) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1136) at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild .java:744) at hudson.model.Build$BuildExecution.build(Build.java:206) at hudson.model.Build$BuildExecution.doRun(Build.java:163) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.jav a:504) at hudson.model.Run.execute(Run.java:1810) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:429) FATAL: Unable to delete script file /tmp/jenkins9130735500847889838.sh java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.j ava:2681) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStr eam.java:3156) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49) at hudson.remoting.Command.readFrom(Command.java:140) at hudson.remoting.Command.readFrom(Command.java:126) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(Abstr actSynchronousByteArrayCommandTransport.java:36) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchronou sCommandTransport.java:63) Caused: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchronou sCommandTransport.java:77) Caused: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on shard-iclb7 failed. The channel is closing down or has closed down at hudson.remoting.Channel.call(Channel.java:948) at hudson.FilePath.act(FilePath.java:1070) at hudson.FilePath.act(FilePath.java:1059) at hudson.FilePath.delete(FilePath.java:1563) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:123) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild .java:744) at hudson.model.Build$BuildExecution.build(Build.java:206) at hudson.model.Build$BuildExecution.doRun(Build.java:163) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.jav a:504) at hudson.model.Run.execute(Run.java:1810) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:429) > > > > > Cc: Manasi Navare <manasi.d.navare@xxxxxxxxx> > > Signed-off-by: Mika Kahola <mika.kahola@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/intel_dp.c | 27 +++++++++++++++++++++++++++ > > 1 file changed, 27 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/intel_dp.c > > b/drivers/gpu/drm/i915/intel_dp.c > > index a6907a1761ab..6ce7d54e49af 100644 > > --- a/drivers/gpu/drm/i915/intel_dp.c > > +++ b/drivers/gpu/drm/i915/intel_dp.c > > @@ -6746,6 +6746,10 @@ static void > > intel_dp_modeset_retry_work_fn(struct work_struct *work) > > { > > struct intel_connector *intel_connector; > > struct drm_connector *connector; > > + struct drm_connector_state *conn_state; > > + struct drm_i915_private *dev_priv; > > + struct intel_crtc *crtc; > > + struct intel_crtc_state *crtc_state; > > > > intel_connector = container_of(work, typeof(*intel_connector), > > modeset_retry_work); > > @@ -6753,6 +6757,14 @@ static void > > intel_dp_modeset_retry_work_fn(struct work_struct *work) > > DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", connector->base.id, > > connector->name); > > > > + dev_priv = to_i915(connector->dev); > > + conn_state = intel_connector->base.state; > > + > > + crtc = to_intel_crtc(conn_state->crtc); > > + crtc_state = to_intel_crtc_state(crtc->base.state); > > + > > + WARN_ON(!intel_crtc_has_dp_encoder(crtc_state)); > > + > > /* Grab the locks before changing connector property*/ > > mutex_lock(&connector->dev->mode_config.mutex); > > /* Set connector link status to BAD and send a Uevent to notify > > @@ -6761,6 +6773,21 @@ static void > > intel_dp_modeset_retry_work_fn(struct work_struct *work) > > drm_connector_set_link_status_property(connector, > > DRM_MODE_LINK_STATUS_BAD > > ); > > mutex_unlock(&connector->dev->mode_config.mutex); > > + > > + /* Suppress underruns caused by re-training */ > > + intel_set_cpu_fifo_underrun_reporting(dev_priv, crtc->pipe, > > false); > > + if (crtc_state->has_pch_encoder) > > + intel_set_pch_fifo_underrun_reporting(dev_priv, > > + intel_crtc_pch_tr > > anscoder(crtc), false); > > + > > + /* Keep underrun reporting disabled until things are stable */ > > + intel_wait_for_vblank(dev_priv, crtc->pipe); > > + > > + intel_set_cpu_fifo_underrun_reporting(dev_priv, crtc->pipe, > > true); > > + if (crtc_state->has_pch_encoder) > > + intel_set_pch_fifo_underrun_reporting(dev_priv, > > + intel_crtc_pch_tr > > anscoder(crtc), true); > > + > > /* Send Hotplug uevent so userspace can reprobe */ > > drm_kms_helper_hotplug_event(connector->dev); > > } > > -- > > 2.17.1 > > > > _______________________________________________ > > Intel-gfx mailing list > > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx > > _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx