[Bug 111747] [CI][DRMTIP] igt@ - incomplete - Jenkins gives up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



changed bug 111747
What Removed Added
Component IGT DRM/Intel
Assignee dri-devel@lists.freedesktop.org intel-gfx-bugs@lists.freedesktop.org
Priority not set medium
Severity not set normal
QA Contact   intel-gfx-bugs@lists.freedesktop.org
i915 features GEM/Other CI Infra

Comment # 15 on bug 111747 from
Happens to TGL in 5 / 16 runs (31.2%), last seen in: the previous build.

(I mention TGL since this bug seems to be for the TGL occurrences but it can
happen to any machine)

User impact for this issue in particular is N/A since it's a CI issue. However,
having incompletes reduces the coverage for any test that doesn't get run due
to this so potentially very dire. It doesn't happen at 100% regularity though,
and happens for arbitrary tests so coverage loss is not entirely up to the
potential cap.

What happens here is

1) Jenkins connects to DUT through ssh and launches tests
2) Jenkins loses ssh connection
3) The Jenkins job for executing the test finishes, because the ssh command
completed
4) At the end of finishing a test, a reboot-and-collect job is executed
5) The reboot-and-collect job connects through ssh and reboots the machine

The remote reboot job got a logging step added, tests that die due to the
reboot command prematurely invoked get a log entry in dmesg stating power.sh is
taking this machine down. From that we can determine that network didn't
completely die, just the ssh connection.

There is a plan to solve this. igt_runner will be changed to expose an AF_LOCAL
socket for outside control, and the Jenkins job for executing tests will then
no longer be required to maintain an ssh connection active for the duration of
the whole test round. Instead tests will be launched in the background (with
screen or tmux or just nohup) and the Jenkins job will reconnect the ssh
connection when/if it fails and check through igt_runner's control channel if a
test is still running.

Moving this bug to CI infra.


You are receiving this mail because:
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux