Re: Regression: drm: Lobotomize set_busid nonsense for !pci drivers (a325725633c2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 30-09-16 05:09, Laszlo Ersek wrote:
Hello Daniel,

On 06/21/16 14:08, daniel.vetter at ffwll.ch (Daniel Vetter) wrote:
We already have a fallback in place to fill out the unique from
dev->unique, which is set to something reasonable in drm_dev_alloc.

Which means we only need to have a special set_busid for pci devices,
to be able to care the backwards compat code for drm 1.1 around, which
libdrm still needs.

While developing and testing this patch things blew up in really
interesting ways, and the code is rather confusing in naming things
between the kernel code, ioctl #defines and libdrm. For the next brave
dragon slayer, document all this madness properly in the userspace
interface section of gpu.tmpl.

v2: Make drm_dev_set_unique static and update kerneldoc.

v3: Entire rewrite, plus document what's going on for posterity in the
gpu docbook uapi section.

v4: Drop accidental amdgpu hunk (Emil).

v5: Drop accidental omapdrm vblank counter change (Emil).

Cc: Gustavo Padovan <gustavo.padovan at collabora.co.uk>
Cc: Emil Velikov <emil.l.velikov at gmail.com>
Tested-by: Gustavo Padovan <gustavo.padovan at collabora.co.uk> (virt_gpu)
Reviewed-by: Emil Velikov <emil.l.velikov at gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
---
 Documentation/DocBook/gpu.tmpl                  |  4 ++
 drivers/gpu/drm/armada/armada_drv.c             |  1 -
 drivers/gpu/drm/drm_ioctl.c                     | 58 +++++++++++++++++++++++++
 drivers/gpu/drm/drm_platform.c                  | 18 --------
 drivers/gpu/drm/etnaviv/etnaviv_drv.c           |  1 -
 drivers/gpu/drm/exynos/exynos_drm_drv.c         |  1 -
 drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c |  1 -
 drivers/gpu/drm/imx/imx-drm-core.c              |  1 -
 drivers/gpu/drm/msm/msm_drv.c                   |  1 -
 drivers/gpu/drm/nouveau/nouveau_drm.c           |  1 -
 drivers/gpu/drm/omapdrm/omap_drv.c              |  1 -
 drivers/gpu/drm/shmobile/shmob_drm_drv.c        |  1 -
 drivers/gpu/drm/tilcdc/tilcdc_drv.c             |  1 -
 drivers/gpu/drm/virtio/virtgpu_drm_bus.c        | 10 -----
 drivers/gpu/drm/virtio/virtgpu_drv.c            |  1 -
 drivers/gpu/drm/virtio/virtgpu_drv.h            |  1 -
 include/drm/drmP.h                              |  1 -
 17 files changed, 62 insertions(+), 41 deletions(-)

This patch (commit a325725633c2) regresses X.org on QEMU's virtio-vga
device. Please see

  https://bugzilla.redhat.com/show_bug.cgi?id=1366842

complete with a bisection log under

  drivers/gpu/drm/virtio/

(comment 20).

Copying Thorsten so he can include this report in his next v4.8-rc8
regression report, if he chooses so. (Commit a325725633c2 is part of
v4.8-rc1, but we only managed to identify it now.) The last such report
I know of is archived e.g. at
<http://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg1239220.html>.

Reported-by: Joachim Frieben <jfrieben@xxxxxxxxxxx>

First of all Joachim thanks for bisecting this. I was thinking about this
bug / issue, while doing my laps in the swimming pool.

I wanted to add a comment to the bug to tell you that this is likely
a Xorg xserver issue and not a kernel issue and that there is no need to
bisect, but it is too late for that now.

Xorg when running without a Xorg.conf searches for what it considers
a "primary" gpu / video-card, basically it attempts to bring up the
right card in setups where there are multiple cards and if it does not
find one exits with an error.

The xserver has a 2 step process for finding the primary card:

1) It searches for is a card which has a vga-bios mapped,
as we've already determined in the mentioned Red Hat bug that works for
the classic qemu emulated video-cards, but not for qemu's virtio-vga.

2) If that does not work Xorg will fallback to any video class device
on pci-bus 1.

This fallback actually has been broken in the Xorg xserver for quite a
while now and only 2 days ago a patch from Laszlo was merged to fix this.

Only for things to break again due to this kernel patch.

Since the whole step 2) thingie is very much tied to x86 machines
where pci-bus 0 used to be the main bus and pci-bus 1 the agp,
which is sorta an obsolete assumption now a days and  since relying
on bus numbers / enumeration order is a bad idea in general I'm not
entirely sure if this counts as a regression.

I've discussed the problem of the xserver exiting with an error when
no primary device can be found with some people (ajax) at XDC last week
since there are other use-cases where the pci-bus 1 fallback does not
work.

As such I've been working on a xserver patch-set to make the xserver
try harder (pick the first available device) when both steps described
above fail to find one, which should make things work even with the
newest (broken / regressed) kernels.

Given this mail thread, I guess I'm working after all today (I had
planned a day off) and I'll try to wrap up this patch-set and reply
to this mail with the server patches attached for Joachim and/or
Laszlo to test.

Regards,

Hans



p.s.

It would be interesting to do a lspci on both a working and a
non-working kernel to see what exactly is going on here.
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux