Re: [PATCH] drm/mgag200: Increase bandwidth for G200se A rev1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

Am 01.08.23 um 16:24 schrieb Jocelyn Falempe:
On 01/08/2023 12:25, Thomas Zimmermann wrote:
Hi

Am 01.08.23 um 12:11 schrieb Jocelyn Falempe:
On 28/07/2023 14:12, Roger Sewell wrote:

Thomas, Jocelyn,

JF> I think the culprit is probably this patch:
JF> https://patchwork.freedesktop.org/patch/486242/
JF>
JF> before this patch,
JF> mgag200_simple_display_pipe_mode_valid() always return MODE_OK
JF>
JF> after this patch, it checks the bandwidth limit too.

It turns out to be more complicated than that - and I still think it is
something to do with how the two functions
mgag200_simple_display_pipe_mode_valid and
mgag200_mode_config_mode_valid are made known to the rest of the drm
system, i.e. which slots in the various function structures they are put
in.

I attach a contiguous excerpt from /var/log/messages, recording what
happened when I did the following.

1. I instrumented the old mgag200 module with printk statements in
    mgag200_simple_display_pipe_mode_valid and mga_vga_mode_valid and
    mga_vga_calculate_mode_bandwidth, and also put in a call to the
    latter in mgag200_simple_display_pipe_mode_valid so that I could see     what parameters it had been called with. I then rebooted the system,
    getting the messages starting at Jul 28 10:42:45 . As you will see,
    almost every time mgag200_simple_display_pipe_mode_valid is called it     is immediately following a return of MODE_OK from mga_vga_mode_valid
    with the same display parameters - the two exceptions are:

    a) at line 1156 is when it is called after "fbcon: mgag200drmfb (fb0)
       is primary device", and

    b) with the mode actually being set (1440x900) at line 2681 when it
       of course returns MODE_OK (as that is what it always returns, as
       you say).

2. I then switched to the new mgag200 module similarly instrumented, but     with the unique_rev_id increased by 1 to get sufficient bandwidth to
    make 1440x900 usable. I then rebooted the system, getting the
    messages starting at Jul 28 11:46:53 . Again, almost every time
    mgag200_simple_display_pipe_mode_valid is called it is immediately
    after a return of MODE_OK from mgag200_mode_config_mode_valid, and we
    still have exception type (a) at line 5672. However, the exception
    type (b) is no longer present, as at line 6590, on setting the
    1440x900 mode, there is now a call of mgag200_mode_config_mode_valid
    preceding the call of mgag200_simple_display_pipe_mode_valid.

3. I then modified that mgag200 module by forcing a return of MODE_OK
    from mgag200_simple_display_pipe_mode_valid and removing the
    increment to unique_rev_id, so that 1440x900 is no longer "valid"
    according to the criteria being used. I rebooted, getting the
    messages starting at Jul 28 12:12:34 . Now at line 11004 we have a
    failure to set mode immediately followed by a return of MODE_BAD, not
    from mgag200_simple_display_pipe_mode_valid but from
    mgag200_mode_config_mode_valid.

Thus between the old mgag200 module and the new one, there is a change
in when the mode_config_mode_valid function gets called - there being
one extra call. So even if one were to revert to
mgag200_simple_display_pipe_mode_valid just always returning MODE_OK it
wouldn't fix the problem.

At present I don't see how the change of behaviour can be anything other
than to do with the way these function names are passed to the rest of
the drm system (though of course even if that were reverted, the fact
that mgag200_simple_display_pipe_mode_valid now tests bandwidth would
still cause what I want to do to fail).

Sadly I don't understand how the drm system works, so I'm not sure that
I can shed any more light - but if there are any more experiments that
would help, please do let me know.

I think the issue is that in v5.18, the memory check was done on the connector mode_valid() callback, and in v6.0, it's done in the mode_config mode_valid() callback.

Can you please try the patch attached, I moved the bandwidth check back to the connector callback.

It should not make difference. I'd be surprised if it does. And in any case, the bandwidth check belongs in to the mode_config test, as it is a device-wide limit.

It does, and it goes back to the v5.18 behavior, where the "out of spec" resolutions are not proposed, but you can still force them from userspace. Also this claims to be a "bandwidth" limit, but all mgag200 are using the PCI bus for the memory clock, so have the same bandwidth. The limit is on the pixel clock, which is used only for the VGA DAC, so it's probably fine to attach this limit to the VGA connector. Of course mgag200 only have VGA connector, so it doesn't matter in practice.


FYI I intend to close this bug report as INVALID. The hardware and driver work according to the known specs, so there's no bug here.

I still think it's a kernel regression, because a userspace configuration that used to work is now broken.

No it's not. The kernel didn't validate the given state correctly. Hence, the user's hardware accidentally got overclocked, which the kernel driver is really expected to prevent. That was the kernel bug that has been fixed a while ago.

If anything, the kernel driver could support several modes only for 16-bit colors (iff userspace can handle that).

Best regards
Thomas


Best regards,


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux