Hi Am 01.08.23 um 16:24 schrieb Jocelyn Falempe:
On 01/08/2023 12:25, Thomas Zimmermann wrote:It does, and it goes back to the v5.18 behavior, where the "out of spec" resolutions are not proposed, but you can still force them from userspace. Also this claims to be a "bandwidth" limit, but all mgag200 are using the PCI bus for the memory clock, so have the same bandwidth. The limit is on the pixel clock, which is used only for the VGA DAC, so it's probably fine to attach this limit to the VGA connector. Of course mgag200 only have VGA connector, so it doesn't matter in practice.Hi Am 01.08.23 um 12:11 schrieb Jocelyn Falempe:On 28/07/2023 14:12, Roger Sewell wrote:Thomas, Jocelyn, JF> I think the culprit is probably this patch: JF> https://patchwork.freedesktop.org/patch/486242/ JF> JF> before this patch, JF> mgag200_simple_display_pipe_mode_valid() always return MODE_OK JF> JF> after this patch, it checks the bandwidth limit too. It turns out to be more complicated than that - and I still think it is something to do with how the two functions mgag200_simple_display_pipe_mode_valid and mgag200_mode_config_mode_valid are made known to the rest of the drmsystem, i.e. which slots in the various function structures they are putin. I attach a contiguous excerpt from /var/log/messages, recording what happened when I did the following. 1. I instrumented the old mgag200 module with printk statements in mgag200_simple_display_pipe_mode_valid and mga_vga_mode_valid and mga_vga_calculate_mode_bandwidth, and also put in a call to thelatter in mgag200_simple_display_pipe_mode_valid so that I could see what parameters it had been called with. I then rebooted the system,getting the messages starting at Jul 28 10:42:45 . As you will see,almost every time mgag200_simple_display_pipe_mode_valid is called it is immediately following a return of MODE_OK from mga_vga_mode_validwith the same display parameters - the two exceptions are:a) at line 1156 is when it is called after "fbcon: mgag200drmfb (fb0)is primary device", and b) with the mode actually being set (1440x900) at line 2681 when it of course returns MODE_OK (as that is what it always returns, as you say).2. I then switched to the new mgag200 module similarly instrumented, but with the unique_rev_id increased by 1 to get sufficient bandwidth tomake 1440x900 usable. I then rebooted the system, getting the messages starting at Jul 28 11:46:53 . Again, almost every time mgag200_simple_display_pipe_mode_valid is called it is immediatelyafter a return of MODE_OK from mgag200_mode_config_mode_valid, and westill have exception type (a) at line 5672. However, the exception type (b) is no longer present, as at line 6590, on setting the1440x900 mode, there is now a call of mgag200_mode_config_mode_validpreceding the call of mgag200_simple_display_pipe_mode_valid. 3. I then modified that mgag200 module by forcing a return of MODE_OK from mgag200_simple_display_pipe_mode_valid and removing the increment to unique_rev_id, so that 1440x900 is no longer "valid" according to the criteria being used. I rebooted, getting the messages starting at Jul 28 12:12:34 . Now at line 11004 we have afailure to set mode immediately followed by a return of MODE_BAD, notfrom mgag200_simple_display_pipe_mode_valid but from mgag200_mode_config_mode_valid. Thus between the old mgag200 module and the new one, there is a change in when the mode_config_mode_valid function gets called - there being one extra call. So even if one were to revert to mgag200_simple_display_pipe_mode_valid just always returning MODE_OK it wouldn't fix the problem.At present I don't see how the change of behaviour can be anything otherthan to do with the way these function names are passed to the rest of the drm system (though of course even if that were reverted, the fact that mgag200_simple_display_pipe_mode_valid now tests bandwidth would still cause what I want to do to fail). Sadly I don't understand how the drm system works, so I'm not sure that I can shed any more light - but if there are any more experiments that would help, please do let me know.I think the issue is that in v5.18, the memory check was done on the connector mode_valid() callback, and in v6.0, it's done in the mode_config mode_valid() callback.Can you please try the patch attached, I moved the bandwidth check back to the connector callback.It should not make difference. I'd be surprised if it does. And in any case, the bandwidth check belongs in to the mode_config test, as it is a device-wide limit.FYI I intend to close this bug report as INVALID. The hardware and driver work according to the known specs, so there's no bug here.I still think it's a kernel regression, because a userspace configuration that used to work is now broken.
No it's not. The kernel didn't validate the given state correctly. Hence, the user's hardware accidentally got overclocked, which the kernel driver is really expected to prevent. That was the kernel bug that has been fixed a while ago.
If anything, the kernel driver could support several modes only for 16-bit colors (iff userspace can handle that).
Best regards Thomas
Best regards,
-- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg)
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature