Hi Am 26.06.22 um 20:54 schrieb Linus Torvalds:
So this has been going on for a while, and it's quite annoying. At bootup, my main desktop (Threadripper 3970X with radeon graphics) now complains about resource sanity check: requesting [mem 0xd0000000-0xdfffffff], which spans more than BOOTFB [mem 0xd0000000-0xd02fffff] and then gives me a nasty callchain that is basically the amdgpu probe sequence ending in amdgpu_bo_init() doing the arch_io_reserve_memtype_wc() which is then what complains. That "BOOTFB" resource is from sysfb_simplefb.c, and I think what started triggering this is commit c96898342c38 ("drivers/firmware: Don't mark as busy the simple-framebuffer IO resource"). Because it turns out that that removed the IORESOURCE_BUSY, which in turn is what makes the resource conflict code complain about it now, because /* * if a resource is "BUSY", it's not a hardware resource * but a driver mapping of such a resource; we don't want * to warn for those; some drivers legitimately map only * partial hardware resources. (example: vesafb) */ so the issue is that now the resource code - correctly - says "hey, you have *two* conflicting driver mappings". And that commit claims it did it because "which can lead to drivers requesting the same memory resource to fail", but - once again - the link in the commit that might actually tell more is just one of those useless patch submission links again. So who knows why that commit was actually done, but it's causing annoyance. If simplefb is actually still using that frame buffer, it's a problem. If it isn't, then maybe that resource should have been released?
As Javier said, that resource is the framebuffer that's set up by the firmware. It should be gone after the call to drm_aperture_remove_conflicting_pci_framebuffers(). [1] The call to amdgpu_bo_init() runs afterwards, so that removal apparently failed.
Is the BOOTFB entry still listed in /proc/iomem after the system finished booting?
Attached is a (totally untested) patch to manually point amdgpu to the right location. Does it fix the problem?
Best regards Thomas[1] https://elixir.bootlin.com/linux/v5.18.7/source/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c#L2077
I really think that commit c96898342c38 is buggy. It talks about "let drivers to request it as busy instead", but then it registers a resource that isn't actually a proper real resource. It's just a random incomplete chunk of the actual real thing, so it will still interfere with resource allocation, and in fact now interferes even with that "set memtype" because of this valid warning. Linus
-- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev
From c37f0fa8e763c471ddaccc08da9ec9bb1a715451 Mon Sep 17 00:00:00 2001 From: Thomas Zimmermann <tzimmermann@xxxxxxx> Date: Mon, 27 Jun 2022 10:51:44 +0200 Subject: [PATCH] drm/amdgpu: Remove firmware framebuffer without PCI helper The DRM aperture helper for PCI devices fails to remove the firmware framebuffer's device. Manually tell it where to look. Reported-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Thomas Zimmermann <tzimmermann@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 46ef57b07c15..e00318ff66ff 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2073,7 +2073,8 @@ static int amdgpu_pci_probe(struct pci_dev *pdev, is_fw_fb = amdgpu_is_fw_framebuffer(base, size); /* Get rid of things like offb */ - ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, &amdgpu_kms_driver); + ret = drm_aperture_remove_conflicting_framebuffers(base, size, is_fw_fb, + &amdgpu_kms_driver); if (ret) return ret; -- 2.36.1
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature