On Thu, 16 May 2024 at 06:29, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Wed, 15 May 2024 at 13:24, Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > > I have to revert both > > > > a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality") > > e362b7c8f8c7 ("drm/amdgpu: Modify the contiguous flags behaviour") > > > > to make things build cleanly. Next step: see if it boots and fixes the > > problem for me. > > Well, perhaps not surprisingly, the WARN_ON() no longer triggers with > this, and everything looks fine. > > Let's see if the machine ends up being stable now. It took several > hours for the "scary messages" state to turn into the "hung machine" > state, so they *could* have been independent issues, but it seems a > bit unlikely. I think that should be fine to do for now. I think it is also fine to do like I've attached, but I'm not sure if I'd take that chance. Two questions for Arunpravin (and Alex): Is this fix correct, and can we get a good explanation of it? Where did this error sneak in? Is the problem in the amdgpu tree, or was it a drm-next only problem? If so perhaps we need to discuss moving amdgpu more into drm-tip to catch this sort of problem. Dave.
From 085b89278f296c40e86f5d1e1bcc1017c39f4002 Mon Sep 17 00:00:00 2001 From: Dave Airlie <airlied@xxxxxxxxxx> Date: Thu, 16 May 2024 09:46:37 +1000 Subject: [PATCH] drm/buddy: convert WARN_ON to an if + continue This WARN_ON triggers a lot, but I don't think the __force_merge path always has to succeed, so just return a failure here instead of warn on to let other paths handle the allocation. (Not 100% sure on this patch - airlied). --- drivers/gpu/drm/drm_buddy.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index 284ebae71cc4..6b90ec6eefa8 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -195,8 +195,9 @@ static int __force_merge(struct drm_buddy *mm, if (!drm_buddy_block_is_free(buddy)) continue; - WARN_ON(drm_buddy_block_is_clear(block) == - drm_buddy_block_is_clear(buddy)); + if (drm_buddy_block_is_clear(block) != + drm_buddy_block_is_clear(buddy)) + continue; /* * If the prev block is same as buddy, don't access the -- 2.44.0