[AMD Official Use Only - AMD Internal Distribution Only] > -----Original Message----- > From: Thomas Zimmermann <tzimmermann@xxxxxxx> > Sent: Wednesday, June 12, 2024 9:26 AM > To: Linux regressions mailing list <regressions@xxxxxxxxxxxxxxx> > Cc: Petkov, Borislav <Borislav.Petkov@xxxxxxx>; > zack.rusin@xxxxxxxxxxxx; dmitry.osipenko@xxxxxxxxxxxxx; Kaplan, David > <David.Kaplan@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; > Dave Airlie <airlied@xxxxxxxxxx>; Maarten Lankhorst > <maarten.lankhorst@xxxxxxxxxxxxxxx>; Maxime Ripard > <mripard@xxxxxxxxxx>; LKML <linux-kernel@xxxxxxxxxxxxxxx>; ML dri-devel > <dri-devel@xxxxxxxxxxxxxxxxxxxxx>; spice-devel@xxxxxxxxxxxxxxxxxxxxx; > virtualization@xxxxxxxxxxxxxxx > Subject: Re: [REGRESSION] QXL display malfunction > > Caution: This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > > Hi > > Am 12.06.24 um 14:41 schrieb Linux regression tracking (Thorsten Leemhuis): > > [CCing a few more people and lists that get_maintainers pointed out > > for qxl] > > > > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting > > for once, to make this easily accessible to everyone. > > > > Thomas, from here it looks like this report that apparently is caused > > by a change of yours that went into 6.10-rc1 (b33651a5c98dbd > > ("drm/qxl: Do not pin buffer objects for vmap")) fell through the > > cracks. Or was progress made to resolve this and I just missed this? > > > > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' > > hat) > > -- > > Everything you wanna know about Linux kernel regression tracking: > > https://linux-regtracking.leemhuis.info/about/#tldr > > If I did something stupid, please tell me, as explained on that page. > > > > #regzbot poke > > > > > > On 03.06.24 04:29, Kaplan, David wrote: > >>> -----Original Message----- > >>> From: Kaplan, David > >>> Sent: Sunday, June 2, 2024 9:25 PM > >>> To: tzimmermann@xxxxxxx; dmitry.osipenko@xxxxxxxxxxxxx; Koenig, > >>> Christian <Christian.Koenig@xxxxxxx>; zach.rusin@xxxxxxxxxxxx > >>> Cc: Petkov, Borislav <Borislav.Petkov@xxxxxxx>; > >>> regressions@xxxxxxxxxxxxxx > >>> Subject: [REGRESSION] QXL display malfunction > >>> > >>> Hi, > >>> > >>> I am running an Ubuntu 19.10 VM with a tip kernel using QXL video > >>> and I've observed the VM graphics often malfunction after boot, > >>> sometimes failing to load the Ubuntu desktop or even immediately > shutting the guest down. > >>> When it does load, the guest dmesg log often contains errors like > >>> > >>> [ 4.303586] [drm:drm_atomic_helper_commit_planes] *ERROR* head > 1 > >>> wrong: 65376256x16777216+0+0 > >>> [ 4.586883] [drm:drm_atomic_helper_commit_planes] *ERROR* head > 1 > >>> wrong: 65376256x16777216+0+0 > >>> [ 4.904036] [drm:drm_atomic_helper_commit_planes] *ERROR* head > 1 > >>> wrong: 65335296x16777216+0+0 > > I don't see how these messages are related. Did they already appear before > the broken commit was there? No, I did not observe them prior to the broken commit. > > >>> [ 5.374347] [drm:qxl_release_from_id_locked] *ERROR* failed to find > id in > >>> release_idr > > Is there only one such message in the log? Or multiple/frequent ones. I would usually only see one. > > Could you provide a stack trace of what happens before? Here's the top of a backtrace when the error occurs: #0 qxl_release_from_id_locked (qdev=qdev@entry=0xffff88810126e000, id=id@entry=262151) at drivers/gpu/drm/qxl/qxl_release.c:373 #1 0xffffffff819f5b6a in qxl_garbage_collect (qdev=0xffff88810126e000) at drivers/gpu/drm/qxl/qxl_cmd.c:222 #2 0xffffffff810e3aa8 in process_one_work (worker=worker@entry=0xffff888101680300, work=0xffff88810126f340) at kernel/workqueue.c:3231 #3 0xffffffff810e6281 in process_scheduled_works (worker=<optimized out>) at kernel/workqueue.c:3312 #4 worker_thread (__worker=0xffff888101680300) at kernel/workqueue.c:3393 > > We sometimes draw into the buffer object from the CPU. For accessing the > buffer object's pages from the CPU, only a vmap operation should be > necessary. It appears as if qxl also requires a pin. My guess is that the pin > inserts the buffer-object's host-side pages and the code around > qxl_release_from_id_locked() appears to be garbage-collecting them. > Hence without the pin, the GC complains about inconsistent state. > >>> > >>> I bisected the issue down to "drm/qxl: Do not pin buffer objects for > vmap" > >>> (b33651a5c98dbd5a919219d8c129d0674ef74299). > > Thanks for bisecting. Does it work if you revert that commit? Yes Thanks --David Kaplan