Am 08.09.2018 um 12:40 schrieb Tom St Denis: > > > On 09/08/2018 05:23 AM, Huang Rui wrote: >> On Fri, Sep 07, 2018 at 04:59:11PM +0800, Christian König wrote: >>> Hi Ray, >>> >>> in the meantime can we disable the feature once more in the kernel >>> until >>> we have hammered out all possible corner cases? >> >> That's fine. So far, we have to disable it again. I will do more testing >> and repro the issue of Tom firstly. >> >>> >>> As Tom figured out commenting out setting "bulk_moveable" to true >>> should >>> be enough. >> >> I saw you already remove the "bulk_moveable = true" in >> amdgpu_vm_init(), do >> you point we also comment out the one in amdgpu_vm_move_to_lru_tail() to >> disable bulk_move totally for the moment? > > Hi Ray, > > I just commented out the assignment of true. Yeah, I think if we didn't figured out what is going wrong here by Monday we need to do this to prevent further bug reports. Christian. > > Tom > > >> >> Thanks, >> Ray >> >>> >>> Thanks, >>> Christian. >>> >>> Am 07.09.2018 um 08:51 schrieb Huang, Ray: >>>> Hi Tom, >>>> >>>> Thanks to trace this issue. I am trying to reproduce it on >>>> amd-staging-drm-next with piglit. >>>> May I know the steps/configurations to repro it? >>>> >>>> Thanks, >>>> Ray >>>> >>>> -----Original Message----- >>>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of >>>> Tom St Denis >>>> Sent: Wednesday, September 5, 2018 9:27 PM >>>> To: Koenig, Christian <Christian.Koenig at amd.com>; Daenzer, Michel >>>> <Michel.Daenzer at amd.com>; amd-gfx at lists.freedesktop.org; Deucher, >>>> Alexander <Alexander.Deucher at amd.com> >>>> Subject: Re: two KASANs in TTM logic >>>> >>>> Logs attached. >>>> >>>> Tom >>>> >>>> >>>> >>>> On 09/05/2018 08:02 AM, Christian König wrote: >>>>> Still not the slightest idea what is causing this and the patch >>>>> definitely fixes things a lot. >>>>> >>>>> Can you try to enable list debugging in your kernel? >>>>> >>>>> Thanks, >>>>> Christian. >>>>> >>>>> Am 04.09.2018 um 19:18 schrieb Tom St Denis: >>>>>> Sure: >>>>>> >>>>>> d2917f399e0b250f47d07da551a335843a24f835 is the first bad commit >>>>>> commit d2917f399e0b250f47d07da551a335843a24f835 >>>>>> Author: Christian König <christian.koenig at amd.com> >>>>>> Date:  Thu Aug 30 10:04:53 2018 +0200 >>>>>> >>>>>>     drm/amdgpu: fix "use bulk moves for efficient VM LRU >>>>>> handling" v2 >>>>>> >>>>>>     First step to fix the LRU corruption, we accidentially >>>>>> tried to >>>>>> move things >>>>>>     on the LRU after dropping the lock. >>>>>> >>>>>>     Signed-off-by: Christian König <christian.koenig at amd.com> >>>>>>     Tested-by: Michel Dänzer <michel.daenzer at amd.com> >>>>>> >>>>>> :040000 040000 ed5be1ad4da129c4154b2b43acf7ef349a470700 >>>>>> 0008c4e2fb56512f41559618dd474c916fc09a37 M     drivers >>>>>> >>>>>> >>>>>> The commit before that I can run xonotic-glx and piglit on my >>>>>> Carrizo >>>>>> without a KASAN. >>>>>> >>>>>> Tom >>>>>> >>>>>> On 09/04/2018 10:05 AM, Christian König wrote: >>>>>>> The first one should already be fixed. >>>>>>> >>>>>>> Not sure where the second comes from. Can you narrow that down >>>>>>> further? >>>>>>> >>>>>>> Christian. >>>>>>> >>>>>>> Am 04.09.2018 um 15:46 schrieb Tom St Denis: >>>>>>>> First is caused by this commit while running a GL heavy >>>>>>>> application. >>>>>>>> >>>>>>>> d78c1fa0c9f815fe951fd57001acca3d35262a17 is the first bad commit >>>>>>>> commit d78c1fa0c9f815fe951fd57001acca3d35262a17 >>>>>>>> Author: Michel Dänzer <michel.daenzer at amd.com> >>>>>>>> Date:  Wed Aug 29 11:59:38 2018 +0200 >>>>>>>> >>>>>>>>     Revert "drm/amdgpu: move PD/PT bos on LRU again" >>>>>>>> >>>>>>>>     This reverts commit >>>>>>>> 31625ccae4464b61ec8cdb9740df848bbc857a5b. >>>>>>>> >>>>>>>>     It triggered various badness on my development machine when >>>>>>>> running the >>>>>>>>     piglit gpu profile with radeonsi on Bonaire, looks like >>>>>>>> memory >>>>>>>>     corruption due to insufficiently protected list >>>>>>>> manipulations. >>>>>>>> >>>>>>>>     Signed-off-by: Michel Dänzer <michel.daenzer at amd.com> >>>>>>>>     Signed-off-by: Alex Deucher <alexander.deucher at amd.com> >>>>>>>> >>>>>>>> :040000 040000 b7169f0cf0c7decec631751a9896a92badb67f9d >>>>>>>> 42ea58f43199d26fc0c7ddcc655e6d0964b81817 M drivers >>>>>>>> >>>>>>>> The second is caused by something between that and the tip of the >>>>>>>> 4.19-rc1 amd-staging-drm-next (I haven't pinned it down yet) while >>>>>>>> loading GNOME. >>>>>>>> >>>>>>>> Tom >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> amd-gfx mailing list >>>>>>>> amd-gfx at lists.freedesktop.org >>>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>>> _______________________________________________ >>>> amd-gfx mailing list >>>> amd-gfx at lists.freedesktop.org >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>>