RE: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[Public]

Hi,

After investigating quite some time on this issue, found freeze problem is not with the amdgpu part of buddy allocator patch as the patch doesn’t throw any issues when applied separately on top of the stable base of drm-next. After digging more into this issue, the below patch seems to be the cause of this problem,

drm/ttm: rework bulk move handling v5
https://cgit.freedesktop.org/drm/drm/commit/?id=fee2ede155423b0f7a559050a39750b98fe9db69

when this patch applied on top of the stable (working version) of drm-next without buddy allocator patch, we can see multiple issues listed below, each thrown randomly at every GravityMark run, 1. general protection fault at ttm_lru_bulk_move_tail() 2. NULL pointer deference at ttm_lru_bulk_move_tail() 3. NULL pointer deference at ttm_resource_init().

Regards,
Arun.
-----Original Message-----
From: Alex Deucher <alexdeucher@xxxxxxxxx> 
Sent: Monday, May 16, 2022 8:36 PM
To: Mike Lothian <mike@xxxxxxxxxxxxxx>
Cc: Paneer Selvam, Arunpravin <Arunpravin.PaneerSelvam@xxxxxxx>; Intel Graphics Development <intel-gfx@xxxxxxxxxxxxxxxxxxxxx>; amd-gfx list <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>; Maling list - DRI developers <dri-devel@xxxxxxxxxxxxxxxxxxxxx>; Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; Matthew Auld <matthew.auld@xxxxxxxxx>
Subject: Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

On Mon, May 16, 2022 at 8:40 AM Mike Lothian <mike@xxxxxxxxxxxxxx> wrote:
>
> Hi
>
> The merge window for 5.19 will probably be opening next week, has 
> there been any progress with this bug?

It took a while to find a combination of GPUs that would repro the issue, but now that we can, it is still being investigated.

Alex

>
> Thanks
>
> Mike
>
> On Mon, 2 May 2022 at 17:31, Mike Lothian <mike@xxxxxxxxxxxxxx> wrote:
> >
> > On Mon, 2 May 2022 at 16:54, Arunpravin Paneer Selvam 
> > <arunpravin.paneerselvam@xxxxxxx> wrote:
> > >
> > >
> > >
> > > On 5/2/2022 8:41 PM, Mike Lothian wrote:
> > > > On Wed, 27 Apr 2022 at 12:55, Mike Lothian <mike@xxxxxxxxxxxxxx> wrote:
> > > >> On Tue, 26 Apr 2022 at 17:36, Christian König <christian.koenig@xxxxxxx> wrote:
> > > >>> Hi Mike,
> > > >>>
> > > >>> sounds like somehow stitching together the SG table for PRIME 
> > > >>> doesn't work any more with this patch.
> > > >>>
> > > >>> Can you try with P2P DMA disabled?
> > > >> -CONFIG_PCI_P2PDMA=y
> > > >> +# CONFIG_PCI_P2PDMA is not set
> > > >>
> > > >> If that's what you're meaning, then there's no difference, I'll 
> > > >> upload my dmesg to the gitlab issue
> > > >>
> > > >>> Apart from that can you take a look Arun?
> > > >>>
> > > >>> Thanks,
> > > >>> Christian.
> > > > Hi
> > > >
> > > > Have you had any success in replicating this?
> > > Hi Mike,
> > > I couldn't replicate on my Raven APU machine. I see you have 2 
> > > cards initialized, one is Renoir and the other is Navy Flounder. 
> > > Could you give some more details, are you running Gravity Mark on 
> > > Renoir and what is your system RAM configuration?
> > > >
> > > > Cheers
> > > >
> > > > Mike
> > >
> > Hi
> >
> > It's a PRIME laptop, it failed on the RENOIR too, it caused a 
> > lockup, but systemd managed to capture it, I'll attach it to the 
> > issue
> >
> > I've got 64GB RAM, the 6800M has 12GB VRAM
> >
> > Cheers
> >
> > Mike




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux