Re: [PATCH] drm/amdgpu: Register bad page handler for Aldebaran

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 13, 2021 at 10:57 AM Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> On Thu, May 13, 2021 at 10:32:45AM -0400, Alex Deucher wrote:
> > Right.  The sys admin can query the bad page count and decide when to
> > retire the card.
>
> Yap, although the driver should actively "tell" the sysadmin when some
> critical counts of retired VRAM pages are reached because I doubt all
> admins would go look at those counts on their own.

I think we print something in the log as well when we hit the
threshold.  I need to double check the code.

>
> Btw, you say "admin" - am I to understand that those are some high end
> GPU cards with ECC memory? If consumer grade stuff has this too, then
> the driver should very much warn on such levels on its own because
> normal users won't know what and where to look.
>

Currently it's only available on workstation and datacenter boards.

> Other than that, the big picture sounds good to me.

Thanks!

Alex

>
> Thx.
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux