Re: infinite loop in crash due to double-NMI on x86_64 system

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



----- "Lucas Silacci" <Lucas.Silacci@xxxxxxxxxxxx> wrote:

> Hi,
>  
> I've run into an issue where crash will enter an infinite loop while
> decoding exception stacks if those stacks get corrupted.
>  
> We've seen this on four different systems where the hardware generated
> multiple NMIs and the second and subsequent NMIs caused the NMI
> exception stack to be overwritten. When this condition is hit, the
> bottom rsp on the NMI exception stack (which would normally point you
> back to the kernel thread stack or possibly a different exception stack)
> points you back into the middle of the same NMI exception stack. This
> causes crash to infinitely loop when it tries to decode that exception
> stack.
>  
> Now clearly the root cause of the issue is faulty hardware that
> generated multiple NMIs. However a very small change in crash can detect
> this issue and stop the infinite loop from happening thereby allowing
> you to get to a point in crash where you can actually tell that it was
> an NMI that caused the system to dump.
>  
> The patch is attached to this email. For x86_64 it will detect the
> condition of any exception stack that points back at itself.
>  
> Please feel free to ask me any questions on this.

Wow, that's pretty interesting -- I've certainly never seen that before.
Can you show me what the backtrace looks like with your patch applied?

Thanks,
  Dave

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility


[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux