Re: Crash and automatical reboot when using the NVIDIA card

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



Panruo Wu wrote:
> David McGiven <davidmcgivenn@...> writes:
>>
>> I'm running a Supermicro server with the latest CentOS 6.4 versions
>> (kernel 2.6.32-358.23.2.el6.x86_64) and the latest nvidia driver (331.20).
>>
>> A few minutes after using the GPU for doing some HPC calculations, the
>> server crashes and reboots itself. This is happening every time. I know
>> it will be rebooted but I don't know when. Sometimes it's 20 minutes after
>> starting using it. Sometimes it's 2 hours.
<snip>
> I also have the same problem with all my 4 Supermicro machines. I don't
> know why it happens but nvidia driver seems to be blamed for me.
> I'm using CentOS 6.3 and nVidia driver version 304.54 or 319.37.

On our Dell R720s, I'm using the kmod-nvidia from elrepo. They don't
crash... and that even when they're running week-long jobs.

       mark

_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos




[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux