Comment # 24
on bug 111763
from wychuchol
(In reply to wychuchol from comment #23) > (In reply to wychuchol from comment #19) > > After some time in Witcher 3 GOTY run with Lutris PC restarts on it's own. I > > thought something is overheating (I've noticed graphic card memory in > > PSensor sometimes reaching 90 so I thought maybe that's what's happening) > > but I investigated kern.log and this always happened before that autonomous > > reset: > > > > Nov 2 22:01:53 pop-os kernel: [ 979.244964] pcieport 0000:00:01.1: AER: > > Corrected error received: 0000:01:00.0 > > Nov 2 22:01:53 pop-os kernel: [ 979.244967] nvme 0000:01:00.0: AER: PCIe > > Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID) > > Nov 2 22:01:53 pop-os kernel: [ 979.244968] nvme 0000:01:00.0: AER: > > device [1987:5012] error status/mask=00001000/00006000 > > Nov 2 22:01:53 pop-os kernel: [ 979.244968] nvme 0000:01:00.0: AER: > > [12] Timeout > > Nov 2 22:01:53 pop-os kernel: [ 979.262629] Emergency Sync complete > > Thing with those AER errors is that they can go on and on and reset happens > few minutes after the last logged error. > This might be overheating, I managed to find how to output sensors readings > into txt log and found that memory went up to 96 C (or rather it stayed > there for about 1m 10s) > Last reading before reset: > amdgpu-pci-2800 > Adapter: PCI adapter > vddgfx: +1.16 V > fan1: 1551 RPM (min = 0 RPM, max = 3200 RPM) > edge: +74.0°C (crit = +118.0°C, hyst = -273.1°C) > (emerg = +99.0°C) > junction: +88.0°C (crit = +99.0°C, hyst = -273.1°C) > (emerg = +99.0°C) > mem: +96.0°C (crit = +99.0°C, hyst = -273.1°C) > (emerg = +99.0°C) > power1: 162.00 W (cap = 195.00 W) > > k10temp-pci-00c3 > Adapter: PCI adapter > Tdie: +70.5°C (high = +70.0°C) > Tctl: +70.5°C > > Now the weird thing is - if this is in fact overheating why fan didn't go > beyond 1600 rpm even once.... Highest was like 1581 rpm and I don't have > silent bios switched on (sapphire pulse rx 5700 xt, lever facing away from > video ports). Okay I don't think it's overheating anymore. I found a moment in Anomaly 1.5.0 I can't get past without system resetting, just before a psi storm in Army Warehouses (I can provide a savefile). Last sensors reading before crash (5 second increments): amdgpu-pci-2800 Adapter: PCI adapter vddgfx: +1.01 V fan1: 1560 RPM (min = 0 RPM, max = 3200 RPM) edge: +69.0°C (crit = +118.0°C, hyst = -273.1°C) (emerg = +99.0°C) junction: +84.0°C (crit = +99.0°C, hyst = -273.1°C) (emerg = +99.0°C) mem: +80.0°C (crit = +99.0°C, hyst = -273.1°C) (emerg = +99.0°C) power1: 227.00 W (cap = 195.00 W) k10temp-pci-00c3 Adapter: PCI adapter Tdie: +71.8°C (high = +70.0°C) Tctl: +71.8°C
You are receiving this mail because:
- You are the assignee for the bug.
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel