Re: DMAR regression in 2.6.31 leads to ext4 corruption?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 14, 2009 at 01:09:26PM +0100, David Woodhouse wrote:
> On Fri, 2009-10-09 at 18:47 -0700, Andy Isaacson wrote:
> > Well, we don't know for sure what happened on the previous boot where
> > the filesystem corruption occurred.  I'm imagining a nightmare scenario
> > where GPU erroneous writes cause DMAR faults and handling them somehow
> > causes AHCI DMA requests to get lost.
> 
> Seems unlikely. The GPU faults happen whenever the GATT changes, because
> it translates _every_ address in the GATT through the IOMMU right there
> and then -- so if parts of the table are uninitialised, they'll cause
> stray write faults. But no writes are actually _happening_.
> 
> > I'm going to go ahead on the theory that the BIOS needs an update.
> 
> I can't really imagine how that would help; how the BIOS would be
> responsible for this. I'm more inclined to blame the drive. It's not an
> SSD, is it?

It's a Fujitsu (now serviced by Toshiba?) MHZ2160BH.  smartctl says:

Device Model:     FUJITSU MHZ2160BH G1
Serial Number:    K60WT8C2HHRS
Firmware Version: 0084000A
User Capacity:    160,041,885,696 bytes
...
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_
FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   046    Pre-fail  Always       -
       219593
  2 Throughput_Performance  0x0005   100   100   030    Pre-fail  Offline      -
       27721728
  3 Spin_Up_Time            0x0003   100   100   025    Pre-fail  Always       -
       0
  4 Start_Stop_Count        0x0032   099   099   000    Old_age   Always       -
       406
  5 Reallocated_Sector_Ct   0x0033   100   100   024    Pre-fail  Always       -
       8589934592000
  7 Seek_Error_Rate         0x000f   100   100   047    Pre-fail  Always       -
       112
  8 Seek_Time_Performance   0x0005   100   100   019    Pre-fail  Offline      -
       0
  9 Power_On_Hours          0x0032   097   097   000    Old_age   Always       -
       1598
 10 Spin_Retry_Count        0x0013   100   100   020    Pre-fail  Always       -
       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -
       284
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -
       78
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -
       1216
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -
       38 (Lifetime Min/Max 21/46)
195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -
       247
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -
       457965568
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -
       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -
       0
199 UDMA_CRC_Error_Count    0x003e   200   253   000    Old_age   Always       -
       0
200 Multi_Zone_Error_Rate   0x000f   100   100   060    Pre-fail  Always       -
       10448
203 Run_Out_Cancel          0x0002   100   100   000    Old_age   Always       -
       1529011503750
240 Head_Flying_Hours       0x003e   200   200   000    Old_age   Always       -
       0

-andy
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux