Re: sata_sil data corruption, possible workarounds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Monday 17 December 2012 06:44, Robert Hancock wrote:

> Hmm, looks like I was looking at the wrong register. The CLS itself is
> described by what I posted, so changing that does affect things (i.e.
> the threshold for Memory Read Multiple). The other value being written
> into fifo_cfg is the FIFO Write Request Control and FIFO Read Request
> Control field (that's why it's written to bits 0-2 and 8-10).
> 
> "The FIFO Write Request Control and FIFO Read Request Control fields in
> these registers provide threshold settings for establishing when PCI
> requests are made to the Arbiter. The Arbiter arbitrates among the four
> requests using fixed priority with masking. The fixed priority is, from
> highest to lowest: channel 0; channel 1; channel 2; and channel 3. If
> multiple requests are present, the arbiter grants PCI bus access to the
> highest priority channel that is not masked. That channel’s request is
> then masked as long as any unmasked requests are present.
> 
> ..
> 
> FIFO Read Request Control.  This bit field defines the FIFO threshold to
> assign priority when requesting a PCI bus read operation. A value of 00H
> indicates that read request priority is set whenever the FIFO has
> greater than 32 bytes available space, while a value of 07H indicates
> that read request priority is set whenever the FIFO has greater than
> 7x32 bytes (=224 bytes) available space. 

A fencepost error? They probably mean 8x32 bytes...

> This bit field is useful when 
> multiple DMA channels are competing for accessing the PCI bus.
> 
> 
> FIFO Write Request Control. This bit field defines the FIFO threshold to
> assign priority when requesting a PCI bus write operation. A value of
> 00H indicates that write request priority is set whenever the FIFO
> contains greater than 32 bytes, while a value of 07H indicates that
> write request priority is set whenever the FIFO contains greater than
> 7x32 bytes (=224 bytes). This bit field is useful when multiple DMA
> channels are competing for the PCI bus."
> 
> The value apparently being written to the register according to the code
> (and given that the value in the CLS register is in units of 32-bit
> words) is (cache line size >> 3) + 1.
> 
>  From looking at the history of this code (which dates from the pre-git
> days in 2005) it comes from:
> 
>
https://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commit;h=fceff08ed7660f9bbe96ee659acb02841a3f1f39
> 
> which refers to an issue with DMA FIFO thresholds which could cause data
> corruption. The description is pretty much hand-waving and doesn't 
> really describe what is going on. 

>From the description: "The patch is to setup the DMA fifo threshold so that
there is no chance for the DMA engine to change protocol." Is there a way to
check (or add debugging messages) if/when this change of protocol is
happening?

> But it seems quite likely that 
> whatever magic numbers this code is picking don't work on your system
> for some reason. It appears the root cause is likely a bug in the SiI
> chip. There shouldn't be any region why messing around with these values
> should cause data corruption other than that.

Do you think something should be done about it in the linux sata_sil driver?
For a lack of a better solution, here is my suggestion. There is already
one option 'slow_down' for problematic disks. Another option, for
example 'cache_line_workaround', could be added for problematic
motherboards. If enabled, the most straightforward way is to set cache line
size to 0 and not worry about the fifo_cfg register. If someone else
confirms that it solves the problem for them, this option could be enabled
automatically if certain motherboard chipset is detected.

A comment in the source of another Silicon Image driver, siimage.c: "If you
have strange problems with nVidia chipset systems please see the SI support
documentation and update your system BIOS if necessary". Do you (or anyone
else) know what support documentation it's refering to?


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux