On 2012-04-29, at 1:39 PM, Ted Ts'o wrote: > On Sat, Apr 28, 2012 at 04:58:12PM -0600, Andreas Dilger wrote: >> >> I thought we originally discussed using the high 16 bits of the >> t_flags field to store the checksum? This would avoid the need to >> change the disk format. > > I don't recall that suggestion, but I like it. One thing that will > get subtle about this is that t_flags is stored big-endian (jbd/jbd2 > data structures are stored be, but the data structures in ext4 proper > are stored le; sigh). So we'd have to do something like this: > > typedef struct journal_block_tag_s > { > __u32 t_blocknr; /* The on-disk block number */ > __u16 t_checksum; /* 16-bit checksum */ > __u16 t_flags; /* See below */ > __u32 t_blocknr_high; /* most-significant high 32bits. */ > } journal_block_tag_t; > > ... and then make sure we change all of the places that access t_flags > using cpu_to_be32() and be32_to_cpu() get changed to the 16-bit > variant. Sigh... >> Since there is still a whole transaction checksum, it isn't so >> critical that the per-block checksum be strong. >> >> One idea is to do the crc32c for each block, then store the high 16 >> bits into t_flags, and checksum the full 32-bit per-block checksums >> to make the commit block checksum, to avoid having to do the block >> checksums twice. > > It's not critical because the hard drive is doing its own ECC. So I'm > not that worried about detecting a large burst of bit errors, which is > the main advantage of using a larger CRC. I'm more worried about a > disk block getting written to the wrong place, or not getting written > at all. So whether the chance of detecting a wrong block is 99.9985% > at all (w/ a 16-bit checksum) or 99.9999% (with a 32-bit checksum), > at all, either is fine. > > I'm not even sure I would worry combining the per-block checksums into > the commit block checksum. In the rare case where there is an error > not detected by the 16-bit checksum which is detected in the commit > checksum, what are we supposed to do? Throw away the entire commit > again? Just simply testing to see what we do in this rare case is > going to be interesting / painful. The main reason to create the commit block checksum out of the per-block checksums is to avoid having to checksum the data blocks themselves twice. That would be a significant overhead to compute, say, 4096 * 4kB = 16MB of block data, while computing the commit block checksum out of 4096 * 2 bytes = 8kB of the 16-bit checksums is relatively minor. Cheers, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html