>>> What checksumming is done for the actual data? I know that >>> storage devices often do their own checksumming too, but how >>> can I be sure my data is integrity checked every time I read >>> it? These things ("storage devices often do their own checksumming" and "my data is integrity checked") are rather unrelated. Various parts of storage subsystems do things like checksumming not to protect your data, but to detect potential faults. That is mainly as a diagnostic not for integrity. Part of the reason is that it is very difficult and needlessly expensive to do comprehensivce integrity checking within the storage subsystem, automagically. >> If you use disks that support the Data Integrity Field (DIF) >> extension, Linux will use it to provide end-to-end data >> checksum support. Otherwise, there are checksums on the disk >> and between disk controller and the CPU, but those are >> obviously not end-to-end checksums. Yes. But I'll add that the only way to ensure that "data is integrity checked" is to do it truly end-to-end, with data and application specific checks. For example as a weak but useful measure I 'zip' or gzip' (sometimes with zero compression if already compressed) data that I want to be able to move around across years and many storage devices. Consider for example bugs in the IO subsystem itself, where the wrong data ends up being written and checksummed, and gets validated every time even if it is not the right data. > Just to be clear, even with a storage path that supports > DIF/DIX, we don't currently do anything for applications on > top of file systems. The primary application to target storage > path is covered mainly for raw devices. Which makes it not that generally useful. In effect DIF is a hw accelerator of a weak form of per-block checksumming. I think that most current CPUs are fast enough to do it without it beoing that noticeable. >> Adding data-level checksums is not something that we are >> planning on adding to the ext2/3/4 file systems. BTRFS is >> the only file system that has data-level checksums, but it's >> not yet production ready. But again that's not end-to-end. It is just as far as the current storage system goes, and the biggest value, like for ZFS, is to detect issues with the storage system itself (e.g. bugs as well as hw issues). _______________________________________________ Ext3-users mailing list Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users