On 26/05/2020 13:54, David Sterba wrote: > On Tue, May 26, 2020 at 07:50:53AM +0000, Johannes Thumshirn wrote: >> On 25/05/2020 15:11, David Sterba wrote: >>> On Thu, May 14, 2020 at 11:24:12AM +0200, Johannes Thumshirn wrote: >>> As mentioned in the discussion under LWN article, https://lwn.net/Articles/818842/ >>> ZFS implements split hash where one half is (partial) authenticated hash >>> and the other half is a checksum. This allows to have at least some sort >>> of verification when the auth key is not available. This applies to the >>> fixed size checksum area of metadata blocks, for data we can afford to >>> store both hashes in full. >>> >>> I like this idea, however it brings interesting design decisions, "what >>> if" and corner cases: >>> >>> - what hashes to use for the plain checksum, and thus what's the split >>> - what if one hash matches and the other not >>> - increased checksum calculation time due to doubled block read >>> - whether to store the same parital hash+checksum for data too >>> >>> As the authenticated hash is the main usecase, I'd reserve most of the >>> 32 byte buffer to it and use a weak hash for checksum: 24 bytes for HMAC >>> and 8 bytes for checksum. As an example: sha256+xxhash or >>> blake2b+xxhash. >>> >>> I'd outright skip crc32c for the checksum so we have only small number >>> of authenticated checksums and avoid too many options, eg. >>> hmac-sha256-crc32c etc. The result will be still 2 authenticated hashes >>> with the added checksum hardcoded to xxhash. >> >> Hmm I'm really not a fan of this. We would have to use something like >> sha2-224 to get the room for the 2nd checksum. So we're using a weaker >> hash just so we can add a second checksum. > > The idea is to calculate full hash (32 bytes) and store only the part > (24 bytes). Yes this means there's some information loss and weakening, > but enables a usecase. I'm not enough a security expert to be able to judge this. Eric can I hear your opinion on this? Thanks, Johannes