Re: [PATCH v3 0/3] Add file-system authentication to BTRFS

Johannes Thumshirn <Johannes.Thumshirn@xxxxxxx> · Tue, 26 May 2020 07:50:53 +0000

On 25/05/2020 15:11, David Sterba wrote:
> On Thu, May 14, 2020 at 11:24:12AM +0200, Johannes Thumshirn wrote:
>> From: Johannes Thumshirn <johannes.thumshirn@xxxxxxx>
>>
>> This series adds file-system authentication to BTRFS. 
>>
>> Unlike other verified file-system techniques like fs-verity the
>> authenticated version of BTRFS does not need extra meta-data on disk.
>>
>> This works because in BTRFS every on-disk block has a checksum, for meta-data
>> the checksum is in the header of each meta-data item. For data blocks, a
>> separate checksum tree exists, which holds the checksums for each block.
>>
>> Currently BRTFS supports CRC32C, XXHASH64, SHA256 and Blake2b for checksumming
>> these blocks. This series adds a new checksum algorithm, HMAC(SHA-256), which
>> does need an authentication key. When no, or an incoreect authentication key
>> is supplied no valid checksum can be generated and a read, fsck or scrub
>> operation would detect invalid or tampered blocks once the file-system is
>> mounted again with the correct key. 
> 
> As mentioned in the discussion under LWN article, https://lwn.net/Articles/818842/
> ZFS implements split hash where one half is (partial) authenticated hash
> and the other half is a checksum. This allows to have at least some sort
> of verification when the auth key is not available. This applies to the
> fixed size checksum area of metadata blocks, for data we can afford to
> store both hashes in full.
> 
> I like this idea, however it brings interesting design decisions, "what
> if" and corner cases:
> 
> - what hashes to use for the plain checksum, and thus what's the split
> - what if one hash matches and the other not
> - increased checksum calculation time due to doubled block read
> - whether to store the same parital hash+checksum for data too
> 
> As the authenticated hash is the main usecase, I'd reserve most of the
> 32 byte buffer to it and use a weak hash for checksum: 24 bytes for HMAC
> and 8 bytes for checksum. As an example: sha256+xxhash or
> blake2b+xxhash.
> 
> I'd outright skip crc32c for the checksum so we have only small number
> of authenticated checksums and avoid too many options, eg.
> hmac-sha256-crc32c etc. The result will be still 2 authenticated hashes
> with the added checksum hardcoded to xxhash.
> 

Hmm I'm really not a fan of this. We would have to use something like 
sha2-224 to get the room for the 2nd checksum. So we're using a weaker
hash just so we can add a second checksum. On the other hand you've asked 
me to add the known pieces of information into the hashes as a salt to
"make attacks harder at a small cost".