Re: Transactional updates for LUKS2 metadata?

"Schneider, Robert" <robert.schneider03@xxxxxxx> · Sun, 11 Apr 2021 12:09:09 +0000

Thank you very much for your response!

I am implementing a form of key rotation for a "backup" key (a secondary key). The backup key has associated data which I wanted to store in a token. This associated data is required to get the passphrase. To rotate this backup key, I have to adjust the keyslot json object, the token, and the keyslot binary data.

Currently, I am using two keyslots to implement a safe update. I'm relying on the atomicity of API calls such as crypt_keyslot_add_by_* to a certain degree. To maintain the pair of keyslots, I write a sequence number into the token object. The key is rotated as follows:
1. crypt_keyslot_add_* - add new keyslot
2. crypt_token_set - add metadata for new keyslot, with incremented sequence number
3. crypt_keyslot_wipe - remove old keyslot and tokens

While this is not too difficult, I'll also have to deal with the existence of two keyslot+token pairs (old, new). Plus, it's possible to leak the keyslot if a reboot occurs between steps 1 and 2. It would be an ordinary keyslot without associated metadata, hence I cannot be sure that it's the backup key - the keyslot number is not fixed. At some point I realized that it might be possible to leverage cryptsetup to get rid of this complexity in my tool, since libcryptsetup can provide slightly better atomicity guarantees than currently exposed in the API. For example, if I could add the token first, then the keyslot, the aforementioned leak would be gone. A leaked token is easy to detect, since a token is invalid if the associated keyslot is invalid.

I've been reading through the spec and the implementation again, and it seemed to me that header restore cannot be atomic due to the binary keyslot area: Since there are two copies of the header with a sequence number and checksum, we can restore one header first, sync, and then restore the second header. Should an error occur at any point, one of the header replica should still be valid. However if an error occurs during the restore of the binary keyslot area, we cannot recover nor detect the error since there's no checksum for this area. (I wonder why the binary keyslot area is unprotected...)

It looks to me as if a safe update of a keyslot would therefore have to be done along these lines:
1. Load header into memory.
2. Allocate new keyslot area by using in-memory data of the header. Includes update of keyslot json object in memory to refer to new binary keyslot.
3. Lock the device (locked for exclusive writes) and ensure sequence number is still the same.
4. Write new binary keyslot data to disk. This does not overwrite the binary data of the old keyslot.
5. Sync.
6. Write first header to disk.
7. Sync.
8. Write second header to disk.
9. Sync.
10. Wipe old keyslot binary data.
11. Sync.
12. Unlock device.

In this case, we could leak the old keyslot between step 9 and step 11, but at least we could detect it as a non-zero, unreferenced keyslot. For key rotation, I believe it is important to not leak the old keyslot binary data, since we're trying to make unlocking with the old passphrase impossible (since the salt of the kdf is removed, this might not be an issue).
The LUKS2_keyslot_store function together with the luks2_keyslot keyslot_handler seems to follow this roughly, except that locking is different and it only stores a new keyslot but doesn't remove any.

In this algorithm, we can also persist any modifications of the binary header and the json metadata in steps 6/8 safely due to the checksum and sequence number. Therefore, it should be possible to atomically change both a keyslot and the associated token.

Generic transactions with multiple modifications of the keyslot area now appear to me to be rather complicated to implement. However, multiple modifications of the json metadata and a single, final keyslot_add or keyslot_change should be relatively easy to implement. I understand that you want to keep complexity low, but would you be willing to accept a contribution in this direction? Or do you maybe have a better idea how I could solve my issue?

Thanks again,
Robert

-----Original Message-----
From: Milan Broz <gmazyland@xxxxxxxxx> 
Sent: Samstag, 10. April 2021 21:27
To: Schneider, Robert <robert.schneider03@xxxxxxx>; dm-crypt@xxxxxxxx
Subject: Re:  Transactional updates for LUKS2 metadata?

On 09/04/2021 20:46, Schneider, Robert wrote:
> Hi,
> 
> Is there a way to get transactions over multiple metadata operations when using libcryptsetup?
> 
> Imagine I have some mechanism for unlocking which requires information from a token associated to a keyslot. Now I'd like to update that information in the token together with the keyslot.
> But if the machine reboots in between the API calls, I believe my unlock mechanism would be broken - for example, when I've updated the keyslot but still have the old token.
> 
> I could not find an operation to update a token atomically, nor any transaction operations (like open transaction, commit) in the API. I've had a quick glance at the source code and it looks to me like the header is updated in memory and finally written to disk with replica, using a sequence number. This suggests to me that transactions should be relatively easy to implement. However I don't see the full picture of course, so I'd like to know your opinion.
> 
> As an alternative to transactions within the libcryptsetup API, it looks like it's possible to perform a header backup, then manipulate the detached (backup) header, and finally restore the header - as long as the volume key is not changed. Do you think that's a reasonable alternative, or are there potential pitfalls here?

LUKS2 header is not database and never will be. Implementing transactions smells like overengineering here.

For "complex" operations you can recover after failure by removing token metadata and recreate them, it is expected that you still
have a backup keyslot or volume key (or header backup).

And yes, you can modify backup header externally and then update it. But then the recovery can fail with a partial write or a media failure,
so you will end in the same situation you tried to avoid.

Milan
_______________________________________________
dm-crypt mailing list -- dm-crypt@xxxxxxxx
To unsubscribe send an email to dm-crypt-leave@xxxxxxxx