On Feb 9, 2021, at 4:22 PM, Theodore Ts'o <tytso@xxxxxxx> wrote: > > On Wed, Feb 03, 2021 at 11:31:28AM -0500, Theodore Ts'o wrote: >> On Wed, Feb 03, 2021 at 03:55:06AM -0700, Andreas Dilger wrote: >>> >>> It looks like this change will break the dirdata feature, which is similarly >>> storing a data field beyond the end of the dirent. However, that feature also >>> provides for flags stored in the high bits of the type field to indicate >>> which of the fields are in use there. >>> The first byte of each field stores >>> the length, so it can be skipped even if the content is not understood. >> >> Daniel, for context, the dirdata field is an out-of-tree feature which >> is used by Lustre, and so has fairly large deployed base. So if there >> is a way that we can accomodate not breaking dirdata, that would be >> good. >> >> Did the ext4 casefold+encryption implementation escape out to any >> Android handsets? > > So from an OOB chat with Daniel, it appears that the ext4 > casefold+encryption implementation did in fact escape out to Android > handsets. So I think what we will need to do, ultiumately, is support > one way of supporting the casefold IV in the case where "encryption && > casefold", and another way when "encryption && casefold && dirdata". > > That's going to be a bit sucky, but I don't think it should be that > complex. Daniel, Andreas, does that make sense to you? I was just going to ping you about this, whether it made sense to remove this feature addition from the "maint" branch (i.e. make a 1.45.8 without it), and keep it only in 1.46 or "next" to reduce its spread? Depending on the size of the "escape", it probably makes sense to move toward having e2fsck migrate from the current mechanism to using dirdata for all deployments. In the current implementation, tools don't really know for sure if there is data beyond the filename in the dirent or not. I guess it is implicit with the casefold+encryption case for dirents in directories that have the encryption flag set in a filesystem that also has casefold enabled, but it's definitely not friendly to these features being enabled on an existing filesystem. For example, what if casefold is enabled on an existing filesystem that already has an encrypted directory? Does the code _assume_ that there is a hash beyond the name if the rec_len is long enough for this? There will definitely be some pre-existing dirents that will have a large rec_len (e.g. those at the end of the block, or with deleted entries immediately following), that do *not* have the proper hash stored in them. There may be random garbage at the end of the dirent, and since every value in the hash is valid, there is no way to know whether it is good or bad. With the dirdata mechanism, there would be a bit set in the "file_type" field that will indicate if the hash was present, as well as a length field (0x08) that is a second confirmation that this field is valid. Cheers, Andreas
Attachment:
signature.asc
Description: Message signed with OpenPGP