On Mon, Apr 19, 2021 at 09:59:06PM -0400, Theodore Ts'o wrote: > On Mon, Apr 19, 2021 at 03:53:52PM -0700, Eric Biggers wrote: > > On Mon, Apr 19, 2021 at 04:21:00PM +0000, Leah Rumancik wrote: > > > Upon file deletion, zero out all fields in ext4_dir_entry2 besides inode > > > and rec_len. In case sensitive data is stored in filenames, this ensures > > > no potentially sensitive data is left in the directory entry upon deletion. > > > Also, wipe these fields upon moving a directory entry during the conversion > > > to an htree and when splitting htree nodes. > > > > This should include more explanation about why this is useful, and what its > > limitations are (e.g. how do the properties of the storage device affect whether > > the filename is *really* deleted)... > > Well, it might be useful to talk about how this is not a complete > solution on its own (acknowledge that more changes to make sure > filenames aren't leaked in the journal will be forthcoming). > > However, there is a limit to how much we can put in a commit > description, and I'd argue that the people for whom caveats about > flash devices having old copies of directory blocks which could be > extracted by a nation-state intelligence angency, etc., are not likely > going to be the people reading the git commit description. :-) That's > the sort of thing that is best placed in a presentation given at a > conference, or in a white paper, or in LWN article. > > Commit descriptions are targetted at developers, so a note that "more > commits to follow" would be appropriate. I'll add a bit more to the description to help clarify some more. > > > > diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c > > > index 883e2a7cd4ab..df7809a4821f 100644 > > > --- a/fs/ext4/namei.c > > > +++ b/fs/ext4/namei.c > > > @@ -1778,6 +1778,11 @@ dx_move_dirents(char *from, char *to, struct dx_map_entry *map, int count, > > > ((struct ext4_dir_entry_2 *) to)->rec_len = > > > ext4_rec_len_to_disk(rec_len, blocksize); > > > de->inode = 0; > > > + > > > + /* wipe name_len through and name field */ > > > + memset(&de->name_len, 0, ext4_rec_len_from_disk(de->rec_len, > > > + blocksize) - 6); > > > + > > This change in dx_move_dirents() does work, but I wonder if it would > have been better / more efficient to simply zero out the last > directory entry in dx_pack_dirents() after it is done packing the > directory entries in the original directory block? > > Hmm, I was basing it on the fact that it felt more logical to put it with the code that actually moves the entry in case move_dirents() is ever called without a packing after. But I see that move_dirents() and pack_dirents() are each only called once in do_split() so perhaps moving it to pack_dirents() would be best. > > The comment is confusing. IMO it would make more sense to mention what is *not* > > being zeroed: > > > > /* wipe the dir_entry excluding the rec_len field */ > > Or maybe, "wipe everything in the directory entry after the rec_len > field". Thanks for the suggestion, I was having a hard time coming up with something understandable. > > > > @@ -2492,6 +2498,11 @@ int ext4_generic_delete_entry(struct inode *dir, > > > else > > > de->inode = 0; > > > inode_inc_iversion(dir); > > > + > > > + /* wipe name_len through name field */ > > > + memset(&de->name_len, 0, > > > + ext4_rec_len_from_disk(de->rec_len, blocksize) - 6); > > > + > > > return 0; > > > > And maybe here too, although here why is the condition for setting the inode to > > 0 not the same as the condition for zeroing the other fields? > > I'd actually suggest wiping the directory entry *before* the "if > (pde)" statement, and yeah, it's probably best to zap the de->inode > unconditionally. > > What is going on is if there is a previoud directory entry ("if (pde) > ...) the the original code wasn't changing the directory entry at all, > including zero'ing the inode field, but instead simply expanding the > previous directory entry's rec_len to include the directory entry > being deleted. So in the original code, where the goal is to make > life as easy as possible for undelete programs, skipping "de->inode = > 0" when it was unnecessary was a good thing. > > But given that the new design goal of the code is, "to heck with > undelete programs, we want to shred anything that's no longer needed", > clearing the inode number is fine. > > In fact, what we could actually do is in the if (pde) case, we can zap > the entire directory entry, include de->rec_len. The advantage of > doing that is it becomes a lot easier to verify that the wiping code > is working correctly. We can simply check to make sure everything in > every directory entry after the end of the name (e.g., everything between > &de->name[de->name_pen) and ((char *) de) + de->rec_len) MUST be zero. > Yes this all makes sense, I'll update it. I'll also add a test similar to what you described in the fstest I'm working on. > > Also, maybe use offsetof(struct ext4_dir_entry_2, name_len) instead of '6'... > > Sure. Someone will still need to look at the definition of struct > ext4_dir_entry_2 to understand the structure layout, but offsetof(..) > is going to be a bit more understandable than a magic constant of '6'. > Sounds good to me. > - Ted > Thanks for the comments, Leah