On Thu, Jul 7, 2011 at 12:05 AM, Allison Henderson <achender@xxxxxxxxxxxxxxxxxx> wrote: > On 07/02/2011 02:33 AM, Amir Goldstein wrote: >> >> On Fri, Jul 1, 2011 at 12:22 AM, Allison Henderson >> <achender@xxxxxxxxxxxxxxxxxx> wrote: >>> >>> The first patch adds a new flag, EXT4_FREE_BLOCKS_ZERO, >>> to ext4_free_blocks. This flag causes causes blocks to be >>> zerod before they are freed. Files that have the EXT4_SECRM_FL >>> attribute flag on will have their blocks zerod when >>> they are released. The EXT4_SECRM_FL attribute flag can >>> be enabled useing chattr +s >>> >>> Signed-off-by: Allison Henderson<achender@xxxxxxxxxxxxxxxxxx> >>> --- >>> v1->v2 >>> Removed check for discard mount option and replaced with >>> check for secure discard and discard_zeroes_data >>> >>> Added BLKDEV_DISCARD_SECURE to the sb_issue_discard call >>> >>> v2->v3 >>> Removed code for discard. A seperate patch will >>> be done to add that code in the block layer >>> >>> :100644 100644 1921392... 38a4d75... M fs/ext4/ext4.h >>> :100644 100644 f815cc8... cf178f3... M fs/ext4/extents.c >>> :100644 100644 62f86e7... 8fdae7d... M fs/ext4/inode.c >>> :100644 100644 6ed859d... 94872b2... M fs/ext4/mballoc.c >>> :100644 100644 c757adc... 1ff7532... M fs/ext4/xattr.c >>> fs/ext4/ext4.h | 1 + >>> fs/ext4/extents.c | 3 +++ >>> fs/ext4/inode.c | 3 +++ >>> fs/ext4/mballoc.c | 8 ++++++++ >>> fs/ext4/xattr.c | 12 ++++++++---- >>> 5 files changed, 23 insertions(+), 4 deletions(-) >>> >>> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h >>> index 1921392..38a4d75 100644 >>> --- a/fs/ext4/ext4.h >>> +++ b/fs/ext4/ext4.h >>> @@ -526,6 +526,7 @@ struct ext4_new_group_data { >>> #define EXT4_FREE_BLOCKS_METADATA 0x0001 >>> #define EXT4_FREE_BLOCKS_FORGET 0x0002 >>> #define EXT4_FREE_BLOCKS_VALIDATED 0x0004 >>> +#define EXT4_FREE_BLOCKS_ZERO 0x0008 >>> >>> /* >>> * ioctl commands >>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c >>> index f815cc8..cf178f3 100644 >>> --- a/fs/ext4/extents.c >>> +++ b/fs/ext4/extents.c >>> @@ -2231,6 +2231,9 @@ static int ext4_remove_blocks(handle_t *handle, >>> struct inode *inode, >>> >>> if (S_ISDIR(inode->i_mode) || S_ISLNK(inode->i_mode)) >>> flags |= EXT4_FREE_BLOCKS_METADATA; >>> + >>> + if (EXT4_I(inode)->i_flags& EXT4_SECRM_FL) >>> + flags |= EXT4_FREE_BLOCKS_ZERO; >>> #ifdef EXTENTS_STATS >>> { >>> struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); >>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c >>> index 62f86e7..8fdae7d 100644 >>> --- a/fs/ext4/inode.c >>> +++ b/fs/ext4/inode.c >>> @@ -4182,6 +4182,9 @@ static int ext4_clear_blocks(handle_t *handle, >>> struct inode *inode, >>> for (p = first; p< last; p++) >>> *p = 0; >>> >>> + if (EXT4_I(inode)->i_flags& EXT4_SECRM_FL) >>> + flags |= EXT4_FREE_BLOCKS_ZERO; >>> + >>> ext4_free_blocks(handle, inode, NULL, block_to_free, count, >>> flags); >>> return 0; >>> out_err: >>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c >>> index 6ed859d..94872b2 100644 >>> --- a/fs/ext4/mballoc.c >>> +++ b/fs/ext4/mballoc.c >>> @@ -4485,6 +4485,14 @@ void ext4_free_blocks(handle_t *handle, struct >>> inode *inode, >>> ext4_debug("freeing block %llu\n", block); >>> trace_ext4_free_blocks(inode, block, count, flags); >>> >>> + if (flags& EXT4_FREE_BLOCKS_ZERO) { >>> + err = sb_issue_zeroout(inode->i_sb, block, count, >>> GFP_NOFS); >> >> But the delete of these blocks in not yet committed, >> so after reboot, you can end up with a non-deleted but zeroed file data. >> Is that acceptable? I should think not. >> >> One way around this is a 2-phase unlink/truncate. >> Phase 1: add to orphan list and register a callback on commit >> Phase 2: issue zeroout and free the blocks >> >> This won't work for punch hole, but then again, for punch hole >> it's probably OK to end up with zeroed data, but non-deleted blocks. >> Right? >> >> Amir. > > Hi, I had a quick question about the orphan list. I notice that > ext4_ext_truncate and also ext4_ext_punch_hole already have a call to > ext4_orphan_add that happens really early before any calls to free blocks. > Does this address your earlier concerns, or is there another reason I > missed? Thx! > It doesn't address the concerns of getting a non-deleted file with zeroed data after crash, because the existence of the inode on the orphan list after crash depends on the transaction that added it to the list being committed. And your patch zeroes the blocks before that transaction is committed. However, the orphan list gives you a very good framework to implement deferred delete (by a kernel thread) as Andreas suggested. Unlink should be simple, because freeing unlinked inode blocks it is anyway deferred till the inode refcount drops to zero. Truncate is more tricky, because of the truncate shrink/extend requirement (that all data is zeroes after extending the inode's size via truncate system call), so a shrinking-deferred truncate would have to mark all the to-be-deleted extents uninitialized. Amir. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html