On Fri, 2011-07-08 at 03:09 +0300, Amir Goldstein wrote: > On Thu, Jul 7, 2011 at 11:19 PM, Allison Henderson > <achender@xxxxxxxxxxxxxxxxxx> wrote: > > On 07/07/2011 12:52 PM, Andreas Dilger wrote: > >> > >> On 2011-07-07, at 1:05 AM, Amir Goldstein wrote: > >>> > >>> On Thu, Jul 7, 2011 at 12:05 AM, Allison Henderson > >>> <achender@xxxxxxxxxxxxxxxxxx> wrote: > >>>> > >>>> On 07/02/2011 02:33 AM, Amir Goldstein wrote: > >>>>> > >>>>> On Fri, Jul 1, 2011 at 12:22 AM, Allison Henderson > >>>>> <achender@xxxxxxxxxxxxxxxxxx> wrote: > >>>>>> > >>>>>> @@ -4485,6 +4485,14 @@ void ext4_free_blocks(handle_t *handle, struct > >>>>>> inode *inode, > >>>>>> ext4_debug("freeing block %llu\n", block); > >>>>>> trace_ext4_free_blocks(inode, block, count, flags); > >>>>>> > >>>>>> + if (flags& EXT4_FREE_BLOCKS_ZERO) { > >>>>>> + err = sb_issue_zeroout(inode->i_sb, block, count, > >>>>>> GFP_NOFS); > >>>>> > >>>>> But the delete of these blocks in not yet committed, > >>>>> so after reboot, you can end up with a non-deleted but zeroed file > >>>>> data. > >>>>> Is that acceptable? I should think not. > >>>>> > >>>>> One way around this is a 2-phase unlink/truncate. > >>>>> Phase 1: add to orphan list and register a callback on commit > >>>>> Phase 2: issue zeroout and free the blocks > >>>>> > >>>>> This won't work for punch hole, but then again, for punch hole > >>>>> it's probably OK to end up with zeroed data, but non-deleted blocks. > >>>>> Right? > >>>> > >>>> Hi, I had a quick question about the orphan list. I notice that > >>>> ext4_ext_truncate and also ext4_ext_punch_hole already have a call to > >>>> ext4_orphan_add that happens really early before any calls to free > >>>> blocks. > >>>> Does this address your earlier concerns, or is there another reason I > >>>> missed? Thx! > >>> > >>> It doesn't address the concerns of getting a non-deleted file with zeroed > >>> data > >>> after crash, because the existence of the inode on the orphan list after > >>> crash > >>> depends on the transaction that added it to the list being committed. > >>> And your patch zeroes the blocks before that transaction is committed. > >>> > >>> However, the orphan list gives you a very good framework to implement > >>> deferred delete (by a kernel thread) as Andreas suggested. > >>> Unlink should be simple, because freeing unlinked inode blocks it is > >>> anyway > >>> deferred till the inode refcount drops to zero. > >> > >> Right. The patch that I referenced moved all of the blocks from unlink > >> and truncate-to-zero from the current inode to a new temporary inode on > >> the > >> orphan list (simply copying the i_blocks field + i_block and i_size, IIRC, > >> and zeroing them on the original inode). > >> > >>> Truncate is more tricky, because of the truncate shrink/extend > >>> requirement > >>> (that all data is zeroes after extending the inode's size via truncate > >>> system call), so a shrinking-deferred truncate would have to mark all the > >>> to-be-deleted extents uninitialized. > >> > >> It would be possible to do this for partial truncate/punch as well, to > >> move whole blocks over to a new inode on the orphan list and zeroing only > >> the 1 or 2 partial blocks inline. > >> > >> It should even be possible to leverage the "block migrate" facility used > >> by defrag, so that we don't duplicate this code. That would mean just > >> allocating a temp "unlink" inode in the kernel and putting it on the > >> orphan > >> list (like an open-unlinked file), migrate the selected range of blocks, > >> and then zeroing the blocks in the background before unlinking the inode. > >> > >> I don't think that just marking the deleted extents as uninitialized is > >> enough, since it would still leave "private" data on disk that could be > >> read afterward. This would also only work for extent-mapped filesystems. > >> > >> There may need to be some work to enable the migrate code on block-mapped > >> files, if you want to allow secure-delete on those files, but that is good > >> IMHO since it also means that we could defrag block-mapped files. > >> > >> Cheers, Andreas > >> > > > > Ah, ok then. Yes, part of the requirements was to make secure delete work > > for partial truncates, punch hole, and also indexed files. So that will > > save me some time if I can get the migrate routines work. Thx for the > > pointers all! > > > > I realized that there is a basic flaw in the concept of deferred-secure-delete. > From a security point of view, after a crash during a secure-delete, > if the file is not there, all its data should have been wiped. > Orphan cleanup on the next mount may be done on a system that > doesn't respect secure delete. > So for real security, the unlink/truncate command cannot return before > all data is wiped. I agree. I think the user who expect secure delete will be expecting the data being completely wiped off from disk, instead of wondering when the OS/fs will really get rid of the data on the hidden inode by background thread. Secure delete should be synchronous. > The unlink/truncate metadata changes must not even be committed > before all data is wiped (or at least part of the data with partial truncate). > > Amir. > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html