Re: [PATCH 1/5] WIP: Add syscall unlinkat_s (currently x86* only)

Alex Elsayed <eternaleye@xxxxxxxxx> · Tue, 03 Feb 2015 08:44:03 -0800

Alexander Holler wrote:

> Am 03.02.2015 um 09:51 schrieb Alexander Holler:
>> Am 03.02.2015 um 08:56 schrieb Al Viro:
>>> On Tue, Feb 03, 2015 at 07:58:50AM +0100, Alexander Holler wrote:
>>>
>>>>> Charming.  Now, what exactly happens if two such syscalls overlap in
>>>>> time?
>>>>
>>>> What do you think will happen? I assume you haven't looked at how I've
>>>> implemented set_secure_delete(). CHarming.
>>>
>>> AFAICS, you get random unlink() happening at the same time hit by that
>>> mess, whether they'd asked for it or not.  What's more, this counter
>>> of yours is *not* guaranteed to be elevated during the final iput() of
>>> the
>>> inode you wanted to get - again, ls -lR racing with that syscall can
>>> elevate the refcount of dentry, making d_delete() in vfs_unlink() just
>>> remove that dentry from hash, while keeping it positive.  If dentry
>>> reference grabbed by stat(2) is released after both dput() and iput() in
>>> do_unlinkat(), the final iput() will be done when stat(2) drops its
>>> reference to dentry, triggering immediate dentry_kill() (since dentry
>>> has already been unhashed) and dentry_iput() from it.
>>
>> Thanks for the short explanation. I will see if I can make sense out of
>> it for me to get an idea how to solve that.
>>
>>>
>>> IOW, this counter is both too crude (it's fs-wide, for crying out loud)
>>> *and* not guaranteed to cover enough.  _IF_ you want that behaviour at
>>
>> Sure it is crude.
>>
>> But it keeps the patches simple. As I've written, unlinkat_s() isn't
>> meant for everyday usage, just for the rare case when one really wants
>> to get rid of some contents. Therefor execution speed or an i/o slowdown
>> while the "secure deletion" is in work is totally ignored
>>
>> And that "rare case" doesn't include military security levels, it's just
>> meant for ordinary people which want make it much, much harder for other
>> ordinary people (or geeks or kernel maintainers) to read the deleted
>> content ever again. It's far too easy to use grep or something similiar
>> to find seemingly deleted stuff at device level again (after it was
>> deleted by what filesystems are offering nowadays). Especially if one
>> thinks at stuff like certificates and similiar which can be identified
>> by common patterns (bit sequences) they use.
> 
> Or to give another more common example: If you delete your contact list,
> I likely might find again by just searching for 0x6f726956 at the device
> level (assuming you've stored a contact in that list with the same
> surname as yours.
> 
> And, because I've only mentioned in a different thread, now think at the
> problem that nowadays storage is often fixed (soldered) to devices which
> don't offer a way to delete the whole storage. You might have luck if
> the contact list in question was stored in some encrypted part, but that
> presumes that the key for that encrypted part isn't somehow stored on
> the same device too. Which unfortunately isn't always the case (maybe
> because of usability). And ...
> 
> That's why I think filesystems should offer a way to really delete
> files. Most people would be happy, even if filesystems won't delete
> stuff at military security levels and would disregard all the cases when
> they couldn't make sure that stuff is really deleted.
> 
> To conclude, most people would be already happy if the most trivial case
> would be handled right and not just by marking files as deleted but
> leaving the contents intact.

Well, one other issue is that this only ensures that the extents referenced 
at the time of explicit deletion are wiped.

On COW filesystems this is most obviously insufficient (All the old 
unreferenced-after-COW copies will have been left in place rather than 
erased in this manner). However, on other filesystems it can still happen 
anyway.

So even just architecturally, this operates at the wrong point in time to 
make the guarantees it's claiming. Secure deletion behavior affects the 
entire filesystem, because it may very well make internal copies, not just 
explicit user-driven ones.

Because of that, it'd pretty much have to be a mount flag of some variety, 
as far as I understand. When enabled, any time an extent is freed, it must 
be wiped. And even then, a single mount with that option disabled would 
destroy any such guarantees.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html