Re: [LSF/MM/BPF TOPIC] Dropping page cache of individual fs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 16, 2024 at 11:50:32AM +0100, Christian Brauner wrote:
> Hey,
> 
> I'm not sure this even needs a full LSFMM discussion but since I
> currently don't have time to work on the patch I may as well submit it.
> 
> Gnome recently got awared 1M Euro by the Sovereign Tech Fund (STF). The
> STF was created by the German government to fund public infrastructure:
> 
> "The Sovereign Tech Fund supports the development, improvement and
>  maintenance of open digital infrastructure. Our goal is to sustainably
>  strengthen the open source ecosystem. We focus on security, resilience,
>  technological diversity, and the people behind the code." (cf. [1])
> 
> Gnome has proposed various specific projects including integrating
> systemd-homed with Gnome. Systemd-homed provides various features and if
> you're interested in details then you might find it useful to read [2].
> It makes use of various new VFS and fs specific developments over the
> last years.
> 
> One feature is encrypting the home directory via LUKS. An approriate
> image or device must contain a GPT partition table. Currently there's
> only one partition which is a LUKS2 volume. Inside that LUKS2 volume is
> a Linux filesystem. Currently supported are btrfs (see [4] though),
> ext4, and xfs.
> 
> The following issue isn't specific to systemd-homed. Gnome wants to be
> able to support locking encrypted home directories. For example, when
> the laptop is suspended. To do this the luksSuspend command can be used.
> 
> The luksSuspend call is nothing else than a device mapper ioctl to
> suspend the block device and it's owning superblock/filesystem. Which in
> turn is nothing but a freeze initiated from the block layer:
> 
> dm_suspend()
> -> __dm_suspend()
>    -> lock_fs()
>       -> bdev_freeze()
> 
> So when we say luksSuspend we really mean block layer initiated freeze.
> The overall goal or expectation of userspace is that after a luksSuspend
> call all sensitive material has been evicted from relevant caches to
> harden against various attacks. And luksSuspend does wipe the encryption
> key and suspend the block device. However, the encryption key can still
> be available clear-text in the page cache.

The wiping of secrets is completely orthogonal to the freezing of
the device and filesystem - the freeze does not need to occur to
allow the encryption keys and decrypted data to be purged. They
should not be conflated; purging needs to be a completely separate
operation that can be run regardless of device/fs freeze status.

FWIW, focussing on purging the page cache omits the fact that
having access to the directory structure is a problem - one can
still retrieve other user information that is stored in metadata
(e.g. xattrs) that isn't part of the page cache. Even the directory
structure that is cached in dentries could reveal secrets someone
wants to keep hidden (e.g code names for operations/products).

So if we want luksSuspend to actually protect user information when
it runs, then it effectively needs to bring the filesystem right
back to it's "just mounted" state where the only thing in memory is
the root directory dentry and inode and nothing else.

And, of course, this is largely impossible to do because anything
with an open file on the filesystem will prevent this robust cache
purge from occurring....

Which brings us back to "best effort" only, and at this point we
already have drop-caches....

Mind you, I do wonder if drop caches is fast enough for this sort of
use case. It is single threaded, and if the filesystem/system has
millions of cached inodes it can take minutes to run. Unmount has
the same problem - purging large dentry/inode caches takes a *lot*
of CPU time and these operations are single threaded.

So it may not be practical in the luks context to purge caches e.g.
suspending a laptop shouldn't take minutes. However laptops are
getting to the hundreds of GB of RAM these days and so they can
cache millions of inodes, so cache purge runtime is definitely a
consideration here.

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux