Re: [PATCH 1/3] ima: keep the integrity state of open files up to date

Janne Karhunen <janne.karhunen@xxxxxxxxx> · Tue, 17 Sep 2019 10:24:45 +0300

On Tue, Sep 17, 2019 at 7:23 AM Eric Biggers <ebiggers@xxxxxxxxxx> wrote:

> > > Who will use it if it isn't 100% safe?
> >
> > I suppose anyone using mutable data with IMA appraise should, unless
> > they have a redundant power supply and a kernel that never crashes. In
> > a way this is like asking if the ima-appraise should be there for
> > mutable data at all. All this is doing is that it improves the crash
> > recovery reliability without taking anything away.
>
> Okay, so why would anyone use mutable data with IMA appraise if it corrupts your
> files by design, both with and without this patchset?

Now you are exaggerating heavily: it does not corrupt your files by
design. A crash in any security related system is supposed to be
pretty rare occurrence.

> > Anyway, I think I'm getting along with my understanding of the page
> > writeback slowly and the journal support will eventually be there at
> > least as an add-on patch for those that want to use it and really need
> > the last 0.n% reliability. Note that even without that patch you can
> > build ima-appraise based systems that are 99.999% reliable just by
>
> On what storage devices, workloads, and filesystems is this number for?

I reached 99.2% recovery rate with the AOSP without touching the
android on top by crashing the kernel with a test case while the
device was in use. 80% if I crash it while the device is in the
busiest write cycle (the first boot, I guess we would suck quite
royally if we never made past this point without dying).

99.95+% of course requires a high-availability system that probably
crashes once per year at best and recovers in seconds. In that case
this will recover it with pretty high odds, so reliability is not all
that much reduced from it's normal reliability statistics. So, the
ima-appraise for the mutable data could be in use even in a
high-availability system. 99% recovery probability for the crash that
occurs once per year would be OK; 0% would not be. I suppose it all
depends on your requirements.

> > having the patch we're discussing here. Without it you would be orders
> > of magnitude worse off. All we are doing is that we give it a fairly
> > good chance to recover instead of giving up without even trying.
> >
> > That said, I'm not sure the 100% crash recovery is ever guaranteed in
> > any Linux system. We just have to do what we can, no?
>
> Filesystems implement consistency mechanisms, e.g. journalling or copy-on-write,
> to recover from crashes by design.  This patchset doesn't implement or use any
> such mechanism, so it's not crash-safe.  It's not clear that it's even a step in
> the right direction, as no patches have been proposed for a correct solution so
> we can see what it actually involves.

Great, what would be the better alternative? I guess the suggestion
cannot be that 'don't use it' since the code is there?

As for the 'step to the right direction': before we could talk about
any of this journaling stuff we had to make sure that we have the
plumbing where the measurements are accurate. These patches do that
and the journaling is the next step. All the journaling add-on does
now is that it binds the page write and the xattr update into one
transaction, so both of those run as sub-transactions of one master.
Now, only when the master ends the data is moved out of the journal in
one bundle. All this is so ridiculously simple I doubt my own eyes,
but it seems to work fine apart from some slowdown on shutdown when
processes call sync() like there is no tomorrow. Nevertheless,
understanding the related code (the page writeback and the ext4) is
pretty nasty and there are lots of things I need to understand about
that still. The thing I'm currently trying to get my head around is
that whether or not it is possible that we have a measurement over a
page that was not eligible for the writeback. I'm also no ext4 expert
so all help in that regard is highly appreciated if this type of thing
is interesting to others.

Anyway, all this is good info. If this code is not needed upstream,
I'm happy to stop working with it and will maintain this for my use
only. Let me know,

--
Janne