Re: [PATCH 10/17] prmem: documentation

Nadav Amit <nadav.amit@xxxxxxxxx> · Tue, 13 Nov 2018 10:06:59 -0800

From: Andy Lutomirski
Sent: November 13, 2018 at 5:47:16 PM GMT
> To: Nadav Amit <nadav.amit@xxxxxxxxx>
> Cc: Igor Stoppa <igor.stoppa@xxxxxxxxx>, Kees Cook <keescook@xxxxxxxxxxxx>, Peter Zijlstra <peterz@xxxxxxxxxxxxx>, Mimi Zohar <zohar@xxxxxxxxxxxxxxxxxx>, Matthew Wilcox <willy@xxxxxxxxxxxxx>, Dave Chinner <david@xxxxxxxxxxxxx>, James Morris <jmorris@xxxxxxxxx>, Michal Hocko <mhocko@xxxxxxxxxx>, Kernel Hardening <kernel-hardening@xxxxxxxxxxxxxxxxxx>, linux-integrity <linux-integrity@xxxxxxxxxxxxxxx>, LSM List <linux-security-module@xxxxxxxxxxxxxxx>, Igor Stoppa <igor.stoppa@xxxxxxxxxx>, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>, Jonathan Corbet <corbet@xxxxxxx>, Laura Abbott <labbott@xxxxxxxxxx>, Randy Dunlap <rdunlap@xxxxxxxxxxxxx>, Mike Rapoport <rppt@xxxxxxxxxxxxxxxxxx>, open list:DOCUMENTATION <linux-doc@xxxxxxxxxxxxxxx>, LKML <linux-kernel@xxxxxxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Subject: Re: [PATCH 10/17] prmem: documentation
> 
> 
> On Tue, Nov 13, 2018 at 9:43 AM Nadav Amit <nadav.amit@xxxxxxxxx> wrote:
>> From: Andy Lutomirski
>> Sent: November 13, 2018 at 5:16:09 PM GMT
>>> To: Igor Stoppa <igor.stoppa@xxxxxxxxx>
>>> Cc: Kees Cook <keescook@xxxxxxxxxxxx>, Peter Zijlstra <peterz@xxxxxxxxxxxxx>, Nadav Amit <nadav.amit@xxxxxxxxx>, Mimi Zohar <zohar@xxxxxxxxxxxxxxxxxx>, Matthew Wilcox <willy@xxxxxxxxxxxxx>, Dave Chinner <david@xxxxxxxxxxxxx>, James Morris <jmorris@xxxxxxxxx>, Michal Hocko <mhocko@xxxxxxxxxx>, Kernel Hardening <kernel-hardening@xxxxxxxxxxxxxxxxxx>, linux-integrity <linux-integrity@xxxxxxxxxxxxxxx>, LSM List <linux-security-module@xxxxxxxxxxxxxxx>, Igor Stoppa <igor.stoppa@xxxxxxxxxx>, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>, Jonathan Corbet <corbet@xxxxxxx>, Laura Abbott <labbott@xxxxxxxxxx>, Randy Dunlap <rdunlap@xxxxxxxxxxxxx>, Mike Rapoport <rppt@xxxxxxxxxxxxxxxxxx>, open list:DOCUMENTATION <linux-doc@xxxxxxxxxxxxxxx>, LKML <linux-kernel@xxxxxxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>>> Subject: Re: [PATCH 10/17] prmem: documentation
>>> 
>>> 
>>> On Tue, Nov 13, 2018 at 6:25 AM Igor Stoppa <igor.stoppa@xxxxxxxxx> wrote:
>>>> Hello,
>>>> I've been studying v4 of the patch-set [1] that Nadav has been working on.
>>>> Incidentally, I think it would be useful to cc also the
>>>> security/hardening ml.
>>>> The patch-set seems to be close to final, so I am resuming this discussion.
>>>> 
>>>> On 30/10/2018 19:06, Andy Lutomirski wrote:
>>>> 
>>>>> I support the addition of a rare-write mechanism to the upstream kernel.  And I think that there is only one sane way to implement it: using an mm_struct. That mm_struct, just like any sane mm_struct, should only differ from init_mm in that it has extra mappings in the *user* region.
>>>> 
>>>> After reading the code, I see what you meant.
>>>> I think I can work with it.
>>>> 
>>>> But I have a couple of questions wrt the use of this mechanism, in the
>>>> context of write rare.
>>>> 
>>>> 
>>>> 1) mm_struct.
>>>> 
>>>> Iiuc, the purpose of the patchset is mostly (only?) to patch kernel code
>>>> (live patch?), which seems to happen sequentially and in a relatively
>>>> standardized way, like replacing the NOPs specifically placed in the
>>>> functions that need patching.
>>>> 
>>>> This is a bit different from the more generic write-rare case, applied
>>>> to data.
>>>> 
>>>> As example, I have in mind a system where both IMA and SELinux are in use.
>>>> 
>>>> In this system, a file is accessed for the first time.
>>>> 
>>>> That would trigger 2 things:
>>>> - evaluation of the SELinux rules and probably update of the AVC cache
>>>> - IMA measurement and update of the measurements
>>>> 
>>>> Both of them could be write protected, meaning that they would both have
>>>> to be modified through the write rare mechanism.
>>>> 
>>>> While the events, for 1 specific file, would be sequential, it's not
>>>> difficult to imagine that multiple files could be accessed at the same time.
>>>> 
>>>> If the update of the data structures in both IMA and SELinux must use
>>>> the same mm_struct, that would have to be somehow regulated and it would
>>>> introduce an unnecessary (imho) dependency.
>>>> 
>>>> How about having one mm_struct for each writer (core or thread)?
>>> 
>>> I don't think that helps anything.  I think the mm_struct used for
>>> prmem (or rare_write or whatever you want to call it) should be
>>> entirely abstracted away by an appropriate API, so neither SELinux nor
>>> IMA need to be aware that there's an mm_struct involved.  It's also
>>> entirely possible that some architectures won't even use an mm_struct
>>> behind the scenes -- x86, for example, could have avoided it if there
>>> were a kernel equivalent of PKRU.  Sadly, there isn't.
>>> 
>>>> 2) Iiuc, the purpose of the 2 pages being remapped is that the target of
>>>> the patch might spill across the page boundary, however if I deal with
>>>> the modification of generic data, I shouldn't (shouldn't I?) assume that
>>>> the data will not span across multiple pages.
>>> 
>>> The reason for the particular architecture of text_poke() is to avoid
>>> memory allocation to get it working.  i think that prmem/rare_write
>>> should have each rare-writable kernel address map to a unique user
>>> address, possibly just by offsetting everything by a constant.  For
>>> rare_write, you don't actually need it to work as such until fairly
>>> late in boot, since the rare_writable data will just be writable early
>>> on.
>>> 
>>>> If the data spans across multiple pages, in unknown amount, I suppose
>>>> that I should not keep interrupts disabled for an unknown time, as it
>>>> would hurt preemption.
>>>> 
>>>> What I thought, in my initial patch-set, was to iterate over each page
>>>> that must be written to, in a loop, re-enabling interrupts in-between
>>>> iterations, to give pending interrupts a chance to be served.
>>>> 
>>>> This would mean that the data being written to would not be consistent,
>>>> but it's a problem that would have to be addressed anyways, since it can
>>>> be still read by other cores, while the write is ongoing.
>>> 
>>> This probably makes sense, except that enabling and disabling
>>> interrupts means you also need to restore the original mm_struct (most
>>> likely), which is slow.  I don't think there's a generic way to check
>>> whether in interrupt is pending without turning interrupts on.
>> 
>> I guess that enabling IRQs might break some hidden assumptions in the code,
>> but is there a fundamental reason that IRQs need to be disabled? use_mm()
>> got them enabled, although it is only suitable for kernel threads.
> 
> For general rare-writish stuff, I don't think we want IRQs running
> with them mapped anywhere for write.  For AVC and IMA, I'm less sure.

Oh.. Of course. It is sort of a measure to prevent a single malicious/faulty
write from corrupting the sensitive memory. Doing so limits the code that
can compromise the security by a write to the protected data-structures
(rephrasing for myself).

I think I should add it as a comment in your patch.