Re: [PATCH v6 1/4] IMA: Add func to measure LSM state and policy

John Johansen <john.johansen@xxxxxxxxxxxxx> · Wed, 5 Aug 2020 09:45:33 -0700

On 8/5/20 8:43 AM, Stephen Smalley wrote:
> On 8/5/20 11:07 AM, Tyler Hicks wrote:
> 
>> On 2020-08-05 10:27:43, Stephen Smalley wrote:
>>> On Wed, Aug 5, 2020 at 9:20 AM Mimi Zohar <zohar@xxxxxxxxxxxxx> wrote:
>>>> On Wed, 2020-08-05 at 09:03 -0400, Stephen Smalley wrote:
>>>>> On Wed, Aug 5, 2020 at 8:57 AM Mimi Zohar <zohar@xxxxxxxxxxxxx> wrote:
>>>>>> On Wed, 2020-08-05 at 08:46 -0400, Stephen Smalley wrote:
>>>>>>> On 8/4/20 11:25 PM, Mimi Zohar wrote:
>>>>>>>
>>>>>>>> Hi Lakshmi,
>>>>>>>>
>>>>>>>> There's still  a number of other patch sets needing to be reviewed
>>>>>>>> before my getting to this one.  The comment below is from a high level.
>>>>>>>>
>>>>>>>> On Tue, 2020-08-04 at 17:43 -0700, Lakshmi Ramasubramanian wrote:
>>>>>>>>> Critical data structures of security modules need to be measured to
>>>>>>>>> enable an attestation service to verify if the configuration and
>>>>>>>>> policies for the security modules have been setup correctly and
>>>>>>>>> that they haven't been tampered with at runtime. A new IMA policy is
>>>>>>>>> required for handling this measurement.
>>>>>>>>>
>>>>>>>>> Define two new IMA policy func namely LSM_STATE and LSM_POLICY to
>>>>>>>>> measure the state and the policy provided by the security modules.
>>>>>>>>> Update ima_match_rules() and ima_validate_rule() to check for
>>>>>>>>> the new func and ima_parse_rule() to handle the new func.
>>>>>>>> I can understand wanting to measure the in kernel LSM memory state to
>>>>>>>> make sure it hasn't changed, but policies are stored as files.  Buffer
>>>>>>>> measurements should be limited  to those things that are not files.
>>>>>>>>
>>>>>>>> Changing how data is passed to the kernel has been happening for a
>>>>>>>> while.  For example, instead of passing the kernel module or kernel
>>>>>>>> image in a buffer, the new syscalls - finit_module, kexec_file_load -
>>>>>>>> pass an open file descriptor.  Similarly, instead of loading the IMA
>>>>>>>> policy data, a pathname may be provided.
>>>>>>>>
>>>>>>>> Pre and post security hooks already exist for reading files.   Instead
>>>>>>>> of adding IMA support for measuring the policy file data, update the
>>>>>>>> mechanism for loading the LSM policy.  Then not only will you be able
>>>>>>>> to measure the policy, you'll also be able to require the policy be
>>>>>>>> signed.
>>>>>>> To clarify, the policy being measured by this patch series is a
>>>>>>> serialized representation of the in-memory policy data structures being
>>>>>>> enforced by SELinux.  Not the file that was loaded.  Hence, this
>>>>>>> measurement would detect tampering with the in-memory policy data
>>>>>>> structures after the policy has been loaded.  In the case of SELinux,
>>>>>>> one can read this serialized representation via /sys/fs/selinux/policy.
>>>>>>> The result is not byte-for-byte identical to the policy file that was
>>>>>>> loaded but can be semantically compared via sediff and other tools to
>>>>>>> determine whether it is equivalent.
>>>>>> Thank you for the clarification.   Could the policy hash be included
>>>>>> with the other critical data?  Does it really need to be measured
>>>>>> independently?
>>>>> They were split into two separate functions because we wanted to be
>>>>> able to support using different templates for them (ima-buf for the
>>>>> state variables so that the measurement includes the original buffer,
>>>>> which is small and relatively fixed-size, and ima-ng for the policy
>>>>> because it is large and we just want to capture the hash for later
>>>>> comparison against known-good).  Also, the state variables are
>>>>> available for measurement always from early initialization, whereas
>>>>> the policy is only available for measurement once we have loaded an
>>>>> initial policy.
>>>> Ok, measuring the policy separately from other critical data makes
>>>> sense.  Instead of measuring the policy, which is large, measure the
>>>> policy hash.
>>> I think that was the original approach.  However, I had concerns with
>>> adding code to SELinux to compute a hash over the policy versus
>>> leaving that to IMA's existing policy and mechanism.  If that's
>>> preferred I guess we can do it that way but seems less flexible and
>>> duplicative.
>> In AppArmor, we store the sha1 of the raw policy as the policy is
>> loaded. The hash is exposed to userspace in apparmorfs. See commit
>> 5ac8c355ae00 ("apparmor: allow introspecting the loaded policy pre
>> internal transform").
>>
>> It has proved useful as a mechanism for debugging as sometimes the
>> on-disk policy doesn't match the loaded policy and this can be a good
>> way to check that while providing support to users. John also mentions
>> checkpoint/restore in the commit message and I could certainly see how
>> the policy hashes would be useful in that scenario.
>>
>> When thinking through how Lakshmi's series could be extended for
>> AppArmor support, I was thinking that the AppArmor policy measurement
>> would be a measurement of these hashes that we already have in place.
>>
>> Perhaps there's some general usefulness in storing/exposing an SELinux
>> policy hash rather than only seeing it as duplicative property required
>> this measurement series?
> 
> That would be a hash of the policy file that was last loaded via the selinuxfs interface for loading policy, not a hash of the in-memory policy data structures at the time of measurement (which is what this patch series is implementing).  The duplicative part is with respect to selecting a hash algorithm and hashing the in-memory policy as part of the SELinux code rather than just passing the policy buffer to IMA for measurement like any other buffer.  Userspace can already hash the in-memory policy data itself by running sha256sum or whatever on /sys/fs/selinux/policy, so we don't need to save or expose that separately.
> 
> 

yeah apparmor exposes full loaded policy data that userspace could hash independently too, the hashing done by the kernel just reduces the amount of data that userspace has to suck down if they trust the kernel to do the hash. Those hashes are also used by apparmor internally for the first part of a dedup check so exposing them cost very little.

The hashing of the in-memory data structures and variables is something we are not currently doing. If we were to do it hashing in-memory apparmor policy would be quite involved and that would be something I would rather have LSMs export an interface for rather than having IMA poke directly at the data structures (ie. apparmor specific code in apparmor).

As for computing a measurement based on the hash instead of the in-memory policy, while quicker that would not detect memory corruption/attacks that manage to modify policy via writing kernel memory. Whether that type of measurement is sufficient depends on what you are trying to achieve.