Hello Mimi,
Thanks for your feedback on this.
On 8/11/2023 6:14 AM, Mimi Zohar wrote:
Hi Sush, Tushar,
On Tue, 2023-08-01 at 12:12 -0700, Sush Shringarputale wrote:
================================================
| A. Problem Statement |
================================================
Depending on the IMA policy, the IMA log can consume a lot of Kernel
memory on
the device. For instance, the events for the following IMA policy
entries may
need to be measured in certain scenarios, but they can also lead to a
verbose
IMA log when the device is running for a long period of time.
┌───────────────────────────────────────┐
│# PROC_SUPER_MAGIC │
│measure fsmagic=0x9fa0 │
│# SYSFS_MAGIC │
│measure fsmagic=0x62656572 │
│# DEBUGFS_MAGIC │
│measure fsmagic=0x64626720 │
│# TMPFS_MAGIC │
│measure fsmagic=0x01021994 │
│# RAMFS_MAGIC │
│measure fsmagic=0x858458f6 │
│# SECURITYFS_MAGIC │
│measure fsmagic=0x73636673 │
│# OVERLAYFS_MAGIC │
│measure fsmagic=0x794c7630 │
│# log, audit or tmp files │
│measure obj_type=var_log_t │
│measure obj_type=auditd_log_t │
│measure obj_type=tmp_t │
└───────────────────────────────────────┘
Secondly, certain devices are configured to take Kernel updates using Kexec
soft-boot. The IMA log from the previous Kernel gets carried over and the
Kernel memory consumption problem worsens when such devices undergo multiple
Kexec soft-boots over a long period of time.
The above two scenarios can cause IMA log to grow and consume Kernel memory.
In addition, a large IMA log can add pressure on the network bandwidth when
the attestation client sends it to remote-attestation-service.
Truncating IMA log to reclaim memory is not feasible, since it makes the
log go
out of sync with the TPM PCR quote making remote attestation fail.
A sophisticated solution is required which will help relieve the memory
pressure on the device and continue supporting remote attestation without
disruptions.
If the problem is kernel memory, then using a single tmpfs file has
already been proposed [1]. As entries are added to the measurement
list, they are copied to the tmpfs file and removed from kernel memory.
Userspace would still access the measurement list via the existing
securityfs file.
The IMA measurement list is a sequential file, allowing it to be read
from an offset. How much or how little of the measuremnt list is read
by the attestation client and sent to the attestation server is up to
the attestation client/server.
If the problem is not kernel memory, but memory pressure in general,
then instead of a tmpfs file, the measurement list could similarly be
copied to a single persistent file [1].
The suggested approach in this RFC discussion using a vfs_tmpfile was
only discussed but no prototype was created back then. We are
discussing the approach internally now and will respond with more
details about it.
-------------------------------------------------------------------------------
================================================
| B. Proposed Solution |
================================================
In this document, we propose an enhancement to the IMA subsystem to improve
the long-running performance by snapshotting the IMA log, while still
providing mechanisms to verify its integrity using the PCR quotes.
The remainder of the document describes details of the proposed solution
in the
following sub-sections.
- High-level Work-flow
- Snapshot Triggering Mechanism
- Design Choices for Storing Snapshots
- Attestation-Client and Remote-Attestation-Service Side Changes
- Example Walk-through
- Open Questions
-------------------------------------------------------------------------------
================================================
| B.1 High-level Work-flow |
================================================
Pre-requisites:
- IMA Integrity guarantees are maintained.
The proposed high level work-flow of IMA log snapshotting is as follows:
- A user-mode process will trigger the snapshot by opening a file in SysFS
say /sys/kernel/security/ima/snapshot (referred to as
sysk_ima_snapshot_file
here onwards).
Please fix the mailer so that it doesn't wrap sentences. Adding blank
lines between bullets would improve readability.
Noted, will do.
- The Kernel will get the current TPM PCR values and PCR update counter [2]
and store them as template data in a new IMA event "snapshot_aggregate".
This event will be measured by IMA using critical data measurement
functionality [1]. Recording regular IMA events will be paused while
"snapshot_aggregate" is being computed using the existing IMA mutex lock.
- Once the "snapshot_aggregate" is computed and measured in IMA log, the
prior
IMA events will be made available in the sysk_ima_snapshot_file.
- The UM process will copy those IMA events from sysk_ima_snapshot_file to a
snapshot file on disk chosen by UM (referred to as UM_snapshot_file here
onwards). The location, file-system type, access permissions etc. of the
UM_snapshot_file would be controlled by UM process itself.
- Once UM is done copying the IMA events from sysk_ima_snapshot_file to
UM_snapshot_file, it will indicate to the Kernel that the snapshot can be
finalized by triggering a write with any data to the
sysk_ima_snapshot_file.
UM process cannot prevent the IMA log purge operation after this point.
- The Kernel will truncate the current IMA log and and clear HTable up
to the
"snapshot_aggregate" marker.
- The Kernel will measure the PCR update counter as part of measuring
snapshot_aggregate, so that it can be used by the remote attestation
service
for detecting missing events.
- UM can prevent the IMA log purge by closing the sysk_ima_snapshot_file
without performing a write operation on it. In this case, while the
"snapshot_aggregate" marker may still be in the log, the event can be
ignored
since the previous entries in the IMA log will not be purged.
Note:
- This work-flow should work when interleaved with Kexec 'load' and
'execute'
events and should not cause IMA log + snapshot to go out of sync with PCR
quotes. The implementation details are omitted from this document for
brevity.
This design seems overly complex and requires synchronization between
the "snapshot" record and exporting the records from the measurement
list. None of this would be necessary if the measurements were copied
from kernel memory to a backing file (e.g. tmpfs), as described in [1].
What is the real problem - kernel memory pressure, memory pressure in
general, or disk space? Is the intention to remove or offload the
exported measurements?
The main concern is the memory pressure on both the kernel and the
attestation client
when it sends the request. The concern you bring up is valid and we are
working on
creating a prototype. There is no intention to remove the exported
measurements.
- Sush
Concerns:
- Pausing extending the measurement list.
[1]
https://lore.kernel.org/linux-integrity/CAOQ4uxj4Pv2Wr1wgvBCDR-tnA5dsZT3rvdDzKgAH1aEV_-r9Qg@xxxxxxxxxxxxxx/#t