[RFC] IMA Log Snapshotting Design Proposal

Mimi Zohar zohar at linux.ibm.com
Fri Aug 11 06:14:16 PDT 2023


Hi Sush, Tushar,

On Tue, 2023-08-01 at 12:12 -0700, Sush Shringarputale wrote:
> ================================================
> | A. Problem Statement                         |
> ================================================
> Depending on the IMA policy, the IMA log can consume a lot of Kernel 
> memory on
> the device.  For instance, the events for the following IMA policy 
> entries may
> need to be measured in certain scenarios, but they can also lead to a 
> verbose
> IMA log when the device is running for a long period of time.
> ┌───────────────────────────────────────┐
> │# PROC_SUPER_MAGIC                     │
> │measure fsmagic=0x9fa0                 │
> │# SYSFS_MAGIC                          │
> │measure fsmagic=0x62656572             │
> │# DEBUGFS_MAGIC                        │
> │measure fsmagic=0x64626720             │
> │# TMPFS_MAGIC                          │
> │measure fsmagic=0x01021994             │
> │# RAMFS_MAGIC                          │
> │measure fsmagic=0x858458f6             │
> │# SECURITYFS_MAGIC                     │
> │measure fsmagic=0x73636673             │
> │# OVERLAYFS_MAGIC                      │
> │measure fsmagic=0x794c7630             │
> │# log, audit or tmp files              │
> │measure obj_type=var_log_t             │
> │measure obj_type=auditd_log_t          │
> │measure obj_type=tmp_t                 │
> └───────────────────────────────────────┘
> 
> Secondly, certain devices are configured to take Kernel updates using Kexec
> soft-boot.  The IMA log from the previous Kernel gets carried over and the
> Kernel memory consumption problem worsens when such devices undergo multiple
> Kexec soft-boots over a long period of time.
> 
> The above two scenarios can cause IMA log to grow and consume Kernel memory.
> 
> In addition, a large IMA log can add pressure on the network bandwidth when
> the attestation client sends it to remote-attestation-service.
> 
> Truncating IMA log to reclaim memory is not feasible, since it makes the 
> log go
> out of sync with the TPM PCR quote making remote attestation fail.
> 
> A sophisticated solution is required which will help relieve the memory
> pressure on the device and continue supporting remote attestation without
> disruptions.

If the problem is kernel memory, then using a single tmpfs file has
already been proposed [1].  As entries are added to the measurement
list, they are copied to the tmpfs file and removed from kernel memory.
Userspace would still access the measurement list via the existing
securityfs file.

The IMA measurement list is a sequential file, allowing it to be read
from an offset.  How much or how little of the measuremnt list is read
by the attestation client and sent to the attestation server is up to
the attestation client/server.

If the problem is not kernel memory, but memory pressure in general,
then instead of a tmpfs file, the measurement list could similarly be
copied to a single persistent file [1].

> 
> -------------------------------------------------------------------------------
> ================================================
> | B. Proposed Solution                         |
> ================================================
> In this document, we propose an enhancement to the IMA subsystem to improve
> the long-running performance by snapshotting the IMA log, while still
> providing mechanisms to verify its integrity using the PCR quotes.
> 
> The remainder of the document describes details of the proposed solution 
> in the
> following sub-sections.
>   - High-level Work-flow
>   - Snapshot Triggering Mechanism
>   - Design Choices for Storing Snapshots
>   - Attestation-Client and Remote-Attestation-Service Side Changes
>   - Example Walk-through
>   - Open Questions
> -------------------------------------------------------------------------------
> ================================================
> | B.1 High-level Work-flow                     |
> ================================================
> Pre-requisites:
> - IMA Integrity guarantees are maintained.
> 
> The proposed high level work-flow of IMA log snapshotting is as follows:
> - A user-mode process will trigger the snapshot by opening a file in SysFS
>    say /sys/kernel/security/ima/snapshot (referred to as 
> sysk_ima_snapshot_file
>    here onwards).

Please fix the mailer so that it doesn't wrap sentences.   Adding blank
lines between bullets would improve readability.

> - The Kernel will get the current TPM PCR values and PCR update counter [2]
>    and store them as template data in a new IMA event "snapshot_aggregate".
>    This event will be measured by IMA using critical data measurement
>    functionality [1].  Recording regular IMA events will be paused while
>    "snapshot_aggregate" is being computed using the existing IMA mutex lock.

> - Once the "snapshot_aggregate" is computed and measured in IMA log, the 
> prior
>    IMA events will be made available in the sysk_ima_snapshot_file.

> - The UM process will copy those IMA events from sysk_ima_snapshot_file to a
>    snapshot file on disk chosen by UM (referred to as UM_snapshot_file here
>    onwards).  The location, file-system type, access permissions etc. of the
>    UM_snapshot_file would be controlled by UM process itself.

> - Once UM is done copying the IMA events from sysk_ima_snapshot_file to
>    UM_snapshot_file, it will indicate to the Kernel that the snapshot can be
>    finalized by triggering a write with any data to the 
> sysk_ima_snapshot_file.

>    UM process cannot prevent the IMA log purge operation after this point.
> - The Kernel will truncate the current IMA log and and clear HTable up 
> to the
>    "snapshot_aggregate" marker.

> - The Kernel will measure the PCR update counter as part of measuring
>    snapshot_aggregate, so that it can be used by the remote attestation 
> service
>    for detecting missing events.

> - UM can prevent the IMA log purge by closing the sysk_ima_snapshot_file
>    without performing a write operation on it.  In this case, while the
>    "snapshot_aggregate" marker may still be in the log, the event can be 
> ignored
>    since the previous entries in the IMA log will not be purged.
> 
> Note:
> - This work-flow should work when interleaved with Kexec 'load' and 
> 'execute'
>    events and should not cause IMA log + snapshot to go out of sync with PCR
>    quotes. The implementation details are omitted from this document for
>    brevity.

This design seems overly complex and requires synchronization between
the "snapshot" record and exporting the records from the measurement
list.  None of this would be necessary if the measurements were copied
from kernel memory to a backing file (e.g. tmpfs), as described in [1].

What is the real problem - kernel memory pressure, memory pressure in
general, or disk space?  Is the intention to remove or offload the
exported measurements?

Concerns:
- Pausing extending the measurement list.

[1] 
https://lore.kernel.org/linux-integrity/CAOQ4uxj4Pv2Wr1wgvBCDR-tnA5dsZT3rvdDzKgAH1aEV_-r9Qg@mail.gmail.com/#t

-- 
thanks,

Mimi




More information about the kexec mailing list