[RFC] IMA Log Snapshotting Design Proposal

Sush Shringarputale sushring at linux.microsoft.com
Mon Aug 14 14:42:46 PDT 2023


Hello Mimi,

Thanks for your feedback on this.


On 8/11/2023 6:14 AM, Mimi Zohar wrote:
> Hi Sush, Tushar,
>
> On Tue, 2023-08-01 at 12:12 -0700, Sush Shringarputale wrote:
>> ================================================
>> | A. Problem Statement                         |
>> ================================================
>> Depending on the IMA policy, the IMA log can consume a lot of Kernel
>> memory on
>> the device.  For instance, the events for the following IMA policy
>> entries may
>> need to be measured in certain scenarios, but they can also lead to a
>> verbose
>> IMA log when the device is running for a long period of time.
>> ┌───────────────────────────────────────┐
>> │# PROC_SUPER_MAGIC                     │
>> │measure fsmagic=0x9fa0                 │
>> │# SYSFS_MAGIC                          │
>> │measure fsmagic=0x62656572             │
>> │# DEBUGFS_MAGIC                        │
>> │measure fsmagic=0x64626720             │
>> │# TMPFS_MAGIC                          │
>> │measure fsmagic=0x01021994             │
>> │# RAMFS_MAGIC                          │
>> │measure fsmagic=0x858458f6             │
>> │# SECURITYFS_MAGIC                     │
>> │measure fsmagic=0x73636673             │
>> │# OVERLAYFS_MAGIC                      │
>> │measure fsmagic=0x794c7630             │
>> │# log, audit or tmp files              │
>> │measure obj_type=var_log_t             │
>> │measure obj_type=auditd_log_t          │
>> │measure obj_type=tmp_t                 │
>> └───────────────────────────────────────┘
>>
>> Secondly, certain devices are configured to take Kernel updates using Kexec
>> soft-boot.  The IMA log from the previous Kernel gets carried over and the
>> Kernel memory consumption problem worsens when such devices undergo multiple
>> Kexec soft-boots over a long period of time.
>>
>> The above two scenarios can cause IMA log to grow and consume Kernel memory.
>>
>> In addition, a large IMA log can add pressure on the network bandwidth when
>> the attestation client sends it to remote-attestation-service.
>>
>> Truncating IMA log to reclaim memory is not feasible, since it makes the
>> log go
>> out of sync with the TPM PCR quote making remote attestation fail.
>>
>> A sophisticated solution is required which will help relieve the memory
>> pressure on the device and continue supporting remote attestation without
>> disruptions.
> If the problem is kernel memory, then using a single tmpfs file has
> already been proposed [1].  As entries are added to the measurement
> list, they are copied to the tmpfs file and removed from kernel memory.
> Userspace would still access the measurement list via the existing
> securityfs file.
>
> The IMA measurement list is a sequential file, allowing it to be read
> from an offset.  How much or how little of the measuremnt list is read
> by the attestation client and sent to the attestation server is up to
> the attestation client/server.
>
> If the problem is not kernel memory, but memory pressure in general,
> then instead of a tmpfs file, the measurement list could similarly be
> copied to a single persistent file [1].
The suggested approach in this RFC discussion using a vfs_tmpfile was
only discussed but no prototype was created back then.  We are
discussing the approach internally now and will respond with more
details about it.
>> -------------------------------------------------------------------------------
>> ================================================
>> | B. Proposed Solution                         |
>> ================================================
>> In this document, we propose an enhancement to the IMA subsystem to improve
>> the long-running performance by snapshotting the IMA log, while still
>> providing mechanisms to verify its integrity using the PCR quotes.
>>
>> The remainder of the document describes details of the proposed solution
>> in the
>> following sub-sections.
>>    - High-level Work-flow
>>    - Snapshot Triggering Mechanism
>>    - Design Choices for Storing Snapshots
>>    - Attestation-Client and Remote-Attestation-Service Side Changes
>>    - Example Walk-through
>>    - Open Questions
>> -------------------------------------------------------------------------------
>> ================================================
>> | B.1 High-level Work-flow                     |
>> ================================================
>> Pre-requisites:
>> - IMA Integrity guarantees are maintained.
>>
>> The proposed high level work-flow of IMA log snapshotting is as follows:
>> - A user-mode process will trigger the snapshot by opening a file in SysFS
>>     say /sys/kernel/security/ima/snapshot (referred to as
>> sysk_ima_snapshot_file
>>     here onwards).
> Please fix the mailer so that it doesn't wrap sentences.   Adding blank
> lines between bullets would improve readability.
Noted, will do.
>> - The Kernel will get the current TPM PCR values and PCR update counter [2]
>>     and store them as template data in a new IMA event "snapshot_aggregate".
>>     This event will be measured by IMA using critical data measurement
>>     functionality [1].  Recording regular IMA events will be paused while
>>     "snapshot_aggregate" is being computed using the existing IMA mutex lock.
>> - Once the "snapshot_aggregate" is computed and measured in IMA log, the
>> prior
>>     IMA events will be made available in the sysk_ima_snapshot_file.
>> - The UM process will copy those IMA events from sysk_ima_snapshot_file to a
>>     snapshot file on disk chosen by UM (referred to as UM_snapshot_file here
>>     onwards).  The location, file-system type, access permissions etc. of the
>>     UM_snapshot_file would be controlled by UM process itself.
>> - Once UM is done copying the IMA events from sysk_ima_snapshot_file to
>>     UM_snapshot_file, it will indicate to the Kernel that the snapshot can be
>>     finalized by triggering a write with any data to the
>> sysk_ima_snapshot_file.
>>     UM process cannot prevent the IMA log purge operation after this point.
>> - The Kernel will truncate the current IMA log and and clear HTable up
>> to the
>>     "snapshot_aggregate" marker.
>> - The Kernel will measure the PCR update counter as part of measuring
>>     snapshot_aggregate, so that it can be used by the remote attestation
>> service
>>     for detecting missing events.
>> - UM can prevent the IMA log purge by closing the sysk_ima_snapshot_file
>>     without performing a write operation on it.  In this case, while the
>>     "snapshot_aggregate" marker may still be in the log, the event can be
>> ignored
>>     since the previous entries in the IMA log will not be purged.
>>
>> Note:
>> - This work-flow should work when interleaved with Kexec 'load' and
>> 'execute'
>>     events and should not cause IMA log + snapshot to go out of sync with PCR
>>     quotes. The implementation details are omitted from this document for
>>     brevity.
> This design seems overly complex and requires synchronization between
> the "snapshot" record and exporting the records from the measurement
> list.  None of this would be necessary if the measurements were copied
> from kernel memory to a backing file (e.g. tmpfs), as described in [1].
>
> What is the real problem - kernel memory pressure, memory pressure in
> general, or disk space?  Is the intention to remove or offload the
> exported measurements?
The main concern is the memory pressure on both the kernel and the 
attestation client
when it sends the request.  The concern you bring up is valid and we are 
working on
creating a prototype.  There is no intention to remove the exported 
measurements.
- Sush
> Concerns:
> - Pausing extending the measurement list.
>
> [1]
> https://lore.kernel.org/linux-integrity/CAOQ4uxj4Pv2Wr1wgvBCDR-tnA5dsZT3rvdDzKgAH1aEV_-r9Qg@mail.gmail.com/#t
>



More information about the kexec mailing list