[RFC] IMA Log Snapshotting Design Proposal - aggregate

Wed Sep 6 13:49:16 PDT 2023

On 9/1/2023 6:06 PM, Tushar Sugandhi wrote:
> 
> 
> On 8/30/23 11:12, Ken Goldman wrote:
>> On 8/1/2023 3:12 PM, Sush Shringarputale wrote:
>>> - A user-mode process will trigger the snapshot by opening a file in 
>>> SysFS
>>>    say /sys/kernel/security/ima/snapshot (referred to as 
>>> sysk_ima_snapshot_file
>>>    here onwards).
>>> - The Kernel will get the current TPM PCR values and PCR update 
>>> counter [2]
>>>    and store them as template data in a new IMA event 
>>> "snapshot_aggregate".
>>
>> If this is relying on a user-mode process, is there a concern that the 
>> process doesn't run. Might it be safer to have the kernel trigger the
>> snapshot.
>>
> The UM process here would be typically an attestation client
> which passes on the IMA log to the remote service for attestation.
> If the process doesn't run, the client will operate the same way as it
> does currently.

I see.

1. Ensure that the attestation client stores the snapshot in a 
well-known and widely readable location.  There can be more than one 
attestation client, and all need access to the snapshot.

There is a privacy concern around making the snapshot world-read.

2. Is there a concern that, if the client doesn't run, it doesn't solve 
the kernel memory issue?  Is this relying on a UM process to solve a 
kernel issue?
> 
>> PCR reads are not atomic, with each other and with event log appends. 
>> Is this an issue?
>>
> In this design, reading the PCR plus adding the snapshot_aggregate
> has to be an atomic operation.  Other IMA events shouldn't interfere
> with this operation. Just like IMA ensures adding an entry to the log
> plus PCR extension happens in an atomic way by holding the
> ima_extend_list_mutex [2], we intend to use a similar mechanism to
> ensure reading the PCR plus adding the snapshot_aggregate remains an
> atomic operation.  And since taking a snapshot would be a rare event
> compared to adding a generic event to IMA log - overall we expect a low
> overhead in case of snapshotting.

How would that work?  The PCR read is UM, but IMA events are kernel. The 
UM operation cannot block the kernel or there can be a deadlock, right?

(UM) PCR reads can take multiple TPM commands, and they should not block 
an (kernel) extend.

>> What is the purpose of the snapshot aggregate?  Since the entire event 
>> log has to be retained and sent to the verifier, is the aggregate 
>> redundant?
> 
> The goals of snapshot_aggregate marker are:
>      1. To allow the IMA log to be divided into multiple chunks and
>         provide attestation service the ability to verify and use the
>         latest chunk (i.e. snapshot ) for attestation.

I believe that the verifier needs the entire log the first time, whether 
there is a snapshot or not.  Shouldn't the snapshot process be opaque to 
the verifier?

> 
>      2. To indicate to the attestation service that the client device has
>         IMA log snapshotting feature enabled, and at least one snapshot
>         is taken.  So that the service can ask for previous snapshots
>         as needed.

Why does the verifier need this?  The first time, it asks for events 
starting at #0.  Next time, it asks for what's new.  It's independent of 
__where__ the log comes from.

> 
>      3. In the event of multiple snapshots, the snapshot_aggregate
>         marker has sufficient information to verify the integrity
>         of latest subset of isolated snapshots (with the help of PCR
>         quote of course)

A new verifier needs the entire log, no matter how many snapshots have 
been taken.

> 
>      4. snapshot_aggregate helps both kernel and UM define clear
>         boundaries between multiple snapshots.
>         (each new snapshot starts with either the first boot_aggregate
>          or a snapshot_aggregate event)
> 
> The overall goals of IMA log snapshotting feature are:
>      a. to relieve memory pressure on the client device.
> 
>      b. to make attestation service side processing more efficient
>         They don't have to deal with the entire log since boot,
>         as you mentioned on

I don't think snapshotting affects the verifier at all. The attestor is 
a bit more complicated, but not significantly.