[PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas

Mon Feb 22 05:50:02 EST 2021

On 22.02.21 10:38, David Hildenbrand wrote:
> On 17.02.21 17:19, James Bottomley wrote:
>> On Tue, 2021-02-16 at 18:16 +0100, David Hildenbrand wrote:
>> [...]
>>>>>     The discussion regarding migratability only really popped up
>>>>> because this is a user-visible thing and not being able to
>>>>> migrate can be a real problem (fragmentation, ZONE_MOVABLE, ...).
>>>>
>>>> I think the biggest use will potentially come from hardware
>>>> acceleration.  If it becomes simple to add say encryption to a
>>>> secret page with no cost, then no flag needed.  However, if we only
>>>> have a limited number of keys so once we run out no more encrypted
>>>> memory then it becomes a costly resource and users might want a
>>>> choice of being backed by encryption or not.
>>>
>>> Right. But wouldn't HW support with configurable keys etc. need more
>>> syscall parameters (meaning, even memefd_secret() as it is would not
>>> be sufficient?). I suspect the simplistic flag approach might not
>>> be sufficient. I might be wrong because I have no clue about MKTME
>>> and friends.
>>
>> The theory I was operating under is key management is automatic and
>> hidden, but key scarcity can't be, so if you flag requesting hardware
>> backing then you either get success (the kernel found a key) or failure
>> (the kernel is out of keys).  If we actually want to specify the key
>> then we need an extra argument and we *must* have a new system call.
>>
>>> Anyhow, I still think extending memfd_create() might just be good
>>> enough - at least for now.
>>
>> I really think this is the wrong approach for a user space ABI.  If we
>> think we'll ever need to move to a separate syscall, we should begin
>> with one.  The pain of trying to shift userspace from memfd_create to a
>> new syscall would be enormous.  It's not impossible (see clone3) but
>> it's a pain we should avoid if we know it's coming.
> 
> Sorry for the late reply, there is just too much going on :)
> 
> *If* we ever realize we need to pass more parameters we can easily have
> a new syscall for that purpose. *Then*, we know how that syscall will
> look like. Right now, it's just pure speculation.
> 
> Until then, going with memfd_create() works just fine IMHO.
> 
> The worst think that could happen is that we might not be able to create
> all fancy sectremem flavors in the future via memfd_create() but only
> via different, highly specialized syscall. I don't see a real problem
> with that.
> 

Adding to that, I'll give up arguing now as I have more important things 
to do. It has been questioned by various people why we need a dedicate 
syscall and at least for me, without a satisfying answer.

Worst thing is that we end up with a syscall that could have been 
avoided, for example, because
1. We add existing/future memfd_create() flags to memfd_secret() as well 
when we need them (sealing, hugetlb., ..).
2. We decide in the future to still add MFD_SECRET support to 
memfd_secret().

So be it.

-- 
Thanks,

David / dhildenb