[PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas

David Hildenbrand david at redhat.com
Fri Feb 12 04:18:19 EST 2021


On 12.02.21 00:09, Mike Rapoport wrote:
> On Thu, Feb 11, 2021 at 01:07:10PM +0100, David Hildenbrand wrote:
>> On 11.02.21 12:27, Mike Rapoport wrote:
>>> On Thu, Feb 11, 2021 at 10:01:32AM +0100, David Hildenbrand wrote:
>>
>> So let's talk about the main user-visible differences to other memfd files
>> (especially, other purely virtual files like hugetlbfs). With secretmem:
>>
>> - File content can only be read/written via memory mappings.
>> - File content cannot be swapped out.
>>
>> I think there are still valid ways to modify file content using syscalls:
>> e.g., fallocate(PUNCH_HOLE). Things like truncate also seems to work just
>> fine.
>   
> These work perfectly with any file, so maybe we should have added
> memfd_create as a flag to open(2) back then and now the secretmem file
> descriptors?

I think open() vs memfd_create() makes sense: for open, the path 
specifies main properties (tmpfs, hugetlbfs, filesystem). On memfd, 
there is no such path and the "type" has to be specified differently.

Also, open() might open existing files - memfd always creates new files.

>   
>>>> AFAIKS, we would need MFD_SECRET and disallow
>>>> MFD_ALLOW_SEALING and MFD_HUGETLB.
>>>
>>> So here we start to multiplex.
>>
>> Yes. And as Michal said, maybe we can support combinations in the future.
> 
> Isn't there a general agreement that syscall multiplexing is not a good
> thing?

Looking at mmap(), madvise(), fallocate(), I think multiplexing is just 
fine and flags can be mutually exclusive - as long as we're not 
squashing completely unrelated things into a single system call.

As one example: we don't have mmap_private() vs. mmap_shared() vs. 
mmap_shared_validate(). E.g., MAP_SYNC is only available for 
MAP_SHARED_VALIDATE.


> memfd_create already has flags validation that does not look very nice.

I assume you're talking about the hugetlb size specifications, right? 
It's not nice but fairly compact.

> Adding there only MFD_SECRET will make it a bit less nice, but when we'll
> grow new functionality into secretmem that will become horrible.

What do you have in mind? A couple of MFD_SECRET_* flags that only work 
with MFD_SECRET won't hurt IMHO. Just like we allow MFD_HUGE_* only with 
MFD_HUGETLB.

Thanks,

David / dhildenb




More information about the linux-riscv mailing list