[RFC, PATCHv2 29/29] mm, x86: introduce RLIMIT_VADDR

Andy Lutomirski luto at amacapital.net
Tue Jan 3 10:29:33 PST 2017


On Tue, Jan 3, 2017 at 5:18 AM, Arnd Bergmann <arnd at arndb.de> wrote:
> On Monday, January 2, 2017 10:08:28 PM CET Andy Lutomirski wrote:
>>
>> > This seems to nicely address the same problem on arm64, which has
>> > run into the same issue due to the various page table formats
>> > that can currently be chosen at compile time.
>>
>> On further reflection, I think this has very little to do with paging
>> formats except insofar as paging formats make us notice the problem.
>> The issue is that user code wants to be able to assume an upper limit
>> on an address, and it gets an upper limit right now that depends on
>> architecture due to paging formats.  But someone really might want to
>> write a *portable* 64-bit program that allocates memory with the high
>> 16 bits clear.  So let's add such a mechanism directly.
>>
>> As a thought experiment, what if x86_64 simply never allocated "high"
>> (above 2^47-1) addresses unless a new mmap-with-explicit-limit syscall
>> were used?  Old glibc would continue working.  Old VMs would work.
>> New programs that want to use ginormous mappings would have to use the
>> new syscall.  This would be totally stateless and would have no issues
>> with CRIU.
>
> I can see this working well for the 47-bit addressing default, but
> what about applications that actually rely on 39-bit addressing
> (I'd have to double-check, but I think this was the limit that
> people were most interested in for arm64)?
>
> 39 bits seems a little small to make that the default for everyone
> who doesn't pass the extra flag. Having to pass another flag to
> limit the addresses introduces other problems (e.g. mmap from
> library call that doesn't pass that flag).

That's a fair point.  Maybe my straw man isn't so good.

>
>> If necessary, we could also have a prctl that changes a
>> "personality-like" limit that is in effect when the old mmap was used.
>> I say "personality-like" because it would reset under exactly the same
>> conditions that personality resets itself.
>
> For "personality-like", it would still have to interact
> with the existing PER_LINUX32 and PER_LINUX32_3GB flags that
> do the exact same thing, so actually using personality might
> be better.
>
> We still have a few bits in the personality arguments, and
> we could combine them with the existing ADDR_LIMIT_3GB
> and ADDR_LIMIT_32BIT flags that are mutually exclusive by
> definition, such as
>
>         ADDR_LIMIT_32BIT =      0x0800000, /* existing */
>         ADDR_LIMIT_3GB   =      0x8000000, /* existing */
>         ADDR_LIMIT_39BIT =      0x0010000, /* next free bit */
>         ADDR_LIMIT_42BIT =      0x8010000,
>         ADDR_LIMIT_47BIT =      0x0810000,
>         ADDR_LIMIT_48BIT =      0x8810000,
>
> This would probably take only one or two personality bits for the
> limits that are interesting in practice.

Hmm.  What if we approached this a bit differently?  We could add a
single new personality bit ADDR_LIMIT_EXPLICIT.  Setting this bit
cause PER_LINUX32_3GB etc to be automatically cleared.  When
ADDR_LIMIT_EXPLICIT is in effect, prctl can set a 64-bit numeric
limit.  If ADDR_LIMIT_EXPLICIT is cleared, the prctl value stops being
settable and reading it via prctl returns whatever is implied by the
other personality bits.

--Andy



More information about the linux-arm-kernel mailing list