[PATCH v10 0/4] RISC-V: mm: Make SV48 the default address space
Yangyu Chen
cyy at cyyself.name
Fri Jan 19 22:13:14 PST 2024
Thanks for your reply.
On 1/20/24 09:34, Charlie Jenkins wrote:
> On Sun, Jan 14, 2024 at 01:26:57AM +0800, Yangyu Chen wrote:
>> Hi, Charlie
>>
>> Although this patchset has been merged I still have some questions about
>> this patchset. Because it breaks regular mmap if address >= 38 bits on
>> sv48 / sv57 capable systems like qemu. For example, If a userspace program
>> wants to mmap an anonymous page to addr=(1<<45) on an sv48 capable system,
>> it will fail and kernel will mmaped to another sv39 address since it does
>
> Thank you for raising this concern. To make sure I am understanding
> correctly, you are passing a hint address of (1<<45) and expecting mmap
> to return 1<<45 and if it returns a different address you are describing
> mmap as failing? If you want an address that is in the sv48 space you
> can pass in an address that is greater than 1<<47.
>
>> not meet the requirement to use sv48 as you wrote:
>>
>>> else if ((((_addr) >= VA_USER_SV48)) && (VA_BITS >= VA_BITS_SV48)) \
>>> mmap_end = VA_USER_SV48; \
>>> else \
>>> mmap_end = VA_USER_SV39; \
>>
>> Then, How can a userspace program create a mmap with a hint if the address
>>> = (1<<38) after your patch without MAP_FIXED? The only way to do this is
>> to pass a hint >= (1<<47) on mmap syscall then kernel will return a random
>> address in sv48 address space but the hint address gets lost. I think this
>
> In order to force mmap to return the address provided you must use
> MAP_FIXED. Otherwise, the address is a "hint" and has no guarantees. The
> hint address on riscv is used to mean "don't give me an address that
> uses more bits than this". This behavior is not unique to riscv, arm64
> and powerpc use a similar scheme. In arch/arm64/include/asm/processor.h
> there is the following code:
>
> #define arch_get_mmap_base(addr, base) ((addr > DEFAULT_MAP_WINDOW) ? \
> base + TASK_SIZE - DEFAULT_MAP_WINDOW :\
> base)
>
> arm64/powerpc are only concerned with a single boundary so the code is simpler.
>
As you say, this code in arm64/powerpc will not meet the issue I
address. For example, If the addr here is (1<<50) on arm64, the
arch_get_mmap_base will return base+TASK_SIZE-DEFAULT_MAP_WINDOW which
is (1<<vabits_actual). And this behavior on arm64/powerpc/x86 does not
break anything since we will use a larger address space if the hint
address is specified on the address > DEFAULT_MAP_WINDOW. The
corresponding behavior on RISC-V should be if the hint address > BIT(47)
then use Sv57 address space and use Sv48 when the hint address > BIT(38)
if we want Sv39 by default.
However, your patch needs the address >= BIT(47) rather than BIT(38) to
use Sv48 and address >= BIT(56) to use Sv57, thus breaking existing
userspace software to create mapping on the hint address without
MAP_FIXED set.
>> violate the principle of mmap syscall as kernel should take the hint and
>> attempt to create the mapping there.
>
> Although the man page for mmap does say "on Linux, the kernel will pick
> a nearby page boundary" it is still a hint address so there is no strict
> requirement (and the precedent has already been set by arm64/powerpc).
>
Yeah. There is no strict requirement. But currently x86/arm64/powerpc
works in this situation well. The hint address on these ISAs is not used
as the upper bound to allocating the address. However, on RISC-V, you
treat this as the upper bound.
>>
>> I don't think patching in this way is right. However, if we only revert
>> this patch, some programs relying on mmap to return address with effective
>> bits <= 48 will still be an issue and it might expand to other ISAs if
>> they implement larger virtual address space like RISC-V sv57. A better way
>> to solve this might be adding a MAP_48BIT flag to mmap like MAP_32BIT has
>> been introduced for decades.
>>
>> Thanks,
>> Yangyu Chen
>>
>
> - Charlie
>
More information about the linux-riscv
mailing list