[PATCH] riscv: Define TASK_SIZE_MAX for __access_ok()

Alexandre Ghiti alex at ghiti.fr
Mon Mar 25 00:25:28 PDT 2024


Hi Arnd,

On 24/03/2024 23:05, Arnd Bergmann wrote:
> On Tue, Mar 19, 2024, at 17:51, Alexandre Ghiti wrote:
>> On 18/03/2024 22:29, Samuel Holland wrote:
>>> On 2024-03-18 3:50 PM, Alexandre Ghiti wrote:
>>>> On Wed, Mar 13, 2024 at 7:00 PM Samuel Holland
>>> It looks like the call to fixup_exception() [added
>>> in 416721ff05fd ("riscv, mm: Perform BPF exhandler fixup on page fault")] is
>>> only intended to catch null pointer dereferences. So making the change wouldn't
>>> have any functional impact, but it would still be a valid optimization.
>>>
>>>> Or I was wondering if it would not be better to do like x86 and use an
>>>> alternative, it would be more correct (even though I believe your
>>>> solution works)
>>>> https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/page_64.h#L82.
>>> What would be the benefit of using an alternative? Any access to an address
>>> between TASK_SIZE and TASK_SIZE_MAX is guaranteed to generate a page fault, so
>>> the only benefit I see is returning -EFAULT slightly faster at the cost of
>>> applying a few hundred alternatives at boot. But it's possible I'm missing
>>> something.
>>
>> The use of alternatives allows to return right away if the buffer is
>> beyond the usable user address space, and it's not just "slightly
>> faster" for some cases (a very large buffer with only a few bytes being
>> beyond the limit or someone could fault-in all the user pages and fail
>> very late...etc). access_ok() is here to guarantee that such situations
>> don't happen, so actually it makes more sense to use an alternative to
>> avoid that.
> The access_ok() function really wants a compile-time constant
> value for TASK_SIZE_MAX so it can do constant folding for
> repeated calls inside of one function, so for configurations
> with a boot-time selected TASK_SIZE_64 it's already not ideal,
> with or without alternatives.
>
> If I read the current code correctly, riscv doesn't even
> have a way to build with a compile-time selected
> VA_BITS/PGDIR_SIZE, which is probably a better place to
> start optimizing, since this rarely needs to be selected
> dynamically.


Indeed, we do not support compile-time fixed VA_BITS! We could, but that 
would only be used for custom kernels. I don't think distro kernels will 
ever (?) propose 3 different kernels for sv39, sv48 and sv57 because the 
cost of dynamically choosing the address space width is not big enough 
to me (and the burden of maintaining 3 different kernels is).

Let me know if I'm wrong, I'd be happy to work on that.

Thanks,

Alex


>
>        Arnd



More information about the linux-riscv mailing list