[PATCH 0/4] riscv: uaccess: optimizations
Linus Torvalds
torvalds at linux-foundation.org
Sun Jun 30 09:59:36 PDT 2024
On Tue, 25 Jun 2024 at 11:12, Linus Torvalds
<torvalds at linux-foundation.org> wrote:
>
> But yes, it would be lovely if we did things as "implement the
> low-level accessor functions and we'll wrap them for the generic case"
> rather than have every architecture have to do the wrapping..
Btw, to do that _well_, we need to expand on the user access functions
a bit more.
In particular, we can already implement "get_user()" on top of
user_access_begin() and friends something like this:
#define get_user(x,ptr) ({ \
__label__ Efault_read; \
long __err; \
__typeof__(ptr) __ptr = (ptr); \
if (likely(user_access_begin(__ptr, sizeof(x)))) { \
unsafe_get_user(x, __ptr, Efault_read); \
user_access_end(); \
__err = 0; \
} else { \
if (0) { \
Efault_read: user_access_end(); \
} \
x = (__typeof__(x))0; \
__err = -EFAULT; \
} \
__err; })
and it doesn't generate _horrible_ code. It looks pretty bad, but all
the error handling goes out-of-line, so on x86-64 (without debug
options) it generates code something like this:
test %rdi,%rdi
js <cap_validate_magic+0x98>
stac
lfence
mov (%rdi),%ecx
clac
which is certainly not horrid. But it's not great, because that lfence
ends up really screwing up the pipeline.
The manually coded out-of-line code generates this instead:
mov %rax,%rdx
sar $0x3f,%rdx
or %rdx,%rax
stac
movzbl (%rax),%edx
xor %eax,%eax
clac
ret
because it doesn't do a conditional branch (with the required spectre
thing), but instead does the address as a data dependency and knows
that "all bits set" if the address was negative will cause a page
fault.
But we *can* get the user accesses to do the same with a slight
expansion of user_access_begin():
stac
mov %rdi,%rax
sar $0x3f,%rax
or %rdi,%rax
mov (%rax),%eax
clac
by just introducing a notion of "masked_user_access". The get_user()
macro part would look like this:
__typeof__(ptr) __ptr; \
__ptr = masked_user_access_begin(ptr); \
if (1) { \
unsafe_get_user(x, __ptr, Efault_read); \
user_access_end(); \
and the patch to implement this is attached. I've had it around for a
while, but I don't know how many architectures can do this.
Note this part of the commit message:
This model only works for dense accesses that start at 'src' and on
architectures that have a guard region that is guaranteed to fault in
between the user space and the kernel space area.
which is true on x86-64, but that "guard region" thing might not be
true everywhere.
Will/Catalin - would that
src = masked_user_access_begin(src);
work on arm64? The code does do something like that with
__uaccess_mask_ptr() already, but at least currently it doesn't do the
"avoid conditional entirely", the masking is just in _addition_ to the
access_ok().
Linus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.diff
Type: text/x-patch
Size: 5321 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-riscv/attachments/20240630/c876d3d5/attachment.bin>
More information about the linux-riscv
mailing list