[RFC PATCH 00/29] arm64: Scalable Vector Extension core support
Florian Weimer
fweimer at redhat.com
Fri Dec 2 08:34:33 PST 2016
On 12/02/2016 12:48 PM, Dave Martin wrote:
> On Wed, Nov 30, 2016 at 01:38:28PM +0100, Florian Weimer wrote:
>
> [...]
>
>> We could add a system call to get the right stack size. But as it depends
>> on VL, I'm not sure what it looks like. Particularly if you need determine
>> the stack size before creating a thread that uses a specific VL setting.
>
> I missed this point previously -- apologies for that.
>
> What would you think of:
>
> set_vl(vl_for_new_thread);
> minsigstksz = get_minsigstksz();
> set_vl(my_vl);
>
> This avoids get_minsigstksz() requiring parameters -- which is mainly a
> concern because the parameters tomorrow might be different from the
> parameters today.
>
> If it is possible to create the new thread without any SVE-dependent code,
> then we could
>
> set_vl(vl_for_new_thread);
> new_thread_stack = malloc(get_minsigstksz());
> new_thread = create_thread(..., new_thread_stack);
> set_vl(my_vl);
>
> which has the nice property that the new thread directly inherits the
> configuration that was used for get_minsigstksz().
Because all SVE registers are caller-saved, it's acceptable to
temporarily reduce the VL value, I think. So this should work.
One complication is that both the kernel and the libc need to reserve
stack space, so the kernel-returned value and the one which has to be
used in reality will be different.
> However, it would be necessary to prevent GCC from moving any code
> across these statements -- in particular, SVE code that access VL-
> dependent data spilled on the stack is liable to go wrong if reordered
> with the above. So the sequence would need to go in an external
> function (or a single asm...)
I would talk to GCC folks—we have similar issues with changing the FPU
rounding mode, I assume.
Thanks,
Florian
More information about the linux-arm-kernel
mailing list