Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)

Wed Jan 11 04:13:27 PST 2023

On Wed, Jan 11, 2023 at 5:28 PM Andy Chiu <andy.chiu at sifive.com> wrote:
>
> On Wed, Jan 11, 2023 at 2:20 PM Jeff Law <jlaw at ventanamicro.com> wrote:
> > Fault on first use is well understood and has been implemented on many
> > architectures through the decades, even with its warts.
>
> Unfortunately, we don't have a direct way of acknowledging if an
> illegal instruction is caused by illegitimate use of V instructions.
> Unlike ARM64, where reading ESR_EL1.EC is enough to distinguish the
> fault, we may have to perform a sw decode on the faulting instruction.
> Then see if it is the first-use fault, or a more general illegal
> instruction fault.
After taking more considerations, I think this could be minor. The
first V-instruction of a valid program that uses Vector is limited to
vset{i}vl{i}, vl<nf>r, or vs<nf>r. And perhaps some r/w of
vector-specific CSRs. Decoding these instructions should be relatively
constraint and easy. And we need this decoding only once for each
process since we don't have to do lazy save/restore.
>
> Yes, we may just enable V for a process whenever we find an OP-V major
> opcode, or a LOAD/STORE-FP with vector-encoded width on illegal
> instruction. But it could be kind of messy, IF, later extensions would
> also like to be enabled at first-use-fault. (e.g. ARM has SME followed
> by SVE). And implementing this decoding logic in sw just seems
> redundant to me because hw has already done that for us.
Let's limit our discussion to the scope of VS enablement for now.
>
> Besides, ARM64 has individual mappings of traps for the use of
> FP-related units in EL1 and EL0. So SIMD running in kernel mode would
> not take additional instruction to enable the unit. I assume these
> kinds of CSR-controlling instructions would have to flush hw internal
> buffers to some extent. And doing these takes additional latencies.
We already do some VS/FS settings on the entry of kernel code. So this
should be minor as well.

Anyway, I agree that faulting on first-uses is a better way to make
per-process control of VS feasible. Sorry for disturbing the list.

Thanks,
Andy