[GIT PULL] ARM: kernel mode NEON support

Ard Biesheuvel ard.biesheuvel at linaro.org
Mon Jul 22 12:45:31 EDT 2013


On 22 July 2013 18:31, Russell King - ARM Linux <linux at arm.linux.org.uk> wrote:
> On Mon, Jul 08, 2013 at 11:23:11PM +0100, Ard Biesheuvel wrote:
>> The following changes since commit 8bb495e3f02401ee6f76d1b1d77f3ac9f079e376:
>>
>>   Linux 3.10 (2013-06-30 15:13:29 -0700)
>>
>> are available in the git repository at:
>>
>>   git://git.linaro.org/people/ardbiesheuvel/linux-arm.git for-rmk
>>
>> for you to fetch changes up to 7d11965ddb9b9b1e0a5d13c58345ada1ccbc663b:
>>
>>   lib/raid6: add ARM-NEON accelerated syndrome calculation (2013-07-08
>> 22:09:18 +0100)
>
> I'm assuming that the comments in your previous postings are valid as I've
> included those in the merge commit:
>

I think they're close enough. I did remove the BUG() call in the
kernel mode FP exception handler, as just returning from that function
will cause an oops to be triggered anyway.

Cheers,
-- 
Ard.

>     I have included two use cases that I have been using, XOR and RAID-6
>     checksumming. The former gets a 60% performance boost on the NEON, the
>     latter over 400%.
>
>     ARM: add support for kernel mode NEON
>
>     Adds kernel_neon_begin/end (renamed from kernel_vfp_begin/end in the
>     previous version to de-emphasize the VFP part as VFP code that needs
>     software assistance is not supported currently.)
>
>     Introduces <asm/neon.h> and the Kconfig symbol KERNEL_MODE_NEON. This
>     has been aligned with Catalin for arm64, so any NEON code that does
>     not use assembly but intrinsics or the GCC vectorizer (such as my
>     examples) can potentially be shared between arm and arm64 archs.
>
>     ARM: move VFP init to an earlier boot stage
>
>     This is needed so the NEON is enabled when the XOR and RAID-6 algo
>     boot time benchmarks are run.
>
>     ARM: be strict about FP exceptions in kernel mode
>
>     This adds a check to vfp_support_entry() to flag unsupported uses of
>     the NEON/VFP in kernel mode. FP exceptions (bounces) are flagged as
>     a BUG(), this is because of their potentially intermittent nature.
>     Exceptions caused by the fact that kernel_neon_begin has not been
>     called are just routed through the undef handler.
>
>     ARM: crypto: add NEON accelerated XOR implementation
>
>     This is the xor_blocks() implementation built with -ftree-vectorize,
>     60% faster than optimized ARM code. It calls in_interrupt() to check
>     whether the NEON flavor can be used: this should really not be
>     necessary, but due to xor_blocks'squite generic nature, there is no
>     telling how exactly people may be using it in the real world.
>
>     lib/raid6: add ARM-NEON accelerated syndrome calculation
>
>     This is a port of the RAID-6 checksumming code in altivec.uc ported
>     to use NEON intrinsics. It is about 4x faster than the sequential
>     code.
>



More information about the linux-arm-kernel mailing list