[PATCH v6 5/6] crypto: arm64/aes-ccm - reduce NEON begin/end calls for common case

Ard Biesheuvel ardb at kernel.org
Wed May 26 11:08:05 PDT 2021


On Wed, 26 May 2021 at 19:14, Eric Biggers <ebiggers at kernel.org> wrote:
>
> On Wed, May 26, 2021 at 12:07:28PM +0200, Ard Biesheuvel wrote:
> > AES-CCM (as used in WPA2 CCMP, for instance) typically involves
> > authenticate-only data, and operates on a single network packet, and so
> > the common case is for the authenticate, en/decrypt and finalize SIMD
> > helpers to all be called exactly once in sequence. Since
> > kernel_neon_end() now involves manipulation of the preemption state as
> > well as the softirq mask state, let's reduce the number of times we are
> > forced to call it to only once if we are handling this common case.
> >
> > Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> > ---
> >  arch/arm64/crypto/aes-ce-ccm-core.S |  1 +
> >  arch/arm64/crypto/aes-ce-ccm-glue.c | 74 +++++++++++---------
> >  2 files changed, 43 insertions(+), 32 deletions(-)
> >
> > diff --git a/arch/arm64/crypto/aes-ce-ccm-core.S b/arch/arm64/crypto/aes-ce-ccm-core.S
> > index 99a028e298ed..8adff299fcd3 100644
> > --- a/arch/arm64/crypto/aes-ce-ccm-core.S
> > +++ b/arch/arm64/crypto/aes-ce-ccm-core.S
> > @@ -124,6 +124,7 @@ SYM_FUNC_START(ce_aes_ccm_final)
> >  SYM_FUNC_END(ce_aes_ccm_final)
> >
> >       .macro  aes_ccm_do_crypt,enc
> > +     cbz     x2, 5f
> >       ldr     x8, [x6, #8]                    /* load lower ctr */
> >       ld1     {v0.16b}, [x5]                  /* load mac */
> >  CPU_LE(      rev     x8, x8                  )       /* keep swabbed ctr in reg */
> > diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c
> > index 54bd2494a000..98159f2c49ae 100644
> > --- a/arch/arm64/crypto/aes-ce-ccm-glue.c
> > +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c
> > @@ -97,10 +97,8 @@ static int ccm_init_mac(struct aead_request *req, u8 maciv[], u32 msglen)
> >  static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[],
> >                          u32 abytes, u32 *macp)
> >  {
> > -     kernel_neon_begin();
> >       ce_aes_ccm_auth_data(mac, in, abytes, macp, key->key_enc,
> >                            num_rounds(key));
> > -     kernel_neon_end();
> >  }
> [...]
> > +     if (req->assoclen)
> > +             ccm_calculate_auth_mac(req, mac);
> > +
>
> This still makes all the associated data be processed under a single
> kernel_neon_begin() / kernel_neon_end() pair, even if there is a large amount of
> it.  Shouldn't it be limited to a reasonable amount at a time, like 4K?
> This sort of thing has been considered a bug before, e.g. see
> commit 706024a52c6 ("crypto: arch/lib - limit simd usage to 4k chunks").
>
> You could do the entire CCM operation under a single pair as long as there isn't
> more than 4K of associated data.
>

Good point. I'll add a separate patch for that.



More information about the linux-arm-kernel mailing list