[RFC PATCH v2 1/7] crypto: xctr - Add XCTR support

Eric Biggers ebiggers at kernel.org
Wed Feb 16 15:00:57 PST 2022


On Thu, Feb 10, 2022 at 11:28:06PM +0000, Nathan Huckleberry wrote:
> Add a generic implementation of XCTR mode as a template.  XCTR is a
> blockcipher mode similar to CTR mode.  XCTR uses XORs and little-endian
> addition rather than big-endian arithmetic which has two advantages:  It
> is slightly faster on little-endian CPUs and it is less likely to be
> implemented incorrect since integer overflows are not possible on
> practical input sizes.  XCTR is used as a component to implement HCTR2.
> 
> More information on XCTR mode can be found in the HCTR2 paper:
> https://eprint.iacr.org/2021/1441.pdf
> 
> Signed-off-by: Nathan Huckleberry <nhuck at google.com>
> 
> Changes since v1:
>  * Restricted blocksize to 16-bytes
>  * Removed xctr.h and u32_to_le_block
>  * Use single crypto_template instead of array
> ---

Changelog text conventionally goes in the cover letter, not in the individual
patches.  Having the changelog be above the scissors line ("---") is especially
problematic, as it will show up in the git commit message.

> diff --git a/crypto/Kconfig b/crypto/Kconfig
> index fa1741bb568f..8543f34fa200 100644
> --- a/crypto/Kconfig
> +++ b/crypto/Kconfig
> @@ -452,6 +452,15 @@ config CRYPTO_PCBC
>  	  PCBC: Propagating Cipher Block Chaining mode
>  	  This block cipher algorithm is required for RxRPC.
>  
> +config CRYPTO_XCTR
> +	tristate
> +	select CRYPTO_SKCIPHER
> +	select CRYPTO_MANAGER
> +	help
> +	  XCTR: XOR Counter mode. This blockcipher mode is a variant of CTR mode
> +	  using XORs and little-endian addition rather than big-endian arithmetic.
> +	  XCTR mode is used to implement HCTR2.

Now that this option isn't user-selectable, no one will see this help text.
I think it would be best to remove it, and make sure that the comment in
crypto/xctr.c fully explains what XCTR is (currently it's a bit inadequate).

> +/*
> + * Test vectors generated using https://github.com/google/hctr2
> + */
> +static const struct cipher_testvec aes_xctr_tv_template[] = {
[...]
> +		.klen	= 16,
> +		.len	= 255,
[...]
> +		.klen	= 16,
> +		.len	= 255,

I commented on the test vectors before in the context of the HCTR2 ones, but the
same comments apply here: the actual test coverage for the number of test
vectors included is not great, due to lengths being repeated.  It would be
better to vary the lengths a bit more, especially the message lengths.  What you
have here isn't bad, but I think there's some room for improvement.

> +/*
> + * XCTR mode is a blockcipher mode of operation used to implement HCTR2. XCTR is
> + * closely related to the CTR mode of operation; the main difference is that CTR
> + * generates the keystream using E(CTR + IV) whereas XCTR generates the
> + * keystream using E(CTR ^ IV).
> + *
> + * See the HCTR2 paper for more details:
> + *	Length-preserving encryption with HCTR2
> + *      (https://eprint.iacr.org/2021/1441.pdf)
> + */

The above comment could use a bit more detail, e.g. mentioning endianness as
well as the fact that XCTR avoids having to deal with multi-limb integers.

> +static void crypto_xctr_crypt_final(struct skcipher_walk *walk,
> +				   struct crypto_cipher *tfm, u32 byte_ctr)
> +{
> +	unsigned long alignmask = crypto_cipher_alignmask(tfm);
> +	u8 tmp[XCTR_BLOCKSIZE + MAX_CIPHER_ALIGNMASK];
> +	u8 *keystream = PTR_ALIGN(tmp + 0, alignmask + 1);
> +	u8 *src = walk->src.virt.addr;
> +	u8 *dst = walk->dst.virt.addr;
> +	unsigned int nbytes = walk->nbytes;
> +	__le32 ctr32 = cpu_to_le32(byte_ctr / XCTR_BLOCKSIZE + 1);
> +
> +	crypto_xor(walk->iv, (u8 *)&ctr32, sizeof(ctr32));
> +	crypto_cipher_encrypt_one(tfm, keystream, walk->iv);
> +	crypto_xor_cpy(dst, keystream, src, nbytes);
> +	crypto_xor(walk->iv, (u8 *)&ctr32, sizeof(ctr32));
> +}

When crypto_cipher_encrypt_one() is used instead of ->cia_encrypt directly, the
caller doesn't need to align the buffers, sincec crypto_cipher_encrypt_one()
handles that.

> +static int crypto_xctr_crypt_segment(struct skcipher_walk *walk,
> +				    struct crypto_cipher *tfm, u32 byte_ctr)
> +{
> +	void (*fn)(struct crypto_tfm *, u8 *, const u8 *) =
> +		   crypto_cipher_alg(tfm)->cia_encrypt;
> +	u8 *src = walk->src.virt.addr;
> +	u8 *dst = walk->dst.virt.addr;
> +	unsigned int nbytes = walk->nbytes;
> +	__le32 ctr32 = cpu_to_le32(byte_ctr / XCTR_BLOCKSIZE + 1);
> +
> +	do {
> +		/* create keystream */
> +		crypto_xor(walk->iv, (u8 *)&ctr32, sizeof(ctr32));
> +		fn(crypto_cipher_tfm(tfm), dst, walk->iv);
> +		crypto_xor(dst, src, XCTR_BLOCKSIZE);
> +		crypto_xor(walk->iv, (u8 *)&ctr32, sizeof(ctr32));
> +
> +		ctr32++;
> +
> +		src += XCTR_BLOCKSIZE;
> +		dst += XCTR_BLOCKSIZE;
> +	} while ((nbytes -= XCTR_BLOCKSIZE) >= XCTR_BLOCKSIZE);
> +
> +	return nbytes;
> +}

This won't work on big endian systems due to the 'ctr32++' on a __le32 variable.
I recommend installing 'sparse' and passing C=2 to make, as it will warn about
endianness bugs like this.

Either endianness needs to be converted for the increment, or it needs to be
converted when doing the XOR.  (The latter would work if the XOR is done
manually instead of with crypto_xor, which could be a nice optimization
separately from fixing the endianness bug.)

> +	/* Block size must be >= 4 bytes. */
> +	err = -EINVAL;
> +	if (alg->cra_blocksize != XCTR_BLOCKSIZE)
> +		goto out_free_inst;

The comment above is outdated.

- Eric



More information about the linux-arm-kernel mailing list