[PATCH v4 6/8] fsverity: improve performance by using multibuffer hashing
Eric Biggers
ebiggers at kernel.org
Tue Jun 4 11:42:20 PDT 2024
On Tue, Jun 04, 2024 at 05:37:36PM +0800, Herbert Xu wrote:
> On Mon, Jun 03, 2024 at 11:37:29AM -0700, Eric Biggers wrote:
> >
> > + for (i = 0; i < ctx->num_pending; i++) {
> > + data[i] = ctx->pending_blocks[i].data;
> > + outs[i] = ctx->pending_blocks[i].hash;
> > + }
> > +
> > + desc->tfm = params->hash_alg->tfm;
> > + if (params->hashstate)
> > + err = crypto_shash_import(desc, params->hashstate);
> > + else
> > + err = crypto_shash_init(desc);
> > + if (err) {
> > + fsverity_err(inode, "Error %d importing hash state", err);
> > + return false;
> > + }
> > + err = crypto_shash_finup_mb(desc, data, params->block_size, outs,
> > + ctx->num_pending);
> > + if (err) {
> > + fsverity_err(inode, "Error %d computing block hashes", err);
> > + return false;
> > + }
>
> So with ahash operating in synchronous mode (callback == NULL), this
> would look like:
>
> struct ahash_request *reqs[FS_VERITY_MAX_PENDING_DATA_BLOCKS];
>
> for (i = 0; i < ctx->num_pending; i++) {
> reqs[i] = fsverity_alloc_hash_request();
> if (!req) {
> free all reqs;
> return false;
> }
>
> if (params->hashstate)
> err = crypto_ahash_import(&reqs[i], params->hashstate);
> else
> err = crypto_ahash_init(&reqs[i]);
>
> if (err) {
> fsverity_err(inode, "Error %d importing hash state", err);
> free all reqs;
> return false;
> }
> }
>
> for (i = 0; i < ctx->num_pending; i++) {
> unsigned more;
>
> if (params->hashstate)
> err = crypto_ahash_import(req, params->hashstate);
> else
> err = crypto_ahash_init(req);
>
> if (err) {
> fsverity_err(inode, "Error %d importing hash state", err);
> free all requests;
> return false;
> }
>
> more = 0;
> if (i + 1 < ctx->num_pending)
> more = CRYPTO_TFM_REQ_MORE;
> ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP | more,
> NULL, NULL);
> ahash_request_set_crypt(req, ctx->pending_blocks[i].sg,
> ctx->pending_blocks[i].hash,
> params->block_size);
>
> err = crypto_ahash_finup(req);
> if (err) {
> fsverity_err(inode, "Error %d computing block hashes", err);
> free all requests;
> return false;
> }
> }
>
> You're hiding some of the complexity by not allocating memory
> explicitly for each hash state. This might fit on the stack
> for two requests, but eventually you will have to allocate memory.
>
> With the ahash API, the allocation is explicit.
>
This doesn't make any sense, though. First, the requests need to be enqueued
for the task, but crypto_ahash_finup() would only have the ability to enqueue it
in a queue associated with the tfm, which is shared by many tasks. So it can't
actually work unless the tfm maintained a separate queue for each task, which
would be really complex. Second, it adds a memory allocation per block which is
very undesirable. You claim that it's needed anyway, but actually it's not;
with my API there is only one initial hash state regardless of how high the
interleaving factor is. In fact, if multiple initial states were allowed,
multibuffer hashing would become much more complex because the underlying
algorithm would need to validate that these different states are synced up. My
proposal is much simpler and avoids all this unnecessary overhead.
Really the only reason to even consider ahash at all would be try to support
software hashing and off-CPU hardware accelerators using the "same" code.
However, your proposal would not achieve that either, as it would not use the
async callback. Note, as far as I know no one actually cares about off-CPU
hardware accelerator support in fsverity anyway...
- Eric
More information about the linux-arm-kernel
mailing list