[PATCH v5 2/5] lib/test_bitmap: add tests for bitmap_{read,write}()

Alexander Potapenko glider at google.com
Fri Sep 29 01:54:59 PDT 2023


On Thu, Sep 28, 2023 at 10:02 PM Yury Norov <yury.norov at gmail.com> wrote:
>
> On Thu, Sep 28, 2023 at 05:14:55PM +0200, Alexander Potapenko wrote:
> >
> > So e.g. for compressing something into a 16-byte buffer using bitmaps
> > I'd need to:
> >
> > 1) Allocate the buffer: buf = kmem_cache_alloc(...)
> > 2) Allocate the bitmap: bitmap = bitmap_alloc(16*8, ...)
> > 3) Fill the bitmap: mte_compress_to_buf(..., bitmap, 16)
> > 4) Copy the bitmap contents to the buffer: bitmap_to_arr64(buf, bitmap, 16*8)
> > 5) Deallocate the bitmap: bitmap_free(bitmap)
> >
> > instead of:
> >
> > buf = kmem_cache_alloc(...)
> > mte_compress_to_buf(..., (unsigned long *)buf, 16)
> >
> > , correct?
> >
> > Given that the buffer contents are opaque and its size is aligned on 8
> > bytes, could it be possible to somehow adopt the `buf` pointer
> > instead?
>
> I didn't find an explicit typecasting where you're using
> mte_compress_to_buf(), but now after hard 2nd look I see...
>
> Firstly, now that in the documentation you are explicitly describing the
> return value of mte_compress() as 64-bit frame, the right way to go would
> be declaring the function as: u64 mte_compress(u8 *tags).

Ack.

> And the general pattern should be like this:
>
>   unsigned long mte_compress(u8 *tags)
>   {
>           DECLARE_BITMAP(tmp, MTECOMP_CACHES_MAXBITS);
>           void *storage;
>           ...
>           if (alloc_size < MTE_PAGE_TAG_STORAGE) {
>                   storage = kmem_cache_alloc(cache, GFP_KERNEL);
>                   mte_compress_to_buf(r_len, r_tags, r_sizes, tmp, alloc_size);
>
>                   switch (alloc_size) {
>                   case 16:
>                           bitmap_to_arr16(storage, tmp, 16);

I might be missing something, but why do we need the switch at all?
The buffers we are allocating always contain a whole number of u64's -
cannot we just always call bitmap_to_arr64()?

Note that for cases where alloc_size is > 8 we never make any
assumptions about the contents of @storage, and don't care much about
the byte order as long as swap decompression is done with the same
endianness (which is always the case).
(The case where alloc_size==8 is somewhat special, and needs more
accurate handling, because we do make assumptions about the bit layout
there).

>                           break;
>                   case 32:
>                           bitmap_to_arr32(storage, tmp, 32);
>                           break;
>                   case 64:
>                           bitmap_to_arr64(storage, tmp, 64);
>                           break;
>                   default:
>                           pr_err("error\n");
>                   }
>                   result = ((u64)storage | cache_id) & MTE_HANDLE_MASK;
>                   goto ret;
>           }
>           ...
>   }
>
> Yeah, it looks cumbersome, but this is the right way to go if you need a
> reliable BE-compatible driver.

What is the BE compatibility problem that you are anticipating here?

The issue that came up during testing was that on systems with
different endianness we cannot treat unaligned buffers as bitmaps,
because bitmap_write() might be touching memory past the allocation
size.
I agree therefore with the general approach of encapsulating all
bitmap operations in bitmap.h and hiding the implementation details
behind the API.
But in this particular case it seems to complicate the otherwise
trivial application of bitmap_read()/bitmap_write() to an external
buffer with guaranteed 64-bit alignment: we'll end up allocating this
temporary bitmap and doing a hard-to-optimize memcpy() between it and
the final storage.

A straightforward solution for this would be to fork u64* versions of
bitmap_read()/bitmap_write() implementations in mtecomp.c and apply
them to u64* buffers allocated in that file.
This would remove the need for the temporary bitmap (because there's
no encapsulation that mandates using bitmap.h now) and the extra
memcpy().
But I don't like this either, because forking code that exists in
headers is just wrong.

> I think it will be less scary if you wrap
> the switch with a helper, and/or move it inside mte_compress_to_buf(),
> so that the mte_compress will stay unchanged.
>
> Anyways, hope the above helped.
>
> Thanks,
> Yury


--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg



More information about the linux-arm-kernel mailing list