[PATCH v3 3/5] arm64: mte: implement CONFIG_ARM64_MTE_COMP

Alexander Potapenko glider at google.com
Thu Jul 20 05:00:42 PDT 2023


On Wed, Jul 19, 2023 at 11:06 PM Yury Norov <yury.norov at gmail.com> wrote:
>
> > Would it help if I add a comment explaining that iN is an N-bit
> > integer? Or do you prefer something like
> >
> >   r_sizes[0..4] : 5 x 7 bits
> >
> > ?
>
> Yes, that would help.
Ok.


> > >                      7 : Invalid handler
> >
> > No, 7 means "treat this handle as an out-of-line one". It is still
> > valid, but instead of tag data it contains a pointer.
>
> So, it's invalid combination for _inline_ handler, right?
Yes, that's right.

> Anyways, I'm
> waiting for an updated docs, so it will hopefully bring some light.
It's on the way.

> > > > +
> > > > +/* Transform tag ranges back into tags. */
> > > > +void ea0_ranges_to_tags(u8 *r_tags, short *r_sizes, int r_len, u8 *tags)
> > > > +{
> > > > +     int i, j, pos = 0;
> > > > +     u8 prev;
> > > > +
> > > > +     for (i = 0; i < r_len; i++) {
> > > > +             for (j = 0; j < r_sizes[i]; j++) {
> > > > +                     if (pos % 2)
> > > > +                             tags[pos / 2] = (prev << 4) | r_tags[i];
> > > > +                     else
> > > > +                             prev = r_tags[i];
> > > > +                     pos++;
>
> This code flushes tags at every 2nd iteration. Is that true that
> there's always an even number of iterations, i.e. rsizes is always
> even, assuming r_len can be 1?
>
> If not, it's possible to loose a tag, consider rlen == 1 and
> rsizes[0] == 1.

By design, ea0_ranges_to_tags() only accepts ranges that sum up to 256
(as produced by ea0_tags_to_ranges())
We could add a check for that, but given that it is an internal
helper, a comment should suffice.

But r_sizes[i] does not have to be even (otherwise we could've used
this fact for a more efficient compression).
For example, if the array of tags is {0xab, 0x0, 0x0, 0x0...}, then
r_len = 3, r_sizes[] = {1, 1, 254}, r_tags = {0x0a, 0x0b, 0x0}.

>
> If yes, you can simplify:
>
>      for (i = 0; i < r_len; i++)
>              for (j = 0; j < r_sizes[i]; j++)
>                      tags[pos++] = (r_tags[i] << 4) | r_tags[i];
>
> Anyways, in the test can you run all possible combinations?

I will extract code from compress_range_helper() to generate different
numbers of ranges and test against those.


> > > > +/* Translate allocation size into maximum number of ranges that it can hold. */
> > > > +static int ea0_size_to_ranges(int size)
> > > > +{
> > > > +     switch (size) {
> > > > +     case 8:
> > > > +             return 6;
> > > > +     case 16:
> > > > +             return 11;
> > > > +     case 32:
> > > > +             return 23;
> > > > +     case 64:
> > > > +             return 46;
> > > > +     default:
> > > > +             return 0;
> > > > +     }
> > > > +}
> > >
> > > I wonder if there's a math formula here? Can you explain where from
> > > those numbers come?
> >
> > I'll add a comment there.
> > Basically, for the inline case it is the biggest N such as 4 + 4*N +
> > 7*(N-1) <= 63
> >
> > (not 64, because Xarray steals one bit from us)
> >
> > For the out-of-line case it is the biggest N such as 6+4*N + 7*(N-1)
> > <= array size in bits (128, 256, or 512).
>
> So it's: N <= (BIT_CAPACITY + NUM4 - NUM7) / (TAGSZ + SZ)
>
> Doesn't look like a rocket science...

Note that the size of largest_idx is picked based on the largest N for
64-byte buffers.
And the size of r_sizes[] depends on the number of tags that fit into
128 bytes: if we throw away the largest range, each of the remaining
ones will fit into 7 bits.
But this will not be the case for e.g. 3-bit tags (we'll need 8-bit sizes then).

Changing mte_size_to_ranges() to "(size * 8 + MTE_BITS_PER_SIZE -
largest_bits) / (MTE_BITS_PER_TAG + MTE_BITS_PER_SIZE)" indeed looks
better than hardcoded numbers, but we should keep in mind that the
parameters depend on each other and cannot be easily changed.

>
> Why don't you code the formula instead of results? Are there any
> difficulties? Things may change. For example, in next spec they
> may bring 3- or 5-bit tags, and your compressor will crack loudly
> with hardcoded numbers.

The main reason for the compressor to crack loudly would be the
hardcoded assumption of the tags consisting of 4 bits.
I don't think it is practical to account on that number to change (and
e.g. use bitmap_read/bitmap_write to convert tags into ranges) - we'd
rather have a static assert for that.


-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg



More information about the linux-arm-kernel mailing list