[PATCH v3 3/5] arm64: mte: implement CONFIG_ARM64_MTE_COMP
Yury Norov
yury.norov at gmail.com
Wed Jul 19 14:06:24 PDT 2023
On Wed, Jul 19, 2023 at 04:00:14PM +0200, Alexander Potapenko wrote:
> > > + * 4. For the inline case, the following values are stored in the 8-byte handle:
> > > + * largest_idx : i4
> > > + * r_tags[0..5] : i4 x 6
> > > + * r_sizes[0..4] : i7 x 5
> > > + * (if N is less than 6, @r_tags and @r_sizes are padded up with zero values)
> > > + *
> > > + * Because @largest_idx is <= 5, bit 63 of the handle is always 0 (so it can
> > > + * be stored in the Xarray), and bits 62..60 cannot all be 1, so it can be
> > > + * distinguished from a kernel pointer.
> >
> > I honestly tried to understand... For example, what the
> > 'r_sizes[0..4] : i7 x 5'
> > means? An array of 5 elements, 17 bits each? But it's alone greater
> > than size of pointer... Oh gosh...
>
> iN (note that it is a small i, not a 1) is LLVM notation for an N-bit
> integer type.
> There's no such thing in C syntax, and describing the data layout
> using bitfields would be painful.
> Would it help if I add a comment explaining that iN is an N-bit
> integer? Or do you prefer something like
>
> r_sizes[0..4] : 5 x 7 bits
>
> ?
Yes, that would help.
> >
> > Moreover, MTE tags are all 4-bits.
> >
> > It seems like text without pictures is above my mental abilities. Can you
> > please illustrate it? For example, from the #4, I (hopefully correctly)
> > realized that:
> >
> > Inline frame format:
> > 0 60 62 63
> > +---------------------------------------------------------------------+
> I think it's more natural to number bits from 63 to 0
>
> > | No idea what happens here | Lidx | X |
> > +---------------------------------------------------------------------+
> > 63 : X : RAZ : Reserved for Xarray.
> > 60-62 : Lidx : 0..5 : Largest element index.
>
> There's some mismatch now between this picture, where Lidx is i3, and
> the implementation that treats it as an i4, merging bit 63 into it.
> Maybe we can avoid this by not splitting off bit 63?
>
> > 6 : Reserved
> > 7 : Invalid handler
>
> No, 7 means "treat this handle as an out-of-line one". It is still
> valid, but instead of tag data it contains a pointer.
So, it's invalid combination for _inline_ handler, right? Anyways, I'm
waiting for an updated docs, so it will hopefully bring some light.
> > > +
> > > +/* Transform tag ranges back into tags. */
> > > +void ea0_ranges_to_tags(u8 *r_tags, short *r_sizes, int r_len, u8 *tags)
> > > +{
> > > + int i, j, pos = 0;
> > > + u8 prev;
> > > +
> > > + for (i = 0; i < r_len; i++) {
> > > + for (j = 0; j < r_sizes[i]; j++) {
> > > + if (pos % 2)
> > > + tags[pos / 2] = (prev << 4) | r_tags[i];
> > > + else
> > > + prev = r_tags[i];
> > > + pos++;
This code flushes tags at every 2nd iteration. Is that true that
there's always an even number of iterations, i.e. rsizes is always
even, assuming r_len can be 1?
If not, it's possible to loose a tag, consider rlen == 1 and
rsizes[0] == 1.
If yes, you can simplify:
for (i = 0; i < r_len; i++)
for (j = 0; j < r_sizes[i]; j++)
tags[pos++] = (r_tags[i] << 4) | r_tags[i];
Anyways, in the test can you run all possible combinations?
> > > + }
> > > + }
> > > +}
> > > +EXPORT_SYMBOL_NS(ea0_ranges_to_tags, MTECOMP);
> >
> > Because I didn't understand the compressed frame format, not sure I
> > can understand this logic...
> Hope the above comments will help, if not - please let me know.
>
> > > +
> > > +/* Translate @num_ranges into the allocation size needed to hold them. */
> > > +static int ea0_alloc_size(int num_ranges)
> > > +{
> > > + if (num_ranges <= 6)
> > > + return 8;
> > > + if (num_ranges <= 11)
> > > + return 16;
> > > + if (num_ranges <= 23)
> > > + return 32;
> > > + if (num_ranges <= 46)
> > > + return 64;
> > > + return 128;
> > > +}
> > > +
> > > +/* Translate allocation size into maximum number of ranges that it can hold. */
> > > +static int ea0_size_to_ranges(int size)
> > > +{
> > > + switch (size) {
> > > + case 8:
> > > + return 6;
> > > + case 16:
> > > + return 11;
> > > + case 32:
> > > + return 23;
> > > + case 64:
> > > + return 46;
> > > + default:
> > > + return 0;
> > > + }
> > > +}
> >
> > I wonder if there's a math formula here? Can you explain where from
> > those numbers come?
>
> I'll add a comment there.
> Basically, for the inline case it is the biggest N such as 4 + 4*N +
> 7*(N-1) <= 63
>
> (not 64, because Xarray steals one bit from us)
>
> For the out-of-line case it is the biggest N such as 6+4*N + 7*(N-1)
> <= array size in bits (128, 256, or 512).
So it's: N <= (BIT_CAPACITY + NUM4 - NUM7) / (TAGSZ + SZ)
Doesn't look like a rocket science...
Why don't you code the formula instead of results? Are there any
difficulties? Things may change. For example, in next spec they
may bring 3- or 5-bit tags, and your compressor will crack loudly
with hardcoded numbers.
More information about the linux-arm-kernel
mailing list