[PATCH 1/2] RISC-V: Probe for unaligned access speed

David Laight David.Laight at ACULAB.COM
Fri Jun 30 01:29:42 PDT 2023


...
> Yeah, one thing I could do is disable interrupts, measure the cycle
> count of doing an individual iteration, do this N times, and take the
> minimum value as the time to compare. In the end I'll then have two
> numbers to compare, like I do in this patch. In theory the variance on
> that should be really tight. N will have to depend on the overall
> amount of time I'm taking so as not to shut interrupts off for very
> long. Let me experiment with this and see how the results look.
> -Evan

I doubt you'll need many iterations or a long test.

You can do tests in userspace without disabling pre-emption
or interrupts - the large/silly values they generate are
easily ignored.

I suspect you'll get enough info from something like:
	unsigned long x[2];
	volatile unsigned long *p = (void *)((unsigned char *)x + 1);
	full_cpu_barrier()
	start = rdtsc();
	full_cpu_barrier();
	*p; *p; *p; *p; *p; *p; *p; *p;
	*p; *p; *p; *p; *p; *p; *p; *p;
	full_cpu_barrier()
	elapsed = rdtsc() - start;
Once the i-cache is loaded it should be pretty constant.
For aligned addresses I'd expect each extra '*p' to be
one more clock.
With hardware support for misaligned transfers at most
2 clocks (test on x86 and it will be 1 clock).
The emulated version will be 100s or 1000s.
	
I'm not sure how much of a cpu barrier you need.
Definitely needs to wait for all memory accesses
and the rdtsc().

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


More information about the linux-riscv mailing list