[PATCH v2] riscv: Add support for BATCHED_UNMAP_TLB_FLUSH
Nam Cao
namcao at linutronix.de
Mon Jan 8 12:43:46 PST 2024
On Mon, 8 Jan 2024 20:36:40 +0100 Alexandre Ghiti <alexghiti at rivosinc.com> wrote:
> Allow to defer the flushing of the TLB when unmapping pages, which allows
> to reduce the numbers of IPI and the number of sfence.vma.
>
> The ubenchmarch used in commit 43b3dfdd0455 ("arm64: support
> batched/deferred tlb shootdown during page reclamation/migration") that
> was multithreaded to force the usage of IPI shows good performance
> improvement on all platforms:
>
> * Unmatched: ~34%
> * TH1520 : ~78%
> * Qemu : ~81%
>
> In addition, perf on qemu reports an important decrease in time spent
> dealing with IPIs:
>
> Before: 68.17% main [kernel.kallsyms] [k] __sbi_rfence_v02_call
> After : 8.64% main [kernel.kallsyms] [k] __sbi_rfence_v02_call
>
> * Benchmark:
>
> int stick_this_thread_to_core(int core_id) {
> int num_cores = sysconf(_SC_NPROCESSORS_ONLN);
> if (core_id < 0 || core_id >= num_cores)
> return EINVAL;
>
> cpu_set_t cpuset;
> CPU_ZERO(&cpuset);
> CPU_SET(core_id, &cpuset);
>
> pthread_t current_thread = pthread_self();
> return pthread_setaffinity_np(current_thread,
> sizeof(cpu_set_t), &cpuset);
> }
>
> static void *fn_thread (void *p_data)
> {
> int ret;
> pthread_t thread;
>
> stick_this_thread_to_core((int)p_data);
>
> while (1) {
> sleep(1);
> }
>
> return NULL;
> }
>
> int main()
> {
> volatile unsigned char *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
> MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> pthread_t threads[4];
> int ret;
>
> for (int i = 0; i < 4; ++i) {
> ret = pthread_create(&threads[i], NULL, fn_thread, (void *)i);
> if (ret)
> {
> printf("%s", strerror (ret));
> }
> }
>
> memset(p, 0x88, SIZE);
>
> for (int k = 0; k < 10000; k++) {
> /* swap in */
> for (int i = 0; i < SIZE; i += 4096) {
> (void)p[i];
> }
>
> /* swap out */
> madvise(p, SIZE, MADV_PAGEOUT);
> }
>
> for (int i = 0; i < 4; i++)
> {
> pthread_cancel(threads[i]);
> }
>
> for (int i = 0; i < 4; i++)
> {
> pthread_join(threads[i], NULL);
> }
>
> return 0;
> }
>
> Signed-off-by: Alexandre Ghiti <alexghiti at rivosinc.com>
> Reviewed-by: Jisheng Zhang <jszhang at kernel.org>
> Tested-by: Jisheng Zhang <jszhang at kernel.org> # Tested on TH1520
Before:
real 0m36.674s
user 0m0.173s
sys 0m36.493s
After:
real 0m18.016s
user 0m0.125s
sys 0m17.885s
Tested-by: Nam Cao <namcao at linutronix.de>
Best regards,
Nam
More information about the linux-riscv
mailing list