[PATCH 2/2] riscv: Eagerly flush in flush_icache_deferred()
Andrea Parri
parri.andrea at gmail.com
Wed Aug 14 17:34:19 PDT 2024
> <---- Thread 1 starts running here on CPU1
>
> <---- Thread 2 starts running here with same mm
>
> T2: prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_ON, PR_RISCV_SCOPE_PER_PROCESS);
> T2: <-- kernel sets current->mm->context.force_icache_flush to true
Mmh, TBH, I'm not sure how this patch is supposed to fix the race in
question:
For once, AFAIU, the operation
T2: cpumask_setall(¤t->mm->context.icache_stale_mask)
(on CPU2?) you've added with this patch...
> T2: <modification of instructions>
> T2: fence.i
>
> T1: fence.i (to synchronize with other thread, has some logic to
> determine when to do this)
> T1: <-- thread 1 is preempted
> T1: <-- thread 1 is placed onto CPU3 and starts context switch sequence
> T1 (kernel): flush_icache_deferred() -> skips flush because switch_to_should_flush_icache() returns true
... does _not_ ensure that T1: flush_icache_deferred() on CPU3 will
observe/read from that operation: IOW,
T1: cpumask_test_and_clear_cpu(cpu, &mm->context.icache_stale_mask)
may still evaluate to FALSE, thus preventing the FENCE.I execution.
Moreover, AFAIU, ...
> -> thread has migrated and task->mm->context.force_icache_flush is true
>
> T2: prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_OFF, PR_RISCV_SCOPE_PER_PROCESS);
... moving the operation(s)
T2: set_icache_stale_mask()
T2: cpumask_setall(¤t->mm->context.icache_stale_mask)
before the following operation... (per patch #1)
> T2 (kernel): kernel sets current->mm->context.force_icache_flush = false
>
> T1 (kernel): switch_to() calls switch_to_should_flush_icache() which now
> returns false because task->mm->context.force_icache_flush
> is false due to the other thread emitting
> PR_RISCV_CTX_SW_FENCEI_OFF.
... does not ensure that T1: switch_to_should_flush_icache() on CPU3
will observe
T1: cpumask_test_cpu(<cpu3>, &task->mm->context.icache_stale_mask) == true
in fact, T1 may evaluate the latter expression to FALSE while still
being able to observe the "later" operation, i.e.
T1: task->mm->context.force_icache_flush == false
Perhaps a simplified but useful way to look at such scenarios is as
follows:
- CPUs are just like nodes of a distributed system, and
- store are like messages to be exchanged (by the memory subsystem)
between CPUs: without some (explicit) synchronization/constraints,
messages originating from a given CPU can propagate/be visible to
other CPUs at any time and in any order.
IAC, can you elaborate on the solution proposed here (maybe by adding
some inline comments), keeping the above considerations in mind? what
am I missing?
> T1 (back in userspace): Instruction cache was never flushed on context
> switch to CPU3, and thus may execute incorrect
> instructions.
Mmh, flushing the I$ (or, as meant here, executing a FENCE.I) seems
to be only half of the solution: IIUC, we'd like to ensure that the
store operation
T2: <modification of instructions>
originated from CPU2 is _visible_ to CPU3 by the time that FENCE.I
instruction is executed, cf.
[from Zifencei - emphasis mine]
A FENCE.I instruction ensures that a subsequent instruction fetch on
a RISC-V hart will see any previous data stores _already visible_ to
the same RISC-V hart.
IOW (but assuming code is in coherent main memory), imagine that the
(putative) FENCE.I on CPU3 is replaced by some
T1: LOAD reg,0(insts_addr)
question is: would such a load be guaranteed to observe the store
T2: <modification of instructions> # STORE new_insts,0(insts_addr)
originated from CPU2? can you elaborate?
Andrea
More information about the linux-riscv
mailing list