[PATCH] Add a counter and debug printf to change the latency inside tlb_sync
Jessica Clarke
jrtc27 at jrtc27.com
Fri Jan 10 11:12:03 PST 2025
On 10 Jan 2025, at 19:08, Jessica Clarke <jrtc27 at jrtc27.com> wrote:
>
> On 10 Jan 2025, at 02:16, Chao-ying Fu <icebergfu at gmail.com> wrote:
>>
>> On Thu, Jan 9, 2025 at 5:18 PM Bo Gan <ganboing at gmail.com> wrote:
>>>
>>> Hi Chaoying,
>>>
>>> I don't think this patch does anything at all. Using printf to delay is naive.
>>> Can you give concrete example of how the livelock happened and when it happens
>>> some precise number on the performance impact?
>>
>> Yes, the patch is naive, and we can treat it as the debug patch.
>> However, this patch does help us to avoid livelock.
>> From our 24-hart configuration, one hart cannot update the variable
>> tlb_sync, because another hart always wins to read the value. Then,
>> the tlb_sync loop never exits.
>
> That sounds like your core violates the forward progress guarantee
> required by the RISC-V spec.
Except atomic_add_return uses amoadd.d.aqrl, which doesn’t even need
such a guarantee, because it’s a single instruction, so that’s saying
your hart can wedge on AMOs. Unless you’ve also patched OpenSBI to use
LR/SC there, in which case we’re back to the forward progress guarantee
being violated.
I have serious concerns about this implementation.
Jess
More information about the opensbi
mailing list