[PATCH RFC v2 riscv/for-next 0/5] Enable ftrace with kernel preemption for RISC-V

Andy Chiu andy.chiu at sifive.com
Wed Mar 20 09:37:00 PDT 2024


On Tue, Mar 19, 2024 at 10:50 PM Alexandre Ghiti <alex at ghiti.fr> wrote:
>
> On 11/03/2024 15:24, Andy Chiu wrote:
> > On Thu, Mar 7, 2024 at 11:57 PM Samuel Holland
> > <samuel.holland at sifive.com> wrote:
> >> Hi Alex,
> >>
> >> On 2024-03-07 7:21 AM, Alexandre Ghiti wrote:
> >>> But TBH, I have started thinking about the issue your patch is trying to deal
> >>> with. IIUC you're trying to avoid traps (or silent errors) that could happen
> >>> because of concurrent accesses when patching is happening on a pair auipc/jarl.
> >>>
> >>> I'm wondering if instead, we could not actually handle the potential traps:
> >>> before storing the auipc + jalr pair, we could use a well-identified trapping
> >>> instruction that could be recognized in the trap handler as a legitimate trap.
> >>> For example:
> >>>
> >>>
> >>> auipc  -->  auipc  -->  XXXX  -->  XXXX  -->  auipc
> >>> jalr        XXXX        XXXX       jalr       jalr
> >>>
> >>>
> >>> If a core traps on a XXXX instruction, we know this address is being patched, so
> >>> we can return and probably the patching will be over. We could also identify
> >>> half patched word instruction (I mean with only XX).
> >> Unfortunately this does not work without some fence.i in the middle. The
> >> processor is free to fetch any instruction that has been written to a location
> >> since the last fence.i instruction. So it would be perfectly valid to fetch the
> >> old aiupc and new jalr or vice versa and not trap. This would happen if, for
> >> example, the two instructions were in different cache lines, and only one of the
> >> cache lines got evicted and refilled.
> >>
> >> But sending an IPI to run the fence.i probably negates the performance benefit.
> > Maybe something like x86, we can hook ftrace_replace_code() out and
> > batch send IPIs to prevent storms of remote fences. The solution Alex
> > proposed can save the code size for function entries. But we have to
> > send out remote fences at each "-->" transition, which is 4 sets of
> > remote IPIs. On the other hand, this series increases the per-function
> > patch size to 24 bytes. However, it decreases the number of remote
> > fences to 1 set.
> >
> > The performance hit could be observable for the auipc + jalr case,
> > because all remote cores will be executing on XXXX instructions and
> > take a trap at each function entry during code patching.
> >
> > Besides, this series would give us a chance not to send any remote
> > fences if we were to change only the destination of ftrace (e.g. to a
> > custom ftrace trampoline). As it would be a regular store for the
> > writer and regular load for readers, only fence w,w is needed.
> > However, I am not very certain on how often would be for this
> > particular use case. I'd need some time to investigate it.
> >
> >> Maybe there is some creative way to overcome this.
> >>
> >>> But please let me know if that's completely stupid and I did not understand the
> >>> problem, since my patchset to support svvptc, I am wondering if it is not more
> >>> performant to actually take very unlikely traps instead of trying to avoid them.
> >> I agree in general it is a good idea to optimize the hot path like this.
> >>
> >> Regards,
> >> Samuel
> >>
> > Regards,
> > Andy
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
>
>
> So indeed my solution was way too naive and we've been discussing that
> with Björn lately. He worked a lot on that and came up with the solution
> he proposed here
> https://lore.kernel.org/linux-riscv/87zfv0onre.fsf@all.your.base.are.belong.to.us/
>
> The thing is ftrace seems to be quite broken as the ftrace kselftests
> raise a lot of issues which I have started to debug but are not that
> easy, so we are wondering if *someone* should not work on Bjorn's
> solution (or another, open to discussions) for 6.10. @Andy WDYT? Do you
> have free cycles? Björn could work on that too (and I'll help if needed).

Do you mean the FTRACE_STARTUP_TEST, or something else? I am also
happy to help on text patching issues. It would be great if we could
define the remaining works and share them. Currently I am focusing on
having dynamic ftrace with preemption and getting rid of
stop_machine() while patching code. I am going to spin a revision of
this patch series in a few days if possible. There are quite some
things needed to be discussed and I'd like to join any conversation!

>
> Let me know what you think!
>
> Alex
>
>

Cheers,
Andy



More information about the linux-riscv mailing list