[PATCH v8 01/14] riscv: prevent null-pointer dereference with sbi_remote_fence_i
Heiko Stübner
heiko at sntech.de
Thu Mar 31 05:28:06 PDT 2022
Hi,
Am Donnerstag, 31. März 2022, 11:51:55 CEST schrieb Christoph Hellwig:
> On Thu, Mar 24, 2022 at 01:06:57AM +0100, Heiko Stuebner wrote:
> > The callback used inside sbi_remote_fence_i is set at sbi probe time
> > to the needed variant. Before that it is a NULL pointer.
> >
> > Some users like the flush_icache_*() functions suggest a generic
> > functionality, that doesn't depend on a specific boot-stage but
> > uses sbi_remote_fence_i as one option to flush other cpu cores.
> >
> > So they definitely shouldn't run into null-pointer dereference
> > issues when called "too early" during boot.
> >
> > So introduce an empty function to be the standard for the __sbi_rfence
> > function pointer until sbi_init has run.
> >
> > Users of sbi_remote_fence_i will have separate code for the local
> > cpu and sbi_init() is called before other cpus are brought up.
> > So there are no other cpus present at the time when the issue
> > might happen.
>
> I don't really understand this changelog. If flush_icache_* or
> other routines using SBI calls are called too early they won't
> do what they are asked to do, which implies a bug in the code.
>
> So crashing absolutely is the right thing to do here as we don't
> really have any other error reporting method available.
>
> So unless I'm totally misunderstanding what you are saying here:
>
> Nacked-by: Christoph Hellwig <hch at lst.de>
The function is defined as
void flush_icache_all(void)
{
local_flush_icache_all();
if (IS_ENABLED(CONFIG_RISCV_SBI))
sbi_remote_fence_i(NULL);
else
on_each_cpu(ipi_remote_fence_i, NULL, 1);
}
so essentially flushes the _local_ icache first and then tries to flush
caches on other cores, either via an ipi or via sbi.
The remote-fence callback is set correctly during sbi_init().
The other cores are only brought up after sbi-init is done.
So it's not really about error reporting but making sure that flush_icache_all()
does something sane even when still running on the first core.
As I assume the "all" means on all available cores (which would be the
core the system booted on).
Does this make it clearer what this tries to solve?
Heiko
More information about the linux-riscv
mailing list