[RFC PATCH 3/5] RISC-V: hwprobe: Introduce which-cpus flag

Andrew Jones ajones at ventanamicro.com
Mon Oct 9 08:39:44 PDT 2023


On Thu, Oct 05, 2023 at 08:11:25PM +0200, Andrew Jones wrote:
> On Thu, Oct 05, 2023 at 10:12:14AM -0700, Evan Green wrote:
> > On Thu, Oct 5, 2023 at 6:23 AM Andrew Jones <ajones at ventanamicro.com> wrote:
> > >
> > > On Mon, Sep 25, 2023 at 09:16:01AM -0700, Evan Green wrote:
> ...
> > > > So if I write out this algorithm, I get something like:
> > > >  * Create an array of every possible key, and dedupe the caller's list
> > > > of pairs into this array.
> > > >  * For each remaining cpu, go through this array and either confirm
> > > > the big array's element matches this cpu's value, or clear the cpu
> > > > from the result set.
> > > >
> > > > But why do we go to all the effort of de-duping the caller's array of
> > > > pairs? Can't they do that themselves (or pay a small performance
> > > > penalty for "yes" results)? Instead, couldn't it be something like:
> > > > For each pair in the user's set, for each remaining cpu in the set,
> > > > compare the values, or clear the cpu in the remaining set.
> > > >
> > > > Doing that would also take the runtime from O(keyspace * ncpus) to
> > > > O(query_lengh * ncpus).
> > >
> > > I want to de-dupe for two reasons:
> > >  * query_length is unbounded, but keyspace is bounded (and is currently
> > >    small)
> > 
> > Ok, but remember that if we ship this behavior today, we're committed
> > to it forever. The keyspace is likely to grow, it would be unfortunate
> > if this step started to cause a noticeable performance delay.
> 
> Maybe it's not too late to put a bound on pairs, i.e. pair_count greater
> than some number should return E2BIG.
>

Scratch this idea. I went looking for precedent for limiting the length of
input arrays to syscalls, but couldn't find any. To the contrary, I found
that move_pages() used to return E2BIG when there were "too many pages to
move", but it hasn't done so since 2.6.29 and, even then, it appears the
concern was multiplication overflow, not having "too much" work. So, we
should leave the number of pairs unbounded, but, too me, that means we
should de-dupe, since we can do the copy_from_user() once for each at that
time, and any user which decides to provide each bit separately for each
bitmask key type should get tiny speedup.

Thanks,
drew



More information about the linux-riscv mailing list