[RFC PATCH] arm64: neon: Remove support for nested or hardirq kernel-mode NEON

Mark Rutland mark.rutland at arm.com
Fri May 19 06:34:25 PDT 2017


On Fri, May 19, 2017 at 02:13:21PM +0100, Dave Martin wrote:
> On Fri, May 19, 2017 at 01:49:53PM +0100, Mark Rutland wrote:
> > Hi,
> > 
> > On Fri, May 19, 2017 at 12:26:39PM +0100, Dave Martin wrote:
> > > +static bool __maybe_unused kernel_neon_allowed(void)
> > > +{
> > > +	/*
> > > +	 * The per_cpu_ptr() is racy if called with preemption enabled.
> > > +	 * This is not a bug:  per_cpu(kernel_neon_busy) is only set
> > > +	 * when preemption is disabled, so we cannot migrate to another
> > > +	 * CPU while it is set, nor can we migrate to a CPU where it is set.
> > > +	 * So, if we find it clear on some CPU then we're guaranteed to find it
> > > +	 * clear on any CPU we could migrate to.
> > > +	 *
> > > +	 * If we are in between kernel_neon_begin()...kernel_neon_end(), the
> > > +	 * flag will be set, but preemption is also disabled, so we can't
> > > +	 * migrate to another CPU and spuriously see it become false.
> > > +	 */
> > > +	return !(in_irq() || in_nmi()) &&
> > > +		!local_read(this_cpu_ptr(&kernel_neon_busy));
> > > +}
> > 
> > I think it would be better to use the this_cpu ops for this, rather than
> > manually messing with the pointer.
> 
> I had thought that the properties of local_t were important here, but
> this_cpu ops seem equally appropriate.
> 
> > Here, we can use raw_cpu_read(kernel_neon_busy), given the comment
> > above.
> 
> What's the difference?  The raw_ variants aren't documented.  Do these
> not bother about atomicity between the address calculation and the read?

Yup.

Comparing raw_cpu_{x}, __this_cpu_{x}, and this_cpu_{x}:

* raw_cpu_{x} don't have any preemption checks, and don't disable
  preemption internally. Due to this, the address gen and operation can
  occur on different CPUs.

* __this_cpu_{x} have lockdep annotations to check that preemption is
  disabled, but otherwise behave the same as raw_cpu_{x}. i.e. they
  don't try to disable preemption internally.

* this_cpu_{x} ensure that preemption is disabled around the address gen
  and operation.

Since preemption may be possible, but we believe this is fine, we have
to use the raw form. We'll want to keep the commment explaining why.

Thanks,
Mark.



More information about the linux-arm-kernel mailing list