[PATCH v8 01/11] ARM: brcmstb: add infrastructure for ARM-based Broadcom STB SoCs
Brian Norris
computersforpeace at gmail.com
Wed Aug 13 16:47:06 PDT 2014
Hi Russell,
Picking up this thread again, as things are now set for dropping this
patch and resubmitting SMP support for 3.18.
On Sat, Aug 02, 2014 at 10:27:56AM +0100, Russell King wrote:
> On Thu, Jul 31, 2014 at 03:06:42PM -0700, Brian Norris wrote:
> > Yes, I noticed this. What I meant is that smp_ops.cpu_die() and
> > smp_ops.cpu_kill() are not synchronized.
> ...
> > We're not relying on the L1 cache, though. Don't sync_cache_{r,w}()
> > ensure all reads/writes reach at least the L2?
>
> What exactly are you trying to achieve here?
Synchronization between v7_exit_coherency_flush() (on the dying CPU) and
yanking the power (brcmstb_cpu_kill(), on the murderous CPU). The core
completion-based synchronization is not sufficient, since it allows
brcmstb_smp_ops.smp_kill and brcmstb_smp_ops.smp_die to race.
Am I somehow not achieving what I intend here?
> > How does that ensure that the CPU is down by the time the work is
> > scheduled? It seems like this would just defer the work long enough that
> > it *most likely* has quiesced, but I don't see how this gives us a
> > better guarantee. Or maybe I'm missing something. (If so, please do
> > enlighten!)
>
> Note that I said a delayed work queue. The dying CPU runs a predictable
> sequence once cpu_die() has been entered - interrupts at the GIC have
> been programmed to be routed to other CPUs, interrupts at the CPU are
> masked, so the CPU isn't going to be doing anything else except executing
> that code path. So, it's going to be a predictable number of CPU cycles.
>
> That allows you to arrange a delayed workqueue (or a timer) to fire
> after a period of time that you can guarantee that the dying CPU has
> reached that wfi().
OK, that sounds workable for the active hotplug case.
But what about for the suspend case? CPUs are hot-unplugged during
disable_nonboot_cpus(), and I don't see that this would guarantee the
workqueue will complete before we enter suspend.
> Another point which raises itself in your patch is this:
>
> + /* Settle-time from Broadcom-internal DVT reference code */
> + udelay(7);
>
> 7us looks very precise, but udelay() may not be that precise. What is
> the actual specification? Does it say "you must wait at least 7us"?
>
> udelay() _may_ return early, especially if it is using the CPU delay
> loop to perform the delay - I've explained why this happens previously,
> and why it isn't a bug.
>
> If you're using a timer-based delay for udelay() (which you should be
> using if you support cpufreq) then the delay should be more accurate,
> but it's still good practise to give a little leeway on the figure.
I'm looking into this specific delay. I'd bet it's just "wait at least
7us." I could probably factor in some leeway to be safe.
Thanks,
Brian
More information about the linux-arm-kernel
mailing list