[PATCH v3 08/10] ARM: mxs: add ocotp read function

Thu Jan 6 04:13:01 EST 2011

On Thu, Jan 06, 2011 at 12:50:52AM +0000, Jamie Lokier wrote:
> Russell King - ARM Linux wrote:
> > On Wed, Jan 05, 2011 at 07:44:18PM +0000, Jamie Lokier wrote:
> > > 'git show 534be1d5' explains how it works: cpu_relax() flushes buffered
> > > writes from _this_ CPU, so that other CPUs which are polling can make
> > > progress, which avoids this CPU getting stuck if there is an indirect
> > > dependency (no matter how convoluted) between what it's polling and which
> > > it wrote just before.
> > > 
> > > So cpu_relax() is *essential* in some polling loops, not a hint.
> > > 
> > > In principle that could happen for I/O polling, if (a) buffered memory
> > > writes are delayed by I/O read transactions, and (b) the device state we're
> > > waiting on depends on I/O yet to be done on another CPU, which could be
> > > polling memory first (e.g. a spinlock).
> > > 
> > > I doubt (a) in practice - but what about buses that block during I/O read?
> > > (I have a chip like that here, but it's ARMv4T.)
> > 
> > Let's be clear - ARMv5 and below generally are well ordered architectures
> > within the limits of caching.  There are cases where the write buffer
> > allows two writes to pass each other.  However, for IO we generally map
> > these - especially for ARMv4 and below - as 'uncacheable unbufferable'.
> > So on these, if the program says "read this location" the pipeline will
> > stall until the read has been issued - and if you use the result in the
> > next instruction, it will stall until the data is available.  So really,
> > it's not a problem here.
> > 
> > ARMv6 and above have a weakly ordered memory model with speculative
> > prefetching, so memory reads/writes can be completely unordered.  Device
> > accesses can pass memory accesses, but device accesses are always visible
> > in program order with respect to each other.
> > 
> > So, if you're spinning in a loop reading an IO device, all previous IO
> > accesses will be completed (in all ARM architectures) before the result
> > of your read is evaluated.
> 
> No, that wasn't the scenario - it was:
> 
> You're spinning reading an IO device, whose state depends indirectly
> on a *CPU memory* write that is forever buffered.
> 
> (Go and re-read 'git show 534be1d5' if you haven't already.)

I know what that's about, and it's about memory based accesses _only_.

What you're talking about is a programming error.  Such errors cause
data corruption if you're talking about DMA stuff.

At the moment, the solution to that is to put whatever's necessary into
readl/writel to ensure that they behave as ordered operations with
respect to everything else.  You'll find that on ARM, writel has a
barrier before it to ensure memory writes are visible before the device
write, and on readl there's a barrier to ensure that no memory read can
happen before the IO device read.

cpu_relax() has nothing to do with ensuring ordering with devices.

With relaxed IO operations, the responsibility for ensuring proper ordering
between memory and IO falls to the programmer.