Data Synchronization Barrier (DSB)
Mason
mpeg.blue at free.fr
Wed Dec 3 08:20:26 PST 2014
Hello everyone,
I have several naive questions about memory barriers.
I've glanced at memory-barriers.txt
https://www.kernel.org/doc/Documentation/memory-barriers.txt
QUESTION 1
Are memory ordering issues and caching orthogonal?
In other words, are memory barriers needed even when accessing
non-cached memory (actual memory or device registers)?
QUESTION 1.1
These days, CPUs feature multiple cores, but they often share
the last level of cache. (Implied assumption: the cores are more
tightly-coupled than yesterday's SMP systems) Do multi-core
systems change the need for memory barriers? (Compared to a
unicore system.)
QUESTION 2
On my platform, the MMIO primitives are aliased to __raw_readl
and __raw_writel.
#define IO_ADDRESS(x) (0xf0000000 +(x))
#define gbus_read_reg32(r) __raw_readl((volatile void __iomem *)IO_ADDRESS(r))
#define gbus_write_reg32(r, v) __raw_writel(v, (volatile void __iomem *)IO_ADDRESS(r))
Arnd Bergmann has already pointed out:
"don't use __raw_readl in driver code, use readl or readl_relaxed"
In fact, on ARM platforms, __raw_readl does not insert any memory
barrier (or compiler barrier for that matter, the only constraints
are those imposed by the "volatile" keyword)
static inline u32 __raw_readl(const volatile void __iomem *addr)
{
u32 val;
asm volatile("ldr %1, %0"
: "+Qo" (*(volatile u32 __force *)addr),
"=r" (val));
return val;
}
If I understand correctly, accessing memory-mapped registers without
using memory barriers can lead to subtle bugs, from memory reordering?
(This part is really unclear for me.)
Should I alias my primitives to ioread32 and iowrite32?
NOTE: iowrite32 calls outer_sync() which seems to have somewhat high
of an overhead. If I'm writing to 4 consecutive MM registers, do I
need to sync after each write?
Regards.
Notes for my own reference...
Comment from barrier.h
/*
* Force strict CPU ordering. And yes, this is required on UP too when we're
* talking to devices.
*
* Fall back to compiler barriers if nothing better is provided.
*/
/* IO barriers */
#ifdef CONFIG_ARM_DMA_MEM_BUFFERABLE /* y */
#include <asm/barrier.h>
#define __iormb() rmb()
#define __iowmb() wmb()
#elif defined(CONFIG_ARM_DMA_MEM_BUFFERABLE) || defined(CONFIG_SMP)
#define mb() do { dsb(); outer_sync(); } while (0)
#define rmb() dsb()
#define wmb() do { dsb(st); outer_sync(); } while (0)
#if __LINUX_ARM_ARCH__ >= 7
#define dsb(option) __asm__ __volatile__ ("dsb " #option : : : "memory")
#ifdef CONFIG_OUTER_CACHE_SYNC
static inline void outer_sync(void)
{
if (outer_cache.sync)
outer_cache.sync();
}
static void l2x0_cache_sync(void)
{
unsigned long flags;
raw_spin_lock_irqsave(&l2x0_lock, flags);
cache_sync();
raw_spin_unlock_irqrestore(&l2x0_lock, flags);
}
static inline void cache_sync(void)
{
void __iomem *base = l2x0_base;
writel_relaxed(0, base + sync_reg_offset);
cache_wait(base + L2X0_CACHE_SYNC, 1);
}
More information about the linux-arm-kernel
mailing list