[PATCH 0/2] ioremap_wc on arm64

Jayachandran C jnair at caviumnetworks.com
Tue May 23 03:09:03 PDT 2017


On Mon, May 22, 2017 at 04:49:47PM +0100, Catalin Marinas wrote:
> On Mon, May 22, 2017 at 02:21:00PM +0000, Jayachandran C wrote:
> > On Mon, May 22, 2017 at 01:22:37PM +0100, Will Deacon wrote:
> > > On Mon, May 22, 2017 at 11:53:50AM +0000, Jayachandran C wrote:
> > > > On Mon, May 22, 2017 at 09:56:16AM +0100, Catalin Marinas wrote:
> > > > > On Mon, May 22, 2017 at 07:01:45AM +0000, Jayachandran C wrote:
> > > > > > From its definition, the device "gather" attribute seems to be a better
> > > > > > fit for implementing write combining mapping in ioremap_wc().  And on
> > > > > > ThunderX2, Device GRE mapping has optimizations that makes it much faster
> > > > > > than normal uncached mapping.
> > > > > > 
> > > > > > I am not sure of the reasoning behind the original decision to make
> > > > > > ioremap_wc use "Normal Non-Cached" attribute, since all the other variants
> > > > > > of ioremap use device attributes, and ioremap_wc looks like an exception.
> > > > > 
> > > > > The reason we kept it as Normal NC is that Device_GRE does not allow
> > > > > unaligned accesses.
> > > > 
> > > > There does not to be an expectation to have unaligned access on __iomem
> > > > pointers. I also see that memremap can call ioremap_wc (which is a Normal
> > > > mapping) or ioremap_wt(which is a Device mapping), so that is inconsistent
> > > > as well.
> > > > 
> > > > Was this added for a specific use case? Also, do you think this patchset
> > > > is acceptable?
> > > 
> > > I think it was used for framebuffers. Note that using normal-NC also aligns
> > > with arch/arm/ and, since normal-NC is strictly a more relaxed memory type
> > > than Device-GRE, I'm really not keen making on this change.
> > 
> > If there is a hardware requirement that Normal-NC has to be implemented
> > as more relaxed attribute than Device GRE, I can go back to the hardware
> > team on this. I did not see any text in the ARM ARM requiring this.
> 
> It depends what you mean by "relaxed". Normal NC allows unaligned
> accesses and also allows speculative loads (you don't get any ordering
> guarantees of the Device memory). If you mean skipping some transparent
> cache snooping, it may break other use-cases.

Ok. I was using Will's terminology here. This is not the main issue,
but worth understanding. If there is a requirement in ARM spec that
"Normal NC" must be "strictly more releaxed" than "Device GRE", I will
have to point that out to our hardware team.

> > Even if that is the case, having Normal attribute for ioremap just for
> > ioremap_wc is inconsistent. The Device Gathering attribute by its definition
> > is exactly what write combining requires. The memremap() API looks like a
> > better way to expose Normal-NC mapping (with additional features like
> > speculation etc.) if implemented correctly.
> 
> I agree, memremap() is a better way but this was merged in 4.3 while
> ioremap_wc() had been around for a long time. If you want to change its
> semantics, you'd need to go through all its uses in the kernel and make
> sure they *only* use I/O accessors with such memory. At a quick grep,
> there are several places where the __iomem pointer attribute is dropped
> shortly after ioremap() and the pointer is accessed directly. That's
> where things will break with Device GRE memory since the compiler
> doesn't guarantee aligned accesses.

There is a difference in behavior with regard to unaligned access between
ioremap_wc and ioremap already. I don't follow why unaligned access has to
supported for ioremap_wc but not for ioremap - it would be much better to
be consistent here.

> > Also, patch 1/2 should be useful anyway - can that be picked up?
> 
> The patch is harmless but are we going to have a user of Device_GRE?

This patch would make the attribute in MAIR and the index MT_DEVICE_GRE
useful for __ioremap. Otherwise I don't see the use in having the
attribute at  all. I can at least give this to people who want Device GRE
mapping, which is the fastest way to access device memory on ThunderX2.

JC.



More information about the linux-arm-kernel mailing list