[PATCH v5 2/6] arch: unify ioremap prototypes and macro aliases

Tue Jul 7 03:13:30 PDT 2015

On Tue, Jul 07, 2015 at 11:50:12AM +0200, Luis R. Rodriguez wrote:
> mcgrof at ergon ~/linux-next (git::kill-mtrr)$ git grep ioremap_nocache drivers/| wc -l
> 359

Yes, it's because we have:
(a) LDD telling people they should be using ioremap_nocache() for mapping
    devices.
(b) We have documentation in the Documentation/ subdirectory telling people
    to use ioremap_nocache() for the same.

> This is part of the work I've been doing lately. The
> eventual goal once we have the write-combing areas properly split with
> ioremap_wc() and using the new proper preferred architecture agnostic modifier
> (arch_phys_wc_add()) is to change the default ioremap behaviour on x86 to use
> strong UC for PAT enabled systems for *both* ioremap() and ioremap_nocache().

Please note that on ARM, ioremap_wc() gives what's termed in ARM ARM
speak "normal memory, non-cacheable" - which can be subject to speculation,
write combining, multiple accesses, etc.  The important point is that
such mapping is not suitable for device registers, but is suitable for
device regions that have "memory like" properties (iow, a chunk of RAM,
like video drivers.)  It does support unaligned accesses.

> Because of these grammatical issues and the issues with
> unaligned access with ARM I think its important we put some effort
> to care a bit more about defining clear semantics through grammar
> for new APIs or as we rewrite APIs. We have tools to do this these
> days, best make use of them.

I'm in support of anything which more clearly specifies the requirements
for these APIs.

> While we're at it and reconsidering all this, a few items I wish for
> us to address as well then, most of them related to grammar, some
> procedural clarification:
> 
>   * Document it as not supported to have overlapping ioremap() calls.
>     No one seems to have a clue if this should work, but clearly this
>     is just a bad idea. I don't see why we should support the complexity
>     of having this. It seems we can write grammar rules to prevent this.

On ARM, we (probably) have a lot of cases where ioremap() is used multiple
times for the same physical address space, so we shouldn't rule out having
multiple mappings of the same type.  However, differing types would be a
problem on ARM.

>   * We seem to care about device drivers / kernel code doing unaligned
>     accesses with certain ioremap() variants. At least for ARM you should
>     not do unaligned accesses on ioremap_nocache() areas.

... and ioremap() areas.

If we can stop the "abuse" of ioremap_nocache() to map device registers,
then we could potentially switch ioremap_nocache() to be a normal-memory
like mapping, which would allow it to support unaligned accesses.

>     I am not sure
>     if we can come up with grammar to vet for / warn for unaligned access
>     type of code in driver code on some memory area when some ioremap()
>     variant is used, but this could be looked into. I believe we may
>     want rules for unaligned access maybe in general, and not attached
>     to certain calls due to performance considerations, so this work
>     may be welcomed regardless (refer to
>     Documentation/unaligned-memory-access.txt)
>     
>   * We seem to want to be pedantic about adding new ioremap() variants, the
>     unaligned issue on ARM is one reason, do we ideally then want *all*
>     architecture maintainers to provide an Acked-by for any new ioremap
>     variants ?

/If/ we get the current mess sorted out so that we have a safe fallback,
and we have understanding of the different architecture variants (iow,
documented what the safe fallback is) I don't see any reason why we'd
need acks from arch maintainers.  Unfortunately, we're not in that
situation today, because of the poorly documented mess that ioremap*()
currently is (and yes, I'm partly to blame for that too by not documenting
ARMs behaviour here.)

I have some patches (prepared last week, I was going to push them out
towards the end of the merge window) which address that, but unfortunately
the ARM autobuilders have been giving a number of seemingly random boot
failures, and I'm not yet sure what's going on... so I'm holding that
back until stuff has settled down.

Another issue is... the use of memcpy()/memset() directly on memory
returned from ioremap*().  The pmem driver does this.  This fails sparse
checks.  However, years ago, x86 invented the memcpy_fromio()/memcpy_toio()
memset_io() functions, which took a __iomem pointer (which /presumably/
means they're supposed to operate on the memory associated with an
ioremap'd region.)

Should these functions always be used for mappings via ioremap*(), and
the standard memcpy()/memset() be avoided?  To me, that sounds like a
very good thing, because that gives us more control over the
implementation of the functions used to access ioremap'd regions,
and the arch can decide to prevent GCC inlining its own memset() or
memcpy() code if desired.

Note that on x86, these three functions are merely wrappers around
standard memcpy()/memset(), so there should be no reason why pmem.c
couldn't be updated to use these accessors instead.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.