[PATCH] ARM: Add SWP/SWPB emulation for ARMv7 processors (v3)
Jamie Lokier
jamie at shareable.org
Wed Jan 6 16:53:36 EST 2010
Leif Lindholm wrote:
> > From: Jamie Lokier [mailto:jamie at shareable.org]
> > Then calling it like this:
> >
> > __user_swp_asm(data, address, res, "");
> > __user_swp_asm(data, address, res, "b");
>
> Neat.
> But how about, for clarity, keeping the calling syntax in the calling
> functions and add macros for the variants?:
>
> #define __user_swp_asm_generic(data, addr, res, B) \
> ...
> #define __user_swp_asm(data, addr, res) \
> __user_swp_asm_generic(data, addr, res, "")
> #define __user_swpb_asm(data, addr, res) \
> __user_swp_asm_generic(data, addr, res, "b")
I'd just call the generic one __user_swpX_data and call it - I don't
think the additional tiny macros add any clarity, particularly with
being used in one place just a few lines down. But it's totally
subjective and up to you.
> > > +static int emulate_swp(struct pt_regs *regs, unsigned int address,
> > > + unsigned int destreg, unsigned int data)
> >
> > > +static int emulate_swpb(struct pt_regs *regs, unsigned int address,
> > > + unsigned int destreg, unsigned int data)
> >
> > Two almost identical functions. I wonder if it would be better to
> > merge them and take a flag. It would also reduce the compiled code
> > size.
>
> I'm hesitant to add more than 4 arguments (adds stack overhead).
> Also, at least cs2009q3 gcc (4.3.3) seems to inline both of these, so
> not sure a codesize improvement would occur in practise.
If they are inlined, there is no stack overhead. Mainly I thought
it would make the source smaller/tidier tidier :-)
> > Why is the smp_mb() needed? I don't doubt there's a reason, but I
> > don't see what it is.
>
> A DMB is required between acquiring a lock and accessing the protected
> resource, as well as between modifying a protected resource and
> releasing its lock. Because there is no way to tell whether the SWP
> performed a lock or unlock operation, inserting the barriers on
> either side seemed the safest way to ensure that code written for ARMv5
> or earlier would work as expected.
>
> I guess a case could be made that this is an application problem and
> should be resolved at that end.
That's a really good reason, and thanks for thinking of it: To make
ARMv5 application code that doesn't know about DMB work properly. Any
code which is DMB-aware is likely to use LDREX/STREX itself, so this
is great match. :-)
However, please follow this advice from Documentation/SubmitChecklist:
>> 24: All memory barriers {e.g., barrier(), rmb(), wmb()} need a
>> comment in the source code that explains the logic of what
>> they are doing and why.
I think a comment is even more important in this case, because the
barriers are an ABI design decision you've made; nobody could deduce
why they are there from SMP correctness alone.
Unfortunately there are other places in threaded code that need DMB
(for example any good implementation of pthread_once), so ARMv5
threaded code can still fail in subtle, unpredictable ways :-( Is it
possible to turn off weak memory ordering, after a SWP instruction is
detected? Though even that would not ensure correctness of
pthread_once equivalent if it comes before any mutex locks in a program :(
I wonder, too, about what ARM ARM says about LDREX/STREX only working
on memory with the "Shared TLB attribute". Will single-threaded code
using SWP on mapped shared memory get its expected atomic behaviour at
all with your emulation?
> Good point, will do that in the next version.
>
> > > +#ifndef CONFIG_ALIGNMENT_TRAP
> > > + res = proc_mkdir("cpu", NULL);
>
> > ? Is that to work with different kernel versions?
>
> It's to ensure it would work (without console warnings) even if someone
> decides to disable ALIGNMENT_TRAP. An alternative would be to strip the
> creation of /proc/cpu out from mm/alignment.c and put it somewhere else
> (or move the stats file somewhere else - but it seemed logical to group
> with /proc/alignment).
Seems to me both should be sysfs CPU attributes anyway, but I don't
know much about that so don't take my word for it. The ifdef is kind
of ugly but maybe unavoidable.
-- Jamie
More information about the linux-arm-kernel
mailing list