[PATCH v3 5/6] ARM: atomics: prefetch the destination word for write prior to strex

Will Deacon will.deacon at arm.com
Wed Sep 18 05:54:46 EDT 2013


On Tue, Sep 17, 2013 at 07:09:35PM +0100, Nicolas Pitre wrote:
> On Tue, 17 Sep 2013, Will Deacon wrote:
> 
> > The cost of changing a cacheline from shared to exclusive state can be
> > significant, especially when this is triggered by an exclusive store,
> > since it may result in having to retry the transaction.
> > 
> > This patch prefixes our atomic access implementations with pldw
> > instructions (on CPUs which support them) to try and grab the line in
> > exclusive state from the start. Only the barrier-less functions are
> > updated, since memory barriers can limit the usefulness of prefetching
> > data.
> > 
> > Signed-off-by: Will Deacon <will.deacon at arm.com>
> 
> Acked-by: Nicolas Pitre <nico at linaro.org>

Thanks, Nicolas.

> By the way, did you measure significant performance improvements with 
> those patches?

Yep. Latest version shows around a 3% hackbench boost with -rc1 on my TC2.

Will



More information about the linux-arm-kernel mailing list