[PATCH V2 0/3] PTE fixes for arm and arm64

Russell King - ARM Linux linux at arm.linux.org.uk
Wed Mar 26 09:48:03 EDT 2014


On Wed, Mar 26, 2014 at 01:23:19PM +0000, Steve Capper wrote:
> On Wed, Mar 26, 2014 at 11:01:41AM +0000, Russell King - ARM Linux wrote:
> > On Wed, Mar 26, 2014 at 10:23:19AM +0000, Steve Capper wrote:
> > > If there are no objections, I was going to put the following into
> > > Russell's patch system:
> > > 	arm: mm: Double logical invert for pte accessors
> > > 	arm: mm: Switch back to L_PTE_WRITE
> > 
> > I'm not all that happy with double inversions - I think they just serve
> > to cause confusion (and it was confusing, which is why I removed it.)
> > I'll only take them if you have a really good reason why you want to
> > bring it back.
> 
> Hi Russell,
> The problem I'm trying to solve is for LPAE, where we have flags in the
> upper 32 bits of a page table entry that are tested for with a bitwise
> and, then subsequently downcast by a store to 32-bit integer:
> 
> 	gather_stats(page, md, pte_dirty(*pte), 1);
> and,
> 
> 	static inline unsigned long huge_pte_write(pte_t pte)
> 	{
> 		return pte_write(pte);
> 	}
> 
> (and other cases that may arise in future).

I think I have already said that these cases should be dealt with by
ensuring that they return sensible values in such cases.  The official
return type for pte_write() and pte_dirty() if they aren't a macro is
"int", and that makes a 64-bit AND operation returning a bit set in
the high 32-bits incorrect behaviour.

So, the return value from all these functions must fit within "int" and
be of the appropriate true/false indication according to C rules depending
on the test.

While we can use the shortcut of doing a 32-bit AND to test a bit in the
32-bit case, we can't use this with LPAE nor 64-bit PTEs where "int" is
not 64-bit - in that case, these functions must adjust the value
appropriately.

> I had tried to create a helper macro, pte_isset, but this didn't attract
> any positive comments:
> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/235380.html

It imposes overhead for _everyone_ whether they need that overhead or not.

The problem with !!value is that the compiler has to generate more code
to convert "value" into a one-or-zero in /every/ case, because by doing
that, you've told the compiler not "I want a true/false" value but "I
want a one or zero value".  So, what you end up with in the 32-bit case
is:

	load pte
	test pte bit
	set another register to 0 if test was zero
	set register to 1 if test was non-zero
	test register for zero or non-zero
	... do something ...

which is rather inefficient when you're doing that lots of times.  As I
say, we only need this if the bit being tested is not representable
within 32-bit.

So, rather than:

+#define pte_isset(pte, val)    (!!(pte_val(pte) & (val)))

maybe:

#define pte_isset(pte, val) ((u32)(val) == (val) ? pte_val(pte) & (val) : !!(pte_val(pte) & (val)))

What this says is, if the bit fits within 32-bit, use our existing logic,
otherwise use the new logic.  Since "val" will always be a constant, the
compiler should be able to optimise this, completely eliminating one or
other branches.  It would be worth checking the assembly from the above
on 32-bit LPAE, because the compiler will probably do a 64-bit test even
for values which fit in 32-bit - this may create even better code:

#define pte_isset(pte, val) ((u32)(val) == (val) ? (u32)pte_val(pte) & (u32)(val) : !!(pte_val(pte) & (val)))

but again, it needs the assembly read to work out how it behaves.

Also, it may be worth considering a pte_isclear() macro, since we don't
need the logic in that case - it can just be a plain and simple:

#define pte_isclear(pte, val) (!(pte_val(pte) & (val)))

since we always need the negation.  Again, as per the above, it may be
better on 32-bit LPAE whether a similar trick here would be worth it -
there's no point testing both halves of a 64-bit register pair when you
know that one half is always zero.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.



More information about the linux-arm-kernel mailing list