[PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses
Sricharan R
r.sricharan at ti.com
Wed Jul 24 23:49:37 EDT 2013
Hi Nicolas,
On Thursday 25 July 2013 01:51 AM, Nicolas Pitre wrote:
> On Wed, 24 Jul 2013, Sricharan R wrote:
>
>> On Wednesday 24 July 2013 05:20 PM, Sricharan R wrote:
>>> On Wednesday 24 July 2013 08:19 AM, Nicolas Pitre wrote:
>>>> On Tue, 23 Jul 2013, Santosh Shilimkar wrote:
>>>>
>>>>> On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote:
>>>>>> On Fri, 21 Jun 2013, Santosh Shilimkar wrote:
>>>>>>
>>>>>>> From: Sricharan R <r.sricharan at ti.com>
>>>>>>>
>>>>>>> The current phys_to_virt patching mechanism does not work
>>>>>>> for 64 bit physical addressesp. Note that constant used in add/sub
>>>>>>> instructions is encoded in to the last 8 bits of the opcode. So shift
>>>>>>> the _pv_offset constant by 24 to get it in to the correct place.
>>>>>>>
>>>>>>> The v2p patching mechanism patches the higher 32bits of physical
>>>>>>> address with a constant. While this is correct, in those platforms
>>>>>>> where the lowmem addressable physical memory spawns across 4GB boundary,
>>>>>>> a carry bit can be produced as a result of addition of lower 32bits.
>>>>>>> This has to be taken in to account and added in to the upper. The patched
>>>>>>> __pv_offset and va are added in lower 32bits, where __pv_offset can be
>>>>>>> in two's complement form when PA_START < VA_START and that can result
>>>>>>> in a false carry bit.
>>>>>>>
>>>>>>> e.g PA = 0x80000000 VA = 0xC0000000
>>>>>>> __pv_offset = PA - VA = 0xC0000000 (2's complement)
>>>>>>>
>>>>>>> So adding __pv_offset + VA should never result in a true overflow. So in
>>>>>>> order to differentiate between a true carry, a extra flag __pv_sign_flag
>>>>>>> is introduced.
>>>>> First of all thanks for the review.
>>>>>
>>>>>> I'm still wondering if this is worth bothering about.
>>>>>>
>>>>>> If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry
>>>>>> to propagate to the high word of the physical address as the VA space
>>>>>> cannot be larger than 0x40000000.
>>>>>>
>>>>> Agreed.
>>>>>
>>>>>> So is there really a case where:
>>>>>>
>>>>>> 1) physical memory is crossing the 4GB mark, and ...
>>>>>>
>>>>>> 2) physical memory start address is higher than virtual memory start
>>>>>> address needing a carry due to the 32-bit add overflow?
>>>>>>
>>>>> Consider below two cases of memory layout apart from one mentioned
>>>>> above where the carry is bit irrelevant as you rightly said.
>>>>>
>>>>> 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000
>>>> This can be patched as:
>>>>
>>>> mov phys_hi, #0x8
>>>> add phys_lo, virt, #0x40000000 @ carry ignored
>>>>
>>>>> 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000
>>>> mov phys_hi, #0x2
>>>> add phys_lo, virt, #0xc0000000 @ carry ignored
>>>>
>>>>> In both of these cases there a true carry which needs to be
>>>>> considered.
>>>> Well, not really. However, if you have:
>>>>
>>>> 3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000
>>>>
>>>> ... then you need:
>>>>
>>>> mov phys_hi, #0x2
>>>> adds phys_lo, virt, #0x40000000
>>>> adc phys_hi, phys_hi, #0
>>>>
>>>> My question is: how likely is this?
>>>>
>>>> What is your actual physical memory start address?
>>> Agreed. In our case we do not have the Physical address crossing across
>>> 4GB. So ignoring the carry would have be been OK. But we are
>>> also addressing the other case where it would really crossover.
>>>
>>>> If we really need to cope with the carry, then the __pv_sign_flag should
>>>> instead be represented in pv_offset directly:
>>>>
>>>> Taking example #2 above, that would be:
>>>>
>>>> mov phys_hi, #0x1
>>>> adds phys_lo, virt, #0xc0000000
>>>> adc phys_hi, phys_hi, #0
>>>>
>>>> If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is
>>>> 0xffff-ffff-c000-0000, meaning:
>>>>
>>>> mvn phys_hi, #0
>>>> add phys_lo, virt, #0xc0000000
>>>> adc phys_hi, phys_hi, #0
>>>>
>>>> So that would require a special case in the patching code where a mvn
>>>> with 0 is used if the high part of pv_offset is 0xffffffff.
>>>>
>>>>
>>>> Nicolas
>>> Extending pv_offset to 64bit is really neat way. When PA > VA, then pv_offset
>>> is going to be actual value and not 2's complement. Fine here.
>>> When running from higher physical address space, we will always fall here.
>>>
>>> So for the second case where pv_offset is 0xffffffff .., (PA < VA)
>>> is a problem only when we run from lower physical address. So we can safely
>>> assume that the higher 32bits of PA are '0' and stub it initially. In this way we
>>> can avoid the special case.
>> Sorry, I missed one more point here. In the second case,we should patch it with
>> 0x0 when (PA > VA) and with 0xffffffff when (PA < VA).
> I don't think I follow you here.
>
> Let's assume:
>
> phys_addr_t __pv_offset = PHYS_START - VIRT_START;
>
> If PA = 0x0-8000-0000 and VA = 0xc000-0000 then
> __pv_offset = 0xffff-ffff-c000-0000.
>
> If PA = 0x2-8000-0000 and VA = 0xc000-0000 then
> __pv_offset = 0x1-c000-0000.
>
> So the __virt_to_phys() assembly stub could look like:
>
> static inline phys_addr_t __virt_to_phys(unsigned long x)
> {
> phys_addr_t t;
>
> if if (sizeof(phys_addr_t) == 4) {
> __pv_stub(x, t, "add", __PV_BITS_31_24);
> } else {
> __pv_movhi_stub(t);
> __pv_add_carry_stub(x, t);
> }
>
> return t;
> }
>
> And...
>
> #define __pv_movhi_stub(y) \
> __asm__("@ __pv_movhi_stub\n" \
> "1: mov %R0, %1\n" \
> " .pushsection .pv_table,\"a\"\n" \
> " .long 1b\n" \
> " .popsection\n" \
> : "=r" (y) \
> : "I" (__PV_BITS_8_0))
>
> #define __pv_add_carry_stub(x, y) \
> __asm__("@ __pv_add_carry_stub\n" \
> "1: adds %Q0, %1, %2\n" \
> " adc %R0, %R0, #0\n" \
> " .pushsection .pv_table,\"a\"\n" \
> " .long 1b\n" \
> " .popsection\n" \
> : "+r" (y) \
> : "r" (x), "I" (__PV_BITS_31_24) \
> : "cc")
>
> The stub bits such as __PV_BITS_8_0 can be augmented with more bits in
> the middle to determine the type of fixup needed. The fixup code would
> determine the shift needed on the value, and whether or not the low or
> high word of __pv_offset should be used according to those bits.
>
> Then, in the case where a mov is patched, you need to check if the high
> word of __pv_offset is 0xffffffff and if so the mov should be turned
> into a "mvn rn, #0".
>
> And there you are with all possible cases handled.
>
>
> Nicolas
Thanks and you have given the full details here.
Sorry if i was not clear on my previous response.
1) When i said special case can be avoided, i meant that
we need not differentiate the 0xfffffff case inside the
__virt_to_phy macro, but can handle it at the time of patching.
Your above code makes that clear.
2) I would have ended creating separate tables for 'mov' and 'add'
case. But again thanks to your above idea of augumenting the
__PV_BITS, with which we can find out run time. And 'mvn' would
be needed for moving '0xffffffff' . Now I can get rid
of the separate section that i created for 'mov' in my previous
version.
I will make the above suggestions and come back.
Regards,
Sricharan
More information about the linux-arm-kernel
mailing list