[PATCH] arm64/mm: Validate hotplug range before creating linear mapping

Ard Biesheuvel ardb at kernel.org
Wed Sep 30 07:01:56 EDT 2020


On Wed, 30 Sep 2020 at 10:03, Anshuman Khandual
<anshuman.khandual at arm.com> wrote:
>
>
> On 09/29/2020 08:52 PM, Will Deacon wrote:
> > On Tue, Sep 29, 2020 at 01:34:24PM +0530, Anshuman Khandual wrote:
> >>
> >>
> >> On 09/29/2020 02:05 AM, Will Deacon wrote:
> >>> On Thu, Sep 17, 2020 at 02:16:42PM +0530, Anshuman Khandual wrote:
> >>>> During memory hotplug process, the linear mapping should not be created for
> >>>> a given memory range if that would fall outside the maximum allowed linear
> >>>> range. Else it might cause memory corruption in the kernel virtual space.
> >>>>
> >>>> Maximum linear mapping region is [PAGE_OFFSET..(PAGE_END -1)] accommodating
> >>>> both its ends but excluding PAGE_END. Max physical range that can be mapped
> >>>> inside this linear mapping range, must also be derived from its end points.
> >>>>
> >>>> When CONFIG_ARM64_VA_BITS_52 is enabled, PAGE_OFFSET is computed with the
> >>>> assumption of 52 bits virtual address space. However, if the CPU does not
> >>>> support 52 bits, then it falls back using 48 bits instead and the PAGE_END
> >>>> is updated to reflect this using the vabits_actual. As for PAGE_OFFSET,
> >>>> bits [51..48] are ignored by the MMU and remain unchanged, even though the
> >>>> effective start address of linear map is now slightly different. Hence, to
> >>>> reliably check the physical address range mapped by the linear map, the
> >>>> start address should be calculated using vabits_actual. This ensures that
> >>>> arch_add_memory() validates memory hot add range for its potential linear
> >>>> mapping requirement, before creating it with __create_pgd_mapping().
> >>>>
> >>>> Cc: Catalin Marinas <catalin.marinas at arm.com>
> >>>> Cc: Will Deacon <will at kernel.org>
> >>>> Cc: Mark Rutland <mark.rutland at arm.com>
> >>>> Cc: Ard Biesheuvel <ardb at kernel.org>
> >>>> Cc: Steven Price <steven.price at arm.com>
> >>>> Cc: Robin Murphy <robin.murphy at arm.com>
> >>>> Cc: David Hildenbrand <david at redhat.com>
> >>>> Cc: Andrew Morton <akpm at linux-foundation.org>
> >>>> Cc: linux-arm-kernel at lists.infradead.org
> >>>> Cc: linux-kernel at vger.kernel.org
> >>>> Fixes: 4ab215061554 ("arm64: Add memory hotplug support")
> >>>> Signed-off-by: Anshuman Khandual <anshuman.khandual at arm.com>
> >>>> ---
> >>>>  arch/arm64/mm/mmu.c | 27 +++++++++++++++++++++++++++
> >>>>  1 file changed, 27 insertions(+)
> >>>>
> >>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> >>>> index 75df62fea1b6..d59ffabb9c84 100644
> >>>> --- a/arch/arm64/mm/mmu.c
> >>>> +++ b/arch/arm64/mm/mmu.c
> >>>> @@ -1433,11 +1433,38 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size)
> >>>>    free_empty_tables(start, end, PAGE_OFFSET, PAGE_END);
> >>>>  }
> >>>>
> >>>> +static bool inside_linear_region(u64 start, u64 size)
> >>>> +{
> >>>> +  /*
> >>>> +   * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)]
> >>>> +   * accommodating both its ends but excluding PAGE_END. Max physical
> >>>> +   * range which can be mapped inside this linear mapping range, must
> >>>> +   * also be derived from its end points.
> >>>> +   *
> >>>> +   * With CONFIG_ARM64_VA_BITS_52 enabled, PAGE_OFFSET is defined with
> >>>> +   * the assumption of 52 bits virtual address space. However, if the
> >>>> +   * CPU does not support 52 bits, it falls back using 48 bits and the
> >>>> +   * PAGE_END is updated to reflect this using the vabits_actual. As
> >>>> +   * for PAGE_OFFSET, bits [51..48] are ignored by the MMU and remain
> >>>> +   * unchanged, even though the effective start address of linear map
> >>>> +   * is now slightly different. Hence, to reliably check the physical
> >>>> +   * address range mapped by the linear map, the start address should
> >>>> +   * be calculated using vabits_actual.
> >>>> +   */
> >>>> +  return ((start >= __pa(_PAGE_OFFSET(vabits_actual)))
> >>>> +                  && ((start + size) <= __pa(PAGE_END - 1)));
> >>>> +}
> >>>
> >>> Why isn't this implemented using the existing __is_lm_address()?
> >>
> >> Not sure, if I understood your suggestion here. The physical address range
> >> [start..start + size] needs to be checked against maximum physical range
> >> that can be represented inside effective boundaries for the linear mapping
> >> i.e [__pa(_PAGE_OFFSET(vabits_actual)..__pa(PAGE_END - 1)].
> >>
> >> Are you suggesting [start..start + size] should be first be converted into
> >> a virtual address range and then checked against __is_lm_addresses() ? But
> >> is not deriving the physical range from from know limits of linear mapping
> >> much cleaner ?
> >
> > I just think having a function called "inside_linear_region()" as well as a
> > macro called "__is_lm_address()" is weird when they have completely separate
> > implementations. They're obviously trying to do the same thing, just the
> > first one gets given physical address as parameters.
> >
> > Implementing one in terms of the other is much better for maintenance.
>
> /*
>  * The linear kernel range starts at the bottom of the virtual address
>  * space. Testing the top bit for the start of the region is a
>  * sufficient check and avoids having to worry about the tag.
>  */
> #define __is_lm_address(addr)   (!(((u64)addr) & BIT(vabits_actual - 1)))
>
> __is_lm_address() currently just check the highest bit in a virtual address
> where the linear mapping ends i.e the lower half and all other kernel mapping
> starts i.e the upper half. But I would believe, it misses the blind range
> [_PAGE_OFFSET(VA_BITS).._PAGE_OFFSET(vabits_actual)] in some configurations,
> even though it does not really affect anything because it gets ignored by the
> MMU. Hence in current form __is_lm_address() cannot be used to derive maximum
> linear range and it's corresponding physical range for hotplug range check.
>

This is actually something that I brought up when the 52-bit VA
changes were under review: currently, we split the VA space in half,
and use the lower half for the linear range, and the upper half for
vmalloc, kernel, modules, vmemmap etc

When we run a 48-bit kernel on 52-bit capable hardware, we mostly
stick with the same arrangement: 2^51 bytes of linear range, but only
2^47 bytes of vmalloc range (as the size is fixed), and so we are
wasting 15/16 of the vmalloc region (2^51 - 2^47), by not using it to
map anything.

So it would be better to get away from this notion that the VA space
is divided evenly between linear and vmalloc regions, and use the
entire range between ~0 << 52 and ~0 << 47 for the linear region (Note
that the KASAN region defimnition will overlap the linear region in
this case, by we should be able to sort that at runtime)

-- 
Ard.



More information about the linux-arm-kernel mailing list