[RFC PATCH 06/20] arm64: mm: place empty_zero_page in bss

Marc Zyngier marc.zyngier at arm.com
Thu Dec 10 07:40:08 PST 2015


On 10/12/15 15:29, Mark Rutland wrote:
> On Thu, Dec 10, 2015 at 02:11:08PM +0000, Will Deacon wrote:
>> On Wed, Dec 09, 2015 at 12:44:41PM +0000, Mark Rutland wrote:
>>> Currently the zero page is set up in paging_init, and thus we cannot use
>>> the zero page earlier. We use the zero page as a reserved TTBR value
>>> from which no TLB entries may be allocated (e.g. when uninstalling the
>>> idmap). To enable such usage earlier (as may be required for invasive
>>> changes to the kernel page tables), and to minimise the time that the
>>> idmap is active, we need to be able to use the zero page before
>>> paging_init.
>>>
>>> This patch follows the example set by x86, by allocating the zero page
>>> at compile time, in .bss. This means that the zero page itself is
>>> available immediately upon entry to start_kernel (as we zero .bss before
>>> this), and also means that the zero page takes up no space in the raw
>>> Image binary. The associated struct page is allocated in bootmem_init,
>>> and remains unavailable until this time.
>>>
>>> Outside of arch code, the only users of empty_zero_page assume that the
>>> empty_zero_page symbol refers to the zeroed memory itself, and that
>>> ZERO_PAGE(x) must be used to acquire the associated struct page,
>>> following the example of x86. This patch also brings arm64 inline with
>>> these assumptions.
>>>
>>> Signed-off-by: Mark Rutland <mark.rutland at arm.com>
>>> Cc: Ard Biesheuvel <ard.biesheuvel at linaro.org>
>>> Cc: Catalin Marinas <catalin.marinas at arm.com>
>>> Cc: Jeremy Linton <jeremy.linton at arm.com>
>>> Cc: Laura Abbott <labbott at fedoraproject.org>
>>> Cc: Will Deacon <will.deacon at arm.com>
>>> ---
>>>  arch/arm64/include/asm/mmu_context.h | 2 +-
>>>  arch/arm64/include/asm/pgtable.h     | 4 ++--
>>>  arch/arm64/mm/mmu.c                  | 9 +--------
>>>  3 files changed, 4 insertions(+), 11 deletions(-)
>>
>> [...]
>>
>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>> index 304ff23..7559c22 100644
>>> --- a/arch/arm64/mm/mmu.c
>>> +++ b/arch/arm64/mm/mmu.c
>>> @@ -48,7 +48,7 @@ u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
>>>   * Empty_zero_page is a special page that is used for zero-initialized data
>>>   * and COW.
>>>   */
>>> -struct page *empty_zero_page;
>>> +unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
>>>  EXPORT_SYMBOL(empty_zero_page);
>>
>> I've been looking at this, and it was making me feel uneasy because it's
>> full of junk before the bss is zeroed. Working that through, it's no
>> worse than what we currently have but I then realised that (a) we don't
>> have a dsb after zeroing the zero page (which we need to make sure the
>> zeroes are visible to the page table walker and (b) the zero page is
>> never explicitly cleaned to the PoC.
> 
> Ouch; that's scary.
> 
>> There may be cases where the zero-page is used to back read-only,
>> non-cacheable mappings (something to do with KVM?), so I'd sleep better
>> if we made sure that it was clean.
> 
> From a grep around for uses of ZERO_PAGE, in most places the zero page
> is simply used as an empty buffer for I/O. In these cases it's either
> accessed coherently or goes via the usual machinery for non-coherent DMA
> kicks in.
> 
> I don't believe that we usually give userspace the ability to create
> non-cacheable mappings, and I couldn't spot any paths it could do so via
> some driver-specific IOCTL applied to the zero page.
> 
> Looking around, kvm_clear_guest_page seemed problematic, but isn't used
> on arm64. I can imagine the zero page being mapped into guests in other
> situations when mirroring the userspace mapping. 
> 
> Marc, Christoffer, I thought we cleaned pages to the PoC before mapping
> them into a guest? Is that right? Or do we have potential issues there?

I think we're OK. Looking at __coherent_cache_guest_page (which is
called when transitioning from an invalid to valid mapping), we do flush
things to PoC if the vcpu has its cache disabled (or if we know that the
IPA shouldn't be cached - the whole NOR flash emulation horror story).

Does it answer your question?

	M.
-- 
Jazz is not dead. It just smells funny...



More information about the linux-arm-kernel mailing list