[PATCH v6] RISC-V: enable XIP

Alex Ghiti alex at ghiti.fr
Thu Apr 1 12:10:45 BST 2021


Le 4/1/21 à 4:52 AM, Vitaly Wool a écrit :
> Hi Alex,
> 
> On Thu, Apr 1, 2021 at 10:11 AM Alex Ghiti <alex at ghiti.fr> wrote:
>>
>> Hi,
>>
>> Le 3/30/21 à 4:04 PM, Alex Ghiti a écrit :
>>> Le 3/30/21 à 3:33 PM, Palmer Dabbelt a écrit :
>>>> On Tue, 30 Mar 2021 11:39:10 PDT (-0700), alex at ghiti.fr wrote:
>>>>>
>>>>>
>>>>> Le 3/30/21 à 2:26 AM, Vitaly Wool a écrit :
>>>>>> On Tue, Mar 30, 2021 at 8:23 AM Palmer Dabbelt
>>>>>> <palmerdabbelt at google.com> wrote:
>>>>>>>
>>>>>>> On Sun, 21 Mar 2021 17:12:15 PDT (-0700), vitaly.wool at konsulko.com
>>>>>>> wrote:
>>>>>>>> Introduce XIP (eXecute In Place) support for RISC-V platforms.
>>>>>>>> It allows code to be executed directly from non-volatile storage
>>>>>>>> directly addressable by the CPU, such as QSPI NOR flash which can
>>>>>>>> be found on many RISC-V platforms. This makes way for significant
>>>>>>>> optimization of RAM footprint. The XIP kernel is not compressed
>>>>>>>> since it has to run directly from flash, so it will occupy more
>>>>>>>> space on the non-volatile storage. The physical flash address used
>>>>>>>> to link the kernel object files and for storing it has to be known
>>>>>>>> at compile time and is represented by a Kconfig option.
>>>>>>>>
>>>>>>>> XIP on RISC-V will for the time being only work on MMU-enabled
>>>>>>>> kernels.
>>>>>>>>
>>>>>>>> Signed-off-by: Vitaly Wool <vitaly.wool at konsulko.com>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> Changes in v2:
>>>>>>>> - dedicated macro for XIP address fixup when MMU is not enabled yet
>>>>>>>>     o both for 32-bit and 64-bit RISC-V
>>>>>>>> - SP is explicitly set to a safe place in RAM before __copy_data call
>>>>>>>> - removed redundant alignment requirements in vmlinux-xip.lds.S
>>>>>>>> - changed long -> uintptr_t typecast in __XIP_FIXUP macro.
>>>>>>>> Changes in v3:
>>>>>>>> - rebased against latest for-next
>>>>>>>> - XIP address fixup macro now takes an argument
>>>>>>>> - SMP related fixes
>>>>>>>> Changes in v4:
>>>>>>>> - rebased against the current for-next
>>>>>>>> - less #ifdef's in C/ASM code
>>>>>>>> - dedicated XIP_FIXUP_OFFSET assembler macro in head.S
>>>>>>>> - C-specific definitions moved into #ifndef __ASSEMBLY__
>>>>>>>> - Fixed multi-core boot
>>>>>>>> Changes in v5:
>>>>>>>> - fixed build error for non-XIP kernels
>>>>>>>> Changes in v6:
>>>>>>>> - XIP_PHYS_RAM_BASE config option renamed to PHYS_RAM_BASE
>>>>>>>> - added PHYS_RAM_BASE_FIXED config flag to allow usage of
>>>>>>>>     PHYS_RAM_BASE in non-XIP configurations if needed
>>>>>>>> - XIP_FIXUP macro rewritten with a tempoarary variable to avoid side
>>>>>>>>     effects
>>>>>>>> - fixed crash for non-XIP kernels that don't use built-in DTB
>>>>>>>
>>>>>>> So v5 landed on for-next, which generally means it's best to avoid
>>>>>>> re-spinning the patch and instead send along fixups.  That said,
>>>>>>> the v5
>>>>>>> is causing some testing failures for me.
>>>>>>>
>>>>>>> I'm going to drop the v5 for now as I don't have time to test this
>>>>>>> tonight.  I'll try and take a look soon, as it will conflict with
>>>>>>> Alex's
>>>>>>> patches.
>>>>>>
>>>>>> I can come up with the incremental patch instead pretty much straight
>>>>>> away if that works better.
>>>>>>
>>>>>> ~Vitaly
>>>>>>
>>>>>>>>    arch/riscv/Kconfig                  |  49 ++++++++++-
>>>>>>>>    arch/riscv/Makefile                 |   8 +-
>>>>>>>>    arch/riscv/boot/Makefile            |  13 +++
>>>>>>>>    arch/riscv/include/asm/pgtable.h    |  65 ++++++++++++--
>>>>>>>>    arch/riscv/kernel/cpu_ops_sbi.c     |  11 ++-
>>>>>>>>    arch/riscv/kernel/head.S            |  49 ++++++++++-
>>>>>>>>    arch/riscv/kernel/head.h            |   3 +
>>>>>>>>    arch/riscv/kernel/setup.c           |   8 +-
>>>>>>>>    arch/riscv/kernel/vmlinux-xip.lds.S | 132
>>>>>>>> ++++++++++++++++++++++++++++
>>>>>>>>    arch/riscv/kernel/vmlinux.lds.S     |   6 ++
>>>>>>>>    arch/riscv/mm/init.c                | 100 +++++++++++++++++++--
>>>>>>>>    11 files changed, 426 insertions(+), 18 deletions(-)
>>>>>>>>    create mode 100644 arch/riscv/kernel/vmlinux-xip.lds.S
>>>>>>>>
>>>>>>>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>>>>>>>> index 8ea60a0a19ae..bd6f82240c34 100644
>>>>>>>> --- a/arch/riscv/Kconfig
>>>>>>>> +++ b/arch/riscv/Kconfig
>>>>>>>> @@ -441,7 +441,7 @@ config EFI_STUB
>>>>>>>>
>>>>>>>>    config EFI
>>>>>>>>         bool "UEFI runtime support"
>>>>>>>> -     depends on OF
>>>>>>>> +     depends on OF && !XIP_KERNEL
>>>>>>>>         select LIBFDT
>>>>>>>>         select UCS2_STRING
>>>>>>>>         select EFI_PARAMS_FROM_FDT
>>>>>>>> @@ -465,11 +465,56 @@ config STACKPROTECTOR_PER_TASK
>>>>>>>>         def_bool y
>>>>>>>>         depends on STACKPROTECTOR && CC_HAVE_STACKPROTECTOR_TLS
>>>>>>>>
>>>>>>>> +config PHYS_RAM_BASE_FIXED
>>>>>>>> +     bool "Explicitly specified physical RAM address"
>>>>>>>> +     default n
>>>>>>>> +
>>>>>>>> +config PHYS_RAM_BASE
>>>>>>>> +     hex "Platform Physical RAM address"
>>>>>>>> +     depends on PHYS_RAM_BASE_FIXED
>>>>>>>> +     default "0x80000000"
>>>>>>>> +     help
>>>>>>>> +       This is the physical address of RAM in the system. It has
>>>>>>>> to be
>>>>>>>> +       explicitly specified to run early relocations of
>>>>>>>> read-write data
>>>>>>>> +       from flash to RAM.
>>>>>>>> +
>>>>>>>> +config XIP_KERNEL
>>>>>>>> +     bool "Kernel Execute-In-Place from ROM"
>>>>>>>> +     depends on MMU
>>>>>>>> +     select PHYS_RAM_BASE_FIXED
>>>>>>>> +     help
>>>>>>>> +       Execute-In-Place allows the kernel to run from
>>>>>>>> non-volatile storage
>>>>>>>> +       directly addressable by the CPU, such as NOR flash. This
>>>>>>>> saves RAM
>>>>>>>> +       space since the text section of the kernel is not loaded
>>>>>>>> from flash
>>>>>>>> +       to RAM.  Read-write sections, such as the data section and
>>>>>>>> stack,
>>>>>>>> +       are still copied to RAM.  The XIP kernel is not compressed
>>>>>>>> since
>>>>>>>> +       it has to run directly from flash, so it will take more
>>>>>>>> space to
>>>>>>>> +       store it.  The flash address used to link the kernel
>>>>>>>> object files,
>>>>>>>> +       and for storing it, is configuration dependent. Therefore,
>>>>>>>> if you
>>>>>>>> +       say Y here, you must know the proper physical address
>>>>>>>> where to
>>>>>>>> +       store the kernel image depending on your own flash memory
>>>>>>>> usage.
>>>>>>>> +
>>>>>>>> +       Also note that the make target becomes "make xipImage"
>>>>>>>> rather than
>>>>>>>> +       "make zImage" or "make Image".  The final kernel binary to
>>>>>>>> put in
>>>>>>>> +       ROM memory will be arch/riscv/boot/xipImage.
>>>>>>>> +
>>>>>>>> +       If unsure, say N.
>>>>>>>> +
>>>>>>>> +config XIP_PHYS_ADDR
>>>>>>>> +     hex "XIP Kernel Physical Location"
>>>>>>>> +     depends on XIP_KERNEL
>>>>>>>> +     default "0x21000000"
>>>>>>>> +     help
>>>>>>>> +       This is the physical address in your flash memory the
>>>>>>>> kernel will
>>>>>>>> +       be linked for and stored to.  This address is dependent on
>>>>>>>> your
>>>>>>>> +       own flash usage.
>>>>>>>> +
>>>>>>>>    endmenu
>>>>>>>>
>>>>>>>>    config BUILTIN_DTB
>>>>>>>> -     def_bool n
>>>>>>>> +     bool
>>>>>>>>         depends on OF
>>>>>>>> +     default y if XIP_KERNEL
>>>>>>>>
>>>>>>>>    menu "Power management options"
>>>>>>>>
>>>>>>>> diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
>>>>>>>> index 1368d943f1f3..8fcbec03974d 100644
>>>>>>>> --- a/arch/riscv/Makefile
>>>>>>>> +++ b/arch/riscv/Makefile
>>>>>>>> @@ -82,7 +82,11 @@ CHECKFLAGS += -D__riscv -D__riscv_xlen=$(BITS)
>>>>>>>>
>>>>>>>>    # Default target when executing plain make
>>>>>>>>    boot         := arch/riscv/boot
>>>>>>>> +ifeq ($(CONFIG_XIP_KERNEL),y)
>>>>>>>> +KBUILD_IMAGE := $(boot)/xipImage
>>>>>>>> +else
>>>>>>>>    KBUILD_IMAGE := $(boot)/Image.gz
>>>>>>>> +endif
>>>>>>>>
>>>>>>>>    head-y := arch/riscv/kernel/head.o
>>>>>>>>
>>>>>>>> @@ -95,12 +99,14 @@ PHONY += vdso_install
>>>>>>>>    vdso_install:
>>>>>>>>         $(Q)$(MAKE) $(build)=arch/riscv/kernel/vdso $@
>>>>>>>>
>>>>>>>> +ifneq ($(CONFIG_XIP_KERNEL),y)
>>>>>>>>    ifeq ($(CONFIG_RISCV_M_MODE)$(CONFIG_SOC_CANAAN),yy)
>>>>>>>>    KBUILD_IMAGE := $(boot)/loader.bin
>>>>>>>>    else
>>>>>>>>    KBUILD_IMAGE := $(boot)/Image.gz
>>>>>>>>    endif
>>>>>>>> -BOOT_TARGETS := Image Image.gz loader loader.bin
>>>>>>>> +endif
>>>>>>>> +BOOT_TARGETS := Image Image.gz loader loader.bin xipImage
>>>>>>>>
>>>>>>>>    all: $(notdir $(KBUILD_IMAGE))
>>>>>>>>
>>>>>>>> diff --git a/arch/riscv/boot/Makefile b/arch/riscv/boot/Makefile
>>>>>>>> index 03404c84f971..6bf299f70c27 100644
>>>>>>>> --- a/arch/riscv/boot/Makefile
>>>>>>>> +++ b/arch/riscv/boot/Makefile
>>>>>>>> @@ -17,8 +17,21 @@
>>>>>>>>    KCOV_INSTRUMENT := n
>>>>>>>>
>>>>>>>>    OBJCOPYFLAGS_Image :=-O binary -R .note -R .note.gnu.build-id -R
>>>>>>>> .comment -S
>>>>>>>> +OBJCOPYFLAGS_xipImage :=-O binary -R .note -R .note.gnu.build-id
>>>>>>>> -R .comment -S
>>>>>>>>
>>>>>>>>    targets := Image Image.* loader loader.o loader.lds loader.bin
>>>>>>>> +targets := Image Image.* loader loader.o loader.lds loader.bin
>>>>>>>> xipImage
>>>>>>>> +
>>>>>>>> +ifeq ($(CONFIG_XIP_KERNEL),y)
>>>>>>>> +
>>>>>>>> +quiet_cmd_mkxip = $(quiet_cmd_objcopy)
>>>>>>>> +cmd_mkxip = $(cmd_objcopy)
>>>>>>>> +
>>>>>>>> +$(obj)/xipImage: vmlinux FORCE
>>>>>>>> +     $(call if_changed,mkxip)
>>>>>>>> +     @$(kecho) '  Physical Address of xipImage:
>>>>>>>> $(CONFIG_XIP_PHYS_ADDR)'
>>>>>>>> +
>>>>>>>> +endif
>>>>>>>>
>>>>>>>>    $(obj)/Image: vmlinux FORCE
>>>>>>>>         $(call if_changed,objcopy)
>>>>>>>> diff --git a/arch/riscv/include/asm/pgtable.h
>>>>>>>> b/arch/riscv/include/asm/pgtable.h
>>>>>>>> index ebf817c1bdf4..21a9b2f8d1c7 100644
>>>>>>>> --- a/arch/riscv/include/asm/pgtable.h
>>>>>>>> +++ b/arch/riscv/include/asm/pgtable.h
>>>>>>>> @@ -11,6 +11,33 @@
>>>>>>>>
>>>>>>>>    #include <asm/pgtable-bits.h>
>>>>>>>>
>>>>>>>> +#ifdef CONFIG_MMU
>>>>>>>> +
>>>>>>>> +#define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)
>>>>>>>> +
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +#define VMALLOC_SIZE     ((KERN_VIRT_SIZE >> 1) - SZ_16M)
>>>>>>>> +#define VMALLOC_END      (PAGE_OFFSET - SZ_16M - 1)
>>>>>>>> +
>>>>>>>> +#define XIP_OFFSET           SZ_8M
>>>>>>>> +#define XIP_MASK             (SZ_8M - 1)
>>>>>>>> +#define XIP_VIRT_ADDR(physaddr)      \
>>>>>>>> +     (PAGE_OFFSET - XIP_OFFSET + ((physaddr) & XIP_MASK))
>>>>>>>> +
>>>>>>>> +#else
>>>>>>>> +
>>>>>>>> +#define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
>>>>>>>> +#define VMALLOC_END      (PAGE_OFFSET - 1)
>>>>>>>> +
>>>>>>>> +#endif /* CONFIG_XIP_KERNEL */
>>>>>>>> +
>>>>>>>> +#else
>>>>>>>> +
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +#define XIP_VIRT_ADDR(physaddr) (physaddr)
>>>>>>>> +#endif /* CONFIG_XIP_KERNEL */
>>>>>>>> +#endif /* CONFIG_MMU */
>>>>>>>> +
>>>>>>>>    #ifndef __ASSEMBLY__
>>>>>>>>
>>>>>>>>    /* Page Upper Directory not used in RISC-V */
>>>>>>>> @@ -21,9 +48,25 @@
>>>>>>>>
>>>>>>>>    #ifdef CONFIG_MMU
>>>>>>>>
>>>>>>>> -#define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
>>>>>>>> -#define VMALLOC_END      (PAGE_OFFSET - 1)
>>>>>>>> -#define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +/*
>>>>>>>> + * Since we use sections to map it, this macro replaces the
>>>>>>>> physical address
>>>>>>>> + * with its virtual address while keeping offset from the base
>>>>>>>> section.
>>>>>>>> + */
>>>>>>>> +#define XIP_PHYS_ADDR(va)     \
>>>>>>>> +     ((uintptr_t)(va) - PAGE_OFFSET + XIP_OFFSET +
>>>>>>>> CONFIG_XIP_PHYS_ADDR)
>>>>>>>> +
>>>>>>>> +#define XIP_VIRT_ADDR_START  XIP_VIRT_ADDR(CONFIG_XIP_PHYS_ADDR)
>>>>>>>> +
>>>>>>>> +#define XIP_FIXUP(addr)              ({ \
>>>>>>>> +     uintptr_t __a = (uintptr_t)(addr); \
>>>>>>>> +     (__a >= CONFIG_XIP_PHYS_ADDR && \
>>>>>>>> +      __a < CONFIG_XIP_PHYS_ADDR + SZ_16M) ? \
>>>>>>>> +     __a - CONFIG_XIP_PHYS_ADDR + CONFIG_PHYS_RAM_BASE -
>>>>>>>> XIP_OFFSET : __a; \
>>>>>>>> +})
>>>>>>>> +#else
>>>>>>>> +#define XIP_FIXUP(addr)              (addr)
>>>>>>>> +#endif /* CONFIG_XIP_KERNEL */
>>>>>>>>
>>>>>>>>    #define BPF_JIT_REGION_SIZE  (SZ_128M)
>>>>>>>>    #define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
>>>>>>>> @@ -484,8 +527,20 @@ static inline int
>>>>>>>> ptep_clear_flush_young(struct vm_area_struct *vma,
>>>>>>>>
>>>>>>>>    #define kern_addr_valid(addr)   (1) /* FIXME */
>>>>>>>>
>>>>>>>> -extern void *dtb_early_va;
>>>>>>>> -extern uintptr_t dtb_early_pa;
>>>>>>>> +extern void *_dtb_early_va;
>>>>>>>> +extern uintptr_t _dtb_early_pa;
>>>>>>>> +#if defined(CONFIG_XIP_KERNEL) && defined(CONFIG_MMU)
>>>>>>>> +
>>>>>>>> +#define dtb_early_va (*(void **)XIP_FIXUP(&_dtb_early_va))
>>>>>>>> +#define dtb_early_pa (*(uintptr_t *)XIP_FIXUP(&_dtb_early_pa))
>>>>>>>> +
>>>>>>>> +#else
>>>>>>>> +
>>>>>>>> +#define dtb_early_va _dtb_early_va
>>>>>>>> +#define dtb_early_pa _dtb_early_pa
>>>>>>>> +
>>>>>>>> +#endif /* CONFIG_XIP_KERNEL */
>>>>>>>> +
>>>>>>>>    void setup_bootmem(void);
>>>>>>>>    void paging_init(void);
>>>>>>>>    void misc_mem_init(void);
>>>>>>>> diff --git a/arch/riscv/kernel/cpu_ops_sbi.c
>>>>>>>> b/arch/riscv/kernel/cpu_ops_sbi.c
>>>>>>>> index 685fae72b7f5..2413c2997350 100644
>>>>>>>> --- a/arch/riscv/kernel/cpu_ops_sbi.c
>>>>>>>> +++ b/arch/riscv/kernel/cpu_ops_sbi.c
>>>>>>>> @@ -53,10 +53,19 @@ static int sbi_hsm_hart_get_status(unsigned
>>>>>>>> long hartid)
>>>>>>>>    }
>>>>>>>>    #endif
>>>>>>>>
>>>>>>>> +static inline unsigned long get_secondary_start_phys(void)
>>>>>>>> +{
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +     return XIP_PHYS_ADDR(secondary_start_sbi);
>>>>>>>> +#else
>>>>>>>> +     return __pa_symbol(secondary_start_sbi);
>>>>>>>> +#endif
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>    static int sbi_cpu_start(unsigned int cpuid, struct task_struct
>>>>>>>> *tidle)
>>>>>>>>    {
>>>>>>>>         int rc;
>>>>>>>> -     unsigned long boot_addr = __pa_symbol(secondary_start_sbi);
>>>>>>>> +     unsigned long boot_addr = get_secondary_start_phys();
>>>>>>>>         int hartid = cpuid_to_hartid_map(cpuid);
>>>>>>>>
>>>>>>>>         cpu_update_secondary_bootdata(cpuid, tidle);
>>>>>>>> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
>>>>>>>> index f5a9bad86e58..bbe74e37914f 100644
>>>>>>>> --- a/arch/riscv/kernel/head.S
>>>>>>>> +++ b/arch/riscv/kernel/head.S
>>>>>>>> @@ -9,11 +9,23 @@
>>>>>>>>    #include <linux/linkage.h>
>>>>>>>>    #include <asm/thread_info.h>
>>>>>>>>    #include <asm/page.h>
>>>>>>>> +#include <asm/pgtable.h>
>>>>>>>>    #include <asm/csr.h>
>>>>>>>>    #include <asm/hwcap.h>
>>>>>>>>    #include <asm/image.h>
>>>>>>>>    #include "efi-header.S"
>>>>>>>>
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +.macro XIP_FIXUP_OFFSET reg
>>>>>>>> +     REG_L t0, _xip_fixup
>>>>>>>> +     add \reg, \reg, t0
>>>>>>>> +.endm
>>>>>>>> +_xip_fixup: .dword CONFIG_PHYS_RAM_BASE - CONFIG_XIP_PHYS_ADDR -
>>>>>>>> XIP_OFFSET
>>>>>>>> +#else
>>>>>>>> +.macro XIP_FIXUP_OFFSET reg
>>>>>>>> +.endm
>>>>>>>> +#endif /* CONFIG_XIP_KERNEL */
>>>>>>>> +
>>>>>>>>    __HEAD
>>>>>>>>    ENTRY(_start)
>>>>>>>>         /*
>>>>>>>> @@ -69,7 +81,11 @@ pe_head_start:
>>>>>>>>    #ifdef CONFIG_MMU
>>>>>>>>    relocate:
>>>>>>>>         /* Relocate return address */
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +     li a1, XIP_VIRT_ADDR(CONFIG_XIP_PHYS_ADDR)
>>>>>>>> +#else
>>>>>>>>         li a1, PAGE_OFFSET
>>>>>>>> +#endif
>>>>>>>>         la a2, _start
>>>>>>>>         sub a1, a1, a2
>>>>>>>>         add ra, ra, a1
>>>>>>>> @@ -91,6 +107,7 @@ relocate:
>>>>>>>>          * to ensure the new translations are in use.
>>>>>>>>          */
>>>>>>>>         la a0, trampoline_pg_dir
>>>>>>>> +     XIP_FIXUP_OFFSET a0
>>>>>>>>         srl a0, a0, PAGE_SHIFT
>>>>>>>>         or a0, a0, a1
>>>>>>>>         sfence.vma
>>>>>>>> @@ -144,7 +161,9 @@ secondary_start_sbi:
>>>>>>>>
>>>>>>>>         slli a3, a0, LGREG
>>>>>>>>         la a4, __cpu_up_stack_pointer
>>>>>>>> +     XIP_FIXUP_OFFSET a4
>>>>>>>>         la a5, __cpu_up_task_pointer
>>>>>>>> +     XIP_FIXUP_OFFSET a5
>>>>>>>>         add a4, a3, a4
>>>>>>>>         add a5, a3, a5
>>>>>>>>         REG_L sp, (a4)
>>>>>>>> @@ -156,6 +175,7 @@ secondary_start_common:
>>>>>>>>    #ifdef CONFIG_MMU
>>>>>>>>         /* Enable virtual memory and relocate to virtual address */
>>>>>>>>         la a0, swapper_pg_dir
>>>>>>>> +     XIP_FIXUP_OFFSET a0
>>>>>>>>         call relocate
>>>>>>>>    #endif
>>>>>>>>         call setup_trap_vector
>>>>>>>> @@ -236,12 +256,33 @@ pmp_done:
>>>>>>>>    .Lgood_cores:
>>>>>>>>    #endif
>>>>>>>>
>>>>>>>> +#ifndef CONFIG_XIP_KERNEL
>>>>>>>>         /* Pick one hart to run the main boot sequence */
>>>>>>>>         la a3, hart_lottery
>>>>>>>>         li a2, 1
>>>>>>>>         amoadd.w a3, a2, (a3)
>>>>>>>>         bnez a3, .Lsecondary_start
>>>>>>>>
>>>>>>>> +#else
>>>>>>>> +     /* hart_lottery in flash contains a magic number */
>>>>>>>> +     la a3, hart_lottery
>>>>>>>> +     mv a2, a3
>>>>>>>> +     XIP_FIXUP_OFFSET a2
>>>>>>>> +     lw t1, (a3)
>>>>>>>> +     amoswap.w t0, t1, (a2)
>>>>>>>> +     /* first time here if hart_lottery in RAM is not set */
>>>>>>>> +     beq t0, t1, .Lsecondary_start
>>>>>>>> +
>>>>>>>> +     la sp, _end + THREAD_SIZE
>>>>>>>> +     XIP_FIXUP_OFFSET sp
>>>>>>>> +     mv s0, a0
>>>>>>>> +     call __copy_data
>>>>>>>> +
>>>>>>>> +     /* Restore a0 copy */
>>>>>>>> +     mv a0, s0
>>>>>>>> +#endif
>>>>>>>> +
>>>>>>>> +#ifndef CONFIG_XIP_KERNEL
>>>>>>>>         /* Clear BSS for flat non-ELF images */
>>>>>>>>         la a3, __bss_start
>>>>>>>>         la a4, __bss_stop
>>>>>>>> @@ -251,15 +292,18 @@ clear_bss:
>>>>>>>>         add a3, a3, RISCV_SZPTR
>>>>>>>>         blt a3, a4, clear_bss
>>>>>>>>    clear_bss_done:
>>>>>>>> -
>>>>>>>> +#endif
>>>>>>>>         /* Save hart ID and DTB physical address */
>>>>>>>>         mv s0, a0
>>>>>>>>         mv s1, a1
>>>>>>>> +
>>>>>>>>         la a2, boot_cpu_hartid
>>>>>>>> +     XIP_FIXUP_OFFSET a2
>>>>>>>>         REG_S a0, (a2)
>>>>>>>>
>>>>>>>>         /* Initialize page tables and relocate to virtual addresses */
>>>>>>>>         la sp, init_thread_union + THREAD_SIZE
>>>>>>>> +     XIP_FIXUP_OFFSET sp
>>>>>>>>    #ifdef CONFIG_BUILTIN_DTB
>>>>>>>>         la a0, __dtb_start
>>>>>>>>    #else
>>>>>>>> @@ -268,6 +312,7 @@ clear_bss_done:
>>>>>>>>         call setup_vm
>>>>>>>>    #ifdef CONFIG_MMU
>>>>>>>>         la a0, early_pg_dir
>>>>>>>> +     XIP_FIXUP_OFFSET a0
>>>>>>>>         call relocate
>>>>>>>>    #endif /* CONFIG_MMU */
>>>>>>>>
>>>>>>>> @@ -292,7 +337,9 @@ clear_bss_done:
>>>>>>>>
>>>>>>>>         slli a3, a0, LGREG
>>>>>>>>         la a1, __cpu_up_stack_pointer
>>>>>>>> +     XIP_FIXUP_OFFSET a1
>>>>>>>>         la a2, __cpu_up_task_pointer
>>>>>>>> +     XIP_FIXUP_OFFSET a2
>>>>>>>>         add a1, a3, a1
>>>>>>>>         add a2, a3, a2
>>>>>>>>
>>>>>>>> diff --git a/arch/riscv/kernel/head.h b/arch/riscv/kernel/head.h
>>>>>>>> index b48dda3d04f6..aabbc3ac3e48 100644
>>>>>>>> --- a/arch/riscv/kernel/head.h
>>>>>>>> +++ b/arch/riscv/kernel/head.h
>>>>>>>> @@ -12,6 +12,9 @@ extern atomic_t hart_lottery;
>>>>>>>>
>>>>>>>>    asmlinkage void do_page_fault(struct pt_regs *regs);
>>>>>>>>    asmlinkage void __init setup_vm(uintptr_t dtb_pa);
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +asmlinkage void __init __copy_data(void);
>>>>>>>> +#endif
>>>>>>>>
>>>>>>>>    extern void *__cpu_up_stack_pointer[];
>>>>>>>>    extern void *__cpu_up_task_pointer[];
>>>>>>>> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
>>>>>>>> index e85bacff1b50..a0384c72c272 100644
>>>>>>>> --- a/arch/riscv/kernel/setup.c
>>>>>>>> +++ b/arch/riscv/kernel/setup.c
>>>>>>>> @@ -50,7 +50,11 @@ struct screen_info screen_info
>>>>>>>> __section(".data") = {
>>>>>>>>     * This is used before the kernel initializes the BSS so it
>>>>>>>> can't be in the
>>>>>>>>     * BSS.
>>>>>>>>     */
>>>>>>>> -atomic_t hart_lottery __section(".sdata");
>>>>>>>> +atomic_t hart_lottery __section(".sdata")
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> += ATOMIC_INIT(0xC001BEEF)
>>>>>>>> +#endif
>>>>>>>> +;
>>>>>>>>    unsigned long boot_cpu_hartid;
>>>>>>>>    static DEFINE_PER_CPU(struct cpu, cpu_devices);
>>>>>>>>
>>>>>>>> @@ -254,7 +258,7 @@ void __init setup_arch(char **cmdline_p)
>>>>>>>>    #if IS_ENABLED(CONFIG_BUILTIN_DTB)
>>>>>>>>         unflatten_and_copy_device_tree();
>>>>>>>>    #else
>>>>>>>> -     if (early_init_dt_verify(__va(dtb_early_pa)))
>>>>>>>> +     if (early_init_dt_verify(__va(XIP_FIXUP(dtb_early_pa))))
>>>>>>>>                 unflatten_device_tree();
>>>>>>>>         else
>>>>>>>>                 pr_err("No DTB found in kernel mappings\n");
>>>>>>>> diff --git a/arch/riscv/kernel/vmlinux-xip.lds.S
>>>>>>>> b/arch/riscv/kernel/vmlinux-xip.lds.S
>>>>>>>> new file mode 100644
>>>>>>>> index 000000000000..9f0f08c34cd3
>>>>>>>> --- /dev/null
>>>>>>>> +++ b/arch/riscv/kernel/vmlinux-xip.lds.S
>>>>>>>> @@ -0,0 +1,132 @@
>>>>>>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>>>>>>> +/*
>>>>>>>> + * Copyright (C) 2012 Regents of the University of California
>>>>>>>> + * Copyright (C) 2017 SiFive
>>>>>>>> + * Copyright (C) 2020 Vitaly Wool, Konsulko AB
>>>>>>>> + */
>>>>>>>> +
>>>>>>>> +#define LOAD_OFFSET XIP_VIRT_ADDR(CONFIG_XIP_PHYS_ADDR)
>>>>>>>> +/* No __ro_after_init data in the .rodata section - which will
>>>>>>>> always be ro */
>>>>>>>> +#define RO_AFTER_INIT_DATA
>>>>>>>> +
>>>>>>>> +#include <asm/vmlinux.lds.h>
>>>>>>>> +#include <asm/page.h>
>>>>>>>> +#include <asm/pgtable.h>
>>>>>>>> +#include <asm/cache.h>
>>>>>>>> +#include <asm/thread_info.h>
>>>>>>>> +
>>>>>>>> +OUTPUT_ARCH(riscv)
>>>>>>>> +ENTRY(_start)
>>>>>>>> +
>>>>>>>> +jiffies = jiffies_64;
>>>>>>>> +
>>>>>>>> +SECTIONS
>>>>>>>> +{
>>>>>>>> +     /* Beginning of code and text segment */
>>>>>>>> +     . = XIP_VIRT_ADDR(CONFIG_XIP_PHYS_ADDR);
>>>>>>>> +     _xiprom = .;
>>>>>>>> +     _start = .;
>>>>>>>> +     HEAD_TEXT_SECTION
>>>>>>>> +     INIT_TEXT_SECTION(PAGE_SIZE)
>>>>>>>> +     /* we have to discard exit text and such at runtime, not
>>>>>>>> link time */
>>>>>>>> +     .exit.text :
>>>>>>>> +     {
>>>>>>>> +             EXIT_TEXT
>>>>>>>> +     }
>>>>>>>> +
>>>>>>>> +     .text : {
>>>>>>>> +             _text = .;
>>>>>>>> +             _stext = .;
>>>>>>>> +             TEXT_TEXT
>>>>>>>> +             SCHED_TEXT
>>>>>>>> +             CPUIDLE_TEXT
>>>>>>>> +             LOCK_TEXT
>>>>>>>> +             KPROBES_TEXT
>>>>>>>> +             ENTRY_TEXT
>>>>>>>> +             IRQENTRY_TEXT
>>>>>>>> +             SOFTIRQENTRY_TEXT
>>>>>>>> +             *(.fixup)
>>>>>>>> +             _etext = .;
>>>>>>>> +     }
>>>>>>>> +     RO_DATA(L1_CACHE_BYTES)
>>>>>>>> +     .srodata : {
>>>>>>>> +             *(.srodata*)
>>>>>>>> +     }
>>>>>>>> +     .init.rodata : {
>>>>>>>> +             INIT_SETUP(16)
>>>>>>>> +             INIT_CALLS
>>>>>>>> +             CON_INITCALL
>>>>>>>> +             INIT_RAM_FS
>>>>>>>> +     }
>>>>>>>> +     _exiprom = .;                   /* End of XIP ROM area */
>>>>>>>> +
>>>>>>>> +
>>>>>>>> +/*
>>>>>>>> + * From this point, stuff is considered writable and will be
>>>>>>>> copied to RAM
>>>>>>>> + */
>>>>>>>> +     __data_loc = ALIGN(16);         /* location in file */
>>>>>>>> +     . = PAGE_OFFSET;                /* location in memory */
>>>>>>>> +
>>>>>>>> +     _sdata = .;                     /* Start of data section */
>>>>>>>> +     _data = .;
>>>>>>>> +     RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
>>>>>>>> +     _edata = .;
>>>>>>>> +     __start_ro_after_init = .;
>>>>>>>> +     .data.ro_after_init : AT(ADDR(.data.ro_after_init) -
>>>>>>>> LOAD_OFFSET) {
>>>>>>>> +             *(.data..ro_after_init)
>>>>>>>> +     }
>>>>>>>> +     __end_ro_after_init = .;
>>>>>>>> +
>>>>>>>> +     . = ALIGN(PAGE_SIZE);
>>>>>>>> +     __init_begin = .;
>>>>>>>> +     .init.data : {
>>>>>>>> +             INIT_DATA
>>>>>>>> +     }
>>>>>>>> +     .exit.data : {
>>>>>>>> +             EXIT_DATA
>>>>>>>> +     }
>>>>>>>> +     . = ALIGN(8);
>>>>>>>> +     __soc_early_init_table : {
>>>>>>>> +             __soc_early_init_table_start = .;
>>>>>>>> +             KEEP(*(__soc_early_init_table))
>>>>>>>> +             __soc_early_init_table_end = .;
>>>>>>>> +     }
>>>>>>>> +     __soc_builtin_dtb_table : {
>>>>>>>> +             __soc_builtin_dtb_table_start = .;
>>>>>>>> +             KEEP(*(__soc_builtin_dtb_table))
>>>>>>>> +             __soc_builtin_dtb_table_end = .;
>>>>>>>> +     }
>>>>>>>> +     PERCPU_SECTION(L1_CACHE_BYTES)
>>>>>>>> +
>>>>>>>> +     . = ALIGN(PAGE_SIZE);
>>>>>>>> +     __init_end = .;
>>>>>>>> +
>>>>>>>> +     .sdata : {
>>>>>>>> +             __global_pointer$ = . + 0x800;
>>>>>>>> +             *(.sdata*)
>>>>>>>> +             *(.sbss*)
>>>>>>>> +     }
>>>>>>>> +
>>>>>>>> +     BSS_SECTION(PAGE_SIZE, PAGE_SIZE, 0)
>>>>>>>> +     EXCEPTION_TABLE(0x10)
>>>>>>>> +
>>>>>>>> +     .rel.dyn : AT(ADDR(.rel.dyn) - LOAD_OFFSET) {
>>>>>>>> +             *(.rel.dyn*)
>>>>>>>> +     }
>>>>>>>> +
>>>>>>>> +     /*
>>>>>>>> +      * End of copied data. We need a dummy section to get its LMA.
>>>>>>>> +      * Also located before final ALIGN() as trailing padding is
>>>>>>>> not stored
>>>>>>>> +      * in the resulting binary file and useless to copy.
>>>>>>>> +      */
>>>>>>>> +     .data.endmark : AT(ADDR(.data.endmark) - LOAD_OFFSET) { }
>>>>>>>> +     _edata_loc = LOADADDR(.data.endmark);
>>>>>>>> +
>>>>>>>> +     . = ALIGN(PAGE_SIZE);
>>>>>>>> +     _end = .;
>>>>>>>> +
>>>>>>>> +     STABS_DEBUG
>>>>>>>> +     DWARF_DEBUG
>>>>>>>> +
>>>>>>>> +     DISCARDS
>>>>>>>> +}
>>>>>>>> diff --git a/arch/riscv/kernel/vmlinux.lds.S
>>>>>>>> b/arch/riscv/kernel/vmlinux.lds.S
>>>>>>>> index de03cb22d0e9..6745ec325930 100644
>>>>>>>> --- a/arch/riscv/kernel/vmlinux.lds.S
>>>>>>>> +++ b/arch/riscv/kernel/vmlinux.lds.S
>>>>>>>> @@ -4,7 +4,12 @@
>>>>>>>>     * Copyright (C) 2017 SiFive
>>>>>>>>     */
>>>>>>>>
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +#include "vmlinux-xip.lds.S"
>>>>>>>> +#else
>>>>>>>> +
>>>>>>>>    #define LOAD_OFFSET PAGE_OFFSET
>>>>>>>> +
>>>>>>>>    #include <asm/vmlinux.lds.h>
>>>>>>>>    #include <asm/page.h>
>>>>>>>>    #include <asm/cache.h>
>>>>>>>> @@ -132,3 +137,4 @@ SECTIONS
>>>>>>>>
>>>>>>>>         DISCARDS
>>>>>>>>    }
>>>>>>>> +#endif /* CONFIG_XIP_KERNEL */
>>>>>>>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>>>>>>>> index 7f5036fbee8c..efe649d41f95 100644
>>>>>>>> --- a/arch/riscv/mm/init.c
>>>>>>>> +++ b/arch/riscv/mm/init.c
>>>>>>>> @@ -31,8 +31,8 @@ EXPORT_SYMBOL(empty_zero_page);
>>>>>>>>
>>>>>>>>    extern char _start[];
>>>>>>>>    #define DTB_EARLY_BASE_VA      PGDIR_SIZE
>>>>>>>> -void *dtb_early_va __initdata;
>>>>>>>> -uintptr_t dtb_early_pa __initdata;
>>>>>>>> +void *_dtb_early_va __initdata;
>>>>>>>> +uintptr_t _dtb_early_pa __initdata;
>>>>>>>>
>>>>>>>>    struct pt_alloc_ops {
>>>>>>>>         pte_t *(*get_pte_virt)(phys_addr_t pa);
>>>>>>>> @@ -88,6 +88,10 @@ static void print_vm_layout(void)
>>>>>>>>                   (unsigned long)VMALLOC_END);
>>>>>>>>         print_mlm("lowmem", (unsigned long)PAGE_OFFSET,
>>>>>>>>                   (unsigned long)high_memory);
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +     print_mlm("xip", (unsigned long)XIP_VIRT_ADDR_START,
>>>>>>>> +               (unsigned long)XIP_VIRT_ADDR_START + SZ_16M);
>>>>>>>> +#endif /* CONFIG_XIP_KERNEL */
>>>>>>>>    }
>>>>>>>>    #else
>>>>>>>>    static void print_vm_layout(void) { }
>>>>>>>> @@ -113,6 +117,10 @@ void __init setup_bootmem(void)
>>>>>>>>         phys_addr_t dram_end = memblock_end_of_DRAM();
>>>>>>>>         phys_addr_t max_mapped_addr = __pa(~(ulong)0);
>>>>>>>>
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +     vmlinux_start = __pa_symbol(&_sdata);
>>>>>>>> +#endif
>>>>>>>> +
>>>>>>>>         /* The maximal physical memory size is -PAGE_OFFSET. */
>>>>>>>>         memblock_enforce_memory_limit(-PAGE_OFFSET);
>>>>>>>>
>>>>>>>> @@ -149,11 +157,27 @@ void __init setup_bootmem(void)
>>>>>>>>         memblock_allow_resize();
>>>>>>>>    }
>>>>>>>>
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +
>>>>>>>> +extern char _xiprom[], _exiprom[];
>>>>>>>> +extern char _sdata[], _edata[];
>>>>>>>> +
>>>>>>>> +#endif /* CONFIG_XIP_KERNEL */
>>>>>>>> +
>>>>>>>>    #ifdef CONFIG_MMU
>>>>>>>> -static struct pt_alloc_ops pt_ops;
>>>>>>>> +static struct pt_alloc_ops _pt_ops;
>>>>>>>> +
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +#define pt_ops (*(struct pt_alloc_ops *)XIP_FIXUP(&_pt_ops))
>>>>>>>> +#else
>>>>>>>> +#define pt_ops       _pt_ops
>>>>>>>> +#endif
>>>>>>>>
>>>>>>>>    unsigned long va_pa_offset;
>>>>>>>>    EXPORT_SYMBOL(va_pa_offset);
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +#define va_pa_offset (*((unsigned long *)XIP_FIXUP(&va_pa_offset)))
>>>>>>>> +#endif
>>>>>>>>    unsigned long pfn_base;
>>>>>>>>    EXPORT_SYMBOL(pfn_base);
>>>>>>>>
>>>>>>>> @@ -163,6 +187,12 @@ pte_t fixmap_pte[PTRS_PER_PTE]
>>>>>>>> __page_aligned_bss;
>>>>>>>>
>>>>>>>>    pgd_t early_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE);
>>>>>>>>
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +#define trampoline_pg_dir    ((pgd_t *)XIP_FIXUP(trampoline_pg_dir))
>>>>>>>> +#define fixmap_pte           ((pte_t *)XIP_FIXUP(fixmap_pte))
>>>>>>>> +#define early_pg_dir         ((pgd_t *)XIP_FIXUP(early_pg_dir))
>>>>>>>> +#endif /* CONFIG_XIP_KERNEL */
>>>>>>>> +
>>>>>>>>    void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
>>>>>>>> pgprot_t prot)
>>>>>>>>    {
>>>>>>>>         unsigned long addr = __fix_to_virt(idx);
>>>>>>>> @@ -238,6 +268,15 @@ pmd_t fixmap_pmd[PTRS_PER_PMD]
>>>>>>>> __page_aligned_bss;
>>>>>>>>    pmd_t early_pmd[PTRS_PER_PMD] __initdata __aligned(PAGE_SIZE);
>>>>>>>>    pmd_t early_dtb_pmd[PTRS_PER_PMD] __initdata __aligned(PAGE_SIZE);
>>>>>>>>
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +pmd_t xip_pmd[PTRS_PER_PMD] __page_aligned_bss;
>>>>>>>> +
>>>>>>>> +#define trampoline_pmd       ((pmd_t *)XIP_FIXUP(trampoline_pmd))
>>>>>>>> +#define fixmap_pmd   ((pmd_t *)XIP_FIXUP(fixmap_pmd))
>>>>>>>> +#define xip_pmd              ((pmd_t *)XIP_FIXUP(xip_pmd))
>>>>>>>> +#define early_pmd    ((pmd_t *)XIP_FIXUP(early_pmd))
>>>>>>>> +#endif /* CONFIG_XIP_KERNEL */
>>>>>>>> +
>>>>>>>>    static pmd_t *__init get_pmd_virt_early(phys_addr_t pa)
>>>>>>>>    {
>>>>>>>>         /* Before MMU is enabled */
>>>>>>>> @@ -354,6 +393,19 @@ static uintptr_t __init
>>>>>>>> best_map_size(phys_addr_t base, phys_addr_t size)
>>>>>>>>         return PMD_SIZE;
>>>>>>>>    }
>>>>>>>>
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +/* called from head.S with MMU off */
>>>>>>>> +asmlinkage void __init __copy_data(void)
>>>>>>>> +{
>>>>>>>> +     void *from = (void *)(&_sdata);
>>>>>>>> +     void *end = (void *)(&_end);
>>>>>>>> +     void *to = (void *)CONFIG_PHYS_RAM_BASE;
>>>>>>>> +     size_t sz = (size_t)(end - from);
>>>>>>>> +
>>>>>>>> +     memcpy(to, from, sz);
>>>>>>>> +}
>>>>>>>> +#endif
>>>>>>>> +
>>>>>>>>    /*
>>>>>>>>     * setup_vm() is called from head.S with MMU-off.
>>>>>>>>     *
>>>>>>>> @@ -374,7 +426,8 @@ static uintptr_t __init
>>>>>>>> best_map_size(phys_addr_t base, phys_addr_t size)
>>>>>>>>
>>>>>>>>    asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>>>>>>>    {
>>>>>>>> -     uintptr_t va, pa, end_va;
>>>>>>>> +     uintptr_t va, end_va;
>>>>>>>> +     uintptr_t __maybe_unused pa;
>>>>>>>>         uintptr_t load_pa = (uintptr_t)(&_start);
>>>>>>>>         uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
>>>>>>>>         uintptr_t map_size;
>>>>>>>> @@ -382,6 +435,13 @@ asmlinkage void __init setup_vm(uintptr_t
>>>>>>>> dtb_pa)
>>>>>>>>         pmd_t fix_bmap_spmd, fix_bmap_epmd;
>>>>>>>>    #endif
>>>>>>>>
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +     uintptr_t xiprom = (uintptr_t)CONFIG_XIP_PHYS_ADDR;
>>>>>>>> +     uintptr_t xiprom_sz = (uintptr_t)(&_exiprom) -
>>>>>>>> (uintptr_t)(&_xiprom);
>>>>>>>> +
>>>>>>>> +     load_pa = (uintptr_t)CONFIG_PHYS_RAM_BASE;
>>>>>>>> +     load_sz = (uintptr_t)(&_end) - (uintptr_t)(&_sdata);
>>>>>>>> +#endif
>>>>>>>>         va_pa_offset = PAGE_OFFSET - load_pa;
>>>>>
>>>>> I should have seen this before, I was too focused on having a XIP kernel
>>>>> boot. I already moved the kernel mapping in the vmalloc zone: the
>>>>> virtual to physical translations need to be handled differently now that
>>>>> the kernel mapping does not lie into linear mapping anymore, we can't
>>>>> use va_pa_offset defined above for both mappings.
>>>>>
>>>>> I was rebasing my patchset on the XIP patch but I believe that doing the
>>>>> other way around would greatly simplify the XIP patch as the kernel
>>>>> mapping would already be moved outside the linear mapping, there would
>>>>> be no need to reserve a zone in vmalloc anymore (that simplifies
>>>>> pgtable.h quite a lot). And the XIP kernel mapping could be implemented
>>>>> in a new create_kernel_page_table (that would also simplify mm/init.c).
>>>>>
>>>>> I can help to do that but I don't think we should merge this patch as is
>>>>> now.
>>>>
>>>> I think that's the right way to go for now: it's a lot harder to test
>>>> the XIP stuff, as it requires a bunch of harness changes.  So let's
>>>> take the page table refactoring in now and rebase this stuff on top of
>>>> it. There's really no way to do both without making more work for
>>>> someone, it's just a headache on timing.
>>>>
>>>> Alex: if you want to sign up to spend some time there that'd be great,
>>>> otherwise I will.
>>>
>>> I can take care of that, no problem.
>>
>> Vitaly, can you try the branch int/alex/riscv_xip_kernel_rebase_v1 at
>> https://github.com/AlexGhiti/riscv-linux ? This boots fine using my setup.
>>
>> I removed most of the pgtable.h stuff, the XIP kernel is now mapped like
>> any other kernel at the end of the address space, not in the vmalloc
>> zone as you proposed. And I fixed a few thigns along the way (pfn_base,
>> pfn_valid, copy_data and other stuff).
>>
>> Any comment is welcome !
> 
> thanks for your efforts!
> 
> I've just built your version. The build went well, but the image
> doesn't seem to boot for me.

Too bad. Can you give me your config please ?

Thanks,

Alex

> I'll take a look at the code later today.
> 
> Best regards,
>     Vitaly
> 
>> Thanks,
>>
>> Alex
>>
>>>
>>>> Either way, can you point me (either just indicate the old version is
>>>> OK or send a new one) to what you want me to look at WRT the page
>>>> table code?  IIRC it looked pretty much fine, but I'll take another
>>>> look ASAP so we can avoid serializing everything.
>>>
>>> The v3 is fine for me:
>>> https://patchwork.kernel.org/project/linux-riscv/list/?series=447699
>>>
>>>>
>>>>>
>>>>> Alex
>>>>>
>>>>>>>>         pfn_base = PFN_DOWN(load_pa);
>>>>>>>>
>>>>>>>> @@ -420,6 +480,21 @@ asmlinkage void __init setup_vm(uintptr_t
>>>>>>>> dtb_pa)
>>>>>>>>                            load_pa, PGDIR_SIZE, PAGE_KERNEL_EXEC);
>>>>>>>>    #endif
>>>>>>>>
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +     create_pgd_mapping(trampoline_pg_dir, XIP_VIRT_ADDR_START,
>>>>>>>> +                        (uintptr_t)xip_pmd, PGDIR_SIZE, PAGE_TABLE);
>>>>>>>> +     for (va = XIP_VIRT_ADDR_START;
>>>>>>>> +          va < XIP_VIRT_ADDR_START + xiprom_sz;
>>>>>>>> +          va += PMD_SIZE) {
>>>>>>>> +             create_pmd_mapping(xip_pmd, va,
>>>>>>>> +                                xiprom + (va - XIP_VIRT_ADDR_START),
>>>>>>>> +                                PMD_SIZE, PAGE_KERNEL_EXEC);
>>>>>>>> +     }
>>>>>>>> +
>>>>>>>> +     create_pgd_mapping(early_pg_dir, XIP_VIRT_ADDR_START,
>>>>>>>> +                        (uintptr_t)xip_pmd, PGDIR_SIZE, PAGE_TABLE);
>>>>>>>> +#endif
>>>>>>>> +
>>>>>>>>         /*
>>>>>>>>          * Setup early PGD covering entire kernel which will allows
>>>>>>>>          * us to reach paging_init(). We map all memory banks later
>>>>>>>> @@ -444,7 +519,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>>>>>>>                            pa + PMD_SIZE, PMD_SIZE, PAGE_KERNEL);
>>>>>>>>         dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa &
>>>>>>>> (PMD_SIZE - 1));
>>>>>>>>    #else /* CONFIG_BUILTIN_DTB */
>>>>>>>> -     dtb_early_va = __va(dtb_pa);
>>>>>>>> +     dtb_early_va = __va(XIP_FIXUP(dtb_pa));
>>>>>>>>    #endif /* CONFIG_BUILTIN_DTB */
>>>>>>>>    #else
>>>>>>>>    #ifndef CONFIG_BUILTIN_DTB
>>>>>>>> @@ -456,7 +531,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>>>>>>>                            pa + PGDIR_SIZE, PGDIR_SIZE, PAGE_KERNEL);
>>>>>>>>         dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa &
>>>>>>>> (PGDIR_SIZE - 1));
>>>>>>>>    #else /* CONFIG_BUILTIN_DTB */
>>>>>>>> -     dtb_early_va = __va(dtb_pa);
>>>>>>>> +     dtb_early_va = __va(XIP_FIXUP(dtb_pa));
>>>>>>>>    #endif /* CONFIG_BUILTIN_DTB */
>>>>>>>>    #endif
>>>>>>>>         dtb_early_pa = dtb_pa;
>>>>>>>> @@ -497,6 +572,9 @@ static void __init setup_vm_final(void)
>>>>>>>>         uintptr_t va, map_size;
>>>>>>>>         phys_addr_t pa, start, end;
>>>>>>>>         u64 i;
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +     uintptr_t xiprom_sz = (uintptr_t)(&_exiprom) -
>>>>>>>> (uintptr_t)(&_xiprom);
>>>>>>>> +#endif
>>>>>>>>
>>>>>>>>         /**
>>>>>>>>          * MMU is enabled at this point. But page table setup is
>>>>>>>> not complete yet.
>>>>>>>> @@ -528,6 +606,16 @@ static void __init setup_vm_final(void)
>>>>>>>>                                            map_size,
>>>>>>>> PAGE_KERNEL_EXEC);
>>>>>>>>                 }
>>>>>>>>         }
>>>>>>>> +#ifdef CONFIG_XIP_KERNEL
>>>>>>>> +     map_size = best_map_size(CONFIG_XIP_PHYS_ADDR, xiprom_sz);
>>>>>>>> +     for (va = XIP_VIRT_ADDR_START;
>>>>>>>> +          va < XIP_VIRT_ADDR_START + xiprom_sz;
>>>>>>>> +          va += map_size)
>>>>>>>> +             create_pgd_mapping(swapper_pg_dir, va,
>>>>>>>> +                                CONFIG_XIP_PHYS_ADDR + (va -
>>>>>>>> XIP_VIRT_ADDR_START),
>>>>>>>> +                                map_size, PAGE_KERNEL_EXEC);
>>>>>>>> +
>>>>>>>> +#endif
>>>>>>>>
>>>>>>>>         /* Clear fixmap PTE and PMD mappings */
>>>>>>>>         clear_fixmap(FIX_PTE);
>>>>>>
>>>>>> _______________________________________________
>>>>>> linux-riscv mailing list
>>>>>> linux-riscv at lists.infradead.org
>>>>>> http://lists.infradead.org/mailman/listinfo/linux-riscv
>>>>>>
>>>>
>>>> _______________________________________________
>>>> linux-riscv mailing list
>>>> linux-riscv at lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/linux-riscv
>>>
>>> _______________________________________________
>>> linux-riscv mailing list
>>> linux-riscv at lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-riscv
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
> 



More information about the linux-riscv mailing list