From dayss1224 at gmail.com Mon Sep 1 00:35:48 2025 From: dayss1224 at gmail.com (dayss1224 at gmail.com) Date: Mon, 1 Sep 2025 15:35:48 +0800 Subject: [PATCH v3 0/3] KVM: riscv: selftests: Enable supported test cases Message-ID: From: Dong Yang Add supported KVM test cases and fix the compilation dependencies. --- Changes in v3: - Reorder patches to fix build dependencies - Sort common supported test cases alphabetically - Move ucall_common.h include from common header to specific source files Changes in v2: - Delete some repeat KVM test cases on riscv - Add missing headers to fix the build for new RISC-V KVM selftests Dong Yang (1): KVM: riscv: selftests: Add missing headers for new testcases Quan Zhou (2): KVM: riscv: selftests: Use the existing RISCV_FENCE macro in `rseq-riscv.h` KVM: riscv: selftests: Add common supported test cases tools/testing/selftests/kvm/Makefile.kvm | 6 ++++++ tools/testing/selftests/kvm/access_tracking_perf_test.c | 1 + tools/testing/selftests/kvm/include/riscv/processor.h | 1 + .../selftests/kvm/memslot_modification_stress_test.c | 1 + tools/testing/selftests/kvm/memslot_perf_test.c | 1 + tools/testing/selftests/rseq/rseq-riscv.h | 3 +-- 6 files changed, 11 insertions(+), 2 deletions(-) -- 2.34.1 From dayss1224 at gmail.com Mon Sep 1 00:35:49 2025 From: dayss1224 at gmail.com (dayss1224 at gmail.com) Date: Mon, 1 Sep 2025 15:35:49 +0800 Subject: [PATCH v3 1/3] KVM: riscv: selftests: Use the existing RISCV_FENCE macro in `rseq-riscv.h` In-Reply-To: References: Message-ID: <85e5e51757c9289ca463fbc4ba6d22f9c9db791b.1756710918.git.dayss1224@gmail.com> From: Quan Zhou To avoid redefinition issues with RISCV_FENCE, directly reference the existing macro in `rseq-riscv.h`. Signed-off-by: Quan Zhou Signed-off-by: Dong Yang Reviewed-by: Andrew Jones --- tools/testing/selftests/rseq/rseq-riscv.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/tools/testing/selftests/rseq/rseq-riscv.h b/tools/testing/selftests/rseq/rseq-riscv.h index 67d544aaa9a3..06c840e81c8b 100644 --- a/tools/testing/selftests/rseq/rseq-riscv.h +++ b/tools/testing/selftests/rseq/rseq-riscv.h @@ -8,6 +8,7 @@ * exception when executed in all modes. */ #include +#include #if defined(__BYTE_ORDER) ? (__BYTE_ORDER == __LITTLE_ENDIAN) : defined(__LITTLE_ENDIAN) #define RSEQ_SIG 0xf1401073 /* csrr mhartid, x0 */ @@ -24,8 +25,6 @@ #define REG_L __REG_SEL("ld ", "lw ") #define REG_S __REG_SEL("sd ", "sw ") -#define RISCV_FENCE(p, s) \ - __asm__ __volatile__ ("fence " #p "," #s : : : "memory") #define rseq_smp_mb() RISCV_FENCE(rw, rw) #define rseq_smp_rmb() RISCV_FENCE(r, r) #define rseq_smp_wmb() RISCV_FENCE(w, w) -- 2.34.1 From dayss1224 at gmail.com Mon Sep 1 00:35:50 2025 From: dayss1224 at gmail.com (dayss1224 at gmail.com) Date: Mon, 1 Sep 2025 15:35:50 +0800 Subject: [PATCH v3 2/3] KVM: riscv: selftests: Add missing headers for new testcases In-Reply-To: References: Message-ID: From: Dong Yang Add missing headers to fix the build for new RISC-V KVM selftests. Signed-off-by: Quan Zhou Signed-off-by: Dong Yang --- tools/testing/selftests/kvm/access_tracking_perf_test.c | 1 + tools/testing/selftests/kvm/include/riscv/processor.h | 1 + tools/testing/selftests/kvm/memslot_modification_stress_test.c | 1 + tools/testing/selftests/kvm/memslot_perf_test.c | 1 + 4 files changed, 4 insertions(+) diff --git a/tools/testing/selftests/kvm/access_tracking_perf_test.c b/tools/testing/selftests/kvm/access_tracking_perf_test.c index c9de66537ec3..b058f27b2141 100644 --- a/tools/testing/selftests/kvm/access_tracking_perf_test.c +++ b/tools/testing/selftests/kvm/access_tracking_perf_test.c @@ -50,6 +50,7 @@ #include "memstress.h" #include "guest_modes.h" #include "processor.h" +#include "ucall_common.h" #include "cgroup_util.h" #include "lru_gen_util.h" diff --git a/tools/testing/selftests/kvm/include/riscv/processor.h b/tools/testing/selftests/kvm/include/riscv/processor.h index 162f303d9daa..e58282488beb 100644 --- a/tools/testing/selftests/kvm/include/riscv/processor.h +++ b/tools/testing/selftests/kvm/include/riscv/processor.h @@ -9,6 +9,7 @@ #include #include +#include #include "kvm_util.h" #define INSN_OPCODE_MASK 0x007c diff --git a/tools/testing/selftests/kvm/memslot_modification_stress_test.c b/tools/testing/selftests/kvm/memslot_modification_stress_test.c index c81a84990eab..3cdfa3b19b85 100644 --- a/tools/testing/selftests/kvm/memslot_modification_stress_test.c +++ b/tools/testing/selftests/kvm/memslot_modification_stress_test.c @@ -22,6 +22,7 @@ #include "processor.h" #include "test_util.h" #include "guest_modes.h" +#include "ucall_common.h" #define DUMMY_MEMSLOT_INDEX 7 diff --git a/tools/testing/selftests/kvm/memslot_perf_test.c b/tools/testing/selftests/kvm/memslot_perf_test.c index e3711beff7f3..5087d082c4b0 100644 --- a/tools/testing/selftests/kvm/memslot_perf_test.c +++ b/tools/testing/selftests/kvm/memslot_perf_test.c @@ -25,6 +25,7 @@ #include #include #include +#include #define MEM_EXTRA_SIZE SZ_64K -- 2.34.1 From dayss1224 at gmail.com Mon Sep 1 00:35:51 2025 From: dayss1224 at gmail.com (dayss1224 at gmail.com) Date: Mon, 1 Sep 2025 15:35:51 +0800 Subject: [PATCH v3 3/3] KVM: riscv: selftests: Add common supported test cases In-Reply-To: References: Message-ID: From: Quan Zhou Some common KVM test cases are supported on riscv now as following: access_tracking_perf_test dirty_log_perf_test memslot_modification_stress_test memslot_perf_test mmu_stress_test rseq_test Signed-off-by: Quan Zhou Signed-off-by: Dong Yang --- tools/testing/selftests/kvm/Makefile.kvm | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm index f6fe7a07a0a2..88613a851cc1 100644 --- a/tools/testing/selftests/kvm/Makefile.kvm +++ b/tools/testing/selftests/kvm/Makefile.kvm @@ -195,9 +195,15 @@ TEST_GEN_PROGS_s390 += rseq_test TEST_GEN_PROGS_riscv = $(TEST_GEN_PROGS_COMMON) TEST_GEN_PROGS_riscv += riscv/sbi_pmu_test TEST_GEN_PROGS_riscv += riscv/ebreak_test +TEST_GEN_PROGS_riscv += access_tracking_perf_test TEST_GEN_PROGS_riscv += arch_timer TEST_GEN_PROGS_riscv += coalesced_io_test +TEST_GEN_PROGS_riscv += dirty_log_perf_test TEST_GEN_PROGS_riscv += get-reg-list +TEST_GEN_PROGS_riscv += memslot_modification_stress_test +TEST_GEN_PROGS_riscv += memslot_perf_test +TEST_GEN_PROGS_riscv += mmu_stress_test +TEST_GEN_PROGS_riscv += rseq_test TEST_GEN_PROGS_riscv += steal_time TEST_GEN_PROGS_loongarch += coalesced_io_test -- 2.34.1 From wojciech.mroczka at biznator.pl Mon Sep 1 00:55:57 2025 From: wojciech.mroczka at biznator.pl (Wojciech Mroczka) Date: Mon, 1 Sep 2025 07:55:57 GMT Subject: Fotowoltaika dla firm Message-ID: <20250901084501-0.1.je.1pe2t.0.f8lfhb69mk@biznator.pl> Dzie? dobry, pomagamy mikro, ma?ym i ?rednim przedsi?biorstwom obni?y? koszty energii elektrycznej dzi?ki monta?owi instalacji fotowoltaicznej. Aktualnie proponujemy bezkonkurencyjne warunki wsp??pracy, od planu poprzez realizacj? a? po serwis. Prosz? o informacj? czy mo?emy przedstawi? propozycje dla Pa?stwa firmy? Pozdrawiam, Wojciech Mroczka From pulehui at huawei.com Mon Sep 1 01:06:39 2025 From: pulehui at huawei.com (Pu Lehui) Date: Mon, 1 Sep 2025 16:06:39 +0800 Subject: [PATCH] riscv, bpf: Sign extend struct ops return values properly In-Reply-To: References: <20250827120344.6796-1-hengqi.chen@gmail.com> Message-ID: <1be38ff5-ea37-4d5d-9f33-16799d2fe2c5@huawei.com> On 2025/8/28 9:53, Pu Lehui wrote: > > On 2025/8/27 20:03, Hengqi Chen wrote: >> The ns_bpf_qdisc selftest triggers a kernel panic: >> >> ???? Unable to handle kernel paging request at virtual address >> ffffffffa38dbf58 >> ???? Current test_progs pgtable: 4K pagesize, 57-bit VAs, >> pgdp=0x00000001109cc000 >> ???? [ffffffffa38dbf58] pgd=000000011fffd801, p4d=000000011fffd401, >> pud=000000011fffd001, pmd=0000000000000000 >> ???? Oops [#1] >> ???? Modules linked in: bpf_testmod(OE) xt_conntrack nls_iso8859_1 >> dm_mod drm drm_panel_orientation_quirks configfs backlight btrfs >> blake2b_generic xor lzo_compress zlib_deflate raid6_pq efivarfs [last >> unloaded: bpf_testmod(OE)] >> ???? CPU: 1 UID: 0 PID: 23584 Comm: test_progs Tainted: G??????? W >> OE?????? 6.17.0-rc1-g2465bb83e0b4 #1 NONE >> ???? Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE >> ???? Hardware name: Unknown Unknown Product/Unknown Product, BIOS >> 2024.01+dfsg-1ubuntu5.1 01/01/2024 >> ???? epc : __qdisc_run+0x82/0x6f0 >> ????? ra : __qdisc_run+0x6e/0x6f0 >> ???? epc : ffffffff80bd5c7a ra : ffffffff80bd5c66 sp : ff2000000eecb550 >> ????? gp : ffffffff82472098 tp : ff60000096895940 t0 : ffffffff8001f180 >> ????? t1 : ffffffff801e1664 t2 : 0000000000000000 s0 : ff2000000eecb5d0 >> ????? s1 : ff60000093a6a600 a0 : ffffffffa38dbee8 a1 : 0000000000000001 >> ????? a2 : ff2000000eecb510 a3 : 0000000000000001 a4 : 0000000000000000 >> ????? a5 : 0000000000000010 a6 : 0000000000000000 a7 : 0000000000735049 >> ????? s2 : ffffffffa38dbee8 s3 : 0000000000000040 s4 : ff6000008bcda000 >> ????? s5 : 0000000000000008 s6 : ff60000093a6a680 s7 : ff60000093a6a6f0 >> ????? s8 : ff60000093a6a6ac s9 : ff60000093140000 s10: 0000000000000000 >> ????? s11: ff2000000eecb9d0 t3 : 0000000000000000 t4 : 0000000000ff0000 >> ????? t5 : 0000000000000000 t6 : ff60000093a6a8b6 >> ???? status: 0000000200000120 badaddr: ffffffffa38dbf58 cause: >> 000000000000000d >> ???? [] __qdisc_run+0x82/0x6f0 >> ???? [] __dev_queue_xmit+0x4c0/0x1128 >> ???? [] neigh_resolve_output+0xd0/0x170 >> ???? [] ip6_finish_output2+0x226/0x6c8 >> ???? [] ip6_finish_output+0x10c/0x2a0 >> ???? [] ip6_output+0x5e/0x178 >> ???? [] ip6_xmit+0x29a/0x608 >> ???? [] inet6_csk_xmit+0xe6/0x140 >> ???? [] __tcp_transmit_skb+0x45c/0xaa8 >> ???? [] tcp_connect+0x9ce/0xd10 >> ???? [] tcp_v6_connect+0x4ac/0x5e8 >> ???? [] __inet_stream_connect+0xd8/0x318 >> ???? [] inet_stream_connect+0x3e/0x68 >> ???? [] __sys_connect_file+0x50/0x88 >> ???? [] __sys_connect+0x96/0xc8 >> ???? [] __riscv_sys_connect+0x20/0x30 >> ???? [] do_trap_ecall_u+0x256/0x378 >> ???? [] handle_exception+0x14a/0x156 >> ???? Code: 892a 0363 1205 489c 8bc1 c7e5 2d03 084a 2703 080a (2783) 0709 >> ???? ---[ end trace 0000000000000000 ]--- >> >> The bpf_fifo_dequeue prog returns a skb which is a pointer. >> The pointer is treated as a 32bit value and sign extend to >> 64bit in epilogue. This behavior is right for most bpf prog >> types but wrong for struct ops which requires RISC-V ABI. > > Hi Hengqi, > > Nice catch! > > Actually, I think commit 7112cd26e606c7ba51f9cc5c1905f06039f6f379 looks > a little bit wired and related to this issue. I guess I need some time > to recall this commit. Hi Hengqi, Sorry for late due to busy work. After some backtracking, I dismissed my doubts about commit 7112cd26e606. > > Thanks. > >> >> So let's sign extend struct ops return values according to >> the return value spec in function model. >> >> Fixes: 25ad10658dc1 ("riscv, bpf: Adapt bpf trampoline to optimized >> riscv ftrace framework") >> Signed-off-by: Hengqi Chen >> --- >> ? arch/riscv/net/bpf_jit_comp64.c | 33 +++++++++++++++++++++++++++++++++ >> ? 1 file changed, 33 insertions(+) >> >> diff --git a/arch/riscv/net/bpf_jit_comp64.c >> b/arch/riscv/net/bpf_jit_comp64.c >> index 549c3063c7f1..11ca56320a3f 100644 >> --- a/arch/riscv/net/bpf_jit_comp64.c >> +++ b/arch/riscv/net/bpf_jit_comp64.c >> @@ -954,6 +954,33 @@ static int invoke_bpf_prog(struct bpf_tramp_link >> *l, int args_off, int retval_of >> ????? return ret; >> ? } >> +/* >> + * Sign-extend the register if necessary >> + */ >> +static int sign_extend(struct rv_jit_context *ctx, int r, u8 size) >> +{ >> +??? switch (size) { >> +??? case 1: >> +??????? emit_slli(r, r, 56, ctx); >> +??????? emit_srai(r, r, 56, ctx); >> +??????? break; >> +??? case 2: >> +??????? emit_slli(r, r, 48, ctx); >> +??????? emit_srai(r, r, 48, ctx); >> +??????? break; >> +??? case 4: >> +??????? emit_addiw(r, r, 0, ctx); >> +??????? break; >> +??? case 8: >> +??????? break; >> +??? default: >> +??????? pr_err("bpf-jit: invalid size %d for sign_extend\n", size); >> +??????? return -EINVAL; >> +??? } >> + >> +??? return 0; >> +} We don't need to sign-ext when return value is 1 or 2 bytes. As for 4 bytes, we have already do that in __build_epilogue. So we only need to take care of 8 bytes return value. And the real fix would be: diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c index 2f7188e0340a..08cc641f8b7c 100644 --- a/arch/riscv/net/bpf_jit_comp64.c +++ b/arch/riscv/net/bpf_jit_comp64.c @@ -1177,6 +1177,9 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, if (save_ret) { emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx); + /* Do not truncate return value when it's 8 bytes */ + if (is_struct_ops && m->ret_size == 8) + emit_mv(RV_REG_A0, regmap[BPF_REG_0], ctx); } emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); >> + >> ? static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, >> ?????????????????????? const struct btf_func_model *m, >> ?????????????????????? struct bpf_tramp_links *tlinks, >> @@ -1177,6 +1204,12 @@ static int __arch_prepare_bpf_trampoline(struct >> bpf_tramp_image *im, >> ????? if (save_ret) { >> ????????? emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); >> ????????? emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx); >> +??????? if (is_struct_ops) { >> +??????????? emit_mv(RV_REG_A0, regmap[BPF_REG_0], ctx); >> +??????????? ret = sign_extend(ctx, RV_REG_A0, m->ret_size); >> +??????????? if (ret) >> +??????????????? goto out; >> +??????? } >> ????? } >> ????? emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); From nutty.liu at hotmail.com Mon Sep 1 01:37:14 2025 From: nutty.liu at hotmail.com (Nutty.Liu) Date: Mon, 1 Sep 2025 16:37:14 +0800 Subject: [PATCH v2 4/5] riscv: KVM: allow Zilsd and Zclsd extensions for Guest/VM In-Reply-To: <20250826162939.1494021-5-pincheng.plct@isrc.iscas.ac.cn> References: <20250826162939.1494021-1-pincheng.plct@isrc.iscas.ac.cn> <20250826162939.1494021-5-pincheng.plct@isrc.iscas.ac.cn> Message-ID: On 8/27/2025 12:29 AM, Pincheng Wang wrote: > Extend the KVM ISA extension ONE_REG interface to allow KVM user space > to detect and enable Zilsd and Zclsd extensions for Guest/VM. > > Signed-off-by: Pincheng Wang > --- > arch/riscv/include/uapi/asm/kvm.h | 2 ++ > arch/riscv/kvm/vcpu_onereg.c | 2 ++ > 2 files changed, 4 insertions(+) Reviewed-by: Nutty Liu Thanks. > diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h > index 5f59fd226cc5..beb7ce06dce8 100644 > --- a/arch/riscv/include/uapi/asm/kvm.h > +++ b/arch/riscv/include/uapi/asm/kvm.h > @@ -174,6 +174,8 @@ enum KVM_RISCV_ISA_EXT_ID { > KVM_RISCV_ISA_EXT_ZCD, > KVM_RISCV_ISA_EXT_ZCF, > KVM_RISCV_ISA_EXT_ZCMOP, > + KVM_RISCV_ISA_EXT_ZCLSD, > + KVM_RISCV_ISA_EXT_ZILSD, > KVM_RISCV_ISA_EXT_ZAWRS, > KVM_RISCV_ISA_EXT_SMNPM, > KVM_RISCV_ISA_EXT_SSNPM, > diff --git a/arch/riscv/kvm/vcpu_onereg.c b/arch/riscv/kvm/vcpu_onereg.c > index 2e1b646f0d61..8219769fc4a1 100644 > --- a/arch/riscv/kvm/vcpu_onereg.c > +++ b/arch/riscv/kvm/vcpu_onereg.c > @@ -64,6 +64,7 @@ static const unsigned long kvm_isa_ext_arr[] = { > KVM_ISA_EXT_ARR(ZCD), > KVM_ISA_EXT_ARR(ZCF), > KVM_ISA_EXT_ARR(ZCMOP), > + KVM_ISA_EXT_ARR(ZCLSD), > KVM_ISA_EXT_ARR(ZFA), > KVM_ISA_EXT_ARR(ZFH), > KVM_ISA_EXT_ARR(ZFHMIN), > @@ -78,6 +79,7 @@ static const unsigned long kvm_isa_ext_arr[] = { > KVM_ISA_EXT_ARR(ZIHINTPAUSE), > KVM_ISA_EXT_ARR(ZIHPM), > KVM_ISA_EXT_ARR(ZIMOP), > + KVM_ISA_EXT_ARR(ZILSD), > KVM_ISA_EXT_ARR(ZKND), > KVM_ISA_EXT_ARR(ZKNE), > KVM_ISA_EXT_ARR(ZKNH), From nutty.liu at hotmail.com Mon Sep 1 01:37:45 2025 From: nutty.liu at hotmail.com (Nutty.Liu) Date: Mon, 1 Sep 2025 16:37:45 +0800 Subject: [PATCH v2 5/5] KVM: riscv: selftests: add Zilsd and Zclsd extension to get-reg-list test In-Reply-To: <20250826162939.1494021-6-pincheng.plct@isrc.iscas.ac.cn> References: <20250826162939.1494021-1-pincheng.plct@isrc.iscas.ac.cn> <20250826162939.1494021-6-pincheng.plct@isrc.iscas.ac.cn> Message-ID: On 8/27/2025 12:29 AM, Pincheng Wang wrote: > The KVM RISC-V allows Zilsd and Zclsd extensions for Guest/VM so add > this extension to get-reg-list test. > > Signed-off-by: Pincheng Wang > --- > tools/testing/selftests/kvm/riscv/get-reg-list.c | 6 ++++++ > 1 file changed, 6 insertions(+) Reviewed-by: Nutty Liu Thanks. > diff --git a/tools/testing/selftests/kvm/riscv/get-reg-list.c b/tools/testing/selftests/kvm/riscv/get-reg-list.c > index a0b7dabb5040..477bd386265f 100644 > --- a/tools/testing/selftests/kvm/riscv/get-reg-list.c > +++ b/tools/testing/selftests/kvm/riscv/get-reg-list.c > @@ -78,7 +78,9 @@ bool filter_reg(__u64 reg) > case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZCB: > case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZCD: > case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZCF: > + case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZCLSD: > case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZCMOP: > + case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZILSD: > case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZFA: > case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZFH: > case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZFHMIN: > @@ -530,7 +532,9 @@ static const char *isa_ext_single_id_to_str(__u64 reg_off) > KVM_ISA_EXT_ARR(ZCB), > KVM_ISA_EXT_ARR(ZCD), > KVM_ISA_EXT_ARR(ZCF), > + KVM_ISA_EXT_ARR(ZCLSD), > KVM_ISA_EXT_ARR(ZCMOP), > + KVM_ISA_EXT_ARR(ZILSD), > KVM_ISA_EXT_ARR(ZFA), > KVM_ISA_EXT_ARR(ZFH), > KVM_ISA_EXT_ARR(ZFHMIN), > @@ -1199,7 +1203,9 @@ struct vcpu_reg_list *vcpu_configs[] = { > &config_zcb, > &config_zcd, > &config_zcf, > + &config_zclsd, > &config_zcmop, > + &config_zilsd, > &config_zfa, > &config_zfh, > &config_zfhmin, From nutty.liu at hotmail.com Mon Sep 1 01:39:00 2025 From: nutty.liu at hotmail.com (Nutty.Liu) Date: Mon, 1 Sep 2025 16:39:00 +0800 Subject: [PATCH v2 3/5] riscv: hwprobe: export Zilsd and Zclsd ISA extensions In-Reply-To: <20250826162939.1494021-4-pincheng.plct@isrc.iscas.ac.cn> References: <20250826162939.1494021-1-pincheng.plct@isrc.iscas.ac.cn> <20250826162939.1494021-4-pincheng.plct@isrc.iscas.ac.cn> Message-ID: On 8/27/2025 12:29 AM, Pincheng Wang wrote: > Export Zilsd and Zclsd ISA extensions through hwprobe. > > Signed-off-by: Pincheng Wang > --- > Documentation/arch/riscv/hwprobe.rst | 8 ++++++++ > arch/riscv/include/uapi/asm/hwprobe.h | 2 ++ > arch/riscv/kernel/sys_hwprobe.c | 2 ++ > 3 files changed, 12 insertions(+) > Reviewed-by: Nutty Liu Thanks. From nutty.liu at hotmail.com Mon Sep 1 01:41:16 2025 From: nutty.liu at hotmail.com (Nutty.Liu) Date: Mon, 1 Sep 2025 16:41:16 +0800 Subject: [PATCH v2 2/5] riscv: add ISA extension parsing for Zilsd and Zclsd In-Reply-To: <20250826162939.1494021-3-pincheng.plct@isrc.iscas.ac.cn> References: <20250826162939.1494021-1-pincheng.plct@isrc.iscas.ac.cn> <20250826162939.1494021-3-pincheng.plct@isrc.iscas.ac.cn> Message-ID: On 8/27/2025 12:29 AM, Pincheng Wang wrote: > Add parsing for Zilsd and Zclsd ISA extensions which were ratified in > commit f88abf1 ("Integrating load/store pair for RV32 with the > main manual") of the riscv-isa-manual. > > Signed-off-by: Pincheng Wang > --- > arch/riscv/include/asm/hwcap.h | 2 ++ > arch/riscv/kernel/cpufeature.c | 24 ++++++++++++++++++++++++ > 2 files changed, 26 insertions(+) Reviewed-by: Nutty Liu Thanks. From nutty.liu at hotmail.com Mon Sep 1 01:46:39 2025 From: nutty.liu at hotmail.com (Nutty.Liu) Date: Mon, 1 Sep 2025 16:46:39 +0800 Subject: [PATCH v2 1/5] dt-bindings: riscv: add Zilsd and Zclsd extension descriptions In-Reply-To: <20250826162939.1494021-2-pincheng.plct@isrc.iscas.ac.cn> References: <20250826162939.1494021-1-pincheng.plct@isrc.iscas.ac.cn> <20250826162939.1494021-2-pincheng.plct@isrc.iscas.ac.cn> Message-ID: On 8/27/2025 12:29 AM, Pincheng Wang wrote: > Add descriptions for the Zilsd (Load/Store pair instructions) and > Zclsd (Compressed Load/Store pair instructions) ISA extensions > which were ratified in commit f88abf1 ("Integrating load/store > pair for RV32 with the main manual") of the riscv-isa-manual. > > Signed-off-by: Pincheng Wang > --- > .../devicetree/bindings/riscv/extensions.yaml | 36 +++++++++++++++++++ > 1 file changed, 36 insertions(+) Reviewed-by: Nutty Liu Thanks. From bhe at redhat.com Mon Sep 1 02:02:02 2025 From: bhe at redhat.com (Baoquan He) Date: Mon, 1 Sep 2025 17:02:02 +0800 Subject: [PATCH 0/3] kexec: Fix invalid field access In-Reply-To: References: <20250827-kbuf_all-v1-0-1df9882bb01a@debian.org> Message-ID: On 09/01/25 at 08:42am, Alexandre Ghiti wrote: > Hi Breno, > > On 8/27/25 12:42, Breno Leitao wrote: .....snip... > > I see that the commit those patches fix is in 6.16 so we should add cc: > stable. > > And who should merge those patches? Should we do it on a per-arch basis? It's been in Andrew's akpm-mm/mm-hotfixes-unstable. From hengqi.chen at gmail.com Mon Sep 1 02:14:49 2025 From: hengqi.chen at gmail.com (Hengqi Chen) Date: Mon, 1 Sep 2025 17:14:49 +0800 Subject: [PATCH] riscv, bpf: Sign extend struct ops return values properly In-Reply-To: <1be38ff5-ea37-4d5d-9f33-16799d2fe2c5@huawei.com> References: <20250827120344.6796-1-hengqi.chen@gmail.com> <1be38ff5-ea37-4d5d-9f33-16799d2fe2c5@huawei.com> Message-ID: On Mon, Sep 1, 2025 at 4:06?PM Pu Lehui wrote: > > > > On 2025/8/28 9:53, Pu Lehui wrote: > > > > On 2025/8/27 20:03, Hengqi Chen wrote: > >> The ns_bpf_qdisc selftest triggers a kernel panic: > >> > >> Unable to handle kernel paging request at virtual address > >> ffffffffa38dbf58 > >> Current test_progs pgtable: 4K pagesize, 57-bit VAs, > >> pgdp=0x00000001109cc000 > >> [ffffffffa38dbf58] pgd=000000011fffd801, p4d=000000011fffd401, > >> pud=000000011fffd001, pmd=0000000000000000 > >> Oops [#1] > >> Modules linked in: bpf_testmod(OE) xt_conntrack nls_iso8859_1 > >> dm_mod drm drm_panel_orientation_quirks configfs backlight btrfs > >> blake2b_generic xor lzo_compress zlib_deflate raid6_pq efivarfs [last > >> unloaded: bpf_testmod(OE)] > >> CPU: 1 UID: 0 PID: 23584 Comm: test_progs Tainted: G W > >> OE 6.17.0-rc1-g2465bb83e0b4 #1 NONE > >> Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > >> Hardware name: Unknown Unknown Product/Unknown Product, BIOS > >> 2024.01+dfsg-1ubuntu5.1 01/01/2024 > >> epc : __qdisc_run+0x82/0x6f0 > >> ra : __qdisc_run+0x6e/0x6f0 > >> epc : ffffffff80bd5c7a ra : ffffffff80bd5c66 sp : ff2000000eecb550 > >> gp : ffffffff82472098 tp : ff60000096895940 t0 : ffffffff8001f180 > >> t1 : ffffffff801e1664 t2 : 0000000000000000 s0 : ff2000000eecb5d0 > >> s1 : ff60000093a6a600 a0 : ffffffffa38dbee8 a1 : 0000000000000001 > >> a2 : ff2000000eecb510 a3 : 0000000000000001 a4 : 0000000000000000 > >> a5 : 0000000000000010 a6 : 0000000000000000 a7 : 0000000000735049 > >> s2 : ffffffffa38dbee8 s3 : 0000000000000040 s4 : ff6000008bcda000 > >> s5 : 0000000000000008 s6 : ff60000093a6a680 s7 : ff60000093a6a6f0 > >> s8 : ff60000093a6a6ac s9 : ff60000093140000 s10: 0000000000000000 > >> s11: ff2000000eecb9d0 t3 : 0000000000000000 t4 : 0000000000ff0000 > >> t5 : 0000000000000000 t6 : ff60000093a6a8b6 > >> status: 0000000200000120 badaddr: ffffffffa38dbf58 cause: > >> 000000000000000d > >> [] __qdisc_run+0x82/0x6f0 > >> [] __dev_queue_xmit+0x4c0/0x1128 > >> [] neigh_resolve_output+0xd0/0x170 > >> [] ip6_finish_output2+0x226/0x6c8 > >> [] ip6_finish_output+0x10c/0x2a0 > >> [] ip6_output+0x5e/0x178 > >> [] ip6_xmit+0x29a/0x608 > >> [] inet6_csk_xmit+0xe6/0x140 > >> [] __tcp_transmit_skb+0x45c/0xaa8 > >> [] tcp_connect+0x9ce/0xd10 > >> [] tcp_v6_connect+0x4ac/0x5e8 > >> [] __inet_stream_connect+0xd8/0x318 > >> [] inet_stream_connect+0x3e/0x68 > >> [] __sys_connect_file+0x50/0x88 > >> [] __sys_connect+0x96/0xc8 > >> [] __riscv_sys_connect+0x20/0x30 > >> [] do_trap_ecall_u+0x256/0x378 > >> [] handle_exception+0x14a/0x156 > >> Code: 892a 0363 1205 489c 8bc1 c7e5 2d03 084a 2703 080a (2783) 0709 > >> ---[ end trace 0000000000000000 ]--- > >> > >> The bpf_fifo_dequeue prog returns a skb which is a pointer. > >> The pointer is treated as a 32bit value and sign extend to > >> 64bit in epilogue. This behavior is right for most bpf prog > >> types but wrong for struct ops which requires RISC-V ABI. > > > > Hi Hengqi, > > > > Nice catch! > > > > Actually, I think commit 7112cd26e606c7ba51f9cc5c1905f06039f6f379 looks > > a little bit wired and related to this issue. I guess I need some time > > to recall this commit. > > Hi Hengqi, > > Sorry for late due to busy work. After some backtracking, I dismissed my > doubts about commit 7112cd26e606. > > > > > Thanks. > > > >> > >> So let's sign extend struct ops return values according to > >> the return value spec in function model. > >> > >> Fixes: 25ad10658dc1 ("riscv, bpf: Adapt bpf trampoline to optimized > >> riscv ftrace framework") > >> Signed-off-by: Hengqi Chen > >> --- > >> arch/riscv/net/bpf_jit_comp64.c | 33 +++++++++++++++++++++++++++++++++ > >> 1 file changed, 33 insertions(+) > >> > >> diff --git a/arch/riscv/net/bpf_jit_comp64.c > >> b/arch/riscv/net/bpf_jit_comp64.c > >> index 549c3063c7f1..11ca56320a3f 100644 > >> --- a/arch/riscv/net/bpf_jit_comp64.c > >> +++ b/arch/riscv/net/bpf_jit_comp64.c > >> @@ -954,6 +954,33 @@ static int invoke_bpf_prog(struct bpf_tramp_link > >> *l, int args_off, int retval_of > >> return ret; > >> } > >> +/* > >> + * Sign-extend the register if necessary > >> + */ > >> +static int sign_extend(struct rv_jit_context *ctx, int r, u8 size) > >> +{ > >> + switch (size) { > >> + case 1: > >> + emit_slli(r, r, 56, ctx); > >> + emit_srai(r, r, 56, ctx); > >> + break; > >> + case 2: > >> + emit_slli(r, r, 48, ctx); > >> + emit_srai(r, r, 48, ctx); > >> + break; > >> + case 4: > >> + emit_addiw(r, r, 0, ctx); > >> + break; > >> + case 8: > >> + break; > >> + default: > >> + pr_err("bpf-jit: invalid size %d for sign_extend\n", size); > >> + return -EINVAL; > >> + } > >> + > >> + return 0; > >> +} > > We don't need to sign-ext when return value is 1 or 2 bytes. As for 4 Could you please elaborate more on this ? IIUC, addiw on 1 byte / 2 byte values is equivalent to zext them. > bytes, we have already do that in __build_epilogue. So we only need to > take care of 8 bytes return value. And the real fix would be: > > diff --git a/arch/riscv/net/bpf_jit_comp64.c > b/arch/riscv/net/bpf_jit_comp64.c > index 2f7188e0340a..08cc641f8b7c 100644 > --- a/arch/riscv/net/bpf_jit_comp64.c > +++ b/arch/riscv/net/bpf_jit_comp64.c > @@ -1177,6 +1177,9 @@ static int __arch_prepare_bpf_trampoline(struct > bpf_tramp_image *im, > if (save_ret) { > emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > emit_ld(regmap[BPF_REG_0], -(retval_off - 8), > RV_REG_FP, ctx); > + /* Do not truncate return value when it's 8 bytes */ > + if (is_struct_ops && m->ret_size == 8) > + emit_mv(RV_REG_A0, regmap[BPF_REG_0], ctx); > } > > emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); > > >> + > >> static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, > >> const struct btf_func_model *m, > >> struct bpf_tramp_links *tlinks, > >> @@ -1177,6 +1204,12 @@ static int __arch_prepare_bpf_trampoline(struct > >> bpf_tramp_image *im, > >> if (save_ret) { > >> emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > >> emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx); > >> + if (is_struct_ops) { > >> + emit_mv(RV_REG_A0, regmap[BPF_REG_0], ctx); > >> + ret = sign_extend(ctx, RV_REG_A0, m->ret_size); > >> + if (ret) > >> + goto out; > >> + } > >> } > >> emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); From alex at ghiti.fr Mon Sep 1 02:40:34 2025 From: alex at ghiti.fr (Alexandre Ghiti) Date: Mon, 1 Sep 2025 11:40:34 +0200 Subject: [PATCH 0/3] kexec: Fix invalid field access In-Reply-To: References: <20250827-kbuf_all-v1-0-1df9882bb01a@debian.org> Message-ID: <8bc8b5b9-30a5-4c2f-b7a2-8074f8321d56@ghiti.fr> Hi Baoquan, On 9/1/25 11:02, Baoquan He wrote: > On 09/01/25 at 08:42am, Alexandre Ghiti wrote: >> Hi Breno, >> >> On 8/27/25 12:42, Breno Leitao wrote: > .....snip... >> I see that the commit those patches fix is in 6.16 so we should add cc: >> stable. >> >> And who should merge those patches? Should we do it on a per-arch basis? > It's been in Andrew's akpm-mm/mm-hotfixes-unstable. Great, thanks for letting me know. Alex > From geert at linux-m68k.org Mon Sep 1 02:49:45 2025 From: geert at linux-m68k.org (Geert Uytterhoeven) Date: Mon, 1 Sep 2025 11:49:45 +0200 Subject: [PATCH 000/114] clk: convert drivers from deprecated round_rate() to determine_rate() In-Reply-To: <7c6cc42c-fc76-4300-b0d2-8dabf54cf337@kernel.org> References: <20250811-clk-for-stephen-round-rate-v1-0-b3bf97b038dc@redhat.com> <1907e1c7-2b15-4729-8497-a7e6f0526366@kernel.org> <4d31df9e-62c9-4988-9301-2911ff7de229@kernel.org> <7c6cc42c-fc76-4300-b0d2-8dabf54cf337@kernel.org> Message-ID: On Sat, 23 Aug 2025 at 18:43, Krzysztof Kozlowski wrote: > On 22/08/2025 15:09, Brian Masney wrote: > > On Fri, Aug 22, 2025 at 02:23:50PM +0200, Krzysztof Kozlowski wrote: > >> On 22/08/2025 13:32, Brian Masney wrote: > >>> 7 of the 114 patches in this series needs a v2 with a minor fix. I see > >>> several paths forward to merging this. It's ultimately up to Stephen how > >>> he wants to proceed. > >>> > >>> - I send Stephen a PULL request with all of these patches with the minor > >>> cleanups to the 7 patches. Depending on the timing, Stephen can merge > >>> the other work first, and I deal with cleaning up the merge conflicts. > >>> Or he can if he prefers to instead. > >>> > >>> - Stephen applies everyone else's work first to his tree, and then the > >>> good 107 patches in this series. He skips anything that doesn't apply > >>> due to other people's work and I follow up with a smaller series. > >> > >> Both cause cross tree merge conflicts. Anyway, please document clearly > >> the dependencies between patches. > > > > This series only touches drivers/clk, so it shouldn't cause any issues > > with other subsystems, unless there's a topic branch somewhere, or I'm > > missing something? > > Individual maintainers handle subdirectories. FWI(still)W, I have taken the Renesas SoC-specific patches through the renesas-clk tree... > > There are some drivers under drivers/clk/ where there is an entry in the > > MAINTAINERS file that's not Stephen, although it wasn't clear to me if > > all of those people will send PULL requests to Stephen. I described on > > the cover how how the series was broken up. > > > > - Patches 4-70 are for drivers where there is no clk submaintainer > > - Patches 71-110 are for drivers where this is an entry in MAINTAINERS > > (for drivers/clk) > > It's hidden between multiple other descriptions of patches, so I really > would not think that this means that it is okay by individual maintainer > to take the patch. > > This really should be the one most important part of the cover letter > for something like this. > .. It was indeed rather implicit: "Once all of my conversion patches across the various trees in the kernel have been merged, I will post a small series that removes the round_rate() op from the clk core and the documentation. Here's the other patch series that are currently in flight that need to be merged before we can remove round_rate() from the core. [...]" Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds From devnull+aleksa.paunovic.htecgroup.com at kernel.org Mon Sep 1 03:38:14 2025 From: devnull+aleksa.paunovic.htecgroup.com at kernel.org (Aleksa Paunovic via B4 Relay) Date: Mon, 01 Sep 2025 12:38:14 +0200 Subject: [PATCH v3] riscv: Use Zalrsc extension to implement atomic functions Message-ID: <20250901-p8700-zalrsc-v3-1-ec64fabbe093@htecgroup.com> From: Chao-ying Fu Use only LR/SC instructions to implement atomic functions. Add config ERRATA_MIPS_P8700_AMO_ZALRSC. Signed-off-by: Chao-ying Fu Signed-off-by: Aleksandar Rikalo Co-developed-by: Aleksa Paunovic Signed-off-by: Aleksa Paunovic --- This patch depends on [1], which implements errata support for the MIPS p8700. Changes in v3: - Use alternatives to replace AMO instructions with LR/SC - Rebase on Alexandre Ghiti's "for-next" branch. - Link to v2: https://lore.kernel.org/linux-riscv/20241225082412.36727-1-arikalo at gmail.com/ [1] https://lore.kernel.org/linux-riscv/20250724-p8700-pause-v5-0-a6cbbe1c3412 at htecgroup.com/ --- arch/riscv/Kconfig.errata | 11 ++ arch/riscv/errata/mips/errata.c | 13 +- arch/riscv/include/asm/atomic.h | 29 ++-- arch/riscv/include/asm/bitops.h | 28 ++-- arch/riscv/include/asm/cmpxchg.h | 9 +- arch/riscv/include/asm/errata_list.h | 215 +++++++++++++++++++++++++++ arch/riscv/include/asm/errata_list_vendors.h | 3 +- arch/riscv/include/asm/futex.h | 41 ++--- arch/riscv/kernel/entry.S | 10 +- 9 files changed, 291 insertions(+), 68 deletions(-) diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata index ac64123433e717d3cd4a6d107f1328d27297f9cc..d0bbdbc4fb753e83432c094d58075a3ec17bb43d 100644 --- a/arch/riscv/Kconfig.errata +++ b/arch/riscv/Kconfig.errata @@ -44,6 +44,17 @@ config ERRATA_MIPS_P8700_PAUSE_OPCODE If you are not using the P8700 processor, say n. +config ERRATA_MIPS_P8700_AMO_ZALRSC + bool "Replace AMO instructions with LR/SC on MIPS P8700" + depends on ERRATA_MIPS && 64BIT + default n + help + The MIPS P8700 does not implement the full A extension, + implementing only Zalrsc. Enabling this will replace + all AMO instructions with LR/SC instructions on the P8700. + + If you are not using the P8700 processor, say n. + config ERRATA_SIFIVE bool "SiFive errata" depends on RISCV_ALTERNATIVE diff --git a/arch/riscv/errata/mips/errata.c b/arch/riscv/errata/mips/errata.c index e984a8152208c34690f89d8101571b097485c360..08c5efd58bf242d3831957f91f6338ef49f61238 100644 --- a/arch/riscv/errata/mips/errata.c +++ b/arch/riscv/errata/mips/errata.c @@ -23,13 +23,22 @@ static inline bool errata_probe_pause(void) return true; } -static u32 mips_errata_probe(void) +static inline bool errata_probe_zalrsc(unsigned long archid) +{ + return archid == 0x8000000000000201; +} + +static u32 mips_errata_probe(unsigned long archid) { u32 cpu_req_errata = 0; if (errata_probe_pause()) cpu_req_errata |= BIT(ERRATA_MIPS_P8700_PAUSE_OPCODE); + if (errata_probe_zalrsc(archid)) + cpu_req_errata |= BIT(ERRATA_MIPS_P8700_ZALRSC); + + return cpu_req_errata; } @@ -38,7 +47,7 @@ void mips_errata_patch_func(struct alt_entry *begin, struct alt_entry *end, unsigned int stage) { struct alt_entry *alt; - u32 cpu_req_errata = mips_errata_probe(); + u32 cpu_req_errata = mips_errata_probe(archid); u32 tmp; BUILD_BUG_ON(ERRATA_MIPS_NUMBER >= RISCV_VENDOR_EXT_ALTERNATIVES_BASE); diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h index 5b96c2f61adb596caf8ee6355d4ee86dbc19903b..fadfbc30ac1a93786bfd32c3980d256361cc1e95 100644 --- a/arch/riscv/include/asm/atomic.h +++ b/arch/riscv/include/asm/atomic.h @@ -54,12 +54,9 @@ static __always_inline void arch_atomic64_set(atomic64_t *v, s64 i) static __always_inline \ void arch_atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \ { \ - __asm__ __volatile__ ( \ - " amo" #asm_op "." #asm_type " zero, %1, %0" \ - : "+A" (v->counter) \ - : "r" (I) \ - : "memory"); \ -} \ + register __maybe_unused c_type ret, temp; \ + ALT_ATOMIC_OP(asm_op, I, asm_type, v, ret, temp); \ +} #ifdef CONFIG_GENERIC_ATOMIC64 #define ATOMIC_OPS(op, asm_op, I) \ @@ -89,24 +86,16 @@ static __always_inline \ c_type arch_atomic##prefix##_fetch_##op##_relaxed(c_type i, \ atomic##prefix##_t *v) \ { \ - register c_type ret; \ - __asm__ __volatile__ ( \ - " amo" #asm_op "." #asm_type " %1, %2, %0" \ - : "+A" (v->counter), "=r" (ret) \ - : "r" (I) \ - : "memory"); \ + register __maybe_unused c_type ret, temp; \ + ALT_ATOMIC_FETCH_OP_RELAXED(asm_op, I, asm_type, v, ret, temp); \ return ret; \ } \ static __always_inline \ c_type arch_atomic##prefix##_fetch_##op(c_type i, atomic##prefix##_t *v) \ -{ \ - register c_type ret; \ - __asm__ __volatile__ ( \ - " amo" #asm_op "." #asm_type ".aqrl %1, %2, %0" \ - : "+A" (v->counter), "=r" (ret) \ - : "r" (I) \ - : "memory"); \ - return ret; \ +{ \ + register __maybe_unused c_type ret, temp; \ + ALT_ATOMIC_FETCH_OP(asm_op, I, asm_type, v, ret, temp); \ + return ret; \ } #define ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_type, c_type, prefix) \ diff --git a/arch/riscv/include/asm/bitops.h b/arch/riscv/include/asm/bitops.h index d59310f74c2ba70caeb7b9b0e9221882117583f5..02955d3d573d92ffca7beb4b75f039df056cd6ec 100644 --- a/arch/riscv/include/asm/bitops.h +++ b/arch/riscv/include/asm/bitops.h @@ -187,30 +187,27 @@ static __always_inline int variable_fls(unsigned int x) #if (BITS_PER_LONG == 64) #define __AMO(op) "amo" #op ".d" +#define __LR "lr.d" +#define __SC "sc.d" #elif (BITS_PER_LONG == 32) #define __AMO(op) "amo" #op ".w" +#define __LR "lr.w" +#define __SC "sc.w" #else #error "Unexpected BITS_PER_LONG" #endif #define __test_and_op_bit_ord(op, mod, nr, addr, ord) \ ({ \ - unsigned long __res, __mask; \ + __maybe_unused unsigned long __res, __mask, __temp; \ __mask = BIT_MASK(nr); \ - __asm__ __volatile__ ( \ - __AMO(op) #ord " %0, %2, %1" \ - : "=r" (__res), "+A" (addr[BIT_WORD(nr)]) \ - : "r" (mod(__mask)) \ - : "memory"); \ + ALT_TEST_AND_OP_BIT_ORD(op, mod, nr, addr, ord, __res, __mask, __temp); \ ((__res & __mask) != 0); \ }) #define __op_bit_ord(op, mod, nr, addr, ord) \ - __asm__ __volatile__ ( \ - __AMO(op) #ord " zero, %1, %0" \ - : "+A" (addr[BIT_WORD(nr)]) \ - : "r" (mod(BIT_MASK(nr))) \ - : "memory"); + __maybe_unused unsigned long __res, __temp; \ + ALT_OP_BIT_ORD(op, mod, nr, addr, ord, __res, __temp); #define __test_and_op_bit(op, mod, nr, addr) \ __test_and_op_bit_ord(op, mod, nr, addr, .aqrl) @@ -354,12 +351,9 @@ static __always_inline void arch___clear_bit_unlock( static __always_inline bool arch_xor_unlock_is_negative_byte(unsigned long mask, volatile unsigned long *addr) { - unsigned long res; - __asm__ __volatile__ ( - __AMO(xor) ".rl %0, %2, %1" - : "=r" (res), "+A" (*addr) - : "r" (__NOP(mask)) - : "memory"); + __maybe_unused unsigned long res, temp; + + ALT_ARCH_XOR_UNLOCK(mask, addr, res, temp); return (res & BIT(7)) != 0; } diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h index 80bd52363c68690f33bfd54e0cc40399cd60b57b..290745a03d06062714716491668178f37670f85c 100644 --- a/arch/riscv/include/asm/cmpxchg.h +++ b/arch/riscv/include/asm/cmpxchg.h @@ -56,13 +56,8 @@ #define __arch_xchg(sfx, prepend, append, r, p, n) \ ({ \ - __asm__ __volatile__ ( \ - prepend \ - " amoswap" sfx " %0, %2, %1\n" \ - append \ - : "=r" (r), "+A" (*(p)) \ - : "r" (n) \ - : "memory"); \ + __typeof__(*(__ptr)) __maybe_unused temp; \ + ALT_ARCH_XCHG(sfx, prepend, append, r, p, n, temp); \ }) #define _arch_xchg(ptr, new, sc_sfx, swap_sfx, prepend, \ diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h index 6694b5ccdcf85cfe7e767ea4de981b34f2b17b04..481af503e88f4917e54d016b5d875593c016818e 100644 --- a/arch/riscv/include/asm/errata_list.h +++ b/arch/riscv/include/asm/errata_list.h @@ -25,6 +25,7 @@ ALTERNATIVE(__stringify(RISCV_PTR do_page_fault), \ __stringify(RISCV_PTR sifive_cip_453_page_fault_trp), \ SIFIVE_VENDOR_ID, ERRATA_SIFIVE_CIP_453, \ CONFIG_ERRATA_SIFIVE_CIP_453) + #else /* !__ASSEMBLER__ */ #define ALT_SFENCE_VMA_ASID(asid) \ @@ -53,6 +54,220 @@ asm(ALTERNATIVE( \ : /* no inputs */ \ : "memory") +#ifdef CONFIG_ERRATA_MIPS_P8700_AMO_ZALRSC +#define ALT_ATOMIC_OP(asm_op, I, asm_type, v, ret, temp) \ +asm(ALTERNATIVE( \ + " amo" #asm_op "." #asm_type " zero, %3, %0\n" \ + __nops(3), \ + "1: lr." #asm_type " %1, %0\n" \ + " " #asm_op " %2, %1, %3\n" \ + " sc." #asm_type " %2, %2, %0\n" \ + " bnez %2, 1b\n", \ + MIPS_VENDOR_ID, \ + ERRATA_MIPS_P8700_ZALRSC, \ + CONFIG_ERRATA_MIPS_P8700_AMO_ZALRSC) \ + : "+A" (v->counter), "=&r" (ret), "=&r" (temp) \ + : "r" (I) \ + : "memory") + +#define ALT_ATOMIC_FETCH_OP_RELAXED(asm_op, I, asm_type, v, ret, temp) \ +asm(ALTERNATIVE( \ + " amo" #asm_op "." #asm_type " %1, %3, %0\n" \ + __nops(3), \ + "1: lr." #asm_type " %1, %0\n" \ + " " #asm_op " %2, %1, %3\n" \ + " sc." #asm_type " %2, %2, %0\n" \ + " bnez %2, 1b\n", \ + MIPS_VENDOR_ID, \ + ERRATA_MIPS_P8700_ZALRSC, \ + CONFIG_ERRATA_MIPS_P8700_AMO_ZALRSC) \ + : "+A" (v->counter), "=&r" (ret), "=&r" (temp) \ + : "r" (I) \ + : "memory") + +#define ALT_ATOMIC_FETCH_OP(asm_op, I, asm_type, v, ret, temp) \ +asm(ALTERNATIVE( \ + " amo" #asm_op "." #asm_type ".aqrl %1, %3, %0\n"\ + __nops(3), \ + "1: lr." #asm_type ".aqrl %1, %0\n" \ + " " #asm_op " %2, %1, %3\n" \ + " sc." #asm_type ".aqrl %2, %2, %0\n" \ + " bnez %2, 1b\n", \ + MIPS_VENDOR_ID, \ + ERRATA_MIPS_P8700_ZALRSC, \ + CONFIG_ERRATA_MIPS_P8700_AMO_ZALRSC) \ + : "+A" (v->counter), "=&r" (ret), "=&r" (temp) \ + : "r" (I) \ + : "memory") +/* BITOPS.h */ +#define ALT_TEST_AND_OP_BIT_ORD(op, mod, nr, addr, ord, __res, __mask, __temp) \ +asm(ALTERNATIVE( \ + __AMO(op) #ord " zero, %3, %1\n" \ + __nops(3), \ + "1: " __LR #ord " %0, %1\n" \ + #op " %2, %0, %3\n" \ + __SC #ord " %2, %2, %1\n" \ + "bnez %2, 1b\n", \ + MIPS_VENDOR_ID, \ + ERRATA_MIPS_P8700_ZALRSC, \ + CONFIG_ERRATA_MIPS_P8700_AMO_ZALRSC) \ + : "=&r" (__res), "+A" (addr[BIT_WORD(nr)]), "=&r" (__temp) \ + : "r" (mod(__mask)) \ + : "memory") + +#define ALT_OP_BIT_ORD(op, mod, nr, addr, ord, __res, __temp) \ +asm(ALTERNATIVE( \ + __AMO(op) #ord " zero, %3, %1\n" \ + __nops(3), \ + "1: " __LR #ord " %0, %1\n" \ + #op " %2, %0, %3\n" \ + __SC #ord " %2, %2, %1\n" \ + "bnez %2, 1b\n", \ + MIPS_VENDOR_ID, \ + ERRATA_MIPS_P8700_ZALRSC, \ + CONFIG_ERRATA_MIPS_P8700_AMO_ZALRSC) \ + : "=&r" (__res), "+A" (addr[BIT_WORD(nr)]), "=&r" (__temp) \ + : "r" (mod(BIT_MASK(nr))) \ + : "memory") + +#define ALT_ARCH_XOR_UNLOCK(mask, addr, __res, __temp) \ +asm(ALTERNATIVE( \ + __AMO(xor) ".rl %0, %3, %1\n" \ + __nops(3), \ + "1: " __LR ".rl %0, %1\n" \ + "xor %2, %0, %3\n" \ + __SC ".rl %2, %2, %1\n" \ + "bnez %2, 1b\n", \ + MIPS_VENDOR_ID, \ + ERRATA_MIPS_P8700_ZALRSC, \ + CONFIG_ERRATA_MIPS_P8700_AMO_ZALRSC) \ + : "=&r" (__res), "+A" (*addr), "=&r" (__temp) \ + : "r" (__NOP(mask)) \ + : "memory") + +#define ALT_ARCH_XCHG(sfx, prepend, append, r, p, n, temp) \ +asm(ALTERNATIVE( \ + prepend \ + " amoswap" sfx " %0, %3, %1\n" \ + __nops(2) \ + append, \ + prepend \ + "1: lr" sfx " %0, %1\n" \ + " sc" sfx " %2, %3, %1\n" \ + " bnez %2, 1b\n" \ + append, \ + MIPS_VENDOR_ID, \ + ERRATA_MIPS_P8700_ZALRSC, \ + CONFIG_ERRATA_MIPS_P8700_AMO_ZALRSC) \ + : "=&r" (r), "+A" (*(p)), "=&r" (temp) \ + : "r" (n) \ + : "memory") + +/* FUTEX.H */ +#define ALT_FUTEX_ATOMIC_OP(insn, ret, oldval, uaddr, oparg, temp) \ +asm(ALTERNATIVE( \ + "1: amo" #insn ".w.aqrl %[ov],%z[op],%[u]\n" \ + __nops(3) \ + "2:\n" \ + _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %[r]), \ + "1: lr.w.aqrl %[ov], %[u]\n" \ + " " #insn" %[t], %[ov], %z[op]\n" \ + " sc.w.aqrl %[t], %[t], %[u]\n" \ + " bnez %[t], 1b\n" \ + "2:\n" \ + _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %[r]), \ + MIPS_VENDOR_ID, \ + ERRATA_MIPS_P8700_ZALRSC, \ + CONFIG_ERRATA_MIPS_P8700_AMO_ZALRSC) \ + : [r] "+r" (ret), [ov] "=&r" (oldval), \ + [t] "=&r" (temp), [u] "+m" (*uaddr) \ + : [op] "Jr" (oparg) \ + : "memory") + +#define ALT_FUTEX_ATOMIC_SWAP(ret, oldval, uaddr, oparg, temp) \ +asm(ALTERNATIVE( \ + "1: amoswap.w.aqrl %[ov],%z[op],%[u]\n" \ + __nops(3) \ + "2:\n" \ + _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %[r]), \ + "1: lr.w.aqrl %[ov], %[u]\n" \ + " mv %[t], %z[op]\n" \ + " sc.w.aqrl %[t], %[t], %[u]\n" \ + " bnez %[t], 1b\n" \ + "2:\n" \ + _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %[r]), \ + MIPS_VENDOR_ID, \ + ERRATA_MIPS_P8700_ZALRSC, \ + CONFIG_ERRATA_MIPS_P8700_AMO_ZALRSC) \ + : [r] "+r" (ret), [ov] "=&r" (oldval), \ + [t] "=&r" (temp), [u] "+m" (*uaddr) \ + : [op] "Jr" (oparg) \ + : "memory") + +#else +#define ALT_ATOMIC_OP(asm_op, I, asm_type, v, ret, temp) \ +asm("amo" #asm_op "." #asm_type " zero, %1, %0" \ + : "+A" (v->counter) \ + : "r" (I) \ + : "memory") + +#define ALT_ATOMIC_FETCH_OP_RELAXED(asm_op, I, asm_type, v, ret, temp) \ +asm("amo" #asm_op "." #asm_type " %1, %2, %0" \ + : "+A" (v->counter), "=r" (ret) \ + : "r" (I) \ + : "memory") + +#define ALT_ATOMIC_FETCH_OP(asm_op, I, asm_type, v, ret, temp) \ +asm("amo" #asm_op "." #asm_type ".aqrl %1, %2, %0" \ + : "+A" (v->counter), "=r" (ret) \ + : "r" (I) \ + : "memory") + +#define ALT_TEST_AND_OP_BIT_ORD(op, mod, nr, addr, ord, __res, __mask, __temp) \ +asm(__AMO(op) #ord " %0, %2, %1" \ + : "=r" (__res), "+A" (addr[BIT_WORD(nr)]) \ + : "r" (mod(__mask)) \ + : "memory") + +#define ALT_OP_BIT_ORD(op, mod, nr, addr, ord, __res, __temp) \ +asm(__AMO(op) #ord " zero, %1, %0" \ + : "+A" (addr[BIT_WORD(nr)]) \ + : "r" (mod(BIT_MASK(nr))) \ + : "memory") + +#define ALT_ARCH_XOR_UNLOCK(mask, addr, __res, __temp) \ +asm(__AMO(xor) ".rl %0, %2, %1" \ + : "=r" (res), "+A" (*addr) \ + : "r" (__NOP(mask)) \ + : "memory") + +#define ALT_ARCH_XCHG(sfx, prepend, append, r, p, n, temp) \ +asm(prepend \ + " amoswap" sfx " %0, %2, %1\n" \ + append \ + : "=r" (r), "+A" (*(p)) \ + : "r" (n) \ + : "memory") + +#define ALT_FUTEX_ATOMIC_OP(insn, ret, oldval, uaddr, oparg, temp) \ +asm("1: amo" #insn ".w.aqrl %[ov],%z[op],%[u]\n" \ + "2:\n" \ + _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %[r]) \ + : [r] "+r" (ret), [ov] "=&r" (oldval), \ + [u] "+m" (*uaddr) \ + : [op] "Jr" (oparg) \ + : "memory") + +#define ALT_FUTEX_ATOMIC_SWAP(ret, oldval, uaddr, oparg, temp) \ +asm("1: amoswap.w.aqrl %[ov],%z[op],%[u]\n" \ + "2:\n" \ + _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %[r]) \ + : [r] "+r" (ret), [ov] "=&r" (oldval), \ + [u] "+m" (*uaddr) \ + : [op] "Jr" (oparg) \ + : "memory") +#endif + /* * _val is marked as "will be overwritten", so need to set it to 0 * in the default case. diff --git a/arch/riscv/include/asm/errata_list_vendors.h b/arch/riscv/include/asm/errata_list_vendors.h index 9739a70ed69984ba4fc5f51a4967a58c4b02613a..193b930374d6fce99e6e73d749fd3a84151ed304 100644 --- a/arch/riscv/include/asm/errata_list_vendors.h +++ b/arch/riscv/include/asm/errata_list_vendors.h @@ -24,7 +24,8 @@ #ifdef CONFIG_ERRATA_MIPS #define ERRATA_MIPS_P8700_PAUSE_OPCODE 0 -#define ERRATA_MIPS_NUMBER 1 +#define ERRATA_MIPS_P8700_ZALRSC 1 +#define ERRATA_MIPS_NUMBER 2 #endif #endif diff --git a/arch/riscv/include/asm/futex.h b/arch/riscv/include/asm/futex.h index 90c86b115e008a1fb08f3da64382fb4a64d9cc2f..0d64f312f0f495f443306e36a0d487e2456343b0 100644 --- a/arch/riscv/include/asm/futex.h +++ b/arch/riscv/include/asm/futex.h @@ -12,6 +12,7 @@ #include #include #include +#include /* We don't even really need the extable code, but for now keep it simple */ #ifndef CONFIG_MMU @@ -19,48 +20,48 @@ #define __disable_user_access() do { } while (0) #endif -#define __futex_atomic_op(insn, ret, oldval, uaddr, oparg) \ +#define __futex_atomic_op(insn, ret, oldval, uaddr, oparg, temp) \ { \ __enable_user_access(); \ - __asm__ __volatile__ ( \ - "1: " insn " \n" \ - "2: \n" \ - _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %[r]) \ - : [r] "+r" (ret), [ov] "=&r" (oldval), \ - [u] "+m" (*uaddr) \ - : [op] "Jr" (oparg) \ - : "memory"); \ + ALT_FUTEX_ATOMIC_OP(insn, ret, oldval, uaddr, oparg, temp); \ __disable_user_access(); \ } +#define __futex_atomic_swap(ret, oldval, uaddr, oparg, temp) \ +{ \ + __enable_user_access(); \ + ALT_FUTEX_ATOMIC_SWAP(ret, oldval, uaddr, oparg, temp); \ + __disable_user_access(); \ +} + + static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { - int oldval = 0, ret = 0; + int __maybe_unused oldval = 0, ret = 0, temp = 0; if (!access_ok(uaddr, sizeof(u32))) return -EFAULT; switch (op) { case FUTEX_OP_SET: - __futex_atomic_op("amoswap.w.aqrl %[ov],%z[op],%[u]", - ret, oldval, uaddr, oparg); + __futex_atomic_swap(ret, oldval, uaddr, oparg, temp); break; case FUTEX_OP_ADD: - __futex_atomic_op("amoadd.w.aqrl %[ov],%z[op],%[u]", - ret, oldval, uaddr, oparg); + __futex_atomic_op(add, + ret, oldval, uaddr, oparg, temp); break; case FUTEX_OP_OR: - __futex_atomic_op("amoor.w.aqrl %[ov],%z[op],%[u]", - ret, oldval, uaddr, oparg); + __futex_atomic_op(or, + ret, oldval, uaddr, oparg, temp); break; case FUTEX_OP_ANDN: - __futex_atomic_op("amoand.w.aqrl %[ov],%z[op],%[u]", - ret, oldval, uaddr, ~oparg); + __futex_atomic_op(and, + ret, oldval, uaddr, oparg, temp); break; case FUTEX_OP_XOR: - __futex_atomic_op("amoxor.w.aqrl %[ov],%z[op],%[u]", - ret, oldval, uaddr, oparg); + __futex_atomic_op(xor, + ret, oldval, uaddr, oparg, temp); break; default: ret = -ENOSYS; diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 3a0ec6fd595691c873717ae1e6af5b3ed9854ca2..708cabd896fb86139eeeff27c4ae73d8c79f0ece 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -72,7 +72,15 @@ beq a2, zero, .Lnew_vmalloc_restore_context /* Atomically reset the current cpu bit in new_vmalloc */ - amoxor.d a0, a1, (a0) + ALTERNATIVE("amoxor.d a0, a1, (a0); \ + .rept 3; nop; .endr;", + "1: lr.d a2, (a0); \ + xor a2, a2, a1; \ + sc.d a2, a2, (a0); \ + bnez a2, 1b;", + MIPS_VENDOR_ID, + ERRATA_MIPS_P8700_ZALRSC, + CONFIG_ERRATA_MIPS_P8700_AMO_ZALRSC) /* Only emit a sfence.vma if the uarch caches invalid entries */ ALTERNATIVE("sfence.vma", "nop", 0, RISCV_ISA_EXT_SVVPTC, 1) --- base-commit: 7d4f659c32a309415078b0bca122fbb26a1e4df5 change-id: 20250714-p8700-zalrsc-f3894be40d06 Best regards, -- Aleksa Paunovic From conor at kernel.org Mon Sep 1 04:04:12 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 1 Sep 2025 12:04:12 +0100 Subject: [PATCH v4 0/9] Redo PolarFire SoC's mailbox/clock devicetrees and related code Message-ID: <20250901-rigid-sacrifice-0039c6e6234e@spud> From: Conor Dooley Yo, Stephen - I would really like to know if what I have done with the regmap clock is what you were asking for (a there's a link to that below to remind yourself). I've been trying to get you to look at this for a while, even just to affirm that I am on the right track! Cheers, Conor. v4: - unify both regmap clk implementations under one option - change map_offset to a u32, after Gabriel pointed out that u8 was too restrictive. - remove locking from regmap portion of reset driver, relying on inherent regmap lock v3 changes: - drop simple-mfd (for now) from syscon node v2 cover letter: Here's something that I've been mulling over for a while, since I started to understand how devicetree stuff was "meant" to be done. There'd been little reason to actually press forward with it, because it is fairly disruptive. I've finally opted to do it, because a user has come along with a hwmon driver that needs to access the same register region as the mailbox and the author is not keen on using the aux bus, and because I do not want the new pic64gx SoC that's based on PolarFire SoC to use bindings etc that I know to be incorrect. Given backwards compatibility needs to be maintained, this patch series isn't the prettiest thing I have ever written. The reset driver needs to retain support for the auxiliary bus, which looks a bit mess, but not much can be done there. The mailbox and clock drivers both have to have an "old probe" function to handle the old layout. Thankfully in the clock driver, regmap support can be used to identically handle both old and new devicetree formats - but using a regmap in the mailbox driver was only really possible for the new format, so the code there is unfortunately a bit of an if/else mess that I'm both not proud of, nor really sure is worth "improving". The series should be pretty splitable per subsystem, only the dts change has some sort of dependency, but I'll not be applying that till everything else is in Linus' tree, so that's not a big deal. I don't really want this stuff in stable, hence a lack of cc: stable anywhere here, since what's currently in the tree works fine for the currently supported hardware. AFAIK, the only other project affected here is U-Boot, which I have already modified to support the new format. I previously submitted this as an RFC, only to Lee and the dt list, in order to get some feedback on the syscon/mfd bindings: https://lore.kernel.org/all/20240815-shindig-bunny-fd42792d638a at spud/ I'm not really going to bother with a proper changelog, since that was submitted with lots of WIP code to get answers to some questions. The main change was "removing" some of the child nodes of the syscons. And as a "real" series where discussion lead to me dropping use of the amlogic clk-regmap support: https://lore.kernel.org/linux-clk/20241002-private-unequal-33cfa6101338 at spud/ As a result of that, I've implemented what I think Stephen was asking for - but I'm not at all sure that it is.. CC: Conor Dooley CC: Daire McNamara CC: pierre-henry.moussay at microchip.com CC: valentina.fernandezalanis at microchip.com CC: Michael Turquette CC: Stephen Boyd CC: Rob Herring CC: Krzysztof Kozlowski CC: Jassi Brar CC: Lee Jones CC: Paul Walmsley CC: Palmer Dabbelt CC: Philipp Zabel CC: linux-riscv at lists.infradead.org CC: linux-clk at vger.kernel.org CC: devicetree at vger.kernel.org CC: linux-kernel at vger.kernel.org CC: Gabriel FERNANDEZ Conor Dooley (9): dt-bindings: mfd: syscon document the control-scb syscon on PolarFire SoC dt-bindings: soc: microchip: document the simple-mfd syscon on PolarFire SoC soc: microchip: add mfd drivers for two syscon regions on PolarFire SoC reset: mpfs: add non-auxiliary bus probing dt-bindings: clk: microchip: mpfs: remove first reg region riscv: dts: microchip: fix mailbox description riscv: dts: microchip: convert clock and reset to use syscon clk: divider, gate: create regmap-backed copies of gate and divider clocks clk: microchip: mpfs: use regmap clock types .../bindings/clock/microchip,mpfs-clkcfg.yaml | 36 ++- .../devicetree/bindings/mfd/syscon.yaml | 2 + .../microchip,mpfs-mss-top-sysreg.yaml | 47 +++ arch/riscv/boot/dts/microchip/mpfs.dtsi | 34 ++- drivers/clk/Kconfig | 4 + drivers/clk/Makefile | 2 + drivers/clk/clk-divider-regmap.c | 271 ++++++++++++++++++ drivers/clk/clk-gate-regmap.c | 254 ++++++++++++++++ drivers/clk/microchip/Kconfig | 3 + drivers/clk/microchip/clk-mpfs.c | 151 ++++++---- drivers/reset/reset-mpfs.c | 83 ++++-- drivers/soc/microchip/Kconfig | 13 + drivers/soc/microchip/Makefile | 1 + drivers/soc/microchip/mpfs-control-scb.c | 45 +++ drivers/soc/microchip/mpfs-mss-top-sysreg.c | 48 ++++ include/linux/clk-provider.h | 119 ++++++++ 16 files changed, 1018 insertions(+), 95 deletions(-) create mode 100644 Documentation/devicetree/bindings/soc/microchip/microchip,mpfs-mss-top-sysreg.yaml create mode 100644 drivers/clk/clk-divider-regmap.c create mode 100644 drivers/clk/clk-gate-regmap.c create mode 100644 drivers/soc/microchip/mpfs-control-scb.c create mode 100644 drivers/soc/microchip/mpfs-mss-top-sysreg.c -- 2.47.2 From conor at kernel.org Mon Sep 1 04:04:13 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 1 Sep 2025 12:04:13 +0100 Subject: [PATCH v4 1/9] dt-bindings: mfd: syscon document the control-scb syscon on PolarFire SoC In-Reply-To: <20250901-rigid-sacrifice-0039c6e6234e@spud> References: <20250901-rigid-sacrifice-0039c6e6234e@spud> Message-ID: <20250901-shorten-yahoo-223aeaecd290@spud> From: Conor Dooley The "control-scb" region, contains the "tvs" temperature and voltage sensors and the control/status registers for the system controller's mailbox. The mailbox has a dedicated node, so there's no need for a child node describing it, looking the syscon up by compatible is sufficient. Acked-by: Krzysztof Kozlowski Signed-off-by: Conor Dooley --- v2: add the control-scb syscon here too, since it doesn't have any children. --- Documentation/devicetree/bindings/mfd/syscon.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/Documentation/devicetree/bindings/mfd/syscon.yaml b/Documentation/devicetree/bindings/mfd/syscon.yaml index 27672adeb1fed..d18be50dd7127 100644 --- a/Documentation/devicetree/bindings/mfd/syscon.yaml +++ b/Documentation/devicetree/bindings/mfd/syscon.yaml @@ -90,6 +90,7 @@ select: - mediatek,mt8173-pctl-a-syscfg - mediatek,mt8365-syscfg - microchip,lan966x-cpu-syscon + - microchip,mpfs-control-scb - microchip,mpfs-sysreg-scb - microchip,sam9x60-sfr - microchip,sama7d65-ddr3phy @@ -197,6 +198,7 @@ properties: - mediatek,mt8365-infracfg-nao - mediatek,mt8365-syscfg - microchip,lan966x-cpu-syscon + - microchip,mpfs-control-scb - microchip,mpfs-sysreg-scb - microchip,sam9x60-sfr - microchip,sama7d65-ddr3phy -- 2.47.2 From conor at kernel.org Mon Sep 1 04:04:14 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 1 Sep 2025 12:04:14 +0100 Subject: [PATCH v4 2/9] dt-bindings: soc: microchip: document the simple-mfd syscon on PolarFire SoC In-Reply-To: <20250901-rigid-sacrifice-0039c6e6234e@spud> References: <20250901-rigid-sacrifice-0039c6e6234e@spud> Message-ID: <20250901-garbage-hardship-027861fb3380@spud> From: Conor Dooley "mss-top-sysreg" contains clocks, pinctrl, resets, an interrupt controller and more. At this point, only the reset controller child is described as that's all that is described by the existing bindings. The clock controller already has a dedicated node, and will retain it as there are other clock regions, so like the mailbox, a compatible-based lookup of the syscon is sufficient to keep the clock driver working as before, so no child is needed. There's also an interrupt multiplexing service provided by this syscon, for which there is work in progress at [1]. Link: https://lore.kernel.org/linux-gpio/20240723-uncouple-enforcer-7c48e4a4fefe at wendy/ [1] Reviewed-by: Krzysztof Kozlowski Signed-off-by: Conor Dooley --- v3: - drop simple-mfd at Krzysztof's request since the child nodes do not yet exist. v2: - clean up various minor comments from Rob on mpfs-mss-top-sysreg - remove mpfs-control-scb from this patch --- .../microchip,mpfs-mss-top-sysreg.yaml | 47 +++++++++++++++++++ 1 file changed, 47 insertions(+) create mode 100644 Documentation/devicetree/bindings/soc/microchip/microchip,mpfs-mss-top-sysreg.yaml diff --git a/Documentation/devicetree/bindings/soc/microchip/microchip,mpfs-mss-top-sysreg.yaml b/Documentation/devicetree/bindings/soc/microchip/microchip,mpfs-mss-top-sysreg.yaml new file mode 100644 index 0000000000000..1ab691db87950 --- /dev/null +++ b/Documentation/devicetree/bindings/soc/microchip/microchip,mpfs-mss-top-sysreg.yaml @@ -0,0 +1,47 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/soc/microchip/microchip,mpfs-mss-top-sysreg.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Microchip PolarFire SoC Microprocessor Subsystem (MSS) sysreg register region + +maintainers: + - Conor Dooley + +description: + An wide assortment of registers that control elements of the MSS on PolarFire + SoC, including pinmuxing, resets and clocks among others. + +properties: + compatible: + items: + - const: microchip,mpfs-mss-top-sysreg + - const: syscon + + reg: + maxItems: 1 + + '#reset-cells': + description: + The AHB/AXI peripherals on the PolarFire SoC have reset support, so + from CLK_ENVM to CLK_CFM. The reset consumer should specify the + desired peripheral via the clock ID in its "resets" phandle cell. + See include/dt-bindings/clock/microchip,mpfs-clock.h for the full list + of PolarFire clock/reset IDs. + const: 1 + +required: + - compatible + - reg + +additionalProperties: false + +examples: + - | + syscon at 20002000 { + compatible = "microchip,mpfs-mss-top-sysreg", "syscon"; + reg = <0x20002000 0x1000>; + #reset-cells = <1>; + }; + -- 2.47.2 From conor at kernel.org Mon Sep 1 04:04:15 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 1 Sep 2025 12:04:15 +0100 Subject: [PATCH v4 3/9] soc: microchip: add mfd drivers for two syscon regions on PolarFire SoC In-Reply-To: <20250901-rigid-sacrifice-0039c6e6234e@spud> References: <20250901-rigid-sacrifice-0039c6e6234e@spud> Message-ID: <20250901-contempt-smelting-04aa3b08e112@spud> From: Conor Dooley The control-scb and mss-top-sysreg regions on PolarFire SoC both fulfill multiple purposes. The former is used for mailbox functions in addition to the temperature & voltage sensor while the latter is used for clocks, resets, interrupt muxing and pinctrl. Signed-off-by: Conor Dooley --- drivers/soc/microchip/Kconfig | 13 ++++++ drivers/soc/microchip/Makefile | 1 + drivers/soc/microchip/mpfs-control-scb.c | 45 +++++++++++++++++++ drivers/soc/microchip/mpfs-mss-top-sysreg.c | 48 +++++++++++++++++++++ 4 files changed, 107 insertions(+) create mode 100644 drivers/soc/microchip/mpfs-control-scb.c create mode 100644 drivers/soc/microchip/mpfs-mss-top-sysreg.c diff --git a/drivers/soc/microchip/Kconfig b/drivers/soc/microchip/Kconfig index 19f4b576f822b..31d188311e05f 100644 --- a/drivers/soc/microchip/Kconfig +++ b/drivers/soc/microchip/Kconfig @@ -9,3 +9,16 @@ config POLARFIRE_SOC_SYS_CTRL module will be called mpfs_system_controller. If unsure, say N. + +config POLARFIRE_SOC_SYSCONS + bool "PolarFire SoC (MPFS) syscon drivers" + default y + depends on ARCH_MICROCHIP + select MFD_CORE + help + These drivers add support for the syscons on PolarFire SoC (MPFS). + Without these drivers core parts of the kernel such as clocks + and resets will not function correctly. + + If unsure, and on a PolarFire SoC, say y. + diff --git a/drivers/soc/microchip/Makefile b/drivers/soc/microchip/Makefile index 14489919fe4b3..1a3a1594b089b 100644 --- a/drivers/soc/microchip/Makefile +++ b/drivers/soc/microchip/Makefile @@ -1 +1,2 @@ obj-$(CONFIG_POLARFIRE_SOC_SYS_CTRL) += mpfs-sys-controller.o +obj-$(CONFIG_POLARFIRE_SOC_SYSCONS) += mpfs-control-scb.o mpfs-mss-top-sysreg.o diff --git a/drivers/soc/microchip/mpfs-control-scb.c b/drivers/soc/microchip/mpfs-control-scb.c new file mode 100644 index 0000000000000..d1a8e79c232e3 --- /dev/null +++ b/drivers/soc/microchip/mpfs-control-scb.c @@ -0,0 +1,45 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include +#include +#include + +static const struct mfd_cell mpfs_control_scb_devs[] = { + { .name = "mpfs-tvs", }, +}; + +static int mpfs_control_scb_probe(struct platform_device *pdev) +{ + struct device *dev = &pdev->dev; + int ret; + + ret = mfd_add_devices(dev, PLATFORM_DEVID_NONE, mpfs_control_scb_devs, + 1, NULL, 0, NULL); + if (ret) + return ret; + + return 0; +} + +static const struct of_device_id mpfs_control_scb_of_match[] = { + {.compatible = "microchip,mpfs-control-scb", }, + {}, +}; +MODULE_DEVICE_TABLE(of, mpfs_control_scb_of_match); + +static struct platform_driver mpfs_control_scb_driver = { + .driver = { + .name = "mpfs-control-scb", + .of_match_table = mpfs_control_scb_of_match, + }, + .probe = mpfs_control_scb_probe, +}; +module_platform_driver(mpfs_control_scb_driver); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Conor Dooley "); +MODULE_DESCRIPTION("PolarFire SoC control scb driver"); diff --git a/drivers/soc/microchip/mpfs-mss-top-sysreg.c b/drivers/soc/microchip/mpfs-mss-top-sysreg.c new file mode 100644 index 0000000000000..9b2e7b84cdba2 --- /dev/null +++ b/drivers/soc/microchip/mpfs-mss-top-sysreg.c @@ -0,0 +1,48 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include +#include +#include + +static const struct mfd_cell mpfs_mss_top_sysreg_devs[] = { + { .name = "mpfs-reset", }, +}; + +static int mpfs_mss_top_sysreg_probe(struct platform_device *pdev) +{ + struct device *dev = &pdev->dev; + int ret; + + ret = mfd_add_devices(dev, PLATFORM_DEVID_NONE, mpfs_mss_top_sysreg_devs, + 1, NULL, 0, NULL); + if (ret) + return ret; + + if (devm_of_platform_populate(dev)) + dev_err(dev, "Error populating children\n"); + + return 0; +} + +static const struct of_device_id mpfs_mss_top_sysreg_of_match[] = { + {.compatible = "microchip,mpfs-mss-top-sysreg", }, + {}, +}; +MODULE_DEVICE_TABLE(of, mpfs_mss_top_sysreg_of_match); + +static struct platform_driver mpfs_mss_top_sysreg_driver = { + .driver = { + .name = "mpfs-mss-top-sysreg", + .of_match_table = mpfs_mss_top_sysreg_of_match, + }, + .probe = mpfs_mss_top_sysreg_probe, +}; +module_platform_driver(mpfs_mss_top_sysreg_driver); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Conor Dooley "); +MODULE_DESCRIPTION("PolarFire SoC mss top sysreg driver"); -- 2.47.2 From conor at kernel.org Mon Sep 1 04:04:16 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 1 Sep 2025 12:04:16 +0100 Subject: [PATCH v4 4/9] reset: mpfs: add non-auxiliary bus probing In-Reply-To: <20250901-rigid-sacrifice-0039c6e6234e@spud> References: <20250901-rigid-sacrifice-0039c6e6234e@spud> Message-ID: <20250901-caption-outsell-d1824ab2485a@spud> From: Conor Dooley While the auxiliary bus was a nice bandaid, and meant that re-writing the representation of the clock regions in devicetree was not required, it has run its course. The "mss_top_sysreg" region that contains the clock and reset regions, also contains pinctrl and an interrupt controller, so the time has come rewrite the devicetree and probe the reset controller from an mfd devicetree node, rather than implement those drivers using the auxiliary bus. Wanting to avoid propagating this naive/incorrect description of the hardware to the new pic64gx SoC is a major motivating factor here. Signed-off-by: Conor Dooley --- v4: - Only use driver specific lock for non-regmap writes v2: - Implement the request to use regmap_update_bits(). I found that I then hated the read/write helpers since they were just bloat, so I ripped them out. I replaced the regular spin_lock_irqsave() stuff with a guard(spinlock_irqsave), since that's a simpler way of handling the two different paths through such a trivial pair of functions. --- drivers/reset/reset-mpfs.c | 83 ++++++++++++++++++++++++++++++-------- 1 file changed, 66 insertions(+), 17 deletions(-) diff --git a/drivers/reset/reset-mpfs.c b/drivers/reset/reset-mpfs.c index f6fa10e03ea88..8e5ed4deecf37 100644 --- a/drivers/reset/reset-mpfs.c +++ b/drivers/reset/reset-mpfs.c @@ -7,13 +7,16 @@ * */ #include +#include #include #include +#include #include #include #include -#include +#include #include +#include #include #include @@ -27,11 +30,14 @@ #define MPFS_SLEEP_MIN_US 100 #define MPFS_SLEEP_MAX_US 200 +#define REG_SUBBLK_RESET_CR 0x88u + /* block concurrent access to the soft reset register */ static DEFINE_SPINLOCK(mpfs_reset_lock); struct mpfs_reset { void __iomem *base; + struct regmap *regmap; struct reset_controller_dev rcdev; }; @@ -46,41 +52,50 @@ static inline struct mpfs_reset *to_mpfs_reset(struct reset_controller_dev *rcde static int mpfs_assert(struct reset_controller_dev *rcdev, unsigned long id) { struct mpfs_reset *rst = to_mpfs_reset(rcdev); - unsigned long flags; u32 reg; - spin_lock_irqsave(&mpfs_reset_lock, flags); + if (rst->regmap) { + regmap_update_bits(rst->regmap, REG_SUBBLK_RESET_CR, BIT(id), BIT(id)); + return 0; + } + + guard(spinlock_irqsave)(&mpfs_reset_lock); reg = readl(rst->base); reg |= BIT(id); writel(reg, rst->base); - spin_unlock_irqrestore(&mpfs_reset_lock, flags); - return 0; } static int mpfs_deassert(struct reset_controller_dev *rcdev, unsigned long id) { struct mpfs_reset *rst = to_mpfs_reset(rcdev); - unsigned long flags; u32 reg; - spin_lock_irqsave(&mpfs_reset_lock, flags); + if (rst->regmap) { + regmap_update_bits(rst->regmap, REG_SUBBLK_RESET_CR, BIT(id), 0); + return 0; + } + + guard(spinlock_irqsave)(&mpfs_reset_lock); reg = readl(rst->base); reg &= ~BIT(id); writel(reg, rst->base); - spin_unlock_irqrestore(&mpfs_reset_lock, flags); - return 0; } static int mpfs_status(struct reset_controller_dev *rcdev, unsigned long id) { struct mpfs_reset *rst = to_mpfs_reset(rcdev); - u32 reg = readl(rst->base); + u32 reg; + + if (rst->regmap) + regmap_read(rst->regmap, REG_SUBBLK_RESET_CR, ®); + else + reg = readl(rst->base); /* * It is safe to return here as MPFS_NUM_RESETS makes sure the sign bit @@ -130,11 +145,45 @@ static int mpfs_reset_xlate(struct reset_controller_dev *rcdev, return index - MPFS_PERIPH_OFFSET; } -static int mpfs_reset_probe(struct auxiliary_device *adev, - const struct auxiliary_device_id *id) +static int mpfs_reset_mfd_probe(struct platform_device *pdev) { - struct device *dev = &adev->dev; struct reset_controller_dev *rcdev; + struct device *dev = &pdev->dev; + struct mpfs_reset *rst; + + rst = devm_kzalloc(dev, sizeof(*rst), GFP_KERNEL); + if (!rst) + return -ENOMEM; + + rcdev = &rst->rcdev; + rcdev->dev = dev; + rcdev->ops = &mpfs_reset_ops; + + rcdev->of_node = pdev->dev.parent->of_node; + rcdev->of_reset_n_cells = 1; + rcdev->of_xlate = mpfs_reset_xlate; + rcdev->nr_resets = MPFS_NUM_RESETS; + + rst->regmap = device_node_to_regmap(pdev->dev.parent->of_node); + if (IS_ERR(rst->regmap)) + dev_err_probe(dev, PTR_ERR(rst->regmap), "Failed to find syscon regmap\n"); + + return devm_reset_controller_register(dev, rcdev); +} + +static struct platform_driver mpfs_reset_mfd_driver = { + .probe = mpfs_reset_mfd_probe, + .driver = { + .name = "mpfs-reset", + }, +}; +module_platform_driver(mpfs_reset_mfd_driver); + +static int mpfs_reset_adev_probe(struct auxiliary_device *adev, + const struct auxiliary_device_id *id) +{ + struct reset_controller_dev *rcdev; + struct device *dev = &adev->dev; struct mpfs_reset *rst; rst = devm_kzalloc(dev, sizeof(*rst), GFP_KERNEL); @@ -145,8 +194,8 @@ static int mpfs_reset_probe(struct auxiliary_device *adev, rcdev = &rst->rcdev; rcdev->dev = dev; - rcdev->dev->parent = dev->parent; rcdev->ops = &mpfs_reset_ops; + rcdev->of_node = dev->parent->of_node; rcdev->of_reset_n_cells = 1; rcdev->of_xlate = mpfs_reset_xlate; @@ -176,12 +225,12 @@ static const struct auxiliary_device_id mpfs_reset_ids[] = { }; MODULE_DEVICE_TABLE(auxiliary, mpfs_reset_ids); -static struct auxiliary_driver mpfs_reset_driver = { - .probe = mpfs_reset_probe, +static struct auxiliary_driver mpfs_reset_aux_driver = { + .probe = mpfs_reset_adev_probe, .id_table = mpfs_reset_ids, }; -module_auxiliary_driver(mpfs_reset_driver); +module_auxiliary_driver(mpfs_reset_aux_driver); MODULE_DESCRIPTION("Microchip PolarFire SoC Reset Driver"); MODULE_AUTHOR("Conor Dooley "); -- 2.47.2 From conor at kernel.org Mon Sep 1 04:04:17 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 1 Sep 2025 12:04:17 +0100 Subject: [PATCH v4 5/9] dt-bindings: clk: microchip: mpfs: remove first reg region In-Reply-To: <20250901-rigid-sacrifice-0039c6e6234e@spud> References: <20250901-rigid-sacrifice-0039c6e6234e@spud> Message-ID: <20250901-properly-banister-ccf27886c8e6@spud> From: Conor Dooley The first reg region in this binding is not exclusively for clocks, as evidenced by the dual role of this device as a reset controller at present. The first region is however better described by a simple-mfd syscon, but this would have require a significant re-write of the devicetree for the platform, so the easy way out was chosen when reset support was first introduced. The region doesn't just contain clock and reset registers, it also contains pinctrl and interrupt controller functionality, so drop the region from the clock binding so that it can be described instead by a simple-mfd syscon rather than propagate this incorrect description of the hardware to the new pic64gx SoC. Acked-by: Rob Herring (Arm) Signed-off-by: Conor Dooley --- .../bindings/clock/microchip,mpfs-clkcfg.yaml | 36 +++++++++++-------- 1 file changed, 22 insertions(+), 14 deletions(-) diff --git a/Documentation/devicetree/bindings/clock/microchip,mpfs-clkcfg.yaml b/Documentation/devicetree/bindings/clock/microchip,mpfs-clkcfg.yaml index e4e1c31267d2a..ee4f31596d978 100644 --- a/Documentation/devicetree/bindings/clock/microchip,mpfs-clkcfg.yaml +++ b/Documentation/devicetree/bindings/clock/microchip,mpfs-clkcfg.yaml @@ -22,16 +22,23 @@ properties: const: microchip,mpfs-clkcfg reg: - items: - - description: | - clock config registers: - These registers contain enable, reset & divider tables for the, cpu, - axi, ahb and rtc/mtimer reference clocks as well as enable and reset - for the peripheral clocks. - - description: | - mss pll dri registers: - Block of registers responsible for dynamic reconfiguration of the mss - pll + oneOf: + - items: + - description: | + clock config registers: + These registers contain enable, reset & divider tables for the, cpu, + axi, ahb and rtc/mtimer reference clocks as well as enable and reset + for the peripheral clocks. + - description: | + mss pll dri registers: + Block of registers responsible for dynamic reconfiguration of the mss + pll + deprecated: true + - items: + - description: | + mss pll dri registers: + Block of registers responsible for dynamic reconfiguration of the mss + pll clocks: maxItems: 1 @@ -69,11 +76,12 @@ examples: - | #include soc { - #address-cells = <2>; - #size-cells = <2>; - clkcfg: clock-controller at 20002000 { + #address-cells = <1>; + #size-cells = <1>; + + clkcfg: clock-controller at 3E001000 { compatible = "microchip,mpfs-clkcfg"; - reg = <0x0 0x20002000 0x0 0x1000>, <0x0 0x3E001000 0x0 0x1000>; + reg = <0x3E001000 0x1000>; clocks = <&ref>; #clock-cells = <1>; }; -- 2.47.2 From conor at kernel.org Mon Sep 1 04:04:18 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 1 Sep 2025 12:04:18 +0100 Subject: [PATCH v4 6/9] riscv: dts: microchip: fix mailbox description In-Reply-To: <20250901-rigid-sacrifice-0039c6e6234e@spud> References: <20250901-rigid-sacrifice-0039c6e6234e@spud> Message-ID: <20250901-excretion-employed-1e497728e00e@spud> From: Conor Dooley When the binding for the mailbox on PolarFire SoC was originally written, and later modified, mistakes were made - and the precise nature of the later modification should have been a giveaway, but alas I was naive at the time. A more correct modelling of the hardware is to use two syscons and have a single reg entry for the mailbox, containing the mailbox region. The two syscons contain the general control/status registers for the mailbox and the interrupt related registers respectively. The reason for two syscons is that the same mailbox is present on the non-SoC version of the FPGA, which has no interrupt controller, and the shared part of the rtl was unchanged between devices. Signed-off-by: Conor Dooley --- arch/riscv/boot/dts/microchip/mpfs.dtsi | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/arch/riscv/boot/dts/microchip/mpfs.dtsi b/arch/riscv/boot/dts/microchip/mpfs.dtsi index 9883ca3554c50..f9d6bf08e7170 100644 --- a/arch/riscv/boot/dts/microchip/mpfs.dtsi +++ b/arch/riscv/boot/dts/microchip/mpfs.dtsi @@ -259,6 +259,11 @@ clkcfg: clkcfg at 20002000 { #reset-cells = <1>; }; + sysreg_scb: syscon at 20003000 { + compatible = "microchip,mpfs-sysreg-scb", "syscon"; + reg = <0x0 0x20003000 0x0 0x1000>; + }; + ccc_se: clock-controller at 38010000 { compatible = "microchip,mpfs-ccc"; reg = <0x0 0x38010000 0x0 0x1000>, <0x0 0x38020000 0x0 0x1000>, @@ -521,10 +526,14 @@ usb: usb at 20201000 { status = "disabled"; }; - mbox: mailbox at 37020000 { + control_scb: syscon at 37020000 { + compatible = "microchip,mpfs-control-scb", "syscon"; + reg = <0x0 0x37020000 0x0 0x100>; + }; + + mbox: mailbox at 37020800 { compatible = "microchip,mpfs-mailbox"; - reg = <0x0 0x37020000 0x0 0x58>, <0x0 0x2000318C 0x0 0x40>, - <0x0 0x37020800 0x0 0x100>; + reg = <0x0 0x37020800 0x0 0x1000>; interrupt-parent = <&plic>; interrupts = <96>; #mbox-cells = <1>; -- 2.47.2 From conor at kernel.org Mon Sep 1 04:04:19 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 1 Sep 2025 12:04:19 +0100 Subject: [PATCH v4 7/9] riscv: dts: microchip: convert clock and reset to use syscon In-Reply-To: <20250901-rigid-sacrifice-0039c6e6234e@spud> References: <20250901-rigid-sacrifice-0039c6e6234e@spud> Message-ID: <20250901-famine-turf-deaa34bba81c@spud> From: Conor Dooley The "subblock" clocks and reset registers on PolarFire SoC are located in the mss-top-sysreg region, alongside pinctrl and interrupt control functionality. Re-write the devicetree to describe the sys explicitly, as its own node, rather than as a region of the clock node. Correspondingly, the phandles to the reset controller must be updated to the new provider. The drivers will continue to support the old way of doing things. Signed-off-by: Conor Dooley --- arch/riscv/boot/dts/microchip/mpfs.dtsi | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/arch/riscv/boot/dts/microchip/mpfs.dtsi b/arch/riscv/boot/dts/microchip/mpfs.dtsi index f9d6bf08e7170..5c2963e269b83 100644 --- a/arch/riscv/boot/dts/microchip/mpfs.dtsi +++ b/arch/riscv/boot/dts/microchip/mpfs.dtsi @@ -251,11 +251,9 @@ pdma: dma-controller at 3000000 { #dma-cells = <1>; }; - clkcfg: clkcfg at 20002000 { - compatible = "microchip,mpfs-clkcfg"; - reg = <0x0 0x20002000 0x0 0x1000>, <0x0 0x3E001000 0x0 0x1000>; - clocks = <&refclk>; - #clock-cells = <1>; + mss_top_sysreg: syscon at 20002000 { + compatible = "microchip,mpfs-mss-top-sysreg", "syscon", "simple-mfd"; + reg = <0x0 0x20002000 0x0 0x1000>; #reset-cells = <1>; }; @@ -452,7 +450,7 @@ mac0: ethernet at 20110000 { local-mac-address = [00 00 00 00 00 00]; clocks = <&clkcfg CLK_MAC0>, <&clkcfg CLK_AHB>; clock-names = "pclk", "hclk"; - resets = <&clkcfg CLK_MAC0>; + resets = <&mss_top_sysreg CLK_MAC0>; status = "disabled"; }; @@ -466,7 +464,7 @@ mac1: ethernet at 20112000 { local-mac-address = [00 00 00 00 00 00]; clocks = <&clkcfg CLK_MAC1>, <&clkcfg CLK_AHB>; clock-names = "pclk", "hclk"; - resets = <&clkcfg CLK_MAC1>; + resets = <&mss_top_sysreg CLK_MAC1>; status = "disabled"; }; @@ -550,5 +548,12 @@ syscontroller_qspi: spi at 37020100 { clocks = <&scbclk>; status = "disabled"; }; + + clkcfg: clkcfg at 3e001000 { + compatible = "microchip,mpfs-clkcfg"; + reg = <0x0 0x3e001000 0x0 0x1000>; + clocks = <&refclk>; + #clock-cells = <1>; + }; }; }; -- 2.47.2 From conor at kernel.org Mon Sep 1 04:04:20 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 1 Sep 2025 12:04:20 +0100 Subject: [PATCH v4 8/9] clk: divider, gate: create regmap-backed copies of gate and divider clocks In-Reply-To: <20250901-rigid-sacrifice-0039c6e6234e@spud> References: <20250901-rigid-sacrifice-0039c6e6234e@spud> Message-ID: <20250901-yearling-reconcile-99d06fe7868e@spud> From: Conor Dooley Implement regmap-backed copies of gate and divider clocks by replacing the iomem pointer to the clock registers with a regmap and offset within. Signed-off-by: Conor Dooley --- v4: - increase map_offset to a u32 - use a single Kconfig option for both divider and gate regmap implementations --- drivers/clk/Kconfig | 4 + drivers/clk/Makefile | 2 + drivers/clk/clk-divider-regmap.c | 271 +++++++++++++++++++++++++++++++ drivers/clk/clk-gate-regmap.c | 254 +++++++++++++++++++++++++++++ include/linux/clk-provider.h | 119 ++++++++++++++ 5 files changed, 650 insertions(+) create mode 100644 drivers/clk/clk-divider-regmap.c create mode 100644 drivers/clk/clk-gate-regmap.c diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig index 4d56475f94fc1..3490b6fe6b9b2 100644 --- a/drivers/clk/Kconfig +++ b/drivers/clk/Kconfig @@ -33,6 +33,10 @@ menuconfig COMMON_CLK if COMMON_CLK +config COMMON_CLK_REGMAP + bool + select REGMAP + config COMMON_CLK_WM831X tristate "Clock driver for WM831x/2x PMICs" depends on MFD_WM831X diff --git a/drivers/clk/Makefile b/drivers/clk/Makefile index 18ed29cfdc113..b93b9e3e5fc07 100644 --- a/drivers/clk/Makefile +++ b/drivers/clk/Makefile @@ -21,11 +21,13 @@ clk-test-y := clk_test.o \ kunit_clk_hw_get_dev_of_node.dtbo.o \ kunit_clk_parent_data_test.dtbo.o obj-$(CONFIG_COMMON_CLK) += clk-divider.o +obj-$(CONFIG_COMMON_CLK_REGMAP) += clk-divider-regmap.o obj-$(CONFIG_COMMON_CLK) += clk-fixed-factor.o obj-$(CONFIG_COMMON_CLK) += clk-fixed-rate.o obj-$(CONFIG_CLK_FIXED_RATE_KUNIT_TEST) += clk-fixed-rate-test.o clk-fixed-rate-test-y := clk-fixed-rate_test.o kunit_clk_fixed_rate_test.dtbo.o obj-$(CONFIG_COMMON_CLK) += clk-gate.o +obj-$(CONFIG_COMMON_CLK_REGMAP) += clk-gate-regmap.o obj-$(CONFIG_CLK_GATE_KUNIT_TEST) += clk-gate_test.o obj-$(CONFIG_COMMON_CLK) += clk-multiplier.o obj-$(CONFIG_COMMON_CLK) += clk-mux.o diff --git a/drivers/clk/clk-divider-regmap.c b/drivers/clk/clk-divider-regmap.c new file mode 100644 index 0000000000000..e2f9489ad9ef9 --- /dev/null +++ b/drivers/clk/clk-divider-regmap.c @@ -0,0 +1,271 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include + +static inline u32 clk_div_regmap_readl(struct clk_divider_regmap *divider) +{ + u32 val; + + regmap_read(divider->regmap, divider->map_offset, &val); + + return val; +} + +static inline void clk_div_regmap_writel(struct clk_divider_regmap *divider, u32 val) +{ + regmap_write(divider->regmap, divider->map_offset, val); + +} + +static unsigned long clk_divider_regmap_recalc_rate(struct clk_hw *hw, + unsigned long parent_rate) +{ + struct clk_divider_regmap *divider = to_clk_divider_regmap(hw); + unsigned int val; + + val = clk_div_regmap_readl(divider) >> divider->shift; + val &= clk_div_mask(divider->width); + + return divider_recalc_rate(hw, parent_rate, val, divider->table, + divider->flags, divider->width); +} + +static long clk_divider_regmap_round_rate(struct clk_hw *hw, unsigned long rate, + unsigned long *prate) +{ + struct clk_divider_regmap *divider = to_clk_divider_regmap(hw); + + /* if read only, just return current value */ + if (divider->flags & CLK_DIVIDER_READ_ONLY) { + u32 val; + + val = clk_div_regmap_readl(divider) >> divider->shift; + val &= clk_div_mask(divider->width); + + return divider_ro_round_rate(hw, rate, prate, divider->table, + divider->width, divider->flags, + val); + } + + return divider_round_rate(hw, rate, prate, divider->table, + divider->width, divider->flags); +} + +static int clk_divider_regmap_determine_rate(struct clk_hw *hw, + struct clk_rate_request *req) +{ + struct clk_divider_regmap *divider = to_clk_divider_regmap(hw); + + /* if read only, just return current value */ + if (divider->flags & CLK_DIVIDER_READ_ONLY) { + u32 val; + + val = clk_div_regmap_readl(divider) >> divider->shift; + val &= clk_div_mask(divider->width); + + return divider_ro_determine_rate(hw, req, divider->table, + divider->width, + divider->flags, val); + } + + return divider_determine_rate(hw, req, divider->table, divider->width, + divider->flags); +} + +static int clk_divider_regmap_set_rate(struct clk_hw *hw, unsigned long rate, + unsigned long parent_rate) +{ + struct clk_divider_regmap *divider = to_clk_divider_regmap(hw); + int value; + unsigned long flags = 0; + u32 val; + + value = divider_get_val(rate, parent_rate, divider->table, + divider->width, divider->flags); + if (value < 0) + return value; + + if (divider->lock) + spin_lock_irqsave(divider->lock, flags); + else + __acquire(divider->lock); + + if (divider->flags & CLK_DIVIDER_HIWORD_MASK) { + val = clk_div_mask(divider->width) << (divider->shift + 16); + } else { + val = clk_div_regmap_readl(divider); + val &= ~(clk_div_mask(divider->width) << divider->shift); + } + val |= (u32)value << divider->shift; + clk_div_regmap_writel(divider, val); + + if (divider->lock) + spin_unlock_irqrestore(divider->lock, flags); + else + __release(divider->lock); + + return 0; +} + +const struct clk_ops clk_divider_regmap_ops = { + .recalc_rate = clk_divider_regmap_recalc_rate, + .round_rate = clk_divider_regmap_round_rate, + .determine_rate = clk_divider_regmap_determine_rate, + .set_rate = clk_divider_regmap_set_rate, +}; +EXPORT_SYMBOL_GPL(clk_divider_regmap_ops); + +const struct clk_ops clk_divider_regmap_ro_ops = { + .recalc_rate = clk_divider_regmap_recalc_rate, + .round_rate = clk_divider_regmap_round_rate, + .determine_rate = clk_divider_regmap_determine_rate, +}; +EXPORT_SYMBOL_GPL(clk_divider_regmap_ro_ops); + +struct clk_hw *__clk_hw_register_divider_regmap(struct device *dev, + struct device_node *np, const char *name, + const char *parent_name, const struct clk_hw *parent_hw, + const struct clk_parent_data *parent_data, unsigned long flags, + struct regmap *regmap, u32 map_offset, u8 shift, u8 width, + u8 clk_divider_flags, const struct clk_div_table *table, + spinlock_t *lock) +{ + struct clk_divider_regmap *div; + struct clk_hw *hw; + struct clk_init_data init = {}; + int ret; + + if (clk_divider_flags & CLK_DIVIDER_HIWORD_MASK) { + if (width + shift > 16) { + pr_warn("divider value exceeds LOWORD field\n"); + return ERR_PTR(-EINVAL); + } + } + + /* allocate the divider */ + div = kzalloc(sizeof(*div), GFP_KERNEL); + if (!div) + return ERR_PTR(-ENOMEM); + + init.name = name; + if (clk_divider_flags & CLK_DIVIDER_READ_ONLY) + init.ops = &clk_divider_regmap_ro_ops; + else + init.ops = &clk_divider_regmap_ops; + init.flags = flags; + init.parent_names = parent_name ? &parent_name : NULL; + init.parent_hws = parent_hw ? &parent_hw : NULL; + init.parent_data = parent_data; + if (parent_name || parent_hw || parent_data) + init.num_parents = 1; + else + init.num_parents = 0; + + /* struct clk_divider assignments */ + div->regmap = regmap; + div->map_offset = map_offset; + div->shift = shift; + div->width = width; + div->flags = clk_divider_flags; + div->lock = lock; + div->hw.init = &init; + div->table = table; + + /* register the clock */ + hw = &div->hw; + ret = clk_hw_register(dev, hw); + if (ret) { + kfree(div); + hw = ERR_PTR(ret); + } + + return hw; +} +EXPORT_SYMBOL_GPL(__clk_hw_register_divider_regmap); + +struct clk *clk_register_divider_regmap_table(struct device *dev, const char *name, + const char *parent_name, unsigned long flags, + struct regmap *regmap, u32 map_offset, u8 shift, u8 width, + u8 clk_divider_flags, const struct clk_div_table *table, + spinlock_t *lock) +{ + struct clk_hw *hw; + + hw = __clk_hw_register_divider_regmap(dev, NULL, name, parent_name, NULL, + NULL, flags, regmap, map_offset, + shift, width, clk_divider_flags, + table, lock); + if (IS_ERR(hw)) + return ERR_CAST(hw); + return hw->clk; +} +EXPORT_SYMBOL_GPL(clk_register_divider_regmap_table); + +void clk_unregister_divider_regmap(struct clk *clk) +{ + struct clk_divider_regmap *div; + struct clk_hw *hw; + + hw = __clk_get_hw(clk); + if (!hw) + return; + + div = to_clk_divider_regmap(hw); + + clk_unregister(clk); + kfree(div); +} +EXPORT_SYMBOL_GPL(clk_unregister_divider_regmap); + +/** + * clk_hw_unregister_divider_regmap - unregister a clk divider + * @hw: hardware-specific clock data to unregister + */ +void clk_hw_unregister_divider_regmap(struct clk_hw *hw) +{ + struct clk_divider_regmap *div; + + div = to_clk_divider_regmap(hw); + + clk_hw_unregister(hw); + kfree(div); +} +EXPORT_SYMBOL_GPL(clk_hw_unregister_divider_regmap); + +static void devm_clk_hw_release_divider_regmap(struct device *dev, void *res) +{ + clk_hw_unregister_divider_regmap(*(struct clk_hw **)res); +} + +struct clk_hw *__devm_clk_hw_register_divider_regmap(struct device *dev, + struct device_node *np, const char *name, + const char *parent_name, const struct clk_hw *parent_hw, + const struct clk_parent_data *parent_data, unsigned long flags, + struct regmap *regmap, u32 map_offset, u8 shift, u8 width, + u8 clk_divider_flags, const struct clk_div_table *table, + spinlock_t *lock) +{ + struct clk_hw **ptr, *hw; + + ptr = devres_alloc(devm_clk_hw_release_divider_regmap, sizeof(*ptr), GFP_KERNEL); + if (!ptr) + return ERR_PTR(-ENOMEM); + + hw = __clk_hw_register_divider_regmap(dev, np, name, parent_name, parent_hw, + parent_data, flags, regmap, map_offset, + shift, width, clk_divider_flags, table, + lock); + + if (!IS_ERR(hw)) { + *ptr = hw; + devres_add(dev, ptr); + } else { + devres_free(ptr); + } + + return hw; +} +EXPORT_SYMBOL_GPL(__devm_clk_hw_register_divider_regmap); diff --git a/drivers/clk/clk-gate-regmap.c b/drivers/clk/clk-gate-regmap.c new file mode 100644 index 0000000000000..54105738909c7 --- /dev/null +++ b/drivers/clk/clk-gate-regmap.c @@ -0,0 +1,254 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include +#include +#include +#include + +/** + * DOC: basic gatable clock which can gate and ungate its output + * + * Traits of this clock: + * prepare - clk_(un)prepare only ensures parent is (un)prepared + * enable - clk_enable and clk_disable are functional & control gating + * rate - inherits rate from parent. No clk_set_rate support + * parent - fixed parent. No clk_set_parent support + */ + +static inline u32 clk_gate_regmap_readl(struct clk_gate_regmap *gate) +{ + u32 val; + + regmap_read(gate->map, gate->map_offset, &val); + + return val; +} + +static inline void clk_gate_regmap_writel(struct clk_gate_regmap *gate, u32 val) +{ + regmap_write(gate->map, gate->map_offset, val); + +} + +/* + * It works on following logic: + * + * For enabling clock, enable = 1 + * set2dis = 1 -> clear bit -> set = 0 + * set2dis = 0 -> set bit -> set = 1 + * + * For disabling clock, enable = 0 + * set2dis = 1 -> set bit -> set = 1 + * set2dis = 0 -> clear bit -> set = 0 + * + * So, result is always: enable xor set2dis. + */ +static void clk_gate_regmap_endisable(struct clk_hw *hw, int enable) +{ + struct clk_gate_regmap *gate = to_clk_gate_regmap(hw); + int set = gate->flags & CLK_GATE_SET_TO_DISABLE ? 1 : 0; + unsigned long flags; + u32 reg; + + set ^= enable; + + if (gate->lock) + spin_lock_irqsave(gate->lock, flags); + else + __acquire(gate->lock); + + if (gate->flags & CLK_GATE_HIWORD_MASK) { + reg = BIT(gate->bit_idx + 16); + if (set) + reg |= BIT(gate->bit_idx); + } else { + reg = clk_gate_regmap_readl(gate); + + if (set) + reg |= BIT(gate->bit_idx); + else + reg &= ~BIT(gate->bit_idx); + } + + clk_gate_regmap_writel(gate, reg); + + if (gate->lock) + spin_unlock_irqrestore(gate->lock, flags); + else + __release(gate->lock); +} + +static int clk_gate_regmap_enable(struct clk_hw *hw) +{ + clk_gate_regmap_endisable(hw, 1); + + return 0; +} + +static void clk_gate_regmap_disable(struct clk_hw *hw) +{ + clk_gate_regmap_endisable(hw, 0); +} + +int clk_gate_regmap_is_enabled(struct clk_hw *hw) +{ + u32 reg; + struct clk_gate_regmap *gate = to_clk_gate_regmap(hw); + + reg = clk_gate_regmap_readl(gate); + + /* if a set bit disables this clk, flip it before masking */ + if (gate->flags & CLK_GATE_SET_TO_DISABLE) + reg ^= BIT(gate->bit_idx); + + reg &= BIT(gate->bit_idx); + + return reg ? 1 : 0; +} +EXPORT_SYMBOL_GPL(clk_gate_regmap_is_enabled); + +const struct clk_ops clk_gate_regmap_ops = { + .enable = clk_gate_regmap_enable, + .disable = clk_gate_regmap_disable, + .is_enabled = clk_gate_regmap_is_enabled, +}; +EXPORT_SYMBOL_GPL(clk_gate_regmap_ops); + +struct clk_hw *__clk_hw_register_gate_regmap(struct device *dev, + struct device_node *np, const char *name, + const char *parent_name, const struct clk_hw *parent_hw, + const struct clk_parent_data *parent_data, + unsigned long flags, + struct regmap *map, u32 map_offset, u8 bit_idx, + u8 clk_gate_flags, spinlock_t *lock) +{ + struct clk_gate_regmap *gate; + struct clk_hw *hw; + struct clk_init_data init = {}; + int ret = -EINVAL; + + if (clk_gate_flags & CLK_GATE_HIWORD_MASK) { + if (bit_idx > 15) { + pr_err("gate bit exceeds LOWORD field\n"); + return ERR_PTR(-EINVAL); + } + } + + /* allocate the gate */ + gate = kzalloc(sizeof(*gate), GFP_KERNEL); + if (!gate) + return ERR_PTR(-ENOMEM); + + init.name = name; + init.ops = &clk_gate_regmap_ops; + init.flags = flags; + init.parent_names = parent_name ? &parent_name : NULL; + init.parent_hws = parent_hw ? &parent_hw : NULL; + init.parent_data = parent_data; + if (parent_name || parent_hw || parent_data) + init.num_parents = 1; + else + init.num_parents = 0; + + /* struct clk_gate_regmap assignments */ + gate->map = map; + gate->map_offset = map_offset; + gate->bit_idx = bit_idx; + gate->flags = clk_gate_flags; + gate->lock = lock; + gate->hw.init = &init; + + hw = &gate->hw; + if (dev || !np) + ret = clk_hw_register(dev, hw); + else if (np) + ret = of_clk_hw_register(np, hw); + if (ret) { + kfree(gate); + hw = ERR_PTR(ret); + } + + return hw; + +} +EXPORT_SYMBOL_GPL(__clk_hw_register_gate_regmap); + +struct clk *clk_register_gate_regmap(struct device *dev, const char *name, + const char *parent_name, unsigned long flags, struct regmap *map, + u32 map_offset, u8 bit_idx, u8 clk_gate_flags, spinlock_t *lock) +{ + struct clk_hw *hw; + + hw = __clk_hw_register_gate_regmap(dev, NULL, name, parent_name, NULL, + NULL, flags, map, map_offset, bit_idx, + clk_gate_flags, lock); + if (IS_ERR(hw)) + return ERR_CAST(hw); + return hw->clk; +} +EXPORT_SYMBOL_GPL(clk_register_gate_regmap); + +void clk_unregister_gate_regmap(struct clk *clk) +{ + struct clk_gate_regmap *gate; + struct clk_hw *hw; + + hw = __clk_get_hw(clk); + if (!hw) + return; + + gate = to_clk_gate_regmap(hw); + + clk_unregister(clk); + kfree(gate); +} +EXPORT_SYMBOL_GPL(clk_unregister_gate_regmap); + +void clk_hw_unregister_gate_regmap(struct clk_hw *hw) +{ + struct clk_gate_regmap *gate; + + gate = to_clk_gate_regmap(hw); + + clk_hw_unregister(hw); + kfree(gate); +} +EXPORT_SYMBOL_GPL(clk_hw_unregister_gate_regmap); + +static void devm_clk_hw_release_gate_regmap(struct device *dev, void *res) +{ + clk_hw_unregister_gate_regmap(*(struct clk_hw **)res); +} + +struct clk_hw *__devm_clk_hw_register_gate_regmap(struct device *dev, + struct device_node *np, const char *name, + const char *parent_name, const struct clk_hw *parent_hw, + const struct clk_parent_data *parent_data, + unsigned long flags, struct regmap *map, + u32 map_offset, u8 bit_idx, + u8 clk_gate_flags, spinlock_t *lock) +{ + struct clk_hw **ptr, *hw; + + ptr = devres_alloc(devm_clk_hw_release_gate_regmap, sizeof(*ptr), GFP_KERNEL); + if (!ptr) + return ERR_PTR(-ENOMEM); + + hw = __clk_hw_register_gate_regmap(dev, np, name, parent_name, parent_hw, + parent_data, flags, map, map_offset, + bit_idx, clk_gate_flags, lock); + + if (!IS_ERR(hw)) { + *ptr = hw; + devres_add(dev, ptr); + } else { + devres_free(ptr); + } + + return hw; +} +EXPORT_SYMBOL_GPL(__devm_clk_hw_register_gate_regmap); diff --git a/include/linux/clk-provider.h b/include/linux/clk-provider.h index 630705a471294..0d9ef5c8bf960 100644 --- a/include/linux/clk-provider.h +++ b/include/linux/clk-provider.h @@ -8,6 +8,7 @@ #include #include +#include /* * flags used across common struct clk. these flags should only affect the @@ -538,6 +539,37 @@ struct clk_gate { #define CLK_GATE_BIG_ENDIAN BIT(2) extern const struct clk_ops clk_gate_ops; + +#ifdef CONFIG_COMMON_CLK_REGMAP +/** + * struct clk_gate_regmap - gating clock via regmap + * + * @hw: handle between common and hardware-specific interfaces + * @map: regmap controlling gate + * @map_offset: register offset within the regmap controlling gate + * @bit_idx: single bit controlling gate + * @flags: hardware-specific flags + * @lock: register lock + * + * Clock which can gate its output. Implements .enable & .disable + * + * Flags: + * See clk_gate + */ +struct clk_gate_regmap { + struct clk_hw hw; + struct regmap *map; + u32 map_offset; + u8 bit_idx; + u8 flags; + spinlock_t *lock; +}; + +#define to_clk_gate_regmap(_hw) container_of(_hw, struct clk_gate_regmap, hw) + +extern const struct clk_ops clk_gate_regmap_ops; +#endif + struct clk_hw *__clk_hw_register_gate(struct device *dev, struct device_node *np, const char *name, const char *parent_name, const struct clk_hw *parent_hw, @@ -663,6 +695,31 @@ void clk_unregister_gate(struct clk *clk); void clk_hw_unregister_gate(struct clk_hw *hw); int clk_gate_is_enabled(struct clk_hw *hw); +#ifdef CONFIG_COMMON_CLK_REGMAP +struct clk_hw *__clk_hw_register_gate_regmap(struct device *dev, + struct device_node *np, const char *name, + const char *parent_name, const struct clk_hw *parent_hw, + const struct clk_parent_data *parent_data, + unsigned long flags, + struct regmap *map, u32 map_offset, u8 bit_idx, + u8 clk_gate_flags, spinlock_t *lock); +struct clk_hw *__devm_clk_hw_register_gate_regmap(struct device *dev, + struct device_node *np, const char *name, + const char *parent_name, const struct clk_hw *parent_hw, + const struct clk_parent_data *parent_data, + unsigned long flags, + struct regmap *map, u32 map_offset, u8 bit_idx, + u8 clk_gate_flags, spinlock_t *lock); +struct clk *clk_register_gate_regmap(struct device *dev, const char *name, + const char *parent_name, unsigned long flags, + struct regmap *map, u32 map_offset, u8 bit_idx, + u8 clk_gate_flags, spinlock_t *lock); + +void clk_unregister_gate_regmap(struct clk *clk); +void clk_hw_unregister_gate_regmap(struct clk_hw *hw); +int clk_gate_regmap_is_enabled(struct clk_hw *hw); +#endif + struct clk_div_table { unsigned int val; unsigned int div; @@ -736,6 +793,41 @@ struct clk_divider { extern const struct clk_ops clk_divider_ops; extern const struct clk_ops clk_divider_ro_ops; +#ifdef CONFIG_COMMON_CLK_REGMAP +/** + * struct clk_divider_regmap - adjustable divider clock via regmap + * + * @hw: handle between common and hardware-specific interfaces + * @map: regmap containing the divider + * @map_offset: register offset within the regmap containing the divider + * @shift: shift to the divider bit field + * @width: width of the divider bit field + * @table: array of value/divider pairs, last entry should have div = 0 + * @lock: register lock + * + * Clock with an adjustable divider affecting its output frequency. Implements + * .recalc_rate, .set_rate and .round_rate + * + * @flags: + * See clk_divider + */ +struct clk_divider_regmap { + struct clk_hw hw; + struct regmap *regmap; + u32 map_offset; + u8 shift; + u8 width; + u8 flags; + const struct clk_div_table *table; + spinlock_t *lock; +}; + +#define to_clk_divider_regmap(_hw) container_of(_hw, struct clk_divider_regmap, hw) + +extern const struct clk_ops clk_divider_regmap_ops; +extern const struct clk_ops clk_divider_regmap_ro_ops; +#endif + unsigned long divider_recalc_rate(struct clk_hw *hw, unsigned long parent_rate, unsigned int val, const struct clk_div_table *table, unsigned long flags, unsigned long width); @@ -972,6 +1064,33 @@ struct clk *clk_register_divider_table(struct device *dev, const char *name, void clk_unregister_divider(struct clk *clk); void clk_hw_unregister_divider(struct clk_hw *hw); +#ifdef CONFIG_COMMON_CLK_REGMAP +struct clk_hw *__clk_hw_register_divider_regmap(struct device *dev, + struct device_node *np, const char *name, + const char *parent_name, const struct clk_hw *parent_hw, + const struct clk_parent_data *parent_data, unsigned long flags, + struct regmap *regmap, u32 map_offset, u8 shift, u8 width, + u8 clk_divider_flags, const struct clk_div_table *table, + spinlock_t *lock); + +struct clk_hw *__devm_clk_hw_register_divider_regmap(struct device *dev, + struct device_node *np, const char *name, + const char *parent_name, const struct clk_hw *parent_hw, + const struct clk_parent_data *parent_data, unsigned long flags, + struct regmap *regmap, u32 map_offset, u8 shift, u8 width, + u8 clk_divider_flags, const struct clk_div_table *table, + spinlock_t *lock); + +struct clk *clk_register_divider_regmap_table(struct device *dev, + const char *name, const char *parent_name, unsigned long flags, + struct regmap *regmap, u32 map_offset, u8 shift, u8 width, + u8 clk_divider_flags, const struct clk_div_table *table, + spinlock_t *lock); + +void clk_unregister_divider_regmap(struct clk *clk); +void clk_hw_unregister_divider_regmap(struct clk_hw *hw); +#endif + /** * struct clk_mux - multiplexer clock * -- 2.47.2 From conor at kernel.org Mon Sep 1 04:04:21 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 1 Sep 2025 12:04:21 +0100 Subject: [PATCH v4 9/9] clk: microchip: mpfs: use regmap clock types In-Reply-To: <20250901-rigid-sacrifice-0039c6e6234e@spud> References: <20250901-rigid-sacrifice-0039c6e6234e@spud> Message-ID: <20250901-handful-cardinal-c988f42ac8d9@spud> From: Conor Dooley Convert the PolarFire SoC clock driver to use regmap clock types as a preparatory work for supporting the new binding for this device that will only provide the second of the two register regions, and will require the use of syscon regmap to access the "cfg" and "periph" clocks currently supported by the driver. Signed-off-by: Conor Dooley --- drivers/clk/microchip/Kconfig | 3 + drivers/clk/microchip/clk-mpfs.c | 151 ++++++++++++++++++++----------- 2 files changed, 100 insertions(+), 54 deletions(-) diff --git a/drivers/clk/microchip/Kconfig b/drivers/clk/microchip/Kconfig index 0724ce65898f3..72da1e0f437d9 100644 --- a/drivers/clk/microchip/Kconfig +++ b/drivers/clk/microchip/Kconfig @@ -7,6 +7,9 @@ config MCHP_CLK_MPFS bool "Clk driver for PolarFire SoC" depends on ARCH_MICROCHIP_POLARFIRE || COMPILE_TEST default ARCH_MICROCHIP_POLARFIRE + depends on MFD_SYSCON select AUXILIARY_BUS + select COMMON_CLK_REGMAP + select REGMAP_MMIO help Supports Clock Configuration for PolarFire SoC diff --git a/drivers/clk/microchip/clk-mpfs.c b/drivers/clk/microchip/clk-mpfs.c index c22632a7439c5..c7fec0fcbe379 100644 --- a/drivers/clk/microchip/clk-mpfs.c +++ b/drivers/clk/microchip/clk-mpfs.c @@ -6,8 +6,10 @@ */ #include #include +#include #include #include +#include #include #include @@ -30,6 +32,14 @@ #define MSSPLL_POSTDIV_WIDTH 0x07u #define MSSPLL_FIXED_DIV 4u +static const struct regmap_config clk_mpfs_regmap_config = { + .reg_bits = 32, + .reg_stride = 4, + .val_bits = 32, + .val_format_endian = REGMAP_ENDIAN_LITTLE, + .max_register = REG_SUBBLK_CLOCK_CR, +}; + /* * This clock ID is defined here, rather than the binding headers, as it is an * internal clock only, and therefore has no consumers in other peripheral @@ -39,6 +49,7 @@ struct mpfs_clock_data { struct device *dev; + struct regmap *regmap; void __iomem *base; void __iomem *msspll_base; struct clk_hw_onecell_data hw_data; @@ -68,14 +79,12 @@ struct mpfs_msspll_out_hw_clock { #define to_mpfs_msspll_out_clk(_hw) container_of(_hw, struct mpfs_msspll_out_hw_clock, hw) struct mpfs_cfg_hw_clock { - struct clk_divider cfg; - struct clk_init_data init; + struct clk_divider_regmap divider; unsigned int id; - u32 reg_offset; }; struct mpfs_periph_hw_clock { - struct clk_gate periph; + struct clk_gate_regmap gate; unsigned int id; }; @@ -172,15 +181,15 @@ static int mpfs_clk_register_mssplls(struct device *dev, struct mpfs_msspll_hw_c * MSS PLL output clocks */ -#define CLK_PLL_OUT(_id, _name, _parent, _flags, _shift, _width, _offset) { \ - .id = _id, \ - .output.shift = _shift, \ - .output.width = _width, \ - .output.table = NULL, \ - .reg_offset = _offset, \ - .output.flags = _flags, \ - .output.hw.init = CLK_HW_INIT(_name, _parent, &clk_divider_ops, 0), \ - .output.lock = &mpfs_clk_lock, \ +#define CLK_PLL_OUT(_id, _name, _parent, _flags, _shift, _width, _offset) { \ + .id = _id, \ + .output.shift = _shift, \ + .output.width = _width, \ + .output.table = NULL, \ + .reg_offset = _offset, \ + .output.flags = _flags, \ + .output.hw.init = CLK_HW_INIT(_name, _parent, &clk_divider_regmap_ops, 0), \ + .output.lock = &mpfs_clk_lock, \ } static struct mpfs_msspll_out_hw_clock mpfs_msspll_out_clks[] = { @@ -220,15 +229,14 @@ static int mpfs_clk_register_msspll_outs(struct device *dev, * "CFG" clocks */ -#define CLK_CFG(_id, _name, _parent, _shift, _width, _table, _flags, _offset) { \ - .id = _id, \ - .cfg.shift = _shift, \ - .cfg.width = _width, \ - .cfg.table = _table, \ - .reg_offset = _offset, \ - .cfg.flags = _flags, \ - .cfg.hw.init = CLK_HW_INIT(_name, _parent, &clk_divider_ops, 0), \ - .cfg.lock = &mpfs_clk_lock, \ +#define CLK_CFG(_id, _name, _parent, _shift, _width, _table, _flags, _offset) { \ + .id = _id, \ + .divider.shift = _shift, \ + .divider.width = _width, \ + .divider.table = _table, \ + .divider.map_offset = _offset, \ + .divider.flags = _flags, \ + .divider.hw.init = CLK_HW_INIT(_name, _parent, &clk_divider_regmap_ops, 0), \ } #define CLK_CPU_OFFSET 0u @@ -245,13 +253,13 @@ static struct mpfs_cfg_hw_clock mpfs_cfg_clks[] = { REG_CLOCK_CONFIG_CR), { .id = CLK_RTCREF, - .cfg.shift = 0, - .cfg.width = 12, - .cfg.table = mpfs_div_rtcref_table, - .reg_offset = REG_RTC_CLOCK_CR, - .cfg.flags = CLK_DIVIDER_ONE_BASED, - .cfg.hw.init = - CLK_HW_INIT_PARENTS_DATA("clk_rtcref", mpfs_ext_ref, &clk_divider_ops, 0), + .divider.shift = 0, + .divider.width = 12, + .divider.table = mpfs_div_rtcref_table, + .divider.map_offset = REG_RTC_CLOCK_CR, + .divider.flags = CLK_DIVIDER_ONE_BASED, + .divider.hw.init = + CLK_HW_INIT_PARENTS_DATA("clk_rtcref", mpfs_ext_ref, &clk_divider_regmap_ops, 0), } }; @@ -264,14 +272,14 @@ static int mpfs_clk_register_cfgs(struct device *dev, struct mpfs_cfg_hw_clock * for (i = 0; i < num_clks; i++) { struct mpfs_cfg_hw_clock *cfg_hw = &cfg_hws[i]; - cfg_hw->cfg.reg = data->base + cfg_hw->reg_offset; - ret = devm_clk_hw_register(dev, &cfg_hw->cfg.hw); + cfg_hw->divider.regmap = data->regmap; + ret = devm_clk_hw_register(dev, &cfg_hw->divider.hw); if (ret) return dev_err_probe(dev, ret, "failed to register clock id: %d\n", cfg_hw->id); id = cfg_hw->id; - data->hw_data.hws[id] = &cfg_hw->cfg.hw; + data->hw_data.hws[id] = &cfg_hw->divider.hw; } return 0; @@ -281,15 +289,14 @@ static int mpfs_clk_register_cfgs(struct device *dev, struct mpfs_cfg_hw_clock * * peripheral clocks - devices connected to axi or ahb buses. */ -#define CLK_PERIPH(_id, _name, _parent, _shift, _flags) { \ - .id = _id, \ - .periph.bit_idx = _shift, \ - .periph.hw.init = CLK_HW_INIT_HW(_name, _parent, &clk_gate_ops, \ - _flags), \ - .periph.lock = &mpfs_clk_lock, \ +#define CLK_PERIPH(_id, _name, _parent, _shift, _flags) { \ + .id = _id, \ + .gate.map_offset = REG_SUBBLK_CLOCK_CR, \ + .gate.bit_idx = _shift, \ + .gate.hw.init = CLK_HW_INIT_HW(_name, _parent, &clk_gate_regmap_ops, _flags), \ } -#define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT##_OFFSET].cfg.hw) +#define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT##_OFFSET].divider.hw) /* * Critical clocks: @@ -346,19 +353,60 @@ static int mpfs_clk_register_periphs(struct device *dev, struct mpfs_periph_hw_c for (i = 0; i < num_clks; i++) { struct mpfs_periph_hw_clock *periph_hw = &periph_hws[i]; - periph_hw->periph.reg = data->base + REG_SUBBLK_CLOCK_CR; - ret = devm_clk_hw_register(dev, &periph_hw->periph.hw); + periph_hw->gate.map = data->regmap; + ret = devm_clk_hw_register(dev, &periph_hw->gate.hw); if (ret) return dev_err_probe(dev, ret, "failed to register clock id: %d\n", periph_hw->id); id = periph_hws[i].id; - data->hw_data.hws[id] = &periph_hw->periph.hw; + data->hw_data.hws[id] = &periph_hw->gate.hw; } return 0; } +static inline int mpfs_clk_syscon_probe(struct mpfs_clock_data *clk_data, + struct platform_device *pdev) +{ + clk_data->regmap = syscon_regmap_lookup_by_compatible("microchip,mpfs-mss-top-sysreg"); + if (IS_ERR(clk_data->regmap)) + return PTR_ERR(clk_data->regmap); + + clk_data->msspll_base = devm_platform_ioremap_resource(pdev, 0); + if (IS_ERR(clk_data->msspll_base)) + return PTR_ERR(clk_data->msspll_base); + + return 0; +} + +static inline int mpfs_clk_old_format_probe(struct mpfs_clock_data *clk_data, + struct platform_device *pdev) +{ + struct device *dev = &pdev->dev; + int ret; + + dev_warn(&pdev->dev, "falling back to old devicetree format"); + + clk_data->base = devm_platform_ioremap_resource(pdev, 0); + if (IS_ERR(clk_data->base)) + return PTR_ERR(clk_data->base); + + clk_data->msspll_base = devm_platform_ioremap_resource(pdev, 1); + if (IS_ERR(clk_data->msspll_base)) + return PTR_ERR(clk_data->msspll_base); + + clk_data->regmap = devm_regmap_init_mmio(dev, clk_data->base, &clk_mpfs_regmap_config); + if (IS_ERR(clk_data->regmap)) + return PTR_ERR(clk_data->regmap); + + ret = mpfs_reset_controller_register(dev, clk_data->base + REG_SUBBLK_RESET_CR); + if (ret) + return ret; + + return 0; +} + static int mpfs_clk_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; @@ -374,13 +422,12 @@ static int mpfs_clk_probe(struct platform_device *pdev) if (!clk_data) return -ENOMEM; - clk_data->base = devm_platform_ioremap_resource(pdev, 0); - if (IS_ERR(clk_data->base)) - return PTR_ERR(clk_data->base); - - clk_data->msspll_base = devm_platform_ioremap_resource(pdev, 1); - if (IS_ERR(clk_data->msspll_base)) - return PTR_ERR(clk_data->msspll_base); + ret = mpfs_clk_syscon_probe(clk_data, pdev); + if (ret) { + ret = mpfs_clk_old_format_probe(clk_data, pdev); + if (ret) + return ret; + } clk_data->hw_data.num = num_clks; clk_data->dev = dev; @@ -406,11 +453,7 @@ static int mpfs_clk_probe(struct platform_device *pdev) if (ret) return ret; - ret = devm_of_clk_add_hw_provider(dev, of_clk_hw_onecell_get, &clk_data->hw_data); - if (ret) - return ret; - - return mpfs_reset_controller_register(dev, clk_data->base + REG_SUBBLK_RESET_CR); + return devm_of_clk_add_hw_provider(dev, of_clk_hw_onecell_get, &clk_data->hw_data); } static const struct of_device_id mpfs_clk_of_match_table[] = { -- 2.47.2 From matt.coster at imgtec.com Mon Sep 1 04:13:32 2025 From: matt.coster at imgtec.com (Matt Coster) Date: Mon, 1 Sep 2025 12:13:32 +0100 Subject: (subset) [PATCH v13 0/4] Add TH1520 GPU support with power sequencing In-Reply-To: <20250822-apr_14_for_sending-v13-0-af656f7cc6c3@samsung.com> References: <20250822-apr_14_for_sending-v13-0-af656f7cc6c3@samsung.com> Message-ID: <175672521205.30950.2944854121832397083.b4-ty@imgtec.com> On Fri, 22 Aug 2025 00:20:14 +0200, Michal Wilczynski wrote: > This patch series introduces support for the Imagination IMG BXM-4-64 > GPU found on the T-HEAD TH1520 SoC. A key aspect of this support is > managing the GPU's complex power-up and power-down sequence, which > involves multiple clocks and resets. > > The TH1520 GPU requires a specific sequence to be followed for its > clocks and resets to ensure correct operation. Initial discussions and > an earlier version of this series explored managing this via the generic > power domain (genpd) framework. However, following further discussions > with kernel maintainers [1], the approach has been reworked to utilize > the dedicated power sequencing (pwrseq) framework. > > [...] Applied, thanks! [1/4] drm/imagination: Use pwrseq for TH1520 GPU power management commit: e38e8391f30b41c5a24bb46dc6ef4161921e782d [2/4] dt-bindings: gpu: img,powervr-rogue: Add TH1520 GPU support commit: 337ebfda8a4f2627bf52e200cacf6f3a2f5ccf48 [4/4] drm/imagination: Enable PowerVR driver for RISC-V commit: 6b53cf48d9339c75fa51927b0a67d8a6751066bd Best regards, -- Matt Coster From Matt.Coster at imgtec.com Mon Sep 1 04:16:18 2025 From: Matt.Coster at imgtec.com (Matt Coster) Date: Mon, 1 Sep 2025 11:16:18 +0000 Subject: [PATCH v13 3/4] riscv: dts: thead: th1520: Add IMG BXM-4-64 GPU node In-Reply-To: References: <20250822-apr_14_for_sending-v13-0-af656f7cc6c3@samsung.com> <20250822-apr_14_for_sending-v13-3-af656f7cc6c3@samsung.com> Message-ID: On 28/08/2025 22:56, Drew Fustini wrote: > On Wed, Aug 27, 2025 at 03:08:01PM -0700, Drew Fustini wrote: >> On Fri, Aug 22, 2025 at 01:43:53PM -0700, Drew Fustini wrote: >>> On Fri, Aug 22, 2025 at 12:20:17AM +0200, Michal Wilczynski wrote: >>>> Add a device tree node for the IMG BXM-4-64 GPU present in the T-HEAD >>>> TH1520 SoC used by the Lichee Pi 4A board. This node enables support for >>>> the GPU using the drm/imagination driver. >>>> >>>> By adding this node, the kernel can recognize and initialize the GPU, >>>> providing graphics acceleration capabilities on the Lichee Pi 4A and >>>> other boards based on the TH1520 SoC. >>>> >>>> Add fixed clock gpu_mem_clk, as the MEM clock on the T-HEAD SoC can't be >>>> controlled programatically. >>>> >>>> Reviewed-by: Ulf Hansson >>>> Reviewed-by: Drew Fustini >>>> Reviewed-by: Bartosz Golaszewski >>>> Acked-by: Matt Coster >>>> Signed-off-by: Michal Wilczynski >>>> --- >>>> arch/riscv/boot/dts/thead/th1520.dtsi | 21 +++++++++++++++++++++ >>>> 1 file changed, 21 insertions(+) >>> >>> I've applied this to thead-dt-for-next [1]: >>> >>> 0f78e44fb857 ("riscv: dts: thead: th1520: Add IMG BXM-4-64 GPU node") >>> >>> Thanks, >>> Drew >>> >>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/fustini/linux.git/log/?h=thead-dt-for-next >> >> Hi Matt, >> >> Do you know when the dt binding patch will be applied to >> the drm-misc/for-linux-next tree? >> >> I applied the dts patch but it is creating a warning in next right now. >> If the binding won't show up soon in drm-misc, then I'll remove this dts >> patch from next as dtbs_check is now failing in next. I can add it back >> once the binding makes it to next. > > I've now removed this patch from thead-dt-for-next and will add it back > once the bindings show up in next. Hi Drew, Apologies for the delay, I was on holiday last week. I've just applied the non-dts patches to drm-misc-next [1], would you mind re-adding the dts patch to thead-dt-for-next? Cheers, Matt [1]: https://lore.kernel.org/r/175672521205.30950.2944854121832397083.b4-ty at imgtec.com > > Thanks, > Drew -- Matt Coster E: matt.coster at imgtec.com -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature.asc Type: application/pgp-signature Size: 236 bytes Desc: OpenPGP digital signature URL: From luxu.kernel at bytedance.com Mon Sep 1 04:30:18 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Mon, 1 Sep 2025 19:30:18 +0800 Subject: [PATCH 0/4] riscv: Add Zalasr ISA exntesion support Message-ID: <20250901113022.3812-1-luxu.kernel@bytedance.com> This patch adds support for the Zalasr ISA extension, which supplies the real load acquire/store release instructions. The specification can be found here: https://github.com/riscv/riscv-zalasr/blob/main/chapter2.adoc This patch seires has been tested with ltp on Qemu with Brensan's zalasr support patch[1]. Some false positive spacing error happens during patch checking. Thus I CCed maintainers of checkpatch.pl as well. [1] https://lore.kernel.org/all/CAGPSXwJEdtqW=nx71oufZp64nK6tK=0rytVEcz4F-gfvCOXk2w at mail.gmail.com/ Xu Lu (4): riscv: add ISA extension parsing for Zalasr dt-bindings: riscv: Add Zalasr ISA extension description riscv: Instroduce Zalasr instructions riscv: Use Zalasr for smp_load_acquire/smp_store_release .../devicetree/bindings/riscv/extensions.yaml | 5 ++ arch/riscv/include/asm/barrier.h | 79 ++++++++++++++++--- arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/include/asm/insn-def.h | 79 +++++++++++++++++++ arch/riscv/kernel/cpufeature.c | 1 + 5 files changed, 154 insertions(+), 11 deletions(-) -- 2.20.1 From luxu.kernel at bytedance.com Mon Sep 1 04:30:19 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Mon, 1 Sep 2025 19:30:19 +0800 Subject: [PATCH 1/4] riscv: add ISA extension parsing for Zalasr In-Reply-To: <20250901113022.3812-1-luxu.kernel@bytedance.com> References: <20250901113022.3812-1-luxu.kernel@bytedance.com> Message-ID: <20250901113022.3812-2-luxu.kernel@bytedance.com> Add parsing for Zalasr ISA extension. Signed-off-by: Xu Lu --- arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/kernel/cpufeature.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index affd63e11b0a3..ae3852c4f2ca2 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -106,6 +106,7 @@ #define RISCV_ISA_EXT_ZAAMO 97 #define RISCV_ISA_EXT_ZALRSC 98 #define RISCV_ISA_EXT_ZICBOP 99 +#define RISCV_ISA_EXT_ZALASR 100 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 743d53415572e..bf9d3d92bf372 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -472,6 +472,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_DATA(zaamo, RISCV_ISA_EXT_ZAAMO), __RISCV_ISA_EXT_DATA(zabha, RISCV_ISA_EXT_ZABHA), __RISCV_ISA_EXT_DATA(zacas, RISCV_ISA_EXT_ZACAS), + __RISCV_ISA_EXT_DATA(zalasr, RISCV_ISA_EXT_ZALASR), __RISCV_ISA_EXT_DATA(zalrsc, RISCV_ISA_EXT_ZALRSC), __RISCV_ISA_EXT_DATA(zawrs, RISCV_ISA_EXT_ZAWRS), __RISCV_ISA_EXT_DATA(zfa, RISCV_ISA_EXT_ZFA), -- 2.20.1 From luxu.kernel at bytedance.com Mon Sep 1 04:30:20 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Mon, 1 Sep 2025 19:30:20 +0800 Subject: [PATCH 2/4] dt-bindings: riscv: Add Zalasr ISA extension description In-Reply-To: <20250901113022.3812-1-luxu.kernel@bytedance.com> References: <20250901113022.3812-1-luxu.kernel@bytedance.com> Message-ID: <20250901113022.3812-3-luxu.kernel@bytedance.com> Add description for the Zalasr ISA extension Signed-off-by: Xu Lu --- Documentation/devicetree/bindings/riscv/extensions.yaml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml index ede6a58ccf534..6b8c21807a2da 100644 --- a/Documentation/devicetree/bindings/riscv/extensions.yaml +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml @@ -248,6 +248,11 @@ properties: ratified at commit e87412e621f1 ("integrate Zaamo and Zalrsc text (#1304)") of the unprivileged ISA specification. + - const: zalasr + description: | + The standard Zalasr extension for load-acquire/store-release as frozen + at commit 194f0094 ("Version 0.9 for freeze") of riscv-zalasr. + - const: zawrs description: | The Zawrs extension for entering a low-power state or for trapping -- 2.20.1 From luxu.kernel at bytedance.com Mon Sep 1 04:30:21 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Mon, 1 Sep 2025 19:30:21 +0800 Subject: [PATCH 3/4] riscv: Instroduce Zalasr instructions In-Reply-To: <20250901113022.3812-1-luxu.kernel@bytedance.com> References: <20250901113022.3812-1-luxu.kernel@bytedance.com> Message-ID: <20250901113022.3812-4-luxu.kernel@bytedance.com> Introduce l{b|h|w|d}.{aq|aqrl} and s{b|h|w|d}.{rl|aqrl} instruction encodings. Signed-off-by: Xu Lu --- arch/riscv/include/asm/insn-def.h | 79 +++++++++++++++++++++++++++++++ 1 file changed, 79 insertions(+) diff --git a/arch/riscv/include/asm/insn-def.h b/arch/riscv/include/asm/insn-def.h index d5adbaec1d010..3fec7e66ce50f 100644 --- a/arch/riscv/include/asm/insn-def.h +++ b/arch/riscv/include/asm/insn-def.h @@ -179,6 +179,7 @@ #define RV___RS1(v) __RV_REG(v) #define RV___RS2(v) __RV_REG(v) +#define RV_OPCODE_AMO RV_OPCODE(47) #define RV_OPCODE_MISC_MEM RV_OPCODE(15) #define RV_OPCODE_OP_IMM RV_OPCODE(19) #define RV_OPCODE_SYSTEM RV_OPCODE(115) @@ -208,6 +209,84 @@ __ASM_STR(.error "hlv.d requires 64-bit support") #endif +#define LB_AQ(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(0), FUNC7(26), \ + RD(dest), RS1(addr), __RS2(0)) + +#define LB_AQRL(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(0), FUNC7(27), \ + RD(dest), RS1(addr), __RS2(0)) + +#define LH_AQ(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(1), FUNC7(26), \ + RD(dest), RS1(addr), __RS2(0)) + +#define LH_AQRL(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(1), FUNC7(27), \ + RD(dest), RS1(addr), __RS2(0)) + +#define LW_AQ(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(2), FUNC7(26), \ + RD(dest), RS1(addr), __RS2(0)) + +#define LW_AQRL(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(2), FUNC7(27), \ + RD(dest), RS1(addr), __RS2(0)) + +#define SB_RL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(0), FUNC7(29), \ + __RD(0), RS1(addr), RS2(src)) + +#define SB_AQRL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(0), FUNC7(31), \ + __RD(0), RS1(addr), RS2(src)) + +#define SH_RL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(1), FUNC7(29), \ + __RD(0), RS1(addr), RS2(src)) + +#define SH_AQRL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(1), FUNC7(31), \ + __RD(0), RS1(addr), RS2(src)) + +#define SW_RL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(2), FUNC7(29), \ + __RD(0), RS1(addr), RS2(src)) + +#define SW_AQRL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(2), FUNC7(31), \ + __RD(0), RS1(addr), RS2(src)) + +#ifdef CONFIG_64BIT +#define LD_AQ(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(3), FUNC7(26), \ + RD(dest), RS1(addr), __RS2(0)) + +#define LD_AQRL(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(3), FUNC7(27), \ + RD(dest), RS1(addr), __RS2(0)) + +#define SD_RL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(3), FUNC7(29), \ + __RD(0), RS1(addr), RS2(src)) + +#define SD_AQRL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(3), FUNC7(31), \ + __RD(0), RS1(addr), RS2(src)) +#else +#define LD_AQ(dest, addr) \ + __ASM_STR(.error "ld.aq requires 64-bit support") + +#define LD_AQRL(dest, addr) \ + __ASM_STR(.error "ld.aqrl requires 64-bit support") + +#define SD_RL(dest, addr) \ + __ASM_STR(.error "sd.rl requires 64-bit support") + +#define SD_AQRL(dest, addr) \ + __ASM_STR(.error "sd.aqrl requires 64-bit support") +#endif + #define SINVAL_VMA(vaddr, asid) \ INSN_R(OPCODE_SYSTEM, FUNC3(0), FUNC7(11), \ __RD(0), RS1(vaddr), RS2(asid)) -- 2.20.1 From luxu.kernel at bytedance.com Mon Sep 1 04:30:22 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Mon, 1 Sep 2025 19:30:22 +0800 Subject: [PATCH 4/4] riscv: Use Zalasr for smp_load_acquire/smp_store_release In-Reply-To: <20250901113022.3812-1-luxu.kernel@bytedance.com> References: <20250901113022.3812-1-luxu.kernel@bytedance.com> Message-ID: <20250901113022.3812-5-luxu.kernel@bytedance.com> Replace fence instructions with Zalasr instructions during acquire or release operations. Signed-off-by: Xu Lu --- arch/riscv/include/asm/barrier.h | 79 +++++++++++++++++++++++++++----- 1 file changed, 68 insertions(+), 11 deletions(-) diff --git a/arch/riscv/include/asm/barrier.h b/arch/riscv/include/asm/barrier.h index b8c5726d86acb..b1d2a9a85256d 100644 --- a/arch/riscv/include/asm/barrier.h +++ b/arch/riscv/include/asm/barrier.h @@ -51,19 +51,76 @@ */ #define smp_mb__after_spinlock() RISCV_FENCE(iorw, iorw) -#define __smp_store_release(p, v) \ -do { \ - compiletime_assert_atomic_type(*p); \ - RISCV_FENCE(rw, w); \ - WRITE_ONCE(*p, v); \ +extern void __bad_size_call_parameter(void); + +#define __smp_store_release(p, v) \ +do { \ + compiletime_assert_atomic_type(*p); \ + switch (sizeof(*p)) { \ + case 1: \ + asm volatile(ALTERNATIVE("fence rw, w;\t\nsb %0, 0(%1)\t\n", \ + SB_RL(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : : "r" (v), "r" (p) : "memory"); \ + break; \ + case 2: \ + asm volatile(ALTERNATIVE("fence rw, w;\t\nsh %0, 0(%1)\t\n", \ + SH_RL(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : : "r" (v), "r" (p) : "memory"); \ + break; \ + case 4: \ + asm volatile(ALTERNATIVE("fence rw, w;\t\nsw %0, 0(%1)\t\n", \ + SW_RL(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : : "r" (v), "r" (p) : "memory"); \ + break; \ + case 8: \ + asm volatile(ALTERNATIVE("fence rw, w;\t\nsd %0, 0(%1)\t\n", \ + SD_RL(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : : "r" (v), "r" (p) : "memory"); \ + break; \ + default: \ + __bad_size_call_parameter(); \ + break; \ + } \ } while (0) -#define __smp_load_acquire(p) \ -({ \ - typeof(*p) ___p1 = READ_ONCE(*p); \ - compiletime_assert_atomic_type(*p); \ - RISCV_FENCE(r, rw); \ - ___p1; \ +#define __smp_load_acquire(p) \ +({ \ + TYPEOF_UNQUAL(*p) val; \ + compiletime_assert_atomic_type(*p); \ + switch (sizeof(*p)) { \ + case 1: \ + asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ + LB_AQ(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : "=r" (val) : "r" (p) : "memory"); \ + break; \ + case 2: \ + asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ + LH_AQ(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : "=r" (val) : "r" (p) : "memory"); \ + break; \ + case 4: \ + asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ + LW_AQ(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : "=r" (val) : "r" (p) : "memory"); \ + break; \ + case 8: \ + asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ + LD_AQ(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : "=r" (val) : "r" (p) : "memory"); \ + break; \ + default: \ + __bad_size_call_parameter(); \ + break; \ + } \ + val; \ }) #ifdef CONFIG_RISCV_ISA_ZAWRS -- 2.20.1 From david at redhat.com Mon Sep 1 04:35:05 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 13:35:05 +0200 Subject: [PATCH v1 18/36] mm/gup: drop nth_page() usage within folio when recording subpages In-Reply-To: <8a26ae97-9a78-4db5-be98-9c1f6e4fb403@lucifer.local> References: <20250827220141.262669-1-david@redhat.com> <20250827220141.262669-19-david@redhat.com> <632fea32-28aa-4993-9eff-99fc291c64f2@redhat.com> <8a26ae97-9a78-4db5-be98-9c1f6e4fb403@lucifer.local> Message-ID: <44072455-fc68-430d-ad38-0b9ce6a10b8d@redhat.com> >> >> >> The nice thing is that we only record pages in the array if they actually passed our tests. > > Yeah that's nice actually. > > This is fine (not the meme :P) :D > > So yes let's do this! That leaves us with the following on top of this patch: From 4533c6e3590cab0c53e81045624d5949e0ad9015 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Fri, 29 Aug 2025 15:41:45 +0200 Subject: [PATCH] mm/gup: remove record_subpages() We can just cleanup the code by calculating the #refs earlier, so we can just inline what remains of record_subpages(). Calculate the number of references/pages ahead of times, and record them only once all our tests passed. Signed-off-by: David Hildenbrand --- mm/gup.c | 25 ++++++++----------------- 1 file changed, 8 insertions(+), 17 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 89ca0813791ab..5a72a135ec70b 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -484,19 +484,6 @@ static inline void mm_set_has_pinned_flag(struct mm_struct *mm) #ifdef CONFIG_MMU #ifdef CONFIG_HAVE_GUP_FAST -static int record_subpages(struct page *page, unsigned long sz, - unsigned long addr, unsigned long end, - struct page **pages) -{ - int nr; - - page += (addr & (sz - 1)) >> PAGE_SHIFT; - for (nr = 0; addr != end; nr++, addr += PAGE_SIZE) - pages[nr] = page++; - - return nr; -} - /** * try_grab_folio_fast() - Attempt to get or pin a folio in fast path. * @page: pointer to page to be grabbed @@ -2963,8 +2950,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, if (pmd_special(orig)) return 0; - page = pmd_page(orig); - refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr); + refs = (end - addr) >> PAGE_SHIFT; + page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); folio = try_grab_folio_fast(page, refs, flags); if (!folio) @@ -2985,6 +2972,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, } *nr += refs; + for (; refs; refs--) + *(pages++) = page++; folio_set_referenced(folio); return 1; } @@ -3003,8 +2992,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, if (pud_special(orig)) return 0; - page = pud_page(orig); - refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr); + refs = (end - addr) >> PAGE_SHIFT; + page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); folio = try_grab_folio_fast(page, refs, flags); if (!folio) @@ -3026,6 +3015,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, } *nr += refs; + for (; refs; refs--) + *(pages++) = page++; folio_set_referenced(folio); return 1; } -- 2.50.1 -- Cheers David / dhildenb From luxu.kernel at bytedance.com Mon Sep 1 04:41:39 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Mon, 1 Sep 2025 19:41:39 +0800 Subject: [PATCH RESEND 0/2] riscv: mm: Some optimizations for tlb flush Message-ID: <20250901114141.5438-1-luxu.kernel@bytedance.com> Some optimizations for tlb flush on RISC-V smp: 1. Apply Svinval in update_mmu_cache() to avoid flushing irrelevant tlb entries. 2. Clear bit of current cpu in mm_cpumask after local_flush_tlb_all_asid() to avoid potential IPIs in the future. We saw the number of IPI reduced from ~98k to 268 on mmapstress01 benchmark. Some false positive spacing error happens during patch checking. Thus I CCed maintainers of checkpatch.pl as well. Xu Lu (2): riscv: mm: Apply svinval in update_mmu_cache() riscv: mm: Clear cpu in mm_cpumask after local_flush_tlb_all_asid arch/riscv/include/asm/pgtable.h | 16 +++++++- arch/riscv/include/asm/tlbflush.h | 23 +++++++++++ arch/riscv/mm/tlbflush.c | 64 ++++++++++++------------------- 3 files changed, 63 insertions(+), 40 deletions(-) -- 2.20.1 From luxu.kernel at bytedance.com Mon Sep 1 04:41:40 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Mon, 1 Sep 2025 19:41:40 +0800 Subject: [PATCH RESEND 1/2] riscv: mm: Apply svinval in update_mmu_cache() In-Reply-To: <20250901114141.5438-1-luxu.kernel@bytedance.com> References: <20250901114141.5438-1-luxu.kernel@bytedance.com> Message-ID: <20250901114141.5438-2-luxu.kernel@bytedance.com> Only flush tlb of the specified mm, and apply svinval if available. Signed-off-by: Xu Lu --- arch/riscv/include/asm/pgtable.h | 16 +++++++++++++++- arch/riscv/include/asm/tlbflush.h | 23 +++++++++++++++++++++++ arch/riscv/mm/tlbflush.c | 23 ----------------------- 3 files changed, 38 insertions(+), 24 deletions(-) diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 91697fbf1f901..165cd02d51629 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -495,9 +495,15 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma, unsigned long address, pte_t *ptep, unsigned int nr) { + int i; + unsigned long asid = get_mm_asid(vma->vm_mm); + asm goto(ALTERNATIVE("nop", "j %l[svvptc]", 0, RISCV_ISA_EXT_SVVPTC, 1) : : : : svvptc); + asm goto(ALTERNATIVE("nop", "j %l[svinval]", 0, RISCV_ISA_EXT_SVINVAL, 1) + : : : : svinval); + /* * The kernel assumes that TLBs don't cache invalid entries, but * in RISC-V, SFENCE.VMA specifies an ordering constraint, not a @@ -506,7 +512,15 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf, * the extra traps reduce performance. So, eagerly SFENCE.VMA. */ while (nr--) - local_flush_tlb_page(address + nr * PAGE_SIZE); + local_flush_tlb_page_asid(address + nr * PAGE_SIZE, asid); + return; + +svinval: + local_sfence_w_inval(); + for (i = 0; i < nr; i++) + local_sinval_vma(address + nr * PAGE_SIZE, asid); + local_sfence_inval_ir(); + return; svvptc:; /* diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h index eed0abc405143..9636d07fe9eed 100644 --- a/arch/riscv/include/asm/tlbflush.h +++ b/arch/riscv/include/asm/tlbflush.h @@ -15,6 +15,29 @@ #define FLUSH_TLB_NO_ASID ((unsigned long)-1) #ifdef CONFIG_MMU +static inline unsigned long get_mm_asid(struct mm_struct *mm) +{ + return mm ? cntx2asid(atomic_long_read(&mm->context.id)) : FLUSH_TLB_NO_ASID; +} + +static inline void local_sfence_inval_ir(void) +{ + asm volatile(SFENCE_INVAL_IR() ::: "memory"); +} + +static inline void local_sfence_w_inval(void) +{ + asm volatile(SFENCE_W_INVAL() ::: "memory"); +} + +static inline void local_sinval_vma(unsigned long vma, unsigned long asid) +{ + if (asid != FLUSH_TLB_NO_ASID) + asm volatile(SINVAL_VMA(%0, %1) : : "r" (vma), "r" (asid) : "memory"); + else + asm volatile(SINVAL_VMA(%0, zero) : : "r" (vma) : "memory"); +} + static inline void local_flush_tlb_all(void) { __asm__ __volatile__ ("sfence.vma" : : : "memory"); diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c index 8404530ec00f9..962db300a1665 100644 --- a/arch/riscv/mm/tlbflush.c +++ b/arch/riscv/mm/tlbflush.c @@ -11,24 +11,6 @@ #define has_svinval() riscv_has_extension_unlikely(RISCV_ISA_EXT_SVINVAL) -static inline void local_sfence_inval_ir(void) -{ - asm volatile(SFENCE_INVAL_IR() ::: "memory"); -} - -static inline void local_sfence_w_inval(void) -{ - asm volatile(SFENCE_W_INVAL() ::: "memory"); -} - -static inline void local_sinval_vma(unsigned long vma, unsigned long asid) -{ - if (asid != FLUSH_TLB_NO_ASID) - asm volatile(SINVAL_VMA(%0, %1) : : "r" (vma), "r" (asid) : "memory"); - else - asm volatile(SINVAL_VMA(%0, zero) : : "r" (vma) : "memory"); -} - /* * Flush entire TLB if number of entries to be flushed is greater * than the threshold below. @@ -110,11 +92,6 @@ static void __ipi_flush_tlb_range_asid(void *info) local_flush_tlb_range_asid(d->start, d->size, d->stride, d->asid); } -static inline unsigned long get_mm_asid(struct mm_struct *mm) -{ - return mm ? cntx2asid(atomic_long_read(&mm->context.id)) : FLUSH_TLB_NO_ASID; -} - static void __flush_tlb_range(struct mm_struct *mm, const struct cpumask *cmask, unsigned long start, unsigned long size, -- 2.20.1 From luxu.kernel at bytedance.com Mon Sep 1 04:41:41 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Mon, 1 Sep 2025 19:41:41 +0800 Subject: [PATCH RESEND 2/2] riscv: mm: Clear cpu in mm_cpumask after local_flush_tlb_all_asid In-Reply-To: <20250901114141.5438-1-luxu.kernel@bytedance.com> References: <20250901114141.5438-1-luxu.kernel@bytedance.com> Message-ID: <20250901114141.5438-3-luxu.kernel@bytedance.com> Clear corresponding bit of current cpu in mm_cpumask after executing local_flush_tlb_all_asid(). This reduces the number of IPI due to tlb flush: * ltp - mmapstress01 Before: ~98k After: 268 Signed-off-by: Xu Lu --- arch/riscv/mm/tlbflush.c | 41 ++++++++++++++++++++++++---------------- 1 file changed, 25 insertions(+), 16 deletions(-) diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c index 962db300a1665..571358f385879 100644 --- a/arch/riscv/mm/tlbflush.c +++ b/arch/riscv/mm/tlbflush.c @@ -17,7 +17,8 @@ */ unsigned long tlb_flush_all_threshold __read_mostly = 64; -static void local_flush_tlb_range_threshold_asid(unsigned long start, +static void local_flush_tlb_range_threshold_asid(struct mm_struct *mm, + unsigned long start, unsigned long size, unsigned long stride, unsigned long asid) @@ -27,6 +28,8 @@ static void local_flush_tlb_range_threshold_asid(unsigned long start, if (nr_ptes_in_range > tlb_flush_all_threshold) { local_flush_tlb_all_asid(asid); + if (mm && mm != current->active_mm) + cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(mm)); return; } @@ -46,21 +49,28 @@ static void local_flush_tlb_range_threshold_asid(unsigned long start, } } -static inline void local_flush_tlb_range_asid(unsigned long start, - unsigned long size, unsigned long stride, unsigned long asid) +static inline void local_flush_tlb_range_mm(struct mm_struct *mm, + unsigned long start, + unsigned long size, + unsigned long stride) { - if (size <= stride) + unsigned long asid = get_mm_asid(mm); + + if (size <= stride) { local_flush_tlb_page_asid(start, asid); - else if (size == FLUSH_TLB_MAX_SIZE) + } else if (size == FLUSH_TLB_MAX_SIZE) { local_flush_tlb_all_asid(asid); - else - local_flush_tlb_range_threshold_asid(start, size, stride, asid); + if (mm && mm != current->active_mm) + cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(mm)); + } else { + local_flush_tlb_range_threshold_asid(mm, start, size, stride, asid); + } } /* Flush a range of kernel pages without broadcasting */ void local_flush_tlb_kernel_range(unsigned long start, unsigned long end) { - local_flush_tlb_range_asid(start, end - start, PAGE_SIZE, FLUSH_TLB_NO_ASID); + local_flush_tlb_range_mm(NULL, start, end - start, PAGE_SIZE); } static void __ipi_flush_tlb_all(void *info) @@ -79,17 +89,17 @@ void flush_tlb_all(void) } struct flush_tlb_range_data { - unsigned long asid; + struct mm_struct *mm; unsigned long start; unsigned long size; unsigned long stride; }; -static void __ipi_flush_tlb_range_asid(void *info) +static void __ipi_flush_tlb_range_mm(void *info) { struct flush_tlb_range_data *d = info; - local_flush_tlb_range_asid(d->start, d->size, d->stride, d->asid); + local_flush_tlb_range_mm(d->mm, d->start, d->size, d->stride); } static void __flush_tlb_range(struct mm_struct *mm, @@ -97,7 +107,6 @@ static void __flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long size, unsigned long stride) { - unsigned long asid = get_mm_asid(mm); unsigned int cpu; if (cpumask_empty(cmask)) @@ -107,17 +116,17 @@ static void __flush_tlb_range(struct mm_struct *mm, /* Check if the TLB flush needs to be sent to other CPUs. */ if (cpumask_any_but(cmask, cpu) >= nr_cpu_ids) { - local_flush_tlb_range_asid(start, size, stride, asid); + local_flush_tlb_range_mm(mm, start, size, stride); } else if (riscv_use_sbi_for_rfence()) { - sbi_remote_sfence_vma_asid(cmask, start, size, asid); + sbi_remote_sfence_vma_asid(cmask, start, size, get_mm_asid(mm)); } else { struct flush_tlb_range_data ftd; - ftd.asid = asid; + ftd.mm = mm; ftd.start = start; ftd.size = size; ftd.stride = stride; - on_each_cpu_mask(cmask, __ipi_flush_tlb_range_asid, &ftd, 1); + on_each_cpu_mask(cmask, __ipi_flush_tlb_range_mm, &ftd, 1); } put_cpu(); -- 2.20.1 From pulehui at huawei.com Mon Sep 1 06:23:32 2025 From: pulehui at huawei.com (Pu Lehui) Date: Mon, 1 Sep 2025 21:23:32 +0800 Subject: [PATCH] riscv, bpf: Sign extend struct ops return values properly In-Reply-To: References: <20250827120344.6796-1-hengqi.chen@gmail.com> <1be38ff5-ea37-4d5d-9f33-16799d2fe2c5@huawei.com> Message-ID: On 2025/9/1 17:14, Hengqi Chen wrote: > On Mon, Sep 1, 2025 at 4:06?PM Pu Lehui wrote: >> >> >> >> On 2025/8/28 9:53, Pu Lehui wrote: >>> >>> On 2025/8/27 20:03, Hengqi Chen wrote: >>>> The ns_bpf_qdisc selftest triggers a kernel panic: >>>> >>>> Unable to handle kernel paging request at virtual address >>>> ffffffffa38dbf58 >>>> Current test_progs pgtable: 4K pagesize, 57-bit VAs, >>>> pgdp=0x00000001109cc000 >>>> [ffffffffa38dbf58] pgd=000000011fffd801, p4d=000000011fffd401, >>>> pud=000000011fffd001, pmd=0000000000000000 >>>> Oops [#1] >>>> Modules linked in: bpf_testmod(OE) xt_conntrack nls_iso8859_1 >>>> dm_mod drm drm_panel_orientation_quirks configfs backlight btrfs >>>> blake2b_generic xor lzo_compress zlib_deflate raid6_pq efivarfs [last >>>> unloaded: bpf_testmod(OE)] >>>> CPU: 1 UID: 0 PID: 23584 Comm: test_progs Tainted: G W >>>> OE 6.17.0-rc1-g2465bb83e0b4 #1 NONE >>>> Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE >>>> Hardware name: Unknown Unknown Product/Unknown Product, BIOS >>>> 2024.01+dfsg-1ubuntu5.1 01/01/2024 >>>> epc : __qdisc_run+0x82/0x6f0 >>>> ra : __qdisc_run+0x6e/0x6f0 >>>> epc : ffffffff80bd5c7a ra : ffffffff80bd5c66 sp : ff2000000eecb550 >>>> gp : ffffffff82472098 tp : ff60000096895940 t0 : ffffffff8001f180 >>>> t1 : ffffffff801e1664 t2 : 0000000000000000 s0 : ff2000000eecb5d0 >>>> s1 : ff60000093a6a600 a0 : ffffffffa38dbee8 a1 : 0000000000000001 >>>> a2 : ff2000000eecb510 a3 : 0000000000000001 a4 : 0000000000000000 >>>> a5 : 0000000000000010 a6 : 0000000000000000 a7 : 0000000000735049 >>>> s2 : ffffffffa38dbee8 s3 : 0000000000000040 s4 : ff6000008bcda000 >>>> s5 : 0000000000000008 s6 : ff60000093a6a680 s7 : ff60000093a6a6f0 >>>> s8 : ff60000093a6a6ac s9 : ff60000093140000 s10: 0000000000000000 >>>> s11: ff2000000eecb9d0 t3 : 0000000000000000 t4 : 0000000000ff0000 >>>> t5 : 0000000000000000 t6 : ff60000093a6a8b6 >>>> status: 0000000200000120 badaddr: ffffffffa38dbf58 cause: >>>> 000000000000000d >>>> [] __qdisc_run+0x82/0x6f0 >>>> [] __dev_queue_xmit+0x4c0/0x1128 >>>> [] neigh_resolve_output+0xd0/0x170 >>>> [] ip6_finish_output2+0x226/0x6c8 >>>> [] ip6_finish_output+0x10c/0x2a0 >>>> [] ip6_output+0x5e/0x178 >>>> [] ip6_xmit+0x29a/0x608 >>>> [] inet6_csk_xmit+0xe6/0x140 >>>> [] __tcp_transmit_skb+0x45c/0xaa8 >>>> [] tcp_connect+0x9ce/0xd10 >>>> [] tcp_v6_connect+0x4ac/0x5e8 >>>> [] __inet_stream_connect+0xd8/0x318 >>>> [] inet_stream_connect+0x3e/0x68 >>>> [] __sys_connect_file+0x50/0x88 >>>> [] __sys_connect+0x96/0xc8 >>>> [] __riscv_sys_connect+0x20/0x30 >>>> [] do_trap_ecall_u+0x256/0x378 >>>> [] handle_exception+0x14a/0x156 >>>> Code: 892a 0363 1205 489c 8bc1 c7e5 2d03 084a 2703 080a (2783) 0709 >>>> ---[ end trace 0000000000000000 ]--- >>>> >>>> The bpf_fifo_dequeue prog returns a skb which is a pointer. >>>> The pointer is treated as a 32bit value and sign extend to >>>> 64bit in epilogue. This behavior is right for most bpf prog >>>> types but wrong for struct ops which requires RISC-V ABI. >>> >>> Hi Hengqi, >>> >>> Nice catch! >>> >>> Actually, I think commit 7112cd26e606c7ba51f9cc5c1905f06039f6f379 looks >>> a little bit wired and related to this issue. I guess I need some time >>> to recall this commit. >> >> Hi Hengqi, >> >> Sorry for late due to busy work. After some backtracking, I dismissed my >> doubts about commit 7112cd26e606. >> >>> >>> Thanks. >>> >>>> >>>> So let's sign extend struct ops return values according to >>>> the return value spec in function model. >>>> >>>> Fixes: 25ad10658dc1 ("riscv, bpf: Adapt bpf trampoline to optimized >>>> riscv ftrace framework") >>>> Signed-off-by: Hengqi Chen >>>> --- >>>> arch/riscv/net/bpf_jit_comp64.c | 33 +++++++++++++++++++++++++++++++++ >>>> 1 file changed, 33 insertions(+) >>>> >>>> diff --git a/arch/riscv/net/bpf_jit_comp64.c >>>> b/arch/riscv/net/bpf_jit_comp64.c >>>> index 549c3063c7f1..11ca56320a3f 100644 >>>> --- a/arch/riscv/net/bpf_jit_comp64.c >>>> +++ b/arch/riscv/net/bpf_jit_comp64.c >>>> @@ -954,6 +954,33 @@ static int invoke_bpf_prog(struct bpf_tramp_link >>>> *l, int args_off, int retval_of >>>> return ret; >>>> } >>>> +/* >>>> + * Sign-extend the register if necessary >>>> + */ >>>> +static int sign_extend(struct rv_jit_context *ctx, int r, u8 size) put `ctx` as last param would be more aligned with other function. >>>> +{ >>>> + switch (size) { >>>> + case 1: >>>> + emit_slli(r, r, 56, ctx); >>>> + emit_srai(r, r, 56, ctx); >>>> + break; >>>> + case 2: >>>> + emit_slli(r, r, 48, ctx); >>>> + emit_srai(r, r, 48, ctx) >>>> + break; >>>> + case 4: >>>> + emit_addiw(r, r, 0, ctx); pls use emit_sextb/h/w() helper >>>> + break; >>>> + case 8: >>>> + break; >>>> + default: >>>> + pr_err("bpf-jit: invalid size %d for sign_extend\n", size); >>>> + return -EINVAL; >>>> + } >>>> + >>>> + return 0; >>>> +} >> >> We don't need to sign-ext when return value is 1 or 2 bytes. As for 4 > > Could you please elaborate more on this ? Indeed, you pointed out my misunderstanding. According to riscv calling convention [0], for signed char and short, we need to do sign extension, but no need to do the same for unsigned. So for 1 or 2 bytes, we only need to do that for the signed. Link: https://riscv.org/wp-content/uploads/2024/12/riscv-calling.pdf [0] > IIUC, addiw on 1 byte / 2 byte values is equivalent to zext them. > >> bytes, we have already do that in __build_epilogue. So we only need to >> take care of 8 bytes return value. And the real fix would be: >> >> diff --git a/arch/riscv/net/bpf_jit_comp64.c >> b/arch/riscv/net/bpf_jit_comp64.c >> index 2f7188e0340a..08cc641f8b7c 100644 >> --- a/arch/riscv/net/bpf_jit_comp64.c >> +++ b/arch/riscv/net/bpf_jit_comp64.c >> @@ -1177,6 +1177,9 @@ static int __arch_prepare_bpf_trampoline(struct >> bpf_tramp_image *im, >> if (save_ret) { >> emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); >> emit_ld(regmap[BPF_REG_0], -(retval_off - 8), >> RV_REG_FP, ctx); >> + /* Do not truncate return value when it's 8 bytes */ >> + if (is_struct_ops && m->ret_size == 8) >> + emit_mv(RV_REG_A0, regmap[BPF_REG_0], ctx); >> } >> >> emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); >> >>>> + >>>> static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, >>>> const struct btf_func_model *m, >>>> struct bpf_tramp_links *tlinks, >>>> @@ -1177,6 +1204,12 @@ static int __arch_prepare_bpf_trampoline(struct >>>> bpf_tramp_image *im, >>>> if (save_ret) { >>>> emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); >>>> emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx); >>>> + if (is_struct_ops) { >>>> + emit_mv(RV_REG_A0, regmap[BPF_REG_0], ctx); This could be omit by combining with the sign_extend insn like `sextb(rd, rs, ctx)`. >>>> + ret = sign_extend(ctx, RV_REG_A0, m->ret_size); >>>> + if (ret) >>>> + goto out; >>>> + } >>>> } >>>> emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); From david at redhat.com Mon Sep 1 06:24:59 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 15:24:59 +0200 Subject: [PATCH v2 1/4] copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64) In-Reply-To: <20250901-nios2-implement-clone3-v2-1-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> <20250901-nios2-implement-clone3-v2-1-53fcf5577d57@siemens-energy.com> Message-ID: <94ec640b-76cd-478e-9ee7-ff8597d1fafc@redhat.com> On 01.09.25 15:09, Simon Schuster via B4 Relay wrote: > From: Simon Schuster > > With the introduction of clone3 in commit 7f192e3cd316 ("fork: add > clone3") the effective bit width of clone_flags on all architectures was > increased from 32-bit to 64-bit. However, the signature of the copy_* > helper functions (e.g., copy_sighand) used by copy_process was not > adapted. > > As such, they truncate the flags on any 32-bit architectures that > supports clone3 (arc, arm, csky, m68k, microblaze, mips32, openrisc, > parisc32, powerpc32, riscv32, x86-32 and xtensa). > > For copy_sighand with CLONE_CLEAR_SIGHAND being an actual u64 > constant, this triggers an observable bug in kernel selftest > clone3_clear_sighand: > > if (clone_flags & CLONE_CLEAR_SIGHAND) > > in function copy_sighand within fork.c will always fail given: > > unsigned long /* == uint32_t */ clone_flags > #define CLONE_CLEAR_SIGHAND 0x100000000ULL > > This commit fixes the bug by always passing clone_flags to copy_sighand > via their declared u64 type, invariant of architecture-dependent integer > sizes. > > Fixes: b612e5df4587 ("clone3: add CLONE_CLEAR_SIGHAND") > Cc: stable at vger.kernel.org # linux-5.5+ > Signed-off-by: Simon Schuster > Reviewed-by: Lorenzo Stoakes > --- (stripping To list) Acked-by: David Hildenbrand -- Cheers David / dhildenb From ni_liqiang at 126.com Mon Sep 1 06:36:29 2025 From: ni_liqiang at 126.com (niliqiang) Date: Mon, 1 Sep 2025 21:36:29 +0800 Subject: [RFC PATCH v2 00/10] RISC-V IOMMU HPM and nested IOMMU support In-Reply-To: <20240614142156.29420-3-zong.li@sifive.com> References: <20240614142156.29420-3-zong.li@sifive.com> Message-ID: <20250901133629.87310-1-ni_liqiang@126.com> Hi Zong Fri, 14 Jun 2024 22:21:48 +0800, Zong Li wrote: > This patch initialize the pmu stuff and uninitialize it when driver > removing. The interrupt handling is also provided, this handler need to > be primary handler instead of thread function, because pt_regs is empty > when threading the IRQ, but pt_regs is necessary by perf_event_overflow. > > Signed-off-by: Zong Li > --- > drivers/iommu/riscv/iommu.c | 65 +++++++++++++++++++++++++++++++++++++ > 1 file changed, 65 insertions(+) > > diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c > index 8b6a64c1ad8d..1716b2251f38 100644 > --- a/drivers/iommu/riscv/iommu.c > +++ b/drivers/iommu/riscv/iommu.c > @@ -540,6 +540,62 @@ static irqreturn_t riscv_iommu_fltq_process(int irq, void *data) > return IRQ_HANDLED; > } > > +/* > + * IOMMU Hardware performance monitor > + */ > + > +/* HPM interrupt primary handler */ > +static irqreturn_t riscv_iommu_hpm_irq_handler(int irq, void *dev_id) > +{ > + struct riscv_iommu_device *iommu = (struct riscv_iommu_device *)dev_id; > + > + /* Process pmu irq */ > + riscv_iommu_pmu_handle_irq(&iommu->pmu); > + > + /* Clear performance monitoring interrupt pending */ > + riscv_iommu_writel(iommu, RISCV_IOMMU_REG_IPSR, RISCV_IOMMU_IPSR_PMIP); > + > + return IRQ_HANDLED; > +} > + > +/* HPM initialization */ > +static int riscv_iommu_hpm_enable(struct riscv_iommu_device *iommu) > +{ > + int rc; > + > + if (!(iommu->caps & RISCV_IOMMU_CAPABILITIES_HPM)) > + return 0; > + > + /* > + * pt_regs is empty when threading the IRQ, but pt_regs is necessary > + * by perf_event_overflow. Use primary handler instead of thread > + * function for PM IRQ. > + * > + * Set the IRQF_ONESHOT flag because this IRQ might be shared with > + * other threaded IRQs by other queues. > + */ > + rc = devm_request_irq(iommu->dev, > + iommu->irqs[riscv_iommu_queue_vec(iommu, RISCV_IOMMU_IPSR_PMIP)], > + riscv_iommu_hpm_irq_handler, IRQF_ONESHOT | IRQF_SHARED, NULL, iommu); > + if (rc) > + return rc; > + > + return riscv_iommu_pmu_init(&iommu->pmu, iommu->reg, dev_name(iommu->dev)); > +} > + What are the benefits of initializing the iommu-pmu driver in the iommu driver? It might be better for the RISC-V IOMMU PMU driver to be loaded as a separate module, as this would allow greater flexibility since different vendors may need to add custom events. Also, I'm not quite clear on how custom events should be added if the RISC-V iommu-pmu is placed within the iommu driver. Best regards, Liqiang From david at redhat.com Mon Sep 1 06:38:42 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 15:38:42 +0200 Subject: [PATCH v2 2/4] copy_process: pass clone_flags as u64 across calltree In-Reply-To: <20250901-nios2-implement-clone3-v2-2-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> <20250901-nios2-implement-clone3-v2-2-53fcf5577d57@siemens-energy.com> Message-ID: <48ee3dd0-9af3-4513-aef2-25e185cce349@redhat.com> On 01.09.25 15:09, Simon Schuster via B4 Relay wrote: > From: Simon Schuster > > With the introduction of clone3 in commit 7f192e3cd316 ("fork: add > clone3") the effective bit width of clone_flags on all architectures was > increased from 32-bit to 64-bit, with a new type of u64 for the flags. > However, for most consumers of clone_flags the interface was not > changed from the previous type of unsigned long. > > While this works fine as long as none of the new 64-bit flag bits > (CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still > undesirable in terms of the principle of least surprise. > > Thus, this commit fixes all relevant interfaces of callees to > sys_clone3/copy_process (excluding the architecture-specific > copy_thread) to consistently pass clone_flags as u64, so that > no truncation to 32-bit integers occurs on 32-bit architectures. > > Signed-off-by: Simon Schuster > Reviewed-by: Lorenzo Stoakes > --- > block/blk-ioc.c | 2 +- > fs/namespace.c | 2 +- > include/linux/cgroup.h | 4 ++-- > include/linux/cred.h | 2 +- > include/linux/iocontext.h | 6 +++--- > include/linux/ipc_namespace.h | 4 ++-- > include/linux/lsm_hook_defs.h | 2 +- > include/linux/mnt_namespace.h | 2 +- > include/linux/nsproxy.h | 2 +- > include/linux/pid_namespace.h | 4 ++-- > include/linux/rseq.h | 4 ++-- > include/linux/sched/task.h | 2 +- > include/linux/security.h | 4 ++-- > include/linux/sem.h | 4 ++-- > include/linux/time_namespace.h | 4 ++-- > include/linux/uprobes.h | 4 ++-- > include/linux/user_events.h | 4 ++-- > include/linux/utsname.h | 4 ++-- > include/net/net_namespace.h | 4 ++-- > include/trace/events/task.h | 6 +++--- > ipc/namespace.c | 2 +- > ipc/sem.c | 2 +- > kernel/cgroup/namespace.c | 2 +- > kernel/cred.c | 2 +- > kernel/events/uprobes.c | 2 +- > kernel/fork.c | 8 ++++---- > kernel/nsproxy.c | 4 ++-- > kernel/pid_namespace.c | 2 +- > kernel/sched/core.c | 4 ++-- > kernel/sched/fair.c | 2 +- > kernel/sched/sched.h | 4 ++-- > kernel/time/namespace.c | 2 +- > kernel/utsname.c | 2 +- > net/core/net_namespace.c | 2 +- > security/apparmor/lsm.c | 2 +- > security/security.c | 2 +- > security/selinux/hooks.c | 2 +- > security/tomoyo/tomoyo.c | 2 +- > 38 files changed, 59 insertions(+), 59 deletions(-) > > diff --git a/block/blk-ioc.c b/block/blk-ioc.c > index 9fda3906e5f5..d15918d7fabb 100644 > --- a/block/blk-ioc.c (adjust To: list) Hopefully we caught most of them. The ones not called "clone_flags" are a bit nasty. We could have split of some changes (e.g., trace event), but likely not worth it. Thanks! Acked-by: David Hildenbrand -- Cheers David / dhildenb From david at redhat.com Mon Sep 1 06:39:44 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 15:39:44 +0200 Subject: [PATCH v2 3/4] arch: copy_thread: pass clone_flags as u64 In-Reply-To: <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> Message-ID: <72558e21-2ebf-448a-a93a-3d1a3181a592@redhat.com> On 01.09.25 15:09, Simon Schuster via B4 Relay wrote: > From: Simon Schuster > > With the introduction of clone3 in commit 7f192e3cd316 ("fork: add > clone3") the effective bit width of clone_flags on all architectures was > increased from 32-bit to 64-bit, with a new type of u64 for the flags. > However, for most consumers of clone_flags the interface was not > changed from the previous type of unsigned long. > > While this works fine as long as none of the new 64-bit flag bits > (CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still > undesirable in terms of the principle of least surprise. > > Thus, this commit fixes all relevant interfaces of the copy_thread > function that is called from copy_process to consistently pass > clone_flags as u64, so that no truncation to 32-bit integers occurs on > 32-bit architectures. > > Signed-off-by: Simon Schuster > --- > arch/alpha/kernel/process.c | 2 +- > arch/arc/kernel/process.c | 2 +- > arch/arm/kernel/process.c | 2 +- > arch/arm64/kernel/process.c | 2 +- > arch/csky/kernel/process.c | 2 +- > arch/hexagon/kernel/process.c | 2 +- > arch/loongarch/kernel/process.c | 2 +- > arch/m68k/kernel/process.c | 2 +- > arch/microblaze/kernel/process.c | 2 +- > arch/mips/kernel/process.c | 2 +- > arch/nios2/kernel/process.c | 2 +- > arch/openrisc/kernel/process.c | 2 +- > arch/parisc/kernel/process.c | 2 +- > arch/powerpc/kernel/process.c | 2 +- > arch/riscv/kernel/process.c | 2 +- > arch/s390/kernel/process.c | 2 +- > arch/sh/kernel/process_32.c | 2 +- > arch/sparc/kernel/process_32.c | 2 +- > arch/sparc/kernel/process_64.c | 2 +- > arch/um/kernel/process.c | 2 +- > arch/x86/include/asm/fpu/sched.h | 2 +- > arch/x86/include/asm/shstk.h | 4 ++-- > arch/x86/kernel/fpu/core.c | 2 +- > arch/x86/kernel/process.c | 2 +- > arch/x86/kernel/shstk.c | 2 +- > arch/xtensa/kernel/process.c | 2 +- > 26 files changed, 27 insertions(+), 27 deletions(-) > (Adjust To: list) Thanks! Acked-by: David Hildenbrand -- Cheers David / dhildenb From svetlana.parfenova at syntacore.com Mon Sep 1 06:53:50 2025 From: svetlana.parfenova at syntacore.com (Svetlana Parfenova) Date: Mon, 1 Sep 2025 20:53:50 +0700 Subject: [RFC RESEND v3] binfmt_elf: preserve original ELF e_flags for core dumps In-Reply-To: <20250806161814.607668-1-svetlana.parfenova@syntacore.com> References: <20250806161814.607668-1-svetlana.parfenova@syntacore.com> Message-ID: <20250901135350.619485-1-svetlana.parfenova@syntacore.com> Some architectures, such as RISC-V, use the ELF e_flags field to encode ABI-specific information (e.g., ISA extensions, fpu support). Debuggers like GDB rely on these flags in core dumps to correctly interpret optional register sets. If the flags are missing or incorrect, GDB may warn and ignore valid data, for example: warning: Unexpected size of section '.reg2/213' in core file. This can prevent access to fpu or other architecture-specific registers even when they were dumped. Save the e_flags field during ELF binary loading (in load_elf_binary()) into the mm_struct, and later retrieve it during core dump generation (in fill_note_info()). Kconfig option CONFIG_ARCH_HAS_ELF_CORE_EFLAGS is introduced for architectures that require this behaviour. Signed-off-by: Svetlana Parfenova --- Changes in v3: - Introduce CONFIG_ARCH_HAS_ELF_CORE_EFLAGS Kconfig option instead of arch-specific ELF_CORE_USE_PROCESS_EFLAGS define. - Add helper functions to set/get e_flags in mm_struct. - Wrap saved_e_flags field of mm_struct with #ifdef CONFIG_ARCH_HAS_ELF_CORE_EFLAGS. Changes in v2: - Remove usage of Kconfig option. - Add an architecture-optional macro to set process e_flags. Enabled by defining ELF_CORE_USE_PROCESS_EFLAGS. Defaults to no-op if not used. arch/riscv/Kconfig | 1 + fs/Kconfig.binfmt | 9 +++++++++ fs/binfmt_elf.c | 40 ++++++++++++++++++++++++++++++++++------ include/linux/mm_types.h | 5 +++++ 4 files changed, 49 insertions(+), 6 deletions(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index a4b233a0659e..1bef00208bdd 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -224,6 +224,7 @@ config RISCV select VDSO_GETRANDOM if HAVE_GENERIC_VDSO select USER_STACKTRACE_SUPPORT select ZONE_DMA32 if 64BIT + select ARCH_HAS_ELF_CORE_EFLAGS config RUSTC_SUPPORTS_RISCV def_bool y diff --git a/fs/Kconfig.binfmt b/fs/Kconfig.binfmt index bd2f530e5740..1949e25c7741 100644 --- a/fs/Kconfig.binfmt +++ b/fs/Kconfig.binfmt @@ -184,4 +184,13 @@ config EXEC_KUNIT_TEST This builds the exec KUnit tests, which tests boundary conditions of various aspects of the exec internals. +config ARCH_HAS_ELF_CORE_EFLAGS + bool + depends on BINFMT_ELF && ELF_CORE + default n + help + Select this option if the architecture makes use of the e_flags + field in the ELF header to store ABI or other architecture-specific + information that should be preserved in core dumps. + endmenu diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 4aacf9c9cc2d..e4653bb99946 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -103,6 +103,21 @@ static struct linux_binfmt elf_format = { #define BAD_ADDR(x) (unlikely((unsigned long)(x) >= TASK_SIZE)) +static inline void elf_coredump_set_mm_eflags(struct mm_struct *mm, u32 flags) +{ +#ifdef CONFIG_ARCH_HAS_ELF_CORE_EFLAGS + mm->saved_e_flags = flags; +#endif +} + +static inline u32 elf_coredump_get_mm_eflags(struct mm_struct *mm, u32 flags) +{ +#ifdef CONFIG_ARCH_HAS_ELF_CORE_EFLAGS + flags = mm->saved_e_flags; +#endif + return flags; +} + /* * We need to explicitly zero any trailing portion of the page that follows * p_filesz when it ends before the page ends (e.g. bss), otherwise this @@ -1290,6 +1305,8 @@ static int load_elf_binary(struct linux_binprm *bprm) mm->end_data = end_data; mm->start_stack = bprm->p; + elf_coredump_set_mm_eflags(mm, elf_ex->e_flags); + /** * DOC: "brk" handling * @@ -1804,6 +1821,8 @@ static int fill_note_info(struct elfhdr *elf, int phdrs, struct elf_thread_core_info *t; struct elf_prpsinfo *psinfo; struct core_thread *ct; + u16 machine; + u32 flags; psinfo = kmalloc(sizeof(*psinfo), GFP_KERNEL); if (!psinfo) @@ -1831,17 +1850,26 @@ static int fill_note_info(struct elfhdr *elf, int phdrs, return 0; } - /* - * Initialize the ELF file header. - */ - fill_elf_header(elf, phdrs, - view->e_machine, view->e_flags); + machine = view->e_machine; + flags = view->e_flags; #else view = NULL; info->thread_notes = 2; - fill_elf_header(elf, phdrs, ELF_ARCH, ELF_CORE_EFLAGS); + machine = ELF_ARCH; + flags = ELF_CORE_EFLAGS; #endif + /* + * Override ELF e_flags with value taken from process, + * if arch needs that. + */ + flags = elf_coredump_get_mm_eflags(dump_task->mm, flags); + + /* + * Initialize the ELF file header. + */ + fill_elf_header(elf, phdrs, machine, flags); + /* * Allocate a structure for each thread. */ diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 08bc2442db93..04a2857f12f2 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1102,6 +1102,11 @@ struct mm_struct { unsigned long saved_auxv[AT_VECTOR_SIZE]; /* for /proc/PID/auxv */ +#ifdef CONFIG_ARCH_HAS_ELF_CORE_EFLAGS + /* the ABI-related flags from the ELF header. Used for core dump */ + unsigned long saved_e_flags; +#endif + struct percpu_counter rss_stat[NR_MM_COUNTERS]; struct linux_binfmt *binfmt; -- 2.51.0 From conor at kernel.org Mon Sep 1 07:08:31 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 1 Sep 2025 15:08:31 +0100 Subject: RISC-V: Re-enable GCC+Rust builds In-Reply-To: <20250830-cheesy-prone-ee5fae406c22@spud> References: <68496eed-b5a4-4739-8d84-dcc428a08e20@gmail.com> <20250830-cheesy-prone-ee5fae406c22@spud> Message-ID: <20250901-lasso-kabob-de32b8fcede8@spud> On Sat, Aug 30, 2025 at 07:17:48PM +0100, Conor Dooley wrote: > On Sat, Aug 30, 2025 at 01:00:56PM +0800, Asuna Yang wrote: > > I noticed that GCC+Rust builds for RISC-V were disabled about a year ago, as > > discussed in > > https://lore.kernel.org/all/20240917000848.720765-1-jmontleo at redhat.com/ > > > > I'm a bit lost here. What are the main obstacles to re-enabling GCC builds > > now? > > > > Conor said: > > > Okay. Short term then is deny gcc + rust, longer term is allow it with the > > same caveats as the aforementioned mixed stuff. > > "the same caveats" means detecting what specifically? > > There's "code" in the riscv Kconfig/Makefile that makes sure that the > assembler has the same understanding of what extensions are enabled as > the compiler. This is done by detecting which version of the tools are > in use, and adjusting march etc as a result. See > TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI for an example. When I wrote the > comment you're citing, there was no "off the shelf" way to figure out > the version of libclang in use to ensure that it has the same > understanding of -march as the version of gcc being used on the c side > does. For clang build, it's not a concern since it's almost certainly > the exact same as the compiler building the c side. > > > We have a RISC-V PWM driver being written in Rust. Currently, GCC being > > disabled for building the kernel with Rust for RISC-V is the primary blocker > > for including these drivers in RISC-V distros. Therefore, I'd like to push > > forward and contribute to the re-enabling of GCC builds. Is there a more > > detailed direction on what I can do here? > > Add the version of libclang as a Kconfig symbol, so that the kernel's > build system can ensure that both sides are built using the same > configuration. Off the top of my head, using a pre-17 libclang with a > new gcc would require having zicsr in -march for the c side and it > removed for rust. It's been a while (1 year+) since I fiddled with this > though, so my recollection there could well be inaccurate. Hmm, while I think of it, there's some other things that are problematic that are not currently checked but would have to be. For example, there's a check in the riscv Kconfig menu to see if stack-protector-guard=tls can be used via a cc-option check. If that check passes with gcc as the compiler that option will be passed to the rust side of the build, where llvm might not support it. Similarly, turning on an extension like Zacas via a cc-option check could pass for gcc but not be usable when passed to the rust side, causing errors. These sorts of things should be prevented via Kconfig, not show up as confusing build errors. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From david at redhat.com Mon Sep 1 08:03:22 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:22 +0200 Subject: [PATCH v2 01/37] mm: stop making SPARSEMEM_VMEMMAP user-selectable In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-2-david@redhat.com> In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Acked-by: Zi Yan Acked-by: Mike Rapoport (Microsoft) Acked-by: SeongJae Park Reviewed-by: Wei Yang Reviewed-by: Lorenzo Stoakes Reviewed-by: Liam R. Howlett Cc: Huacai Chen Cc: WANG Xuerui Cc: Madhavan Srinivasan Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Cc: Paul Walmsley Cc: Palmer Dabbelt Cc: Albert Ou Cc: Alexandre Ghiti Cc: "David S. Miller" Cc: Andreas Larsson Signed-off-by: David Hildenbrand --- mm/Kconfig | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index 4108bcd967848..330d0e698ef96 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -439,9 +439,8 @@ config SPARSEMEM_VMEMMAP_ENABLE bool config SPARSEMEM_VMEMMAP - bool "Sparse Memory virtual memmap" + def_bool y depends on SPARSEMEM && SPARSEMEM_VMEMMAP_ENABLE - default y help SPARSEMEM_VMEMMAP uses a virtually mapped memmap to optimise pfn_to_page and page_to_pfn operations. This is the most -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:23 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:23 +0200 Subject: [PATCH v2 02/37] arm64: Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP" In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-3-david@redhat.com> Now handled by the core automatically once SPARSEMEM_VMEMMAP_ENABLE is selected. Reviewed-by: Mike Rapoport (Microsoft) Acked-by: Catalin Marinas Reviewed-by: Lorenzo Stoakes Reviewed-by: Liam R. Howlett Cc: Will Deacon Signed-off-by: David Hildenbrand --- arch/arm64/Kconfig | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index e9bbfacc35a64..b1d1f2ff2493b 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1570,7 +1570,6 @@ source "kernel/Kconfig.hz" config ARCH_SPARSEMEM_ENABLE def_bool y select SPARSEMEM_VMEMMAP_ENABLE - select SPARSEMEM_VMEMMAP config HW_PERF_EVENTS def_bool y -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:24 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:24 +0200 Subject: [PATCH v2 03/37] s390/Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP" In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-4-david@redhat.com> Now handled by the core automatically once SPARSEMEM_VMEMMAP_ENABLE is selected. Reviewed-by: Mike Rapoport (Microsoft) Reviewed-by: Lorenzo Stoakes Reviewed-by: Liam R. Howlett Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Alexander Gordeev Cc: Christian Borntraeger Cc: Sven Schnelle Signed-off-by: David Hildenbrand --- arch/s390/Kconfig | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index bf680c26a33cf..145ca23c2fff6 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -710,7 +710,6 @@ menu "Memory setup" config ARCH_SPARSEMEM_ENABLE def_bool y select SPARSEMEM_VMEMMAP_ENABLE - select SPARSEMEM_VMEMMAP config ARCH_SPARSEMEM_DEFAULT def_bool y -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:25 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:25 +0200 Subject: [PATCH v2 04/37] x86/Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP" In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-5-david@redhat.com> Now handled by the core automatically once SPARSEMEM_VMEMMAP_ENABLE is selected. Reviewed-by: Mike Rapoport (Microsoft) Acked-by: Dave Hansen Reviewed-by: Lorenzo Stoakes Reviewed-by: Liam R. Howlett Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Signed-off-by: David Hildenbrand --- arch/x86/Kconfig | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 58d890fe2100e..e431d1c06fecd 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1552,7 +1552,6 @@ config ARCH_SPARSEMEM_ENABLE def_bool y select SPARSEMEM_STATIC if X86_32 select SPARSEMEM_VMEMMAP_ENABLE if X86_64 - select SPARSEMEM_VMEMMAP if X86_64 config ARCH_SPARSEMEM_DEFAULT def_bool X86_64 || (NUMA && X86_32) -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:26 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:26 +0200 Subject: [PATCH v2 05/37] wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu kernel config In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-6-david@redhat.com> It's no longer user-selectable (and the default was already "y"), so let's just drop it. It was never really relevant to the wireguard selftests either way. Acked-by: Mike Rapoport (Microsoft) Reviewed-by: Lorenzo Stoakes Reviewed-by: Liam R. Howlett Cc: "Jason A. Donenfeld" Cc: Shuah Khan Signed-off-by: David Hildenbrand --- tools/testing/selftests/wireguard/qemu/kernel.config | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/testing/selftests/wireguard/qemu/kernel.config b/tools/testing/selftests/wireguard/qemu/kernel.config index 0a5381717e9f4..1149289f4b30f 100644 --- a/tools/testing/selftests/wireguard/qemu/kernel.config +++ b/tools/testing/selftests/wireguard/qemu/kernel.config @@ -48,7 +48,6 @@ CONFIG_JUMP_LABEL=y CONFIG_FUTEX=y CONFIG_SHMEM=y CONFIG_SLUB=y -CONFIG_SPARSEMEM_VMEMMAP=y CONFIG_SMP=y CONFIG_SCHED_SMT=y CONFIG_SCHED_MC=y -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:27 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:27 +0200 Subject: [PATCH v2 06/37] mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-7-david@redhat.com> Let's reject them early, which in turn makes folio_alloc_gigantic() reject them properly. To avoid converting from order to nr_pages, let's just add MAX_FOLIO_ORDER and calculate MAX_FOLIO_NR_PAGES based on that. While at it, let's just make the order a "const unsigned order". Reviewed-by: Zi Yan Acked-by: SeongJae Park Reviewed-by: Wei Yang Reviewed-by: Lorenzo Stoakes Reviewed-by: Liam R. Howlett Signed-off-by: David Hildenbrand --- include/linux/mm.h | 6 ++++-- mm/page_alloc.c | 10 +++++++++- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 00c8a54127d37..77737cbf2216a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2055,11 +2055,13 @@ static inline long folio_nr_pages(const struct folio *folio) /* Only hugetlbfs can allocate folios larger than MAX_ORDER */ #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE -#define MAX_FOLIO_NR_PAGES (1UL << PUD_ORDER) +#define MAX_FOLIO_ORDER PUD_ORDER #else -#define MAX_FOLIO_NR_PAGES MAX_ORDER_NR_PAGES +#define MAX_FOLIO_ORDER MAX_PAGE_ORDER #endif +#define MAX_FOLIO_NR_PAGES (1UL << MAX_FOLIO_ORDER) + /* * compound_nr() returns the number of pages in this potentially compound * page. compound_nr() can be called on a tail page, and is defined to diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 27ea4c7acd158..7e96c69a06ccb 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6841,6 +6841,7 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask) int alloc_contig_range_noprof(unsigned long start, unsigned long end, acr_flags_t alloc_flags, gfp_t gfp_mask) { + const unsigned int order = ilog2(end - start); unsigned long outer_start, outer_end; int ret = 0; @@ -6858,6 +6859,14 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end, PB_ISOLATE_MODE_CMA_ALLOC : PB_ISOLATE_MODE_OTHER; + /* + * In contrast to the buddy, we allow for orders here that exceed + * MAX_PAGE_ORDER, so we must manually make sure that we are not + * exceeding the maximum folio order. + */ + if (WARN_ON_ONCE((gfp_mask & __GFP_COMP) && order > MAX_FOLIO_ORDER)) + return -EINVAL; + gfp_mask = current_gfp_context(gfp_mask); if (__alloc_contig_verify_gfp_mask(gfp_mask, (gfp_t *)&cc.gfp_mask)) return -EINVAL; @@ -6955,7 +6964,6 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end, free_contig_range(end, outer_end - end); } else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) { struct page *head = pfn_to_page(start); - int order = ilog2(end - start); check_new_pages(head, order); prep_new_page(head, order, gfp_mask, 0); -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:28 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:28 +0200 Subject: [PATCH v2 07/37] mm/memremap: reject unreasonable folio/compound page sizes in memremap_pages() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-8-david@redhat.com> Let's reject unreasonable folio sizes early, where we can still fail. We'll add sanity checks to prepare_compound_head/prepare_compound_page next. Is there a way to configure a system such that unreasonable folio sizes would be possible? It would already be rather questionable. If so, we'd probably want to bail out earlier, where we can avoid a WARN and just report a proper error message that indicates where something went wrong such that we messed up. Acked-by: SeongJae Park Reviewed-by: Lorenzo Stoakes Reviewed-by: Liam R. Howlett Signed-off-by: David Hildenbrand --- mm/memremap.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/memremap.c b/mm/memremap.c index b0ce0d8254bd8..a2d4bb88f64b6 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -275,6 +275,9 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid) if (WARN_ONCE(!nr_range, "nr_range must be specified\n")) return ERR_PTR(-EINVAL); + if (WARN_ONCE(pgmap->vmemmap_shift > MAX_FOLIO_ORDER, + "requested folio size unsupported\n")) + return ERR_PTR(-EINVAL); switch (pgmap->type) { case MEMORY_DEVICE_PRIVATE: -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:29 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:29 +0200 Subject: [PATCH v2 08/37] mm/hugetlb: check for unreasonable folio sizes when registering hstate In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-9-david@redhat.com> Let's check that no hstate that corresponds to an unreasonable folio size is registered by an architecture. If we were to succeed registering, we could later try allocating an unsupported gigantic folio size. Further, let's add a BUILD_BUG_ON() for checking that HUGETLB_PAGE_ORDER is sane at build time. As HUGETLB_PAGE_ORDER is dynamic on powerpc, we have to use a BUILD_BUG_ON_INVALID() to make it compile. No existing kernel configuration should be able to trigger this check: either SPARSEMEM without SPARSEMEM_VMEMMAP cannot be configured or gigantic folios will not exceed a memory section (the case on sparse). Reviewed-by: Zi Yan Reviewed-by: Lorenzo Stoakes Reviewed-by: Liam R. Howlett Signed-off-by: David Hildenbrand --- mm/hugetlb.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1e777cc51ad04..d3542e92a712e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4657,6 +4657,7 @@ static int __init hugetlb_init(void) BUILD_BUG_ON(sizeof_field(struct page, private) * BITS_PER_BYTE < __NR_HPAGEFLAGS); + BUILD_BUG_ON_INVALID(HUGETLB_PAGE_ORDER > MAX_FOLIO_ORDER); if (!hugepages_supported()) { if (hugetlb_max_hstate || default_hstate_max_huge_pages) @@ -4740,6 +4741,7 @@ void __init hugetlb_add_hstate(unsigned int order) } BUG_ON(hugetlb_max_hstate >= HUGE_MAX_HSTATE); BUG_ON(order < order_base_2(__NR_USED_SUBPAGE)); + WARN_ON(order > MAX_FOLIO_ORDER); h = &hstates[hugetlb_max_hstate++]; __mutex_init(&h->resize_lock, "resize mutex", &h->resize_key); h->order = order; -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:30 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:30 +0200 Subject: [PATCH v2 09/37] mm/mm_init: make memmap_init_compound() look more like prep_compound_page() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-10-david@redhat.com> Grepping for "prep_compound_page" leaves on clueless how devdax gets its compound pages initialized. Let's add a comment that might help finding this open-coded prep_compound_page() initialization more easily. Further, let's be less smart about the ordering of initialization and just perform the prep_compound_head() call after all tail pages were initialized: just like prep_compound_page() does. No need for a comment to describe the initialization order: again, just like prep_compound_page(). Reviewed-by: Mike Rapoport (Microsoft) Reviewed-by: Wei Yang Reviewed-by: Lorenzo Stoakes Acked-by: Liam R. Howlett Signed-off-by: David Hildenbrand --- mm/mm_init.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/mm/mm_init.c b/mm/mm_init.c index 5c21b3af216b2..df614556741a4 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1091,6 +1091,12 @@ static void __ref memmap_init_compound(struct page *head, unsigned long pfn, end_pfn = head_pfn + nr_pages; unsigned int order = pgmap->vmemmap_shift; + /* + * We have to initialize the pages, including setting up page links. + * prep_compound_page() does not take care of that, so instead we + * open-code prep_compound_page() so we can take care of initializing + * the pages in the same go. + */ __SetPageHead(head); for (pfn = head_pfn + 1; pfn < end_pfn; pfn++) { struct page *page = pfn_to_page(pfn); @@ -1098,15 +1104,8 @@ static void __ref memmap_init_compound(struct page *head, __init_zone_device_page(page, pfn, zone_idx, nid, pgmap); prep_compound_tail(head, pfn - head_pfn); set_page_count(page, 0); - - /* - * The first tail page stores important compound page info. - * Call prep_compound_head() after the first tail page has - * been initialized, to not have the data overwritten. - */ - if (pfn == head_pfn + 1) - prep_compound_head(head, order); } + prep_compound_head(head, order); } void __ref memmap_init_zone_device(struct zone *zone, -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:31 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:31 +0200 Subject: [PATCH v2 10/37] mm: sanity-check maximum folio size in folio_set_order() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-11-david@redhat.com> Let's sanity-check in folio_set_order() whether we would be trying to create a folio with an order that would make it exceed MAX_FOLIO_ORDER. This will enable the check whenever a folio/compound page is initialized through prepare_compound_head() / prepare_compound_page() with CONFIG_DEBUG_VM set. Reviewed-by: Zi Yan Reviewed-by: Wei Yang Reviewed-by: Lorenzo Stoakes Reviewed-by: Liam R. Howlett Signed-off-by: David Hildenbrand --- mm/internal.h | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/internal.h b/mm/internal.h index 45da9ff5694f6..9b0129531d004 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -755,6 +755,7 @@ static inline void folio_set_order(struct folio *folio, unsigned int order) { if (WARN_ON_ONCE(!order || !folio_test_large(folio))) return; + VM_WARN_ON_ONCE(order > MAX_FOLIO_ORDER); folio->_flags_1 = (folio->_flags_1 & ~0xffUL) | order; #ifdef NR_PAGES_IN_LARGE_FOLIO -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:32 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:32 +0200 Subject: [PATCH v2 11/37] mm: limit folio/compound page sizes in problematic kernel configs In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-12-david@redhat.com> Let's limit the maximum folio size in problematic kernel config where the memmap is allocated per memory section (SPARSEMEM without SPARSEMEM_VMEMMAP) to a single memory section. Currently, only a single architectures supports ARCH_HAS_GIGANTIC_PAGE but not SPARSEMEM_VMEMMAP: sh. Fortunately, the biggest hugetlb size sh supports is 64 MiB (HUGETLB_PAGE_SIZE_64MB) and the section size is at least 64 MiB (SECTION_SIZE_BITS == 26), so their use case is not degraded. As folios and memory sections are naturally aligned to their order-2 size in memory, consequently a single folio can no longer span multiple memory sections on these problematic kernel configs. nth_page() is no longer required when operating within a single compound page / folio. Reviewed-by: Zi Yan Acked-by: Mike Rapoport (Microsoft) Reviewed-by: Wei Yang Reviewed-by: Lorenzo Stoakes Reviewed-by: Liam R. Howlett Signed-off-by: David Hildenbrand --- include/linux/mm.h | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 77737cbf2216a..2dee79fa2efcf 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2053,11 +2053,25 @@ static inline long folio_nr_pages(const struct folio *folio) return folio_large_nr_pages(folio); } -/* Only hugetlbfs can allocate folios larger than MAX_ORDER */ -#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE -#define MAX_FOLIO_ORDER PUD_ORDER -#else +#if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE) +/* + * We don't expect any folios that exceed buddy sizes (and consequently + * memory sections). + */ #define MAX_FOLIO_ORDER MAX_PAGE_ORDER +#elif defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP) +/* + * Only pages within a single memory section are guaranteed to be + * contiguous. By limiting folios to a single memory section, all folio + * pages are guaranteed to be contiguous. + */ +#define MAX_FOLIO_ORDER PFN_SECTION_SHIFT +#else +/* + * There is no real limit on the folio size. We limit them to the maximum we + * currently expect (e.g., hugetlb, dax). + */ +#define MAX_FOLIO_ORDER PUD_ORDER #endif #define MAX_FOLIO_NR_PAGES (1UL << MAX_FOLIO_ORDER) -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:33 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:33 +0200 Subject: [PATCH v2 12/37] mm: simplify folio_page() and folio_page_idx() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-13-david@redhat.com> Now that a single folio/compound page can no longer span memory sections in problematic kernel configurations, we can stop using nth_page() in folio_page() and folio_page_idx(). While at it, turn both macros into static inline functions and add kernel doc for folio_page_idx(). Reviewed-by: Zi Yan Reviewed-by: Wei Yang Reviewed-by: Lorenzo Stoakes Signed-off-by: David Hildenbrand --- include/linux/mm.h | 16 ++++++++++++++-- include/linux/page-flags.h | 5 ++++- 2 files changed, 18 insertions(+), 3 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 2dee79fa2efcf..f6880e3225c5c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -210,10 +210,8 @@ extern unsigned long sysctl_admin_reserve_kbytes; #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP) #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n)) -#define folio_page_idx(folio, p) (page_to_pfn(p) - folio_pfn(folio)) #else #define nth_page(page,n) ((page) + (n)) -#define folio_page_idx(folio, p) ((p) - &(folio)->page) #endif /* to align the pointer to the (next) page boundary */ @@ -225,6 +223,20 @@ extern unsigned long sysctl_admin_reserve_kbytes; /* test whether an address (unsigned long or pointer) is aligned to PAGE_SIZE */ #define PAGE_ALIGNED(addr) IS_ALIGNED((unsigned long)(addr), PAGE_SIZE) +/** + * folio_page_idx - Return the number of a page in a folio. + * @folio: The folio. + * @page: The folio page. + * + * This function expects that the page is actually part of the folio. + * The returned number is relative to the start of the folio. + */ +static inline unsigned long folio_page_idx(const struct folio *folio, + const struct page *page) +{ + return page - &folio->page; +} + static inline struct folio *lru_to_folio(struct list_head *head) { return list_entry((head)->prev, struct folio, lru); diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 5ee6ffbdbf831..faf17ca211b4f 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -316,7 +316,10 @@ static __always_inline unsigned long _compound_head(const struct page *page) * check that the page number lies within @folio; the caller is presumed * to have a reference to the page. */ -#define folio_page(folio, n) nth_page(&(folio)->page, n) +static inline struct page *folio_page(struct folio *folio, unsigned long n) +{ + return &folio->page + n; +} static __always_inline int PageTail(const struct page *page) { -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:34 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:34 +0200 Subject: [PATCH v2 13/37] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-14-david@redhat.com> We can now safely iterate over all pages in a folio, so no need for the pfn_to_page(). Also, as we already force the refcount in __init_single_page() to 1 through init_page_count(), we can just set the refcount to 0 and avoid page_ref_freeze() + VM_BUG_ON. Likely, in the future, we would just want to tell __init_single_page() to which value to initialize the refcount. Further, adjust the comments to highlight that we are dealing with an open-coded prep_compound_page() variant, and add another comment explaining why we really need the __init_single_page() only on the tail pages. Note that the current code was likely problematic, but we never ran into it: prep_compound_tail() would have been called with an offset that might exceed a memory section, and prep_compound_tail() would have simply added that offset to the page pointer -- which would not have done the right thing on sparsemem without vmemmap. Reviewed-by: Mike Rapoport (Microsoft) Reviewed-by: Lorenzo Stoakes Acked-by: Liam R. Howlett Signed-off-by: David Hildenbrand --- mm/hugetlb.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d3542e92a712e..56e6d2af08434 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3237,17 +3237,18 @@ static void __init hugetlb_folio_init_tail_vmemmap(struct folio *folio, { enum zone_type zone = zone_idx(folio_zone(folio)); int nid = folio_nid(folio); + struct page *page = folio_page(folio, start_page_number); unsigned long head_pfn = folio_pfn(folio); unsigned long pfn, end_pfn = head_pfn + end_page_number; - int ret; - - for (pfn = head_pfn + start_page_number; pfn < end_pfn; pfn++) { - struct page *page = pfn_to_page(pfn); + /* + * As we marked all tail pages with memblock_reserved_mark_noinit(), + * we must initialize them ourselves here. + */ + for (pfn = head_pfn + start_page_number; pfn < end_pfn; page++, pfn++) { __init_single_page(page, pfn, zone, nid); prep_compound_tail((struct page *)folio, pfn - head_pfn); - ret = page_ref_freeze(page, 1); - VM_BUG_ON(!ret); + set_page_count(page, 0); } } @@ -3257,12 +3258,15 @@ static void __init hugetlb_folio_init_vmemmap(struct folio *folio, { int ret; - /* Prepare folio head */ + /* + * This is an open-coded prep_compound_page() whereby we avoid + * walking pages twice by initializing/preparing+freezing them in the + * same go. + */ __folio_clear_reserved(folio); __folio_set_head(folio); ret = folio_ref_freeze(folio, 1); VM_BUG_ON(!ret); - /* Initialize the necessary tail struct pages */ hugetlb_folio_init_tail_vmemmap(folio, 1, nr_pages); prep_compound_head((struct page *)folio, huge_page_order(h)); } -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:35 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:35 +0200 Subject: [PATCH v2 14/37] mm/mm/percpu-km: drop nth_page() usage within single allocation In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-15-david@redhat.com> We're allocating a higher-order page from the buddy. For these pages (that are guaranteed to not exceed a single memory section) there is no need to use nth_page(). Reviewed-by: Lorenzo Stoakes Acked-by: Liam R. Howlett Signed-off-by: David Hildenbrand --- mm/percpu-km.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/percpu-km.c b/mm/percpu-km.c index fe31aa19db81a..4efa74a495cb6 100644 --- a/mm/percpu-km.c +++ b/mm/percpu-km.c @@ -69,7 +69,7 @@ static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp) } for (i = 0; i < nr_pages; i++) - pcpu_set_page_chunk(nth_page(pages, i), chunk); + pcpu_set_page_chunk(pages + i, chunk); chunk->data = pages; chunk->base_addr = page_address(pages); -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:36 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:36 +0200 Subject: [PATCH v2 15/37] fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-16-david@redhat.com> The nth_page() is not really required anymore, so let's remove it. Reviewed-by: Zi Yan Reviewed-by: Lorenzo Stoakes Signed-off-by: David Hildenbrand --- fs/hugetlbfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 34d496a2b7de6..c5a46d10afaa0 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -217,7 +217,7 @@ static size_t adjust_range_hwpoison(struct folio *folio, size_t offset, break; offset += n; if (offset == PAGE_SIZE) { - page = nth_page(page, 1); + page++; offset = 0; } } -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:37 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:37 +0200 Subject: [PATCH v2 16/37] fs: hugetlbfs: cleanup folio in adjust_range_hwpoison() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-17-david@redhat.com> Let's cleanup and simplify the function a bit. Reviewed-by: Zi Yan Reviewed-by: Lorenzo Stoakes Signed-off-by: David Hildenbrand --- fs/hugetlbfs/inode.c | 36 ++++++++++++------------------------ 1 file changed, 12 insertions(+), 24 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index c5a46d10afaa0..3cfdf4091001f 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -192,37 +192,25 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr, * Someone wants to read @bytes from a HWPOISON hugetlb @folio from @offset. * Returns the maximum number of bytes one can read without touching the 1st raw * HWPOISON page. - * - * The implementation borrows the iteration logic from copy_page_to_iter*. */ static size_t adjust_range_hwpoison(struct folio *folio, size_t offset, size_t bytes) { - struct page *page; - size_t n = 0; - size_t res = 0; - - /* First page to start the loop. */ - page = folio_page(folio, offset / PAGE_SIZE); - offset %= PAGE_SIZE; - while (1) { - if (is_raw_hwpoison_page_in_hugepage(page)) - break; + struct page *page = folio_page(folio, offset / PAGE_SIZE); + size_t safe_bytes; + + if (is_raw_hwpoison_page_in_hugepage(page)) + return 0; + /* Safe to read the remaining bytes in this page. */ + safe_bytes = PAGE_SIZE - (offset % PAGE_SIZE); + page++; - /* Safe to read n bytes without touching HWPOISON subpage. */ - n = min(bytes, (size_t)PAGE_SIZE - offset); - res += n; - bytes -= n; - if (!bytes || !n) + /* Check each remaining page as long as we are not done yet. */ + for (; safe_bytes < bytes; safe_bytes += PAGE_SIZE, page++) + if (is_raw_hwpoison_page_in_hugepage(page)) break; - offset += n; - if (offset == PAGE_SIZE) { - page++; - offset = 0; - } - } - return res; + return min(safe_bytes, bytes); } /* -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:38 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:38 +0200 Subject: [PATCH v2 17/37] mm/pagewalk: drop nth_page() usage within folio in folio_walk_start() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-18-david@redhat.com> It's no longer required to use nth_page() within a folio, so let's just drop the nth_page() in folio_walk_start(). Reviewed-by: Lorenzo Stoakes Reviewed-by: Liam R. Howlett Signed-off-by: David Hildenbrand --- mm/pagewalk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/pagewalk.c b/mm/pagewalk.c index c6753d370ff4e..9e4225e5fcf5c 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -1004,7 +1004,7 @@ struct folio *folio_walk_start(struct folio_walk *fw, found: if (expose_page) /* Note: Offset from the mapped page, not the folio start. */ - fw->page = nth_page(page, (addr & (entry_size - 1)) >> PAGE_SHIFT); + fw->page = page + ((addr & (entry_size - 1)) >> PAGE_SHIFT); else fw->page = NULL; fw->ptl = ptl; -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:39 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:39 +0200 Subject: [PATCH v2 18/37] mm/gup: drop nth_page() usage within folio when recording subpages In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-19-david@redhat.com> nth_page() is no longer required when iterating over pages within a single folio, so let's just drop it when recording subpages. Reviewed-by: Lorenzo Stoakes Signed-off-by: David Hildenbrand --- mm/gup.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 8157197a19f77..c10cd969c1a3b 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -488,12 +488,11 @@ static int record_subpages(struct page *page, unsigned long sz, unsigned long addr, unsigned long end, struct page **pages) { - struct page *start_page; int nr; - start_page = nth_page(page, (addr & (sz - 1)) >> PAGE_SHIFT); + page += (addr & (sz - 1)) >> PAGE_SHIFT; for (nr = 0; addr != end; nr++, addr += PAGE_SIZE) - pages[nr] = nth_page(start_page, nr); + pages[nr] = page++; return nr; } @@ -1512,7 +1511,7 @@ static long __get_user_pages(struct mm_struct *mm, } for (j = 0; j < page_increm; j++) { - subpage = nth_page(page, j); + subpage = page + j; pages[i + j] = subpage; flush_anon_page(vma, subpage, start + j * PAGE_SIZE); flush_dcache_page(subpage); -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:40 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:40 +0200 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-20-david@redhat.com> We can just cleanup the code by calculating the #refs earlier, so we can just inline what remains of record_subpages(). Calculate the number of references/pages ahead of times, and record them only once all our tests passed. Signed-off-by: David Hildenbrand --- mm/gup.c | 25 ++++++++----------------- 1 file changed, 8 insertions(+), 17 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index c10cd969c1a3b..f0f4d1a68e094 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -484,19 +484,6 @@ static inline void mm_set_has_pinned_flag(struct mm_struct *mm) #ifdef CONFIG_MMU #ifdef CONFIG_HAVE_GUP_FAST -static int record_subpages(struct page *page, unsigned long sz, - unsigned long addr, unsigned long end, - struct page **pages) -{ - int nr; - - page += (addr & (sz - 1)) >> PAGE_SHIFT; - for (nr = 0; addr != end; nr++, addr += PAGE_SIZE) - pages[nr] = page++; - - return nr; -} - /** * try_grab_folio_fast() - Attempt to get or pin a folio in fast path. * @page: pointer to page to be grabbed @@ -2967,8 +2954,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, if (pmd_special(orig)) return 0; - page = pmd_page(orig); - refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr); + refs = (end - addr) >> PAGE_SHIFT; + page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); folio = try_grab_folio_fast(page, refs, flags); if (!folio) @@ -2989,6 +2976,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, } *nr += refs; + for (; refs; refs--) + *(pages++) = page++; folio_set_referenced(folio); return 1; } @@ -3007,8 +2996,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, if (pud_special(orig)) return 0; - page = pud_page(orig); - refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr); + refs = (end - addr) >> PAGE_SHIFT; + page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); folio = try_grab_folio_fast(page, refs, flags); if (!folio) @@ -3030,6 +3019,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, } *nr += refs; + for (; refs; refs--) + *(pages++) = page++; folio_set_referenced(folio); return 1; } -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:41 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:41 +0200 Subject: [PATCH v2 20/37] io_uring/zcrx: remove nth_page() usage within folio In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-21-david@redhat.com> Within a folio/compound page, nth_page() is no longer required. Given that we call folio_test_partial_kmap()+kmap_local_page(), the code would already be problematic if the pages would span multiple folios. So let's just assume that all src pages belong to a single folio/compound page and can be iterated ordinarily. The dst page is currently always a single page, so we're not actually iterating anything. Reviewed-by: Pavel Begunkov Reviewed-by: Lorenzo Stoakes Cc: Jens Axboe Cc: Pavel Begunkov Signed-off-by: David Hildenbrand --- io_uring/zcrx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c index e5ff49f3425e0..18c12f4b56b6c 100644 --- a/io_uring/zcrx.c +++ b/io_uring/zcrx.c @@ -975,9 +975,9 @@ static ssize_t io_copy_page(struct io_copy_cache *cc, struct page *src_page, if (folio_test_partial_kmap(page_folio(dst_page)) || folio_test_partial_kmap(page_folio(src_page))) { - dst_page = nth_page(dst_page, dst_offset / PAGE_SIZE); + dst_page += dst_offset / PAGE_SIZE; dst_offset = offset_in_page(dst_offset); - src_page = nth_page(src_page, src_offset / PAGE_SIZE); + src_page += src_offset / PAGE_SIZE; src_offset = offset_in_page(src_offset); n = min(PAGE_SIZE - src_offset, PAGE_SIZE - dst_offset); n = min(n, len); -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:42 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:42 +0200 Subject: [PATCH v2 21/37] mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-22-david@redhat.com> Let's make it clearer that we are operating within a single folio by providing both the folio and the page. This implies that for flush_dcache_folio() we'll now avoid one more page->folio lookup, and that we can safely drop the "nth_page" usage. While at it, drop the "extern" from the function declaration. Cc: Thomas Bogendoerfer Signed-off-by: David Hildenbrand --- arch/mips/include/asm/cacheflush.h | 11 +++++++---- arch/mips/mm/cache.c | 8 ++++---- 2 files changed, 11 insertions(+), 8 deletions(-) diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h index 5d283ef89d90d..5099c1b65a584 100644 --- a/arch/mips/include/asm/cacheflush.h +++ b/arch/mips/include/asm/cacheflush.h @@ -50,13 +50,14 @@ extern void (*flush_cache_mm)(struct mm_struct *mm); extern void (*flush_cache_range)(struct vm_area_struct *vma, unsigned long start, unsigned long end); extern void (*flush_cache_page)(struct vm_area_struct *vma, unsigned long page, unsigned long pfn); -extern void __flush_dcache_pages(struct page *page, unsigned int nr); +void __flush_dcache_folio_pages(struct folio *folio, struct page *page, unsigned int nr); #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1 static inline void flush_dcache_folio(struct folio *folio) { if (cpu_has_dc_aliases) - __flush_dcache_pages(&folio->page, folio_nr_pages(folio)); + __flush_dcache_folio_pages(folio, folio_page(folio, 0), + folio_nr_pages(folio)); else if (!cpu_has_ic_fills_f_dc) folio_set_dcache_dirty(folio); } @@ -64,10 +65,12 @@ static inline void flush_dcache_folio(struct folio *folio) static inline void flush_dcache_page(struct page *page) { + struct folio *folio = page_folio(page); + if (cpu_has_dc_aliases) - __flush_dcache_pages(page, 1); + __flush_dcache_folio_pages(folio, page, 1); else if (!cpu_has_ic_fills_f_dc) - folio_set_dcache_dirty(page_folio(page)); + folio_set_dcache_dirty(folio); } #define flush_dcache_mmap_lock(mapping) do { } while (0) diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c index bf9a37c60e9f0..e3b4224c9a406 100644 --- a/arch/mips/mm/cache.c +++ b/arch/mips/mm/cache.c @@ -99,9 +99,9 @@ SYSCALL_DEFINE3(cacheflush, unsigned long, addr, unsigned long, bytes, return 0; } -void __flush_dcache_pages(struct page *page, unsigned int nr) +void __flush_dcache_folio_pages(struct folio *folio, struct page *page, + unsigned int nr) { - struct folio *folio = page_folio(page); struct address_space *mapping = folio_flush_mapping(folio); unsigned long addr; unsigned int i; @@ -117,12 +117,12 @@ void __flush_dcache_pages(struct page *page, unsigned int nr) * get faulted into the tlb (and thus flushed) anyways. */ for (i = 0; i < nr; i++) { - addr = (unsigned long)kmap_local_page(nth_page(page, i)); + addr = (unsigned long)kmap_local_page(page + i); flush_data_cache_page(addr); kunmap_local((void *)addr); } } -EXPORT_SYMBOL(__flush_dcache_pages); +EXPORT_SYMBOL(__flush_dcache_folio_pages); void __flush_anon_page(struct page *page, unsigned long vmaddr) { -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:43 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:43 +0200 Subject: [PATCH v2 22/37] mm/cma: refuse handing out non-contiguous page ranges In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-23-david@redhat.com> Let's disallow handing out PFN ranges with non-contiguous pages, so we can remove the nth-page usage in __cma_alloc(), and so any callers don't have to worry about that either when wanting to blindly iterate pages. This is really only a problem in configs with SPARSEMEM but without SPARSEMEM_VMEMMAP, and only when we would cross memory sections in some cases. Will this cause harm? Probably not, because it's mostly 32bit that does not support SPARSEMEM_VMEMMAP. If this ever becomes a problem we could look into allocating the memmap for the memory sections spanned by a single CMA region in one go from memblock. Reviewed-by: Alexandru Elisei Reviewed-by: Lorenzo Stoakes Signed-off-by: David Hildenbrand --- include/linux/mm.h | 6 ++++++ mm/cma.c | 39 ++++++++++++++++++++++++--------------- mm/util.c | 35 +++++++++++++++++++++++++++++++++++ 3 files changed, 65 insertions(+), 15 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index f6880e3225c5c..2ca1eb2db63ec 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -209,9 +209,15 @@ extern unsigned long sysctl_user_reserve_kbytes; extern unsigned long sysctl_admin_reserve_kbytes; #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP) +bool page_range_contiguous(const struct page *page, unsigned long nr_pages); #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n)) #else #define nth_page(page,n) ((page) + (n)) +static inline bool page_range_contiguous(const struct page *page, + unsigned long nr_pages) +{ + return true; +} #endif /* to align the pointer to the (next) page boundary */ diff --git a/mm/cma.c b/mm/cma.c index e56ec64d0567e..813e6dc7b0954 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -780,10 +780,8 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr, unsigned long count, unsigned int align, struct page **pagep, gfp_t gfp) { - unsigned long mask, offset; - unsigned long pfn = -1; - unsigned long start = 0; unsigned long bitmap_maxno, bitmap_no, bitmap_count; + unsigned long start, pfn, mask, offset; int ret = -EBUSY; struct page *page = NULL; @@ -795,7 +793,7 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr, if (bitmap_count > bitmap_maxno) goto out; - for (;;) { + for (start = 0; ; start = bitmap_no + mask + 1) { spin_lock_irq(&cma->lock); /* * If the request is larger than the available number @@ -812,6 +810,22 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr, spin_unlock_irq(&cma->lock); break; } + + pfn = cmr->base_pfn + (bitmap_no << cma->order_per_bit); + page = pfn_to_page(pfn); + + /* + * Do not hand out page ranges that are not contiguous, so + * callers can just iterate the pages without having to worry + * about these corner cases. + */ + if (!page_range_contiguous(page, count)) { + spin_unlock_irq(&cma->lock); + pr_warn_ratelimited("%s: %s: skipping incompatible area [0x%lx-0x%lx]", + __func__, cma->name, pfn, pfn + count - 1); + continue; + } + bitmap_set(cmr->bitmap, bitmap_no, bitmap_count); cma->available_count -= count; /* @@ -821,29 +835,24 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr, */ spin_unlock_irq(&cma->lock); - pfn = cmr->base_pfn + (bitmap_no << cma->order_per_bit); mutex_lock(&cma->alloc_mutex); ret = alloc_contig_range(pfn, pfn + count, ACR_FLAGS_CMA, gfp); mutex_unlock(&cma->alloc_mutex); - if (ret == 0) { - page = pfn_to_page(pfn); + if (!ret) break; - } cma_clear_bitmap(cma, cmr, pfn, count); if (ret != -EBUSY) break; pr_debug("%s(): memory range at pfn 0x%lx %p is busy, retrying\n", - __func__, pfn, pfn_to_page(pfn)); + __func__, pfn, page); - trace_cma_alloc_busy_retry(cma->name, pfn, pfn_to_page(pfn), - count, align); - /* try again with a bit different memory target */ - start = bitmap_no + mask + 1; + trace_cma_alloc_busy_retry(cma->name, pfn, page, count, align); } out: - *pagep = page; + if (!ret) + *pagep = page; return ret; } @@ -882,7 +891,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, */ if (page) { for (i = 0; i < count; i++) - page_kasan_tag_reset(nth_page(page, i)); + page_kasan_tag_reset(page + i); } if (ret && !(gfp & __GFP_NOWARN)) { diff --git a/mm/util.c b/mm/util.c index d235b74f7aff7..fbdb73aaf35fe 100644 --- a/mm/util.c +++ b/mm/util.c @@ -1280,4 +1280,39 @@ unsigned int folio_pte_batch(struct folio *folio, pte_t *ptep, pte_t pte, { return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr, 0); } + +#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP) +/** + * page_range_contiguous - test whether the page range is contiguous + * @page: the start of the page range. + * @nr_pages: the number of pages in the range. + * + * Test whether the page range is contiguous, such that they can be iterated + * naively, corresponding to iterating a contiguous PFN range. + * + * This function should primarily only be used for debug checks, or when + * working with page ranges that are not naturally contiguous (e.g., pages + * within a folio are). + * + * Returns true if contiguous, otherwise false. + */ +bool page_range_contiguous(const struct page *page, unsigned long nr_pages) +{ + const unsigned long start_pfn = page_to_pfn(page); + const unsigned long end_pfn = start_pfn + nr_pages; + unsigned long pfn; + + /* + * The memmap is allocated per memory section, so no need to check + * within the first section. However, we need to check each other + * spanned memory section once, making sure the first page in a + * section could similarly be reached by just iterating pages. + */ + for (pfn = ALIGN(start_pfn, PAGES_PER_SECTION); + pfn < end_pfn; pfn += PAGES_PER_SECTION) + if (unlikely(page + (pfn - start_pfn) != pfn_to_page(pfn))) + return false; + return true; +} +#endif #endif /* CONFIG_MMU */ -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:44 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:44 +0200 Subject: [PATCH v2 23/37] dma-remap: drop nth_page() in dma_common_contiguous_remap() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-24-david@redhat.com> dma_common_contiguous_remap() is used to remap an "allocated contiguous region". Within a single allocation, there is no need to use nth_page() anymore. Neither the buddy, nor hugetlb, nor CMA will hand out problematic page ranges. Acked-by: Marek Szyprowski Reviewed-by: Lorenzo Stoakes Cc: Robin Murphy Signed-off-by: David Hildenbrand --- kernel/dma/remap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/dma/remap.c b/kernel/dma/remap.c index 9e2afad1c6152..b7c1c0c92d0c8 100644 --- a/kernel/dma/remap.c +++ b/kernel/dma/remap.c @@ -49,7 +49,7 @@ void *dma_common_contiguous_remap(struct page *page, size_t size, if (!pages) return NULL; for (i = 0; i < count; i++) - pages[i] = nth_page(page, i); + pages[i] = page++; vaddr = vmap(pages, count, VM_DMA_COHERENT, prot); kvfree(pages); -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:45 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:45 +0200 Subject: [PATCH v2 24/37] scatterlist: disallow non-contigous page ranges in a single SG entry In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-25-david@redhat.com> The expectation is that there is currently no user that would pass in non-contigous page ranges: no allocator, not even VMA, will hand these out. The only problematic part would be if someone would provide a range obtained directly from memblock, or manually merge problematic ranges. If we find such cases, we should fix them to create separate SG entries. Let's check in sg_set_page() that this is really the case. No need to check in sg_set_folio(), as pages in a folio are guaranteed to be contiguous. As sg_set_page() gets inlined into modules, we have to export the page_range_contiguous() helper -- use EXPORT_SYMBOL, there is nothing special about this helper such that we would want to enforce GPL-only modules. We can now drop the nth_page() usage in sg_page_iter_page(). Acked-by: Marek Szyprowski Reviewed-by: Lorenzo Stoakes Signed-off-by: David Hildenbrand --- include/linux/scatterlist.h | 3 ++- mm/util.c | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 6f8a4965f9b98..29f6ceb98d74b 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -158,6 +158,7 @@ static inline void sg_assign_page(struct scatterlist *sg, struct page *page) static inline void sg_set_page(struct scatterlist *sg, struct page *page, unsigned int len, unsigned int offset) { + VM_WARN_ON_ONCE(!page_range_contiguous(page, ALIGN(len + offset, PAGE_SIZE) / PAGE_SIZE)); sg_assign_page(sg, page); sg->offset = offset; sg->length = len; @@ -600,7 +601,7 @@ void __sg_page_iter_start(struct sg_page_iter *piter, */ static inline struct page *sg_page_iter_page(struct sg_page_iter *piter) { - return nth_page(sg_page(piter->sg), piter->sg_pgoffset); + return sg_page(piter->sg) + piter->sg_pgoffset; } /** diff --git a/mm/util.c b/mm/util.c index fbdb73aaf35fe..bb4b47cd67091 100644 --- a/mm/util.c +++ b/mm/util.c @@ -1314,5 +1314,6 @@ bool page_range_contiguous(const struct page *page, unsigned long nr_pages) return false; return true; } +EXPORT_SYMBOL(page_range_contiguous); #endif #endif /* CONFIG_MMU */ -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:46 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:46 +0200 Subject: [PATCH v2 25/37] ata: libata-sff: drop nth_page() usage within SG entry In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-26-david@redhat.com> It's no longer required to use nth_page() when iterating pages within a single SG entry, so let's drop the nth_page() usage. Acked-by: Damien Le Moal Reviewed-by: Lorenzo Stoakes Cc: Niklas Cassel Signed-off-by: David Hildenbrand --- drivers/ata/libata-sff.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c index 7fc407255eb46..1e2a2c33cdc80 100644 --- a/drivers/ata/libata-sff.c +++ b/drivers/ata/libata-sff.c @@ -614,7 +614,7 @@ static void ata_pio_sector(struct ata_queued_cmd *qc) offset = qc->cursg->offset + qc->cursg_ofs; /* get the current page and offset */ - page = nth_page(page, (offset >> PAGE_SHIFT)); + page += offset >> PAGE_SHIFT; offset %= PAGE_SIZE; /* don't overrun current sg */ @@ -631,7 +631,7 @@ static void ata_pio_sector(struct ata_queued_cmd *qc) unsigned int split_len = PAGE_SIZE - offset; ata_pio_xfer(qc, page, offset, split_len); - ata_pio_xfer(qc, nth_page(page, 1), 0, count - split_len); + ata_pio_xfer(qc, page + 1, 0, count - split_len); } else { ata_pio_xfer(qc, page, offset, count); } @@ -751,7 +751,7 @@ static int __atapi_pio_bytes(struct ata_queued_cmd *qc, unsigned int bytes) offset = sg->offset + qc->cursg_ofs; /* get the current page and offset */ - page = nth_page(page, (offset >> PAGE_SHIFT)); + page += offset >> PAGE_SHIFT; offset %= PAGE_SIZE; /* don't overrun current sg */ -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:47 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:47 +0200 Subject: [PATCH v2 26/37] drm/i915/gem: drop nth_page() usage within SG entry In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-27-david@redhat.com> It's no longer required to use nth_page() when iterating pages within a single SG entry, so let's drop the nth_page() usage. Reviewed-by: Lorenzo Stoakes Cc: Jani Nikula Cc: Joonas Lahtinen Cc: Rodrigo Vivi Cc: Tvrtko Ursulin Cc: David Airlie Cc: Simona Vetter Signed-off-by: David Hildenbrand --- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index c16a57160b262..031d7acc16142 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -779,7 +779,7 @@ __i915_gem_object_get_page(struct drm_i915_gem_object *obj, pgoff_t n) GEM_BUG_ON(!i915_gem_object_has_struct_page(obj)); sg = i915_gem_object_get_sg(obj, n, &offset); - return nth_page(sg_page(sg), offset); + return sg_page(sg) + offset; } /* Like i915_gem_object_get_page(), but mark the returned page dirty */ -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:48 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:48 +0200 Subject: [PATCH v2 27/37] mspro_block: drop nth_page() usage within SG entry In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-28-david@redhat.com> It's no longer required to use nth_page() when iterating pages within a single SG entry, so let's drop the nth_page() usage. Acked-by: Ulf Hansson Reviewed-by: Lorenzo Stoakes Cc: Maxim Levitsky Cc: Alex Dubov Signed-off-by: David Hildenbrand --- drivers/memstick/core/mspro_block.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/memstick/core/mspro_block.c b/drivers/memstick/core/mspro_block.c index c9853d887d282..d3f160dc0da4c 100644 --- a/drivers/memstick/core/mspro_block.c +++ b/drivers/memstick/core/mspro_block.c @@ -560,8 +560,7 @@ static int h_mspro_block_transfer_data(struct memstick_dev *card, t_offset += msb->current_page * msb->page_size; sg_set_page(&t_sg, - nth_page(sg_page(&(msb->req_sg[msb->current_seg])), - t_offset >> PAGE_SHIFT), + sg_page(&(msb->req_sg[msb->current_seg])) + (t_offset >> PAGE_SHIFT), msb->page_size, offset_in_page(t_offset)); memstick_init_req_sg(*mrq, msb->data_dir == READ -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:49 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:49 +0200 Subject: [PATCH v2 28/37] memstick: drop nth_page() usage within SG entry In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-29-david@redhat.com> It's no longer required to use nth_page() when iterating pages within a single SG entry, so let's drop the nth_page() usage. Acked-by: Ulf Hansson Reviewed-by: Lorenzo Stoakes Cc: Maxim Levitsky Cc: Alex Dubov Signed-off-by: David Hildenbrand --- drivers/memstick/host/jmb38x_ms.c | 3 +-- drivers/memstick/host/tifm_ms.c | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/memstick/host/jmb38x_ms.c b/drivers/memstick/host/jmb38x_ms.c index cddddb3a5a27f..79e66e30417c1 100644 --- a/drivers/memstick/host/jmb38x_ms.c +++ b/drivers/memstick/host/jmb38x_ms.c @@ -317,8 +317,7 @@ static int jmb38x_ms_transfer_data(struct jmb38x_ms_host *host) unsigned int p_off; if (host->req->long_data) { - pg = nth_page(sg_page(&host->req->sg), - off >> PAGE_SHIFT); + pg = sg_page(&host->req->sg) + (off >> PAGE_SHIFT); p_off = offset_in_page(off); p_cnt = PAGE_SIZE - p_off; p_cnt = min(p_cnt, length); diff --git a/drivers/memstick/host/tifm_ms.c b/drivers/memstick/host/tifm_ms.c index db7f3a088fb09..0b6a90661eee5 100644 --- a/drivers/memstick/host/tifm_ms.c +++ b/drivers/memstick/host/tifm_ms.c @@ -201,8 +201,7 @@ static unsigned int tifm_ms_transfer_data(struct tifm_ms *host) unsigned int p_off; if (host->req->long_data) { - pg = nth_page(sg_page(&host->req->sg), - off >> PAGE_SHIFT); + pg = sg_page(&host->req->sg) + (off >> PAGE_SHIFT); p_off = offset_in_page(off); p_cnt = PAGE_SIZE - p_off; p_cnt = min(p_cnt, length); -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:50 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:50 +0200 Subject: [PATCH v2 29/37] mmc: drop nth_page() usage within SG entry In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-30-david@redhat.com> It's no longer required to use nth_page() when iterating pages within a single SG entry, so let's drop the nth_page() usage. Acked-by: Ulf Hansson Reviewed-by: Lorenzo Stoakes Cc: Alex Dubov Cc: Jesper Nilsson Cc: Lars Persson Signed-off-by: David Hildenbrand --- drivers/mmc/host/tifm_sd.c | 4 ++-- drivers/mmc/host/usdhi6rol0.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/mmc/host/tifm_sd.c b/drivers/mmc/host/tifm_sd.c index ac636efd911d3..2cd69c9e9571b 100644 --- a/drivers/mmc/host/tifm_sd.c +++ b/drivers/mmc/host/tifm_sd.c @@ -191,7 +191,7 @@ static void tifm_sd_transfer_data(struct tifm_sd *host) } off = sg[host->sg_pos].offset + host->block_pos; - pg = nth_page(sg_page(&sg[host->sg_pos]), off >> PAGE_SHIFT); + pg = sg_page(&sg[host->sg_pos]) + (off >> PAGE_SHIFT); p_off = offset_in_page(off); p_cnt = PAGE_SIZE - p_off; p_cnt = min(p_cnt, cnt); @@ -240,7 +240,7 @@ static void tifm_sd_bounce_block(struct tifm_sd *host, struct mmc_data *r_data) } off = sg[host->sg_pos].offset + host->block_pos; - pg = nth_page(sg_page(&sg[host->sg_pos]), off >> PAGE_SHIFT); + pg = sg_page(&sg[host->sg_pos]) + (off >> PAGE_SHIFT); p_off = offset_in_page(off); p_cnt = PAGE_SIZE - p_off; p_cnt = min(p_cnt, cnt); diff --git a/drivers/mmc/host/usdhi6rol0.c b/drivers/mmc/host/usdhi6rol0.c index 85b49c07918b3..3bccf800339ba 100644 --- a/drivers/mmc/host/usdhi6rol0.c +++ b/drivers/mmc/host/usdhi6rol0.c @@ -323,7 +323,7 @@ static void usdhi6_blk_bounce(struct usdhi6_host *host, host->head_pg.page = host->pg.page; host->head_pg.mapped = host->pg.mapped; - host->pg.page = nth_page(host->pg.page, 1); + host->pg.page = host->pg.page + 1; host->pg.mapped = kmap(host->pg.page); host->blk_page = host->bounce_buf; @@ -503,7 +503,7 @@ static void usdhi6_sg_advance(struct usdhi6_host *host) /* We cannot get here after crossing a page border */ /* Next page in the same SG */ - host->pg.page = nth_page(sg_page(host->sg), host->page_idx); + host->pg.page = sg_page(host->sg) + host->page_idx; host->pg.mapped = kmap(host->pg.page); host->blk_page = host->pg.mapped; -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:51 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:51 +0200 Subject: [PATCH v2 30/37] scsi: scsi_lib: drop nth_page() usage within SG entry In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-31-david@redhat.com> It's no longer required to use nth_page() when iterating pages within a single SG entry, so let's drop the nth_page() usage. Reviewed-by: Bart Van Assche Reviewed-by: Lorenzo Stoakes Reviewed-by: Martin K. Petersen Cc: "James E.J. Bottomley" Signed-off-by: David Hildenbrand --- drivers/scsi/scsi_lib.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 0c65ecfedfbd6..d7e42293b8645 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -3148,8 +3148,7 @@ void *scsi_kmap_atomic_sg(struct scatterlist *sgl, int sg_count, /* Offset starting from the beginning of first page in this sg-entry */ *offset = *offset - len_complete + sg->offset; - /* Assumption: contiguous pages can be accessed as "page + i" */ - page = nth_page(sg_page(sg), (*offset >> PAGE_SHIFT)); + page = sg_page(sg) + (*offset >> PAGE_SHIFT); *offset &= ~PAGE_MASK; /* Bytes in this sg-entry from *offset to the end of the page */ -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:52 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:52 +0200 Subject: [PATCH v2 31/37] scsi: sg: drop nth_page() usage within SG entry In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-32-david@redhat.com> It's no longer required to use nth_page() when iterating pages within a single SG entry, so let's drop the nth_page() usage. Reviewed-by: Bart Van Assche Reviewed-by: Lorenzo Stoakes Reviewed-by: Martin K. Petersen Cc: Doug Gilbert Cc: "James E.J. Bottomley" Signed-off-by: David Hildenbrand --- drivers/scsi/sg.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 3c02a5f7b5f39..4c62c597c7be9 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1235,8 +1235,7 @@ sg_vma_fault(struct vm_fault *vmf) len = vma->vm_end - sa; len = (len < length) ? len : length; if (offset < len) { - struct page *page = nth_page(rsv_schp->pages[k], - offset >> PAGE_SHIFT); + struct page *page = rsv_schp->pages[k] + (offset >> PAGE_SHIFT); get_page(page); /* increment page count */ vmf->page = page; return 0; /* success */ -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:53 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:53 +0200 Subject: [PATCH v2 32/37] vfio/pci: drop nth_page() usage within SG entry In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-33-david@redhat.com> It's no longer required to use nth_page() when iterating pages within a single SG entry, so let's drop the nth_page() usage. Reviewed-by: Lorenzo Stoakes Reviewed-by: Alex Williamson Reviewed-by: Brett Creeley Cc: Jason Gunthorpe Cc: Yishai Hadas Cc: Shameer Kolothum Cc: Kevin Tian Signed-off-by: David Hildenbrand --- drivers/vfio/pci/pds/lm.c | 3 +-- drivers/vfio/pci/virtio/migrate.c | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/pci/pds/lm.c b/drivers/vfio/pci/pds/lm.c index f2673d395236a..4d70c833fa32e 100644 --- a/drivers/vfio/pci/pds/lm.c +++ b/drivers/vfio/pci/pds/lm.c @@ -151,8 +151,7 @@ static struct page *pds_vfio_get_file_page(struct pds_vfio_lm_file *lm_file, lm_file->last_offset_sg = sg; lm_file->sg_last_entry += i; lm_file->last_offset = cur_offset; - return nth_page(sg_page(sg), - (offset - cur_offset) / PAGE_SIZE); + return sg_page(sg) + (offset - cur_offset) / PAGE_SIZE; } cur_offset += sg->length; } diff --git a/drivers/vfio/pci/virtio/migrate.c b/drivers/vfio/pci/virtio/migrate.c index ba92bb4e9af94..7dd0ac866461d 100644 --- a/drivers/vfio/pci/virtio/migrate.c +++ b/drivers/vfio/pci/virtio/migrate.c @@ -53,8 +53,7 @@ virtiovf_get_migration_page(struct virtiovf_data_buffer *buf, buf->last_offset_sg = sg; buf->sg_last_entry += i; buf->last_offset = cur_offset; - return nth_page(sg_page(sg), - (offset - cur_offset) / PAGE_SIZE); + return sg_page(sg) + (offset - cur_offset) / PAGE_SIZE; } cur_offset += sg->length; } -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:54 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:54 +0200 Subject: [PATCH v2 33/37] crypto: remove nth_page() usage within SG entry In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-34-david@redhat.com> It's no longer required to use nth_page() when iterating pages within a single SG entry, so let's drop the nth_page() usage. Reviewed-by: Lorenzo Stoakes Acked-by: Herbert Xu Cc: "David S. Miller" Signed-off-by: David Hildenbrand --- crypto/ahash.c | 4 ++-- crypto/scompress.c | 8 ++++---- include/crypto/scatterwalk.h | 4 ++-- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/crypto/ahash.c b/crypto/ahash.c index a227793d2c5b5..dfb4f5476428f 100644 --- a/crypto/ahash.c +++ b/crypto/ahash.c @@ -88,7 +88,7 @@ static int hash_walk_new_entry(struct crypto_hash_walk *walk) sg = walk->sg; walk->offset = sg->offset; - walk->pg = nth_page(sg_page(walk->sg), (walk->offset >> PAGE_SHIFT)); + walk->pg = sg_page(walk->sg) + (walk->offset >> PAGE_SHIFT); walk->offset = offset_in_page(walk->offset); walk->entrylen = sg->length; @@ -226,7 +226,7 @@ int shash_ahash_digest(struct ahash_request *req, struct shash_desc *desc) if (!IS_ENABLED(CONFIG_HIGHMEM)) return crypto_shash_digest(desc, data, nbytes, req->result); - page = nth_page(page, offset >> PAGE_SHIFT); + page += offset >> PAGE_SHIFT; offset = offset_in_page(offset); if (nbytes > (unsigned int)PAGE_SIZE - offset) diff --git a/crypto/scompress.c b/crypto/scompress.c index c651e7f2197a9..1a7ed8ae65b07 100644 --- a/crypto/scompress.c +++ b/crypto/scompress.c @@ -198,7 +198,7 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) } else return -ENOSYS; - dpage = nth_page(dpage, doff / PAGE_SIZE); + dpage += doff / PAGE_SIZE; doff = offset_in_page(doff); n = (dlen - 1) / PAGE_SIZE; @@ -220,12 +220,12 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) } else break; - spage = nth_page(spage, soff / PAGE_SIZE); + spage = spage + soff / PAGE_SIZE; soff = offset_in_page(soff); n = (slen - 1) / PAGE_SIZE; n += (offset_in_page(slen - 1) + soff) / PAGE_SIZE; - if (PageHighMem(nth_page(spage, n)) && + if (PageHighMem(spage + n) && size_add(soff, slen) > PAGE_SIZE) break; src = kmap_local_page(spage) + soff; @@ -270,7 +270,7 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) if (dlen <= PAGE_SIZE) break; dlen -= PAGE_SIZE; - dpage = nth_page(dpage, 1); + dpage++; } } diff --git a/include/crypto/scatterwalk.h b/include/crypto/scatterwalk.h index 15ab743f68c8f..83d14376ff2bc 100644 --- a/include/crypto/scatterwalk.h +++ b/include/crypto/scatterwalk.h @@ -159,7 +159,7 @@ static inline void scatterwalk_map(struct scatter_walk *walk) if (IS_ENABLED(CONFIG_HIGHMEM)) { struct page *page; - page = nth_page(base_page, offset >> PAGE_SHIFT); + page = base_page + (offset >> PAGE_SHIFT); offset = offset_in_page(offset); addr = kmap_local_page(page) + offset; } else { @@ -259,7 +259,7 @@ static inline void scatterwalk_done_dst(struct scatter_walk *walk, end += (offset_in_page(offset) + offset_in_page(nbytes) + PAGE_SIZE - 1) >> PAGE_SHIFT; for (i = start; i < end; i++) - flush_dcache_page(nth_page(base_page, i)); + flush_dcache_page(base_page + i); } scatterwalk_advance(walk, nbytes); } -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:55 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:55 +0200 Subject: [PATCH v2 34/37] mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-35-david@redhat.com> There is the concern that unpin_user_page_range_dirty_lock() might do some weird merging of PFN ranges -- either now or in the future -- such that PFN range is contiguous but the page range might not be. Let's sanity-check for that and drop the nth_page() usage. Reviewed-by: Lorenzo Stoakes Signed-off-by: David Hildenbrand --- mm/gup.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/gup.c b/mm/gup.c index f0f4d1a68e094..010fe56f6e132 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -237,7 +237,7 @@ void folio_add_pin(struct folio *folio) static inline struct folio *gup_folio_range_next(struct page *start, unsigned long npages, unsigned long i, unsigned int *ntails) { - struct page *next = nth_page(start, i); + struct page *next = start + i; struct folio *folio = page_folio(next); unsigned int nr = 1; @@ -342,6 +342,10 @@ EXPORT_SYMBOL(unpin_user_pages_dirty_lock); * "gup-pinned page range" refers to a range of pages that has had one of the * pin_user_pages() variants called on that page. * + * The page range must be truly physically contiguous: the page range + * corresponds to a contiguous PFN range and all pages can be iterated + * naturally. + * * For the page ranges defined by [page .. page+npages], make that range (or * its head pages, if a compound page) dirty, if @make_dirty is true, and if the * page range was previously listed as clean. @@ -359,6 +363,8 @@ void unpin_user_page_range_dirty_lock(struct page *page, unsigned long npages, struct folio *folio; unsigned int nr; + VM_WARN_ON_ONCE(!page_range_contiguous(page, npages)); + for (i = 0; i < npages; i += nr) { folio = gup_folio_range_next(page, npages, i, &nr); if (make_dirty && !folio_test_dirty(folio)) { -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:56 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:56 +0200 Subject: [PATCH v2 35/37] kfence: drop nth_page() usage In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-36-david@redhat.com> We want to get rid of nth_page(), and kfence init code is the last user. Unfortunately, we might actually walk a PFN range where the pages are not contiguous, because we might be allocating an area from memblock that could span memory sections in problematic kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP). We could check whether the page range is contiguous using page_range_contiguous() and failing kfence init, or making kfence incompatible these problemtic kernel configs. Let's keep it simple and simply use pfn_to_page() by iterating PFNs. Reviewed-by: Marco Elver Reviewed-by: Lorenzo Stoakes Cc: Alexander Potapenko Cc: Dmitry Vyukov Signed-off-by: David Hildenbrand --- mm/kfence/core.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/mm/kfence/core.c b/mm/kfence/core.c index 0ed3be100963a..727c20c94ac59 100644 --- a/mm/kfence/core.c +++ b/mm/kfence/core.c @@ -594,15 +594,14 @@ static void rcu_guarded_free(struct rcu_head *h) */ static unsigned long kfence_init_pool(void) { - unsigned long addr; - struct page *pages; + unsigned long addr, start_pfn; int i; if (!arch_kfence_init_pool()) return (unsigned long)__kfence_pool; addr = (unsigned long)__kfence_pool; - pages = virt_to_page(__kfence_pool); + start_pfn = PHYS_PFN(virt_to_phys(__kfence_pool)); /* * Set up object pages: they must have PGTY_slab set to avoid freeing @@ -613,11 +612,12 @@ static unsigned long kfence_init_pool(void) * enters __slab_free() slow-path. */ for (i = 0; i < KFENCE_POOL_SIZE / PAGE_SIZE; i++) { - struct slab *slab = page_slab(nth_page(pages, i)); + struct slab *slab; if (!i || (i % 2)) continue; + slab = page_slab(pfn_to_page(start_pfn + i)); __folio_set_slab(slab_folio(slab)); #ifdef CONFIG_MEMCG slab->obj_exts = (unsigned long)&kfence_metadata_init[i / 2 - 1].obj_exts | @@ -665,10 +665,12 @@ static unsigned long kfence_init_pool(void) reset_slab: for (i = 0; i < KFENCE_POOL_SIZE / PAGE_SIZE; i++) { - struct slab *slab = page_slab(nth_page(pages, i)); + struct slab *slab; if (!i || (i % 2)) continue; + + slab = page_slab(pfn_to_page(start_pfn + i)); #ifdef CONFIG_MEMCG slab->obj_exts = 0; #endif -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:57 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:57 +0200 Subject: [PATCH v2 36/37] block: update comment of "struct bio_vec" regarding nth_page() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-37-david@redhat.com> Ever since commit 858c708d9efb ("block: move the bi_size update out of __bio_try_merge_page"), page_is_mergeable() no longer exists, and the logic in bvec_try_merge_page() is now a simple page pointer comparison. Reviewed-by: Lorenzo Stoakes Signed-off-by: David Hildenbrand --- include/linux/bvec.h | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/include/linux/bvec.h b/include/linux/bvec.h index 0a80e1f9aa201..3fc0efa0825b1 100644 --- a/include/linux/bvec.h +++ b/include/linux/bvec.h @@ -22,11 +22,8 @@ struct page; * @bv_len: Number of bytes in the address range. * @bv_offset: Start of the address range relative to the start of @bv_page. * - * The following holds for a bvec if n * PAGE_SIZE < bv_offset + bv_len: - * - * nth_page(@bv_page, n) == @bv_page + n - * - * This holds because page_is_mergeable() checks the above property. + * All pages within a bio_vec starting from @bv_page are contiguous and + * can simply be iterated (see bvec_advance()). */ struct bio_vec { struct page *bv_page; -- 2.50.1 From david at redhat.com Mon Sep 1 08:03:58 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:58 +0200 Subject: [PATCH v2 37/37] mm: remove nth_page() In-Reply-To: <20250901150359.867252-1-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> Message-ID: <20250901150359.867252-38-david@redhat.com> Now that all users are gone, let's remove it. Reviewed-by: Lorenzo Stoakes Signed-off-by: David Hildenbrand --- include/linux/mm.h | 2 -- tools/testing/scatterlist/linux/mm.h | 1 - 2 files changed, 3 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 2ca1eb2db63ec..b26ca8b2162d9 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -210,9 +210,7 @@ extern unsigned long sysctl_admin_reserve_kbytes; #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP) bool page_range_contiguous(const struct page *page, unsigned long nr_pages); -#define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n)) #else -#define nth_page(page,n) ((page) + (n)) static inline bool page_range_contiguous(const struct page *page, unsigned long nr_pages) { diff --git a/tools/testing/scatterlist/linux/mm.h b/tools/testing/scatterlist/linux/mm.h index 5bd9e6e806254..121ae78d6e885 100644 --- a/tools/testing/scatterlist/linux/mm.h +++ b/tools/testing/scatterlist/linux/mm.h @@ -51,7 +51,6 @@ static inline unsigned long page_to_phys(struct page *page) #define page_to_pfn(page) ((unsigned long)(page) / PAGE_SIZE) #define pfn_to_page(pfn) (void *)((pfn) * PAGE_SIZE) -#define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n)) #define __min(t1, t2, min1, min2, x, y) ({ \ t1 min1 = (x); \ -- 2.50.1 From Valentina.FernandezAlanis at microchip.com Mon Sep 1 08:28:22 2025 From: Valentina.FernandezAlanis at microchip.com (Valentina.FernandezAlanis at microchip.com) Date: Mon, 1 Sep 2025 15:28:22 +0000 Subject: [PATCH v1 5/5] riscv: dts: microchip: add a device tree for Discovery Kit In-Reply-To: <2b1eb8fd-2a64-4745-ad93-abc53d240b69@kernel.org> References: <20250825161952.3902672-1-valentina.fernandezalanis@microchip.com> <20250825161952.3902672-6-valentina.fernandezalanis@microchip.com> <2b1eb8fd-2a64-4745-ad93-abc53d240b69@kernel.org> Message-ID: On 28/08/2025 18:46, Krzysztof Kozlowski wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe > > On 25/08/2025 18:19, Valentina Fernandez wrote: >> +++ b/arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi >> @@ -0,0 +1,58 @@ >> +// SPDX-License-Identifier: (GPL-2.0 OR MIT) >> +/* Copyright (c) 2020-2025 Microchip Technology Inc */ >> + >> +/ { >> + core_pwm0: pwm at 40000000 { >> + compatible = "microchip,corepwm-rtl-v4"; >> + reg = <0x0 0x40000000 0x0 0xF0>; >> + microchip,sync-update-mask = /bits/ 32 <0>; >> + #pwm-cells = <3>; >> + clocks = <&ccc_sw CLK_CCC_PLL0_OUT3>; >> + status = "disabled"; >> + }; >> + >> + i2c2: i2c at 40000200 { >> + compatible = "microchip,corei2c-rtl-v7"; >> + reg = <0x0 0x40000200 0x0 0x100>; >> + #address-cells = <1>; >> + #size-cells = <0>; >> + clocks = <&ccc_sw CLK_CCC_PLL0_OUT3>; >> + interrupt-parent = <&plic>; >> + interrupts = <122>; >> + clock-frequency = <100000>; >> + status = "disabled"; >> + }; >> + >> + ihc: mailbox { >> + compatible = "microchip,sbi-ipc"; >> + interrupt-parent = <&plic>; >> + interrupts = <180>, <179>, <178>, <177>; >> + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; >> + #mbox-cells = <1>; >> + status = "disabled"; >> + }; >> + >> + mailbox at 50000000 { >> + compatible = "microchip,miv-ihc-rtl-v2"; >> + microchip,ihc-chan-disabled-mask = /bits/ 16 <0>; > > Does not look like following DTS coding style - order of properties. > >> + reg = <0x0 0x50000000 0x0 0x1c000>; >> + interrupt-parent = <&plic>; >> + interrupts = <180>, <179>, <178>, <177>; >> + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; >> + #mbox-cells = <1>; >> + status = "disabled"; >> + }; >> + >> + refclk_ccc: cccrefclk { > > Please use name for all fixed clocks which matches current format > recommendation: 'clock-' (see also the pattern in the binding for > any other options). > > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/clock/fixed-clock.yaml The fabric dtsi describes elements configured by the FPGA bitstream. This node is named as such because the Clock Conditioner Circuit CCC's reference clock source is set by the FPGA bitstream, while its frequency is determined by an on-board oscillator. Hope this clarifies the rationale behind the node name. Thanks, Valentina > > Or anything more reasonable than just bunch of letters. > >> + compatible = "fixed-clock"; >> + #clock-cells = <0>; > > >> + }; >> +}; >> + >> +&ccc_sw { >> + clocks = <&refclk_ccc>, <&refclk_ccc>, <&refclk_ccc>, <&refclk_ccc>, >> + <&refclk_ccc>, <&refclk_ccc>; >> + clock-names = "pll0_ref0", "pll0_ref1", "pll1_ref0", "pll1_ref1", >> + "dll0_ref", "dll1_ref"; >> + status = "okay"; >> +}; >> diff --git a/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts b/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts >> new file mode 100644 >> index 000000000000..742369470ab0 >> --- /dev/null >> +++ b/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts >> @@ -0,0 +1,191 @@ >> +// SPDX-License-Identifier: (GPL-2.0 OR MIT) >> +/* Copyright (c) 2020-2025 Microchip Technology Inc */ >> + >> +/dts-v1/; >> + >> +#include "mpfs.dtsi" >> +#include "mpfs-disco-kit-fabric.dtsi" >> +#include >> +#include >> + >> +/ { >> + model = "Microchip PolarFire-SoC Discovery Kit"; >> + compatible = "microchip,mpfs-disco-kit-reference-rtl-v2507", >> + "microchip,mpfs-disco-kit", >> + "microchip,mpfs"; >> + >> + aliases { >> + ethernet0 = &mac0; >> + serial4 = &mmuart4; >> + }; >> + >> + chosen { >> + stdout-path = "serial4:115200n8"; >> + }; >> + >> + leds { >> + compatible = "gpio-leds"; >> + >> + led-1 { >> + gpios = <&gpio2 17 GPIO_ACTIVE_HIGH>; >> + color = ; >> + label = "led1"; >> + }; >> + >> + led-2 { >> + gpios = <&gpio2 18 GPIO_ACTIVE_HIGH>; >> + color = ; >> + label = "led2"; >> + }; >> + >> + led-3 { >> + gpios = <&gpio2 19 GPIO_ACTIVE_HIGH>; >> + color = ; >> + label = "led3"; >> + }; >> + >> + led-4 { >> + gpios = <&gpio2 20 GPIO_ACTIVE_HIGH>; >> + color = ; >> + label = "led4"; >> + }; >> + >> + led-5 { >> + gpios = <&gpio2 21 GPIO_ACTIVE_HIGH>; >> + color = ; >> + label = "led5"; >> + }; >> + >> + led-6 { >> + gpios = <&gpio2 22 GPIO_ACTIVE_HIGH>; >> + color = ; >> + label = "led6"; >> + }; >> + >> + led-7 { >> + gpios = <&gpio2 23 GPIO_ACTIVE_HIGH>; >> + color = ; >> + label = "led7"; >> + }; >> + >> + led-8 { >> + gpios = <&gpio1 9 GPIO_ACTIVE_HIGH>; >> + color = ; >> + label = "led8"; >> + }; >> + }; >> + >> + ddrc_cache_lo: memory at 80000000 { >> + device_type = "memory"; >> + reg = <0x0 0x80000000 0x0 0x40000000>; >> + status = "okay"; > > Why? Did you disable it anywhere? > >> + }; >> + >> + reserved-memory { >> + #address-cells = <2>; >> + #size-cells = <2>; >> + ranges; >> + >> + hss_payload: region at BFC00000 { > > Don't mix cases. Should be lowercase hex everywhere. > > Best regards, > Krzysztof From spriteovo at gmail.com Mon Sep 1 10:19:42 2025 From: spriteovo at gmail.com (Asuna) Date: Tue, 2 Sep 2025 01:19:42 +0800 Subject: RISC-V: Re-enable GCC+Rust builds In-Reply-To: <20250901-lasso-kabob-de32b8fcede8@spud> References: <68496eed-b5a4-4739-8d84-dcc428a08e20@gmail.com> <20250830-cheesy-prone-ee5fae406c22@spud> <20250901-lasso-kabob-de32b8fcede8@spud> Message-ID: > For example, there's a check in the riscv Kconfig menu to see if > stack-protector-guard=tls can be used via a cc-option check. If that > check passes with gcc as the compiler that option will be passed to > the rust side of the build, where llvm might not support it. If I understand correctly, the `-mstack-protector-guard` option is already always filtered out by `bindgen_skip_c_flags` in `rust/Makefile`, regardless of architecture. Therefore, we don't need to do anything more, right? > Similarly, turning on an extension like Zacas via a cc-option check > could pass for gcc but not be usable when passed to the rust side, > causing errors. That makes sense. I might need to check the version of libclang for each extension that passes the cc-option check for GCC to ensure it supports them. > These sorts of things should be prevented via Kconfig, not show up as > confusing build errors. I'm working on a patch, and intend to output an error message in `arch/riscv/Makefile` then exit 1 when detecting an incompatible gcc+libclang mix in use. From conor at kernel.org Mon Sep 1 11:04:04 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 1 Sep 2025 19:04:04 +0100 Subject: RISC-V: Re-enable GCC+Rust builds In-Reply-To: References: <68496eed-b5a4-4739-8d84-dcc428a08e20@gmail.com> <20250830-cheesy-prone-ee5fae406c22@spud> <20250901-lasso-kabob-de32b8fcede8@spud> Message-ID: <20250901-unseemly-blimp-a74e3c77e780@spud> On Tue, Sep 02, 2025 at 01:19:42AM +0800, Asuna wrote: > > For example, there's a check in the riscv Kconfig menu to see if > > stack-protector-guard=tls can be used via a cc-option check. If that > > check passes with gcc as the compiler that option will be passed to the > > rust side of the build, where llvm might not support it. > If I understand correctly, the `-mstack-protector-guard` option is already > always filtered out by `bindgen_skip_c_flags` in `rust/Makefile`, regardless > of architecture. Therefore, we don't need to do anything more, right? That particular one might be a problem not because of -mstack-protector-guard itself, but rather three options get added at once: $(eval KBUILD_CFLAGS += -mstack-protector-guard=tls \ -mstack-protector-guard-reg=tp \ -mstack-protector-guard-offset=$(shell \ awk '{if ($$2 == "TSK_STACK_CANARY") print $$3;}' \ $(objtree)/include/generated/asm-offsets.h)) and the other ones might be responsible for the error. Similarly, something like -Wno-unterminated-string-initialization could cause a problem if gcc supports it but not libclang. I was doing some debugging today of another problem, and was able to trigger both of those errors with llvm-21 and libclang-19, so they definitely have the potential to be problems if there's a mismatch - I just don't know how many of those issues affect a mixed build with rustc and the gnu tools, mixing llvm and libclang versions already produces a warning about it being a Bad IdeaTM (a warning that I think should be an error). > > Similarly, turning on an extension like Zacas via a cc-option check > > could pass for gcc but not be usable when passed to the rust side, > > causing errors. > That makes sense. I might need to check the version of libclang for each > extension that passes the cc-option check for GCC to ensure it supports > them. > > > These sorts of things should be prevented via Kconfig, not show up as > > confusing build errors. > I'm working on a patch, and intend to output an error message in > `arch/riscv/Makefile` then exit 1 when detecting an incompatible > gcc+libclang mix in use. I think you're mostly better off catching that sort of thing in Kconfig, where possible and just make incompatible mixes invalid. What's actually incompatible is likely going to depend heavily on what options are enabled. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From conor at kernel.org Mon Sep 1 11:44:59 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 1 Sep 2025 19:44:59 +0100 Subject: [PATCH 2/4] dt-bindings: riscv: Add Zalasr ISA extension description In-Reply-To: <20250901113022.3812-3-luxu.kernel@bytedance.com> References: <20250901113022.3812-1-luxu.kernel@bytedance.com> <20250901113022.3812-3-luxu.kernel@bytedance.com> Message-ID: <20250901-caravan-traps-9fb18046b458@spud> On Mon, Sep 01, 2025 at 07:30:20PM +0800, Xu Lu wrote: > Add description for the Zalasr ISA extension > > Signed-off-by: Xu Lu > --- > Documentation/devicetree/bindings/riscv/extensions.yaml | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml > index ede6a58ccf534..6b8c21807a2da 100644 > --- a/Documentation/devicetree/bindings/riscv/extensions.yaml > +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml > @@ -248,6 +248,11 @@ properties: > ratified at commit e87412e621f1 ("integrate Zaamo and Zalrsc text > (#1304)") of the unprivileged ISA specification. > > + - const: zalasr This is out of order, no? zalrsc would come after zalasr. > + description: | > + The standard Zalasr extension for load-acquire/store-release as frozen > + at commit 194f0094 ("Version 0.9 for freeze") of riscv-zalasr. > + > - const: zawrs > description: | > The Zawrs extension for entering a low-power state or for trapping > -- > 2.20.1 > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From fustini at kernel.org Mon Sep 1 13:53:27 2025 From: fustini at kernel.org (Drew Fustini) Date: Mon, 1 Sep 2025 13:53:27 -0700 Subject: [PATCH v13 3/4] riscv: dts: thead: th1520: Add IMG BXM-4-64 GPU node In-Reply-To: References: <20250822-apr_14_for_sending-v13-0-af656f7cc6c3@samsung.com> <20250822-apr_14_for_sending-v13-3-af656f7cc6c3@samsung.com> Message-ID: On Mon, Sep 01, 2025 at 11:16:18AM +0000, Matt Coster wrote: > Hi Drew, > > Apologies for the delay, I was on holiday last week. > > I've just applied the non-dts patches to drm-misc-next [1], would you > mind re-adding the dts patch to thead-dt-for-next? Thanks for the update. I've now pushed the dts patch back to thead-dt-for-next: [3/4] riscv: dts: thead: th1520: Add IMG BXM-4-64 GPU node commit: 5052d5cf1359e9057ec311788c12997406fdb2fc -Drew -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From inochiama at gmail.com Mon Sep 1 15:16:15 2025 From: inochiama at gmail.com (Inochi Amaoto) Date: Tue, 2 Sep 2025 06:16:15 +0800 Subject: [PATCH v2 0/3] irqchip/sg2042-msi: Set irq type according to DT configuration In-Reply-To: References: Message-ID: On Tue, Aug 26, 2025 at 09:09:13AM +0800, Chen Wang wrote: > From: Chen Wang > > The original MSI interrupt type was hard-coded, which was not a good idea. > Now it is changed to read the device tree configuration and then set the > interrupt type. > > This patchset is based on irq/drivers branch of tip. > > --- > > Changes in v2: > The patch series is based on irq/drivers branch of tip. > > Reverted the change to obtain params of "msi-ranges"; it's better not to > assume the value of "#interrupt-cells" is 2, even though it's known to be > the case. Thanks to Inochi for the comments. > > Changes in v1: > The patch series is based on irq/drivers branch of tip. You can simply review > or test the patches at the link [1]. > > Link: https://lore.kernel.org/linux-riscv/cover.1756103516.git.unicorn_wang at outlook.com/ [1] > --- > > Chen Wang (3): > irqchip/sg2042-msi: Set irq type according to DT configuration > riscv: sophgo: dts: sg2042: change msi irq type to > IRQ_TYPE_EDGE_RISING > riscv: sophgo: dts: sg2044: change msi irq type to > IRQ_TYPE_EDGE_RISING > > arch/riscv/boot/dts/sophgo/sg2042.dtsi | 2 +- > arch/riscv/boot/dts/sophgo/sg2044.dtsi | 2 +- > drivers/irqchip/irq-sg2042-msi.c | 7 +++++-- > 3 files changed, 7 insertions(+), 4 deletions(-) > > > base-commit: 8ff1c16c753e293c3ba20583cb64f81ea7b9a451 > -- > 2.34.1 > Tested-by: Inochi Amaoto # Sophgo SRD3-10 From unicorn_wang at outlook.com Mon Sep 1 16:59:28 2025 From: unicorn_wang at outlook.com (Chen Wang) Date: Tue, 2 Sep 2025 07:59:28 +0800 Subject: [PATCH v2 0/3] irqchip/sg2042-msi: Set irq type according to DT configuration In-Reply-To: References: Message-ID: Hi, Thomas, Would you please pick this patchset? P.S. Since the modification of the DTS part is closely dependent on the modification of the driver part, I am not sure whether you are willing to pick these three patches together, or just pick the driver part and leave the DTS part to me? Thanks, Chen On 8/26/2025 9:09 AM, Chen Wang wrote: > From: Chen Wang > > The original MSI interrupt type was hard-coded, which was not a good idea. > Now it is changed to read the device tree configuration and then set the > interrupt type. > > This patchset is based on irq/drivers branch of tip. > > --- > > Changes in v2: > The patch series is based on irq/drivers branch of tip. > > Reverted the change to obtain params of "msi-ranges"; it's better not to > assume the value of "#interrupt-cells" is 2, even though it's known to be > the case. Thanks to Inochi for the comments. > > Changes in v1: > The patch series is based on irq/drivers branch of tip. You can simply review > or test the patches at the link [1]. > > Link: https://lore.kernel.org/linux-riscv/cover.1756103516.git.unicorn_wang at outlook.com/ [1] > --- > > Chen Wang (3): > irqchip/sg2042-msi: Set irq type according to DT configuration > riscv: sophgo: dts: sg2042: change msi irq type to > IRQ_TYPE_EDGE_RISING > riscv: sophgo: dts: sg2044: change msi irq type to > IRQ_TYPE_EDGE_RISING > > arch/riscv/boot/dts/sophgo/sg2042.dtsi | 2 +- > arch/riscv/boot/dts/sophgo/sg2044.dtsi | 2 +- > drivers/irqchip/irq-sg2042-msi.c | 7 +++++-- > 3 files changed, 7 insertions(+), 4 deletions(-) > > > base-commit: 8ff1c16c753e293c3ba20583cb64f81ea7b9a451 From troy.mitchell at linux.spacemit.com Mon Sep 1 18:36:42 2025 From: troy.mitchell at linux.spacemit.com (Troy Mitchell) Date: Tue, 2 Sep 2025 09:36:42 +0800 Subject: [PATCH v2 1/2] ASoC: dt-bindings: Add bindings for SpacemiT K1 In-Reply-To: <20250829171624.GA1027608-robh@kernel.org> References: <20250828-k1-i2s-v2-0-09e7b40f002c@linux.spacemit.com> <20250828-k1-i2s-v2-1-09e7b40f002c@linux.spacemit.com> <20250829171624.GA1027608-robh@kernel.org> Message-ID: On Fri, Aug 29, 2025 at 12:16:24PM -0500, Rob Herring wrote: > On Thu, Aug 28, 2025 at 11:37:32AM +0800, Troy Mitchell wrote: > > Add dt-binding for the i2s driver of SpacemiT's K1 SoC. > > > > Signed-off-by: Troy Mitchell > > --- > > .../devicetree/bindings/sound/spacemit,k1-i2s.yaml | 88 ++++++++++++++++++++++ > > 1 file changed, 88 insertions(+) > > > > diff --git a/Documentation/devicetree/bindings/sound/spacemit,k1-i2s.yaml b/Documentation/devicetree/bindings/sound/spacemit,k1-i2s.yaml > > new file mode 100644 > > index 0000000000000000000000000000000000000000..042001c38ed8d434889183831e44289ea9c5aef2 > > --- /dev/null > > +++ b/Documentation/devicetree/bindings/sound/spacemit,k1-i2s.yaml [...] > > + dmas: > > + minItems: 1 > > + maxItems: 2 > > + > > + dma-names: > > + oneOf: > > + - const: rx > > + - items: > > + - const: tx > > + - const: rx > > If tx is optional, wouldn't this be simpler: > > minItems: 1 > items: > - const: rx > - const: tx > Thanks! I will simplify this in the next version - Troy > > > + > > + resets: > > + maxItems: 1 > > + > > + port: > > + $ref: audio-graph-port.yaml# > > + unevaluatedProperties: false > > + > > + "#sound-dai-cells": > > + const: 0 > > + > > +required: > > + - compatible > > + - reg > > + - clocks > > + - clock-names > > + - dmas > > + - dma-names > > + - resets > > + - "#sound-dai-cells" > > + > > +unevaluatedProperties: false > > + > > +examples: > > + - | > > + #include > > + i2s at d4026000 { > > + compatible = "spacemit,k1-i2s"; > > + reg = <0xd4026000 0x30>; > > + clocks = <&syscon_mpmu CLK_I2S_SYSCLK>, > > + <&syscon_mpmu CLK_I2S_BCLK>, > > + <&syscon_apbc CLK_SSPA0_BUS>, > > + <&syscon_apbc CLK_SSPA0>; > > + clock-names = "sysclk", "bclk", "bus", "func"; > > + dmas = <&pdma0 21>, <&pdma0 22>; > > + dma-names = "tx", "rx"; > > + resets = <&syscon_apbc RESET_SSPA0>; > > + #sound-dai-cells = <0>; > > + }; > > > > -- > > 2.50.1 > > > From troy.mitchell at linux.spacemit.com Mon Sep 1 18:39:40 2025 From: troy.mitchell at linux.spacemit.com (Troy Mitchell) Date: Tue, 2 Sep 2025 09:39:40 +0800 Subject: [PATCH v2 2/2] ASoC: spacemit: add i2s support for K1 SoC In-Reply-To: References: <20250828-k1-i2s-v2-0-09e7b40f002c@linux.spacemit.com> <20250828-k1-i2s-v2-2-09e7b40f002c@linux.spacemit.com> Message-ID: On Thu, Aug 28, 2025 at 11:04:19AM +0200, Mark Brown wrote: > On Thu, Aug 28, 2025 at 11:37:33AM +0800, Troy Mitchell wrote: > > > + switch (fmt & SND_SOC_DAIFMT_FORMAT_MASK) { > > + case SND_SOC_DAIFMT_DSP_A: > > + case SND_SOC_DAIFMT_DSP_B: > > + cpu_dai->driver->playback.channels_min = 1; > > + cpu_dai->driver->playback.channels_max = 1; > > > + if ((fmt & SND_SOC_DAIFMT_FORMAT_MASK) == SND_SOC_DAIFMT_DSP_A) > > + sspsp_val |= SSPSP_FSRT; > > It's weird and confusing that this isn't part of the above switch case. Thanks! I'll change it. > > > +static void spacemit_i2s_remove(struct platform_device *pdev) > > +{ > > + /* resources auto-freed by devm_ */ > > +} > > If this can be empty remove it. Yes, I will remove it in the next version. Best regards, Troy From zong.li at sifive.com Mon Sep 1 21:01:19 2025 From: zong.li at sifive.com (Zong Li) Date: Tue, 2 Sep 2025 12:01:19 +0800 Subject: [RFC PATCH v2 00/10] RISC-V IOMMU HPM and nested IOMMU support In-Reply-To: <20250901133629.87310-1-ni_liqiang@126.com> References: <20240614142156.29420-3-zong.li@sifive.com> <20250901133629.87310-1-ni_liqiang@126.com> Message-ID: On Mon, Sep 1, 2025 at 9:37?PM niliqiang wrote: > > Hi Zong > > Fri, 14 Jun 2024 22:21:48 +0800, Zong Li wrote: > > > This patch initialize the pmu stuff and uninitialize it when driver > > removing. The interrupt handling is also provided, this handler need to > > be primary handler instead of thread function, because pt_regs is empty > > when threading the IRQ, but pt_regs is necessary by perf_event_overflow. > > > > Signed-off-by: Zong Li > > --- > > drivers/iommu/riscv/iommu.c | 65 +++++++++++++++++++++++++++++++++++++ > > 1 file changed, 65 insertions(+) > > > > diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c > > index 8b6a64c1ad8d..1716b2251f38 100644 > > --- a/drivers/iommu/riscv/iommu.c > > +++ b/drivers/iommu/riscv/iommu.c > > @@ -540,6 +540,62 @@ static irqreturn_t riscv_iommu_fltq_process(int irq, void *data) > > return IRQ_HANDLED; > > } > > > > +/* > > + * IOMMU Hardware performance monitor > > + */ > > + > > +/* HPM interrupt primary handler */ > > +static irqreturn_t riscv_iommu_hpm_irq_handler(int irq, void *dev_id) > > +{ > > + struct riscv_iommu_device *iommu = (struct riscv_iommu_device *)dev_id; > > + > > + /* Process pmu irq */ > > + riscv_iommu_pmu_handle_irq(&iommu->pmu); > > + > > + /* Clear performance monitoring interrupt pending */ > > + riscv_iommu_writel(iommu, RISCV_IOMMU_REG_IPSR, RISCV_IOMMU_IPSR_PMIP); > > + > > + return IRQ_HANDLED; > > +} > > + > > +/* HPM initialization */ > > +static int riscv_iommu_hpm_enable(struct riscv_iommu_device *iommu) > > +{ > > + int rc; > > + > > + if (!(iommu->caps & RISCV_IOMMU_CAPABILITIES_HPM)) > > + return 0; > > + > > + /* > > + * pt_regs is empty when threading the IRQ, but pt_regs is necessary > > + * by perf_event_overflow. Use primary handler instead of thread > > + * function for PM IRQ. > > + * > > + * Set the IRQF_ONESHOT flag because this IRQ might be shared with > > + * other threaded IRQs by other queues. > > + */ > > + rc = devm_request_irq(iommu->dev, > > + iommu->irqs[riscv_iommu_queue_vec(iommu, RISCV_IOMMU_IPSR_PMIP)], > > + riscv_iommu_hpm_irq_handler, IRQF_ONESHOT | IRQF_SHARED, NULL, iommu); > > + if (rc) > > + return rc; > > + > > + return riscv_iommu_pmu_init(&iommu->pmu, iommu->reg, dev_name(iommu->dev)); > > +} > > + > > What are the benefits of initializing the iommu-pmu driver in the iommu driver? > > It might be better for the RISC-V IOMMU PMU driver to be loaded as a separate module, as this would allow greater flexibility since different vendors may need to add custom events. > > Also, I'm not quite clear on how custom events should be added if the RISC-V iommu-pmu is placed within the iommu driver. Hi Liqiang, My original idea is that, since the IOMMU HPM is not always present, it depends on the capability.HPM bit, if we separate HPM into an individual module, I assume that the PMU driver may not have access to the IOMMU's complete MMIO region. I?m not sure how we would check the capability register in the PMU driver and avoid the following situation: capability.HPM is zero, but the IOMMU-PMU driver is still loaded because the PMU node is present in the DTS. It will be helpful if you have any suggestions on this. Regarding custom events, since we don?t have the driver data, my current rough idea is to add a vendor event map table to list the vendor events and use Kconfig to define them respectively. This is just an initial thought and may not be the good solution, so feel free to share any recommendations. Of course, if we eventually decide to move it to drivers/perf as an individual module, then we could use the driver data for custom events, similar to what ARM does. Thanks > > > Best regards, > Liqiang > From luxu.kernel at bytedance.com Mon Sep 1 21:24:28 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Tue, 2 Sep 2025 12:24:28 +0800 Subject: [PATCH v2 0/4] riscv: Add Zalasr ISA extension support Message-ID: <20250902042432.78960-1-luxu.kernel@bytedance.com> This patch adds support for the Zalasr ISA extension, which supplies the real load acquire/store release instructions. The specification can be found here: https://github.com/riscv/riscv-zalasr/blob/main/chapter2.adoc This patch seires has been tested with ltp on Qemu with Brensan's zalasr support patch[1]. Some false positive spacing error happens during patch checking. Thus I CCed maintainers of checkpatch.pl as well. [1] https://lore.kernel.org/all/CAGPSXwJEdtqW=nx71oufZp64nK6tK=0rytVEcz4F-gfvCOXk2w at mail.gmail.com/ v2: - Adjust the order of Zalasr and Zalrsc in dt-bindings. Thanks to Conor. Xu Lu (4): riscv: add ISA extension parsing for Zalasr dt-bindings: riscv: Add Zalasr ISA extension description riscv: Instroduce Zalasr instructions riscv: Use Zalasr for smp_load_acquire/smp_store_release .../devicetree/bindings/riscv/extensions.yaml | 5 ++ arch/riscv/include/asm/barrier.h | 79 ++++++++++++++++--- arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/include/asm/insn-def.h | 79 +++++++++++++++++++ arch/riscv/kernel/cpufeature.c | 1 + 5 files changed, 154 insertions(+), 11 deletions(-) -- 2.20.1 From luxu.kernel at bytedance.com Mon Sep 1 21:24:29 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Tue, 2 Sep 2025 12:24:29 +0800 Subject: [PATCH v2 1/4] riscv: add ISA extension parsing for Zalasr In-Reply-To: <20250902042432.78960-1-luxu.kernel@bytedance.com> References: <20250902042432.78960-1-luxu.kernel@bytedance.com> Message-ID: <20250902042432.78960-2-luxu.kernel@bytedance.com> Add parsing for Zalasr ISA extension. Signed-off-by: Xu Lu --- arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/kernel/cpufeature.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index affd63e11b0a3..ae3852c4f2ca2 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -106,6 +106,7 @@ #define RISCV_ISA_EXT_ZAAMO 97 #define RISCV_ISA_EXT_ZALRSC 98 #define RISCV_ISA_EXT_ZICBOP 99 +#define RISCV_ISA_EXT_ZALASR 100 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 743d53415572e..bf9d3d92bf372 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -472,6 +472,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_DATA(zaamo, RISCV_ISA_EXT_ZAAMO), __RISCV_ISA_EXT_DATA(zabha, RISCV_ISA_EXT_ZABHA), __RISCV_ISA_EXT_DATA(zacas, RISCV_ISA_EXT_ZACAS), + __RISCV_ISA_EXT_DATA(zalasr, RISCV_ISA_EXT_ZALASR), __RISCV_ISA_EXT_DATA(zalrsc, RISCV_ISA_EXT_ZALRSC), __RISCV_ISA_EXT_DATA(zawrs, RISCV_ISA_EXT_ZAWRS), __RISCV_ISA_EXT_DATA(zfa, RISCV_ISA_EXT_ZFA), -- 2.20.1 From luxu.kernel at bytedance.com Mon Sep 1 21:24:30 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Tue, 2 Sep 2025 12:24:30 +0800 Subject: [PATCH v2 2/4] dt-bindings: riscv: Add Zalasr ISA extension description In-Reply-To: <20250902042432.78960-1-luxu.kernel@bytedance.com> References: <20250902042432.78960-1-luxu.kernel@bytedance.com> Message-ID: <20250902042432.78960-3-luxu.kernel@bytedance.com> Add description for the Zalasr ISA extension Signed-off-by: Xu Lu --- Documentation/devicetree/bindings/riscv/extensions.yaml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml index ede6a58ccf534..100fe53fb0731 100644 --- a/Documentation/devicetree/bindings/riscv/extensions.yaml +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml @@ -242,6 +242,11 @@ properties: is supported as ratified at commit 5059e0ca641c ("update to ratified") of the riscv-zacas. + - const: zalasr + description: | + The standard Zalasr extension for load-acquire/store-release as frozen + at commit 194f0094 ("Version 0.9 for freeze") of riscv-zalasr. + - const: zalrsc description: | The standard Zalrsc extension for load-reserved/store-conditional as -- 2.20.1 From luxu.kernel at bytedance.com Mon Sep 1 21:24:31 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Tue, 2 Sep 2025 12:24:31 +0800 Subject: [PATCH v2 3/4] riscv: Instroduce Zalasr instructions In-Reply-To: <20250902042432.78960-1-luxu.kernel@bytedance.com> References: <20250902042432.78960-1-luxu.kernel@bytedance.com> Message-ID: <20250902042432.78960-4-luxu.kernel@bytedance.com> Introduce l{b|h|w|d}.{aq|aqrl} and s{b|h|w|d}.{rl|aqrl} instruction encodings. Signed-off-by: Xu Lu --- arch/riscv/include/asm/insn-def.h | 79 +++++++++++++++++++++++++++++++ 1 file changed, 79 insertions(+) diff --git a/arch/riscv/include/asm/insn-def.h b/arch/riscv/include/asm/insn-def.h index d5adbaec1d010..3fec7e66ce50f 100644 --- a/arch/riscv/include/asm/insn-def.h +++ b/arch/riscv/include/asm/insn-def.h @@ -179,6 +179,7 @@ #define RV___RS1(v) __RV_REG(v) #define RV___RS2(v) __RV_REG(v) +#define RV_OPCODE_AMO RV_OPCODE(47) #define RV_OPCODE_MISC_MEM RV_OPCODE(15) #define RV_OPCODE_OP_IMM RV_OPCODE(19) #define RV_OPCODE_SYSTEM RV_OPCODE(115) @@ -208,6 +209,84 @@ __ASM_STR(.error "hlv.d requires 64-bit support") #endif +#define LB_AQ(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(0), FUNC7(26), \ + RD(dest), RS1(addr), __RS2(0)) + +#define LB_AQRL(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(0), FUNC7(27), \ + RD(dest), RS1(addr), __RS2(0)) + +#define LH_AQ(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(1), FUNC7(26), \ + RD(dest), RS1(addr), __RS2(0)) + +#define LH_AQRL(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(1), FUNC7(27), \ + RD(dest), RS1(addr), __RS2(0)) + +#define LW_AQ(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(2), FUNC7(26), \ + RD(dest), RS1(addr), __RS2(0)) + +#define LW_AQRL(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(2), FUNC7(27), \ + RD(dest), RS1(addr), __RS2(0)) + +#define SB_RL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(0), FUNC7(29), \ + __RD(0), RS1(addr), RS2(src)) + +#define SB_AQRL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(0), FUNC7(31), \ + __RD(0), RS1(addr), RS2(src)) + +#define SH_RL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(1), FUNC7(29), \ + __RD(0), RS1(addr), RS2(src)) + +#define SH_AQRL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(1), FUNC7(31), \ + __RD(0), RS1(addr), RS2(src)) + +#define SW_RL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(2), FUNC7(29), \ + __RD(0), RS1(addr), RS2(src)) + +#define SW_AQRL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(2), FUNC7(31), \ + __RD(0), RS1(addr), RS2(src)) + +#ifdef CONFIG_64BIT +#define LD_AQ(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(3), FUNC7(26), \ + RD(dest), RS1(addr), __RS2(0)) + +#define LD_AQRL(dest, addr) \ + INSN_R(OPCODE_AMO, FUNC3(3), FUNC7(27), \ + RD(dest), RS1(addr), __RS2(0)) + +#define SD_RL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(3), FUNC7(29), \ + __RD(0), RS1(addr), RS2(src)) + +#define SD_AQRL(src, addr) \ + INSN_R(OPCODE_AMO, FUNC3(3), FUNC7(31), \ + __RD(0), RS1(addr), RS2(src)) +#else +#define LD_AQ(dest, addr) \ + __ASM_STR(.error "ld.aq requires 64-bit support") + +#define LD_AQRL(dest, addr) \ + __ASM_STR(.error "ld.aqrl requires 64-bit support") + +#define SD_RL(dest, addr) \ + __ASM_STR(.error "sd.rl requires 64-bit support") + +#define SD_AQRL(dest, addr) \ + __ASM_STR(.error "sd.aqrl requires 64-bit support") +#endif + #define SINVAL_VMA(vaddr, asid) \ INSN_R(OPCODE_SYSTEM, FUNC3(0), FUNC7(11), \ __RD(0), RS1(vaddr), RS2(asid)) -- 2.20.1 From luxu.kernel at bytedance.com Mon Sep 1 21:24:32 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Tue, 2 Sep 2025 12:24:32 +0800 Subject: [PATCH v2 4/4] riscv: Use Zalasr for smp_load_acquire/smp_store_release In-Reply-To: <20250902042432.78960-1-luxu.kernel@bytedance.com> References: <20250902042432.78960-1-luxu.kernel@bytedance.com> Message-ID: <20250902042432.78960-5-luxu.kernel@bytedance.com> Replace fence instructions with Zalasr instructions during acquire or release operations. Signed-off-by: Xu Lu --- arch/riscv/include/asm/barrier.h | 79 +++++++++++++++++++++++++++----- 1 file changed, 68 insertions(+), 11 deletions(-) diff --git a/arch/riscv/include/asm/barrier.h b/arch/riscv/include/asm/barrier.h index b8c5726d86acb..b1d2a9a85256d 100644 --- a/arch/riscv/include/asm/barrier.h +++ b/arch/riscv/include/asm/barrier.h @@ -51,19 +51,76 @@ */ #define smp_mb__after_spinlock() RISCV_FENCE(iorw, iorw) -#define __smp_store_release(p, v) \ -do { \ - compiletime_assert_atomic_type(*p); \ - RISCV_FENCE(rw, w); \ - WRITE_ONCE(*p, v); \ +extern void __bad_size_call_parameter(void); + +#define __smp_store_release(p, v) \ +do { \ + compiletime_assert_atomic_type(*p); \ + switch (sizeof(*p)) { \ + case 1: \ + asm volatile(ALTERNATIVE("fence rw, w;\t\nsb %0, 0(%1)\t\n", \ + SB_RL(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : : "r" (v), "r" (p) : "memory"); \ + break; \ + case 2: \ + asm volatile(ALTERNATIVE("fence rw, w;\t\nsh %0, 0(%1)\t\n", \ + SH_RL(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : : "r" (v), "r" (p) : "memory"); \ + break; \ + case 4: \ + asm volatile(ALTERNATIVE("fence rw, w;\t\nsw %0, 0(%1)\t\n", \ + SW_RL(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : : "r" (v), "r" (p) : "memory"); \ + break; \ + case 8: \ + asm volatile(ALTERNATIVE("fence rw, w;\t\nsd %0, 0(%1)\t\n", \ + SD_RL(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : : "r" (v), "r" (p) : "memory"); \ + break; \ + default: \ + __bad_size_call_parameter(); \ + break; \ + } \ } while (0) -#define __smp_load_acquire(p) \ -({ \ - typeof(*p) ___p1 = READ_ONCE(*p); \ - compiletime_assert_atomic_type(*p); \ - RISCV_FENCE(r, rw); \ - ___p1; \ +#define __smp_load_acquire(p) \ +({ \ + TYPEOF_UNQUAL(*p) val; \ + compiletime_assert_atomic_type(*p); \ + switch (sizeof(*p)) { \ + case 1: \ + asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ + LB_AQ(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : "=r" (val) : "r" (p) : "memory"); \ + break; \ + case 2: \ + asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ + LH_AQ(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : "=r" (val) : "r" (p) : "memory"); \ + break; \ + case 4: \ + asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ + LW_AQ(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : "=r" (val) : "r" (p) : "memory"); \ + break; \ + case 8: \ + asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ + LD_AQ(%0, %1) "\t\nnop\t\n", \ + 0, RISCV_ISA_EXT_ZALASR, 1) \ + : "=r" (val) : "r" (p) : "memory"); \ + break; \ + default: \ + __bad_size_call_parameter(); \ + break; \ + } \ + val; \ }) #ifdef CONFIG_RISCV_ISA_ZAWRS -- 2.20.1 From dqfext at gmail.com Mon Sep 1 22:13:12 2025 From: dqfext at gmail.com (Qingfang Deng) Date: Tue, 2 Sep 2025 13:13:12 +0800 Subject: [PATCH 4/8] riscv: Introduce support for hardware break/watchpoints In-Reply-To: <20250822174715.1269138-5-jesse@rivosinc.com> References: <20250822174715.1269138-1-jesse@rivosinc.com> <20250822174715.1269138-5-jesse@rivosinc.com> Message-ID: <20250822174715.1269138-5-jesse@rivosinc.com> Hi Jesse and Charlie, On Fri, 22 Aug 2025 10:47:11 -0700, Jesse Taube wrote: > +static int arch_smp_setup_sbi_shmem(unsigned int cpu) > +{ > + union sbi_dbtr_shmem_entry *dbtr_shmem; > + unsigned long shmem_pa; > + struct sbiret ret; > + int rc; > + > + dbtr_shmem = per_cpu_ptr(&sbi_dbtr_shmem, cpu); > + if (!dbtr_shmem) { > + pr_err("Invalid per-cpu shared memory for debug triggers\n"); > + return -ENODEV; > + } > + > + shmem_pa = virt_to_phys(dbtr_shmem); > + > + ret = sbi_ecall(SBI_EXT_DBTR, SBI_EXT_DBTR_SETUP_SHMEM, > + SBI_SHMEM_LO(shmem_pa), SBI_SHMEM_HI(shmem_pa), 0, 0, 0, 0); > + if (ret.error) { > + pr_warn("%s: failed to setup shared memory. error: %ld\n", __func__, ret.error); > + return sbi_err_map_linux_errno(ret.error); > + } > + > + pr_debug("CPU %d: HW Breakpoint shared memory registered.\n", cpu); > + > + return rc; rc is uninitialized. You may remove the variable and just return 0 here. > +} Regards, Qingfang From krzk at kernel.org Mon Sep 1 23:22:02 2025 From: krzk at kernel.org (Krzysztof Kozlowski) Date: Tue, 2 Sep 2025 08:22:02 +0200 Subject: [PATCH v1 5/5] riscv: dts: microchip: add a device tree for Discovery Kit In-Reply-To: References: <20250825161952.3902672-1-valentina.fernandezalanis@microchip.com> <20250825161952.3902672-6-valentina.fernandezalanis@microchip.com> <2b1eb8fd-2a64-4745-ad93-abc53d240b69@kernel.org> Message-ID: <0d90eeb4-e6ac-459c-a6b1-26368f102e0e@kernel.org> On 01/09/2025 17:28, Valentina.FernandezAlanis at microchip.com wrote: > On 28/08/2025 18:46, Krzysztof Kozlowski wrote: >> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe >> >> On 25/08/2025 18:19, Valentina Fernandez wrote: >>> +++ b/arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi >>> @@ -0,0 +1,58 @@ >>> +// SPDX-License-Identifier: (GPL-2.0 OR MIT) >>> +/* Copyright (c) 2020-2025 Microchip Technology Inc */ >>> + >>> +/ { >>> + core_pwm0: pwm at 40000000 { >>> + compatible = "microchip,corepwm-rtl-v4"; >>> + reg = <0x0 0x40000000 0x0 0xF0>; >>> + microchip,sync-update-mask = /bits/ 32 <0>; >>> + #pwm-cells = <3>; >>> + clocks = <&ccc_sw CLK_CCC_PLL0_OUT3>; >>> + status = "disabled"; >>> + }; >>> + >>> + i2c2: i2c at 40000200 { >>> + compatible = "microchip,corei2c-rtl-v7"; >>> + reg = <0x0 0x40000200 0x0 0x100>; >>> + #address-cells = <1>; >>> + #size-cells = <0>; >>> + clocks = <&ccc_sw CLK_CCC_PLL0_OUT3>; >>> + interrupt-parent = <&plic>; >>> + interrupts = <122>; >>> + clock-frequency = <100000>; >>> + status = "disabled"; >>> + }; >>> + >>> + ihc: mailbox { >>> + compatible = "microchip,sbi-ipc"; >>> + interrupt-parent = <&plic>; >>> + interrupts = <180>, <179>, <178>, <177>; >>> + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; >>> + #mbox-cells = <1>; >>> + status = "disabled"; >>> + }; >>> + >>> + mailbox at 50000000 { >>> + compatible = "microchip,miv-ihc-rtl-v2"; >>> + microchip,ihc-chan-disabled-mask = /bits/ 16 <0>; >> >> Does not look like following DTS coding style - order of properties. >> >>> + reg = <0x0 0x50000000 0x0 0x1c000>; >>> + interrupt-parent = <&plic>; >>> + interrupts = <180>, <179>, <178>, <177>; >>> + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; >>> + #mbox-cells = <1>; >>> + status = "disabled"; >>> + }; >>> + >>> + refclk_ccc: cccrefclk { >> >> Please use name for all fixed clocks which matches current format >> recommendation: 'clock-' (see also the pattern in the binding for >> any other options). >> >> https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/clock/fixed-clock.yaml > The fabric dtsi describes elements configured by the FPGA bitstream. > This node is named as such because the Clock Conditioner Circuit CCC's > reference clock source is set by the FPGA bitstream, while its frequency > is determined by an on-board oscillator. > > Hope this clarifies the rationale behind the node name. No, because there is no style naming clocks like this. Neither proper suffix, nor prefix. Use standard naming. And all other comments you ignored? Best regards, Krzysztof From alex at ghiti.fr Mon Sep 1 23:55:24 2025 From: alex at ghiti.fr (Alexandre Ghiti) Date: Tue, 2 Sep 2025 08:55:24 +0200 Subject: [PATCH] riscv: cacheinfo: init cache levels via fetch_cache_info when SMP disabled In-Reply-To: <91570387-4da1-4b26-a274-bed1c59ef12f@ghiti.fr> References: <20250814092936030rQLylo3a7HXUWKIniqFy1@zte.com.cn> <91570387-4da1-4b26-a274-bed1c59ef12f@ghiti.fr> Message-ID: Hi Jessica, On 8/14/25 10:16, Alexandre Ghiti wrote: > Hi Jessica, > > On 8/14/25 03:29, liu.xuemei1 at zte.com.cn wrote: >> >> Hi Alex, >> >> >> >> Hi Jessica, >> >> >> >> >> >> On 8/1/25 03:32, liu.xuemei1 at zte.com.cn wrote: >> >> >>> >> >> >>> On 7/31/25 21:29, alex at ghiti.fr wrote: >> >> >>> >> >> >>> > > From: Jessica Liu >> >> >>> >> >> >>> > > >> >> >>> >> >> >>> > > As described in commit 1845d381f280 ("riscv: cacheinfo: Add back >> >> >>> >> >> >>> > > init_cache_level() function"), when CONFIG_SMP is undefined, the >> >> >>> cache >> >> >>> >> >> >>> > > hierarchy detection needs to be performed through the >> >> >>> init_cache_level(), >> >> >>> >> >> >>> > > whereas when CONFIG_SMP is defined, this detection is handled >> >> >>> during the >> >> >>> >> >> >>> > > init_cpu_topology() process. >> >> >>> >> >> >>> > > >> >> >>> >> >> >>> > > Furthermore, while commit 66381d36771e ("RISC-V: Select ACPI >> PPTT >> >> >>> drivers") >> >> >>> >> >> >>> > > enables cache information retrieval through the ACPI PPTT >> table, the >> >> >>> >> >> >>> > > init_of_cache_level() called within init_cache_level() cannot >> >> >>> support cache >> >> >>> >> >> >>> > > hierarchy detection through ACPI PPTT. Therefore, when >> CONFIG_SMP is >> >> >>> >> >> >>> > > undefined, we directly invoke the fetch_cache_info function to >> >> >>> initialize >> >> >>> >> >> >>> > > the cache levels. >> >> >>> >> >> >>> > > >> >> >>> >> >> >>> > > Signed-off-by: Jessica Liu >> >> >>> >> >> >>> > > --- >> >> >>> >> >> >>> > >? ?arch/riscv/kernel/cacheinfo.c | 6 +++++- >> >> >>> >> >> >>> > >? ?1 file changed, 5 insertions(+), 1 deletion(-) >> >> >>> >> >> >>> > > >> >> >>> >> >> >>> > > diff --git a/arch/riscv/kernel/cacheinfo.c >> >> >>> b/arch/riscv/kernel/cacheinfo.c >> >> >>> >> >> >>> > > index 26b085dbdd07..f81ca963d177 100644 >> >> >>> >> >> >>> > > --- a/arch/riscv/kernel/cacheinfo.c >> >> >>> >> >> >>> > > +++ b/arch/riscv/kernel/cacheinfo.c >> >> >>> >> >> >>> > > @@ -73,7 +73,11 @@ static void ci_leaf_init(struct cacheinfo >> >> >>> *this_leaf, >> >> >>> >> >> >>> > > >> >> >>> >> >> >>> > >? ?int init_cache_level(unsigned int cpu) >> >> >>> >> >> >>> > >? ?{ >> >> >>> >> >> >>> > > -? ? return init_of_cache_level(cpu); >> >> >>> >> >> >>> > > +#ifdef CONFIG_SMP >> >> >>> >> >> >>> > > +? ? return 0; >> >> >>> >> >> >>> > > +#endif >> >> >>> >> >> >>> > > + >> >> >>> >> >> >>> > > +? ? return fetch_cache_info(cpu); >> >> >>> >> >> >>> > >? ?} >> >> >>> >> >> >>> > > >> >> >>> >> >> >>> > >? ?int populate_cache_leaves(unsigned int cpu) >> >> >>> >> >> >>> > >> >> >>> >> >> >>> > >> >> >>> >> >> >>> > Is the current behaviour wrong or just redundant? If wrong, >> I'll add a >> >> >>> >> >> >>> > Fixes tag to backport, otherwise I won't. >> >> >>> >> >> >>> > >> >> >>> >> >> >>> > Thanks, >> >> >>> >> >> >>> > >> >> >>> >> >> >>> > Alex >> >> >>> >> >> >>> >> >> >>> Hi Alex, >> >> >>> >> >> >>> >> >> >>> The current behavior is actually wrong when using ACPI on >> !CONFIG_SMP >> >> >>> >> >> >>> systems. The original init_of_cache_level() cannot detect cache >> >> >>> hierarchy >> >> >>> >> >> >>> through ACPI PPTT table, which means cache information would be >> missing >> >> >>> >> >> >>> in this configuration. >> >> >>> >> >> >>> >> >> >>> The patch fixes this by directly calling fetch_cache_info() when >> >> >>> >> >> >>> CONFIG_SMP is undefined, which properly handles both DT and ACPI >> cases.. >> >> >>> >> >> >>> >> >> >>> So yes, it would be appropriate to add a Fixes tag. The commit being >> >> >>> >> >> >>> fixed is 1845d381f280 ("riscv: cacheinfo: Add back >> init_cache_level() >> >> >>> function"). >> >> >>> >> >> >>> >> >> >>> Please let me know if you need any additional information. >> >> >>> >> >> >> >> >> >> I'm about to send my first PR for 6.17 so I'll delay merging this one >> >> >> for the first rc. >> >> > >> >> > >> >> >So I took the time this morning to look into this, and I don't really >> >> >like the different treatment for smp, can't we just move >> >> >init_cpu_topology() call to setup_arch() (or else) for both !smp and >> smp? >> >> > >> >> >Thanks, >> >> > >> >> >Alex >> >> >> Thank you for your feedback and suggestion. I understand your desire >> >> to have a unified approach for both SMP and !SMP. However, after >> >> careful consideration, I still believe that handling them separately >> >> is the more appropriate solution. >> >> >> The current method of obtaining cache information in >> >> `init_cpu_topology()` is specific to RISC-V and ARM64. If we move >> >> `init_cpu_topology()` to cover both SMP and !SMP, it may require >> >> modifying the generic boot sequence. This could inadvertently affect >> >> other architectures that do not rely on `init_cpu_topology()` for >> >> cache initialization, leading to potential regressions and maintenance >> >> issues. >> >> >> The `setup_arch()` function is called early in the boot process, >> >> and at this stage, the ACPI subsystem has not been fully initialized. >> >> Specifically, the ACPI tables (including PPTT) are not yet parsed. >> >> Therefore, if we call `init_cpu_topology()` from `setup_arch()`, it >> >> would not be able to retrieve cache information from the ACPI PPTT >> table. >> >> >> I hope this clarifies my train of thought. I'm open to further >> discussion and >> >> alternative suggestions that can address the issue properly. >> > > To me it does not make sense to retrieve the cache info at 2 different > points in time if the system is smp or not. I still think we should > find a common place where init_cpu_topology() can be called for both > smp and up, setup_arch() could not be the right place for the reasons > you gave, but we just need to find the right one :) > > Thanks for working on this, I don't mean to pressure you, I know it's the end of summer and people are still on vacations or just back from vacation. I just wanted to know if you had time to look into what I asked above? Thanks, Alex > > Alex > > >> >> Best regards, >> >> Jessica >> >> >> >> >> >> > > _______________________________________________ > linux-riscv mailing list > linux-riscv at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv From valentina.fernandezalanis at microchip.com Tue Sep 2 00:55:43 2025 From: valentina.fernandezalanis at microchip.com (Valentina Fernandez) Date: Tue, 2 Sep 2025 08:55:43 +0100 Subject: [PATCH v2 0/5] Icicle Kit with prod device and Discovery Kit support Message-ID: <20250902075548.1967613-1-valentina.fernandezalanis@microchip.com> Hi all, With the introduction of the Icicle Kit with the production device (MPFS250T) to the market, it's necessary to distinguish it from the engineering sample (-es) variant. This is because engineering samples cannot write to flash from the MSS, as noted in the PolarFire SoC FPGA ES errata. This series adds a common board DTSI for the Icicle Kit, containing hardware shared by both the engineering sample and production versions, as well as a DTS for each Icicle Kit variant. The last two patches add support for the PolarFire SoC Discovery Kit board. Changes since v1: - fix order of properties in mailbox nodes - drop redundant status property from ddrc_cache nodes - fix lowecase hex in reserved memory regions Thanks, Valentina Valentina Fernandez (5): riscv: dts: microchip: add common board dtsi for icicle kit variants dt-bindings: riscv: microchip: document icicle kit with production device riscv: dts: microchip: add icicle kit with production device dt-bindings: riscv: microchip: document Discovery Kit riscv: dts: microchip: add a device tree for Discovery Kit .../devicetree/bindings/riscv/microchip.yaml | 13 + arch/riscv/boot/dts/microchip/Makefile | 2 + .../dts/microchip/mpfs-disco-kit-fabric.dtsi | 58 ++++ .../boot/dts/microchip/mpfs-disco-kit.dts | 190 +++++++++++++ .../dts/microchip/mpfs-icicle-kit-common.dtsi | 249 ++++++++++++++++++ .../dts/microchip/mpfs-icicle-kit-fabric.dtsi | 23 +- .../dts/microchip/mpfs-icicle-kit-prod.dts | 23 ++ .../boot/dts/microchip/mpfs-icicle-kit.dts | 244 +---------------- 8 files changed, 558 insertions(+), 244 deletions(-) create mode 100644 arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi create mode 100644 arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts create mode 100644 arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi create mode 100644 arch/riscv/boot/dts/microchip/mpfs-icicle-kit-prod.dts -- 2.34.1 From valentina.fernandezalanis at microchip.com Tue Sep 2 00:55:44 2025 From: valentina.fernandezalanis at microchip.com (Valentina Fernandez) Date: Tue, 2 Sep 2025 08:55:44 +0100 Subject: [PATCH v2 1/5] riscv: dts: microchip: add common board dtsi for icicle kit variants In-Reply-To: <20250902075548.1967613-1-valentina.fernandezalanis@microchip.com> References: <20250902075548.1967613-1-valentina.fernandezalanis@microchip.com> Message-ID: <20250902075548.1967613-2-valentina.fernandezalanis@microchip.com> In preparation for supporting the Icicle Kit with production silicon, add a common board dtsi for the icicle kit with hardware shared by both the engineering sample and production versions. Signed-off-by: Valentina Fernandez --- .../dts/microchip/mpfs-icicle-kit-common.dtsi | 247 ++++++++++++++++++ .../boot/dts/microchip/mpfs-icicle-kit.dts | 241 +---------------- 2 files changed, 248 insertions(+), 240 deletions(-) create mode 100644 arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi new file mode 100644 index 000000000000..eafea3b69cd7 --- /dev/null +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi @@ -0,0 +1,247 @@ +// SPDX-License-Identifier: (GPL-2.0 OR MIT) +/* Copyright (c) 2025 Microchip Technology Inc */ + +/dts-v1/; + +#include "mpfs.dtsi" +#include "mpfs-icicle-kit-fabric.dtsi" +#include +#include + +/ { + aliases { + ethernet0 = &mac1; + serial0 = &mmuart0; + serial1 = &mmuart1; + serial2 = &mmuart2; + serial3 = &mmuart3; + serial4 = &mmuart4; + }; + + chosen { + stdout-path = "serial1:115200n8"; + }; + + leds { + compatible = "gpio-leds"; + + led-1 { + gpios = <&gpio2 16 GPIO_ACTIVE_HIGH>; + color = ; + label = "led1"; + }; + + led-2 { + gpios = <&gpio2 17 GPIO_ACTIVE_HIGH>; + color = ; + label = "led2"; + }; + + led-3 { + gpios = <&gpio2 18 GPIO_ACTIVE_HIGH>; + color = ; + label = "led3"; + }; + + led-4 { + gpios = <&gpio2 19 GPIO_ACTIVE_HIGH>; + color = ; + label = "led4"; + }; + }; + + ddrc_cache_lo: memory at 80000000 { + device_type = "memory"; + reg = <0x0 0x80000000 0x0 0x40000000>; + status = "okay"; + }; + + ddrc_cache_hi: memory at 1040000000 { + device_type = "memory"; + reg = <0x10 0x40000000 0x0 0x40000000>; + status = "okay"; + }; + + reserved-memory { + #address-cells = <2>; + #size-cells = <2>; + ranges; + + hss_payload: region at BFC00000 { + reg = <0x0 0xBFC00000 0x0 0x400000>; + no-map; + }; + }; +}; + +&core_pwm0 { + status = "okay"; +}; + +&gpio2 { + interrupts = <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>; + status = "okay"; +}; + +&i2c0 { + status = "okay"; +}; + +&i2c1 { + status = "okay"; + + power-monitor at 10 { + compatible = "microchip,pac1934"; + reg = <0x10>; + + #address-cells = <1>; + #size-cells = <0>; + + channel at 1 { + reg = <0x1>; + shunt-resistor-micro-ohms = <10000>; + label = "VDDREG"; + }; + + channel at 2 { + reg = <0x2>; + shunt-resistor-micro-ohms = <10000>; + label = "VDDA25"; + }; + + channel at 3 { + reg = <0x3>; + shunt-resistor-micro-ohms = <10000>; + label = "VDD25"; + }; + + channel at 4 { + reg = <0x4>; + shunt-resistor-micro-ohms = <10000>; + label = "VDDA_REG"; + }; + }; +}; + +&i2c2 { + status = "okay"; +}; + +&mac0 { + phy-mode = "sgmii"; + phy-handle = <&phy0>; + status = "okay"; +}; + +&mac1 { + phy-mode = "sgmii"; + phy-handle = <&phy1>; + status = "okay"; + + phy1: ethernet-phy at 9 { + reg = <9>; + }; + + phy0: ethernet-phy at 8 { + reg = <8>; + }; +}; + +&mbox { + status = "okay"; +}; + +&mmc { + bus-width = <4>; + disable-wp; + cap-sd-highspeed; + cap-mmc-highspeed; + mmc-ddr-1_8v; + mmc-hs200-1_8v; + sd-uhs-sdr12; + sd-uhs-sdr25; + sd-uhs-sdr50; + sd-uhs-sdr104; + status = "okay"; +}; + +&mmuart1 { + status = "okay"; +}; + +&mmuart2 { + status = "okay"; +}; + +&mmuart3 { + status = "okay"; +}; + +&mmuart4 { + status = "okay"; +}; + +&pcie { + status = "okay"; +}; + +&qspi { + status = "okay"; +}; + +&refclk { + clock-frequency = <125000000>; +}; + +&refclk_ccc { + clock-frequency = <50000000>; +}; + +&rtc { + status = "okay"; +}; + +&spi0 { + status = "okay"; +}; + +&spi1 { + status = "okay"; +}; + +&syscontroller { + status = "okay"; +}; + +&syscontroller_qspi { + /* + * The flash *is* there, but Icicle kits that have engineering sample + * silicon (write?) access to this flash to non-functional. The system + * controller itself can actually access it, but the MSS cannot write + * an image there. Instantiating a coreQSPI in the fabric & connecting + * it to the flash instead should work though. Pre-production or later + * silicon does not have this issue. + */ + status = "disabled"; + + sys_ctrl_flash: flash at 0 { // MT25QL01GBBB8ESF-0SIT + compatible = "jedec,spi-nor"; + #address-cells = <1>; + #size-cells = <1>; + spi-max-frequency = <20000000>; + spi-rx-bus-width = <1>; + reg = <0>; + }; +}; + +&usb { + status = "okay"; + dr_mode = "host"; +}; diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts index f80df225f72b..2cb08ed0946d 100644 --- a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts @@ -3,249 +3,10 @@ /dts-v1/; -#include "mpfs.dtsi" -#include "mpfs-icicle-kit-fabric.dtsi" -#include -#include +#include "mpfs-icicle-kit-common.dtsi" / { model = "Microchip PolarFire-SoC Icicle Kit"; compatible = "microchip,mpfs-icicle-reference-rtlv2210", "microchip,mpfs-icicle-kit", "microchip,mpfs"; - - aliases { - ethernet0 = &mac1; - serial0 = &mmuart0; - serial1 = &mmuart1; - serial2 = &mmuart2; - serial3 = &mmuart3; - serial4 = &mmuart4; - }; - - chosen { - stdout-path = "serial1:115200n8"; - }; - - leds { - compatible = "gpio-leds"; - - led-1 { - gpios = <&gpio2 16 GPIO_ACTIVE_HIGH>; - color = ; - label = "led1"; - }; - - led-2 { - gpios = <&gpio2 17 GPIO_ACTIVE_HIGH>; - color = ; - label = "led2"; - }; - - led-3 { - gpios = <&gpio2 18 GPIO_ACTIVE_HIGH>; - color = ; - label = "led3"; - }; - - led-4 { - gpios = <&gpio2 19 GPIO_ACTIVE_HIGH>; - color = ; - label = "led4"; - }; - }; - - ddrc_cache_lo: memory at 80000000 { - device_type = "memory"; - reg = <0x0 0x80000000 0x0 0x40000000>; - status = "okay"; - }; - - ddrc_cache_hi: memory at 1040000000 { - device_type = "memory"; - reg = <0x10 0x40000000 0x0 0x40000000>; - status = "okay"; - }; - - reserved-memory { - #address-cells = <2>; - #size-cells = <2>; - ranges; - - hss_payload: region at BFC00000 { - reg = <0x0 0xBFC00000 0x0 0x400000>; - no-map; - }; - }; -}; - -&core_pwm0 { - status = "okay"; -}; - -&gpio2 { - interrupts = <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>; - status = "okay"; -}; - -&i2c0 { - status = "okay"; -}; - -&i2c1 { - status = "okay"; - - power-monitor at 10 { - compatible = "microchip,pac1934"; - reg = <0x10>; - - #address-cells = <1>; - #size-cells = <0>; - - channel at 1 { - reg = <0x1>; - shunt-resistor-micro-ohms = <10000>; - label = "VDDREG"; - }; - - channel at 2 { - reg = <0x2>; - shunt-resistor-micro-ohms = <10000>; - label = "VDDA25"; - }; - - channel at 3 { - reg = <0x3>; - shunt-resistor-micro-ohms = <10000>; - label = "VDD25"; - }; - - channel at 4 { - reg = <0x4>; - shunt-resistor-micro-ohms = <10000>; - label = "VDDA_REG"; - }; - }; -}; - -&i2c2 { - status = "okay"; -}; - -&mac0 { - phy-mode = "sgmii"; - phy-handle = <&phy0>; - status = "okay"; -}; - -&mac1 { - phy-mode = "sgmii"; - phy-handle = <&phy1>; - status = "okay"; - - phy1: ethernet-phy at 9 { - reg = <9>; - }; - - phy0: ethernet-phy at 8 { - reg = <8>; - }; -}; - -&mbox { - status = "okay"; -}; - -&mmc { - bus-width = <4>; - disable-wp; - cap-sd-highspeed; - cap-mmc-highspeed; - mmc-ddr-1_8v; - mmc-hs200-1_8v; - sd-uhs-sdr12; - sd-uhs-sdr25; - sd-uhs-sdr50; - sd-uhs-sdr104; - status = "okay"; -}; - -&mmuart1 { - status = "okay"; -}; - -&mmuart2 { - status = "okay"; -}; - -&mmuart3 { - status = "okay"; -}; - -&mmuart4 { - status = "okay"; -}; - -&pcie { - status = "okay"; -}; - -&qspi { - status = "okay"; -}; - -&refclk { - clock-frequency = <125000000>; -}; - -&refclk_ccc { - clock-frequency = <50000000>; -}; - -&rtc { - status = "okay"; -}; - -&spi0 { - status = "okay"; -}; - -&spi1 { - status = "okay"; -}; - -&syscontroller { - status = "okay"; -}; - -&syscontroller_qspi { - /* - * The flash *is* there, but Icicle kits that have engineering sample - * silicon (write?) access to this flash to non-functional. The system - * controller itself can actually access it, but the MSS cannot write - * an image there. Instantiating a coreQSPI in the fabric & connecting - * it to the flash instead should work though. Pre-production or later - * silicon does not have this issue. - */ - status = "disabled"; - - sys_ctrl_flash: flash at 0 { // MT25QL01GBBB8ESF-0SIT - compatible = "jedec,spi-nor"; - #address-cells = <1>; - #size-cells = <1>; - spi-max-frequency = <20000000>; - spi-rx-bus-width = <1>; - reg = <0>; - }; -}; - -&usb { - status = "okay"; - dr_mode = "host"; }; -- 2.34.1 From valentina.fernandezalanis at microchip.com Tue Sep 2 00:55:45 2025 From: valentina.fernandezalanis at microchip.com (Valentina Fernandez) Date: Tue, 2 Sep 2025 08:55:45 +0100 Subject: [PATCH v2 2/5] dt-bindings: riscv: microchip: document icicle kit with production device In-Reply-To: <20250902075548.1967613-1-valentina.fernandezalanis@microchip.com> References: <20250902075548.1967613-1-valentina.fernandezalanis@microchip.com> Message-ID: <20250902075548.1967613-3-valentina.fernandezalanis@microchip.com> With the introduction of the Icicle Kit using the production MPFS250T device, it's necessary to distinguish it from the engineering sample (-es) variant. Engineering samples cannot write to flash from the MSS, as noted in the PolarFire SoC FPGA ES errata. Add specific compatibles for the Icicle Kit with Production device (MPFS250T) and Icicle Kit with Engineering Sample (MPFS250T_ES). The icicle kit reference designs in the v2025.07 release include the Mi-V IHC IP v2, used to send/receive data between clusters when using Asymmetric Multiprocessing (AMP) mode. In reference design releases prior to v2025.07, the MI-V IHC subsystem was included as a proof of concept in the design prior to becoming an IP available in the Libero catalog. Among other improvements, the new Mi-V IHC IP v2 includes some changes to the register map. For this reason, make use of a new reference design compatible to denote that v2025.07 reference design releases are not backwards compatible. Signed-off-by: Valentina Fernandez --- Documentation/devicetree/bindings/riscv/microchip.yaml | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/Documentation/devicetree/bindings/riscv/microchip.yaml b/Documentation/devicetree/bindings/riscv/microchip.yaml index 78ce76ae1b6d..8ddc5c02973e 100644 --- a/Documentation/devicetree/bindings/riscv/microchip.yaml +++ b/Documentation/devicetree/bindings/riscv/microchip.yaml @@ -18,10 +18,18 @@ properties: const: '/' compatible: oneOf: + - items: + - const: microchip,mpfs-icicle-prod-reference-rtl-v2507 + - const: microchip,mpfs-icicle-kit-prod + - const: microchip,mpfs-icicle-kit + - const: microchip,mpfs-prod + - const: microchip,mpfs + - items: - enum: - microchip,mpfs-icicle-reference-rtlv2203 - microchip,mpfs-icicle-reference-rtlv2210 + - microchip,mpfs-icicle-es-reference-rtl-v2507 - const: microchip,mpfs-icicle-kit - const: microchip,mpfs -- 2.34.1 From valentina.fernandezalanis at microchip.com Tue Sep 2 00:55:46 2025 From: valentina.fernandezalanis at microchip.com (Valentina Fernandez) Date: Tue, 2 Sep 2025 08:55:46 +0100 Subject: [PATCH v2 3/5] riscv: dts: microchip: add icicle kit with production device In-Reply-To: <20250902075548.1967613-1-valentina.fernandezalanis@microchip.com> References: <20250902075548.1967613-1-valentina.fernandezalanis@microchip.com> Message-ID: <20250902075548.1967613-4-valentina.fernandezalanis@microchip.com> With the introduction of the Icicle Kit using the production MPFS250T device, it's necessary to distinguish it from the engineering sample (-es) variant. Engineering samples cannot write to flash from the MSS, as noted in the PolarFire SoC FPGA ES errata. Add a new device tree (mpfs-icicle-kit-prod.dts) for the production board which includes the icicle kit common dtsi and enable the system controller SPI flash, which is only accessible on production silicon. Remove redundant board compatible from fabric dtsi and update board compatibles for v2025.07 release, which includes Mi-V IHC v2 for AMP cluster communication. Fix formatting by using lowecase hex everywhere and remove reduntant status properties from common dtsi. Signed-off-by: Valentina Fernandez --- arch/riscv/boot/dts/microchip/Makefile | 1 + .../dts/microchip/mpfs-icicle-kit-common.dtsi | 10 ++++---- .../dts/microchip/mpfs-icicle-kit-fabric.dtsi | 23 ++++++++++++++++--- .../dts/microchip/mpfs-icicle-kit-prod.dts | 23 +++++++++++++++++++ .../boot/dts/microchip/mpfs-icicle-kit.dts | 3 ++- 5 files changed, 52 insertions(+), 8 deletions(-) create mode 100644 arch/riscv/boot/dts/microchip/mpfs-icicle-kit-prod.dts diff --git a/arch/riscv/boot/dts/microchip/Makefile b/arch/riscv/boot/dts/microchip/Makefile index f51aeeb9fd3b..1e2f4e41bf0d 100644 --- a/arch/riscv/boot/dts/microchip/Makefile +++ b/arch/riscv/boot/dts/microchip/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-beaglev-fire.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-icicle-kit.dtb +dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-icicle-kit-prod.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-m100pfsevp.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-polarberry.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-sev-kit.dtb diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi index eafea3b69cd7..e01a216e6c3a 100644 --- a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi @@ -53,13 +53,11 @@ led-4 { ddrc_cache_lo: memory at 80000000 { device_type = "memory"; reg = <0x0 0x80000000 0x0 0x40000000>; - status = "okay"; }; ddrc_cache_hi: memory at 1040000000 { device_type = "memory"; reg = <0x10 0x40000000 0x0 0x40000000>; - status = "okay"; }; reserved-memory { @@ -67,8 +65,8 @@ reserved-memory { #size-cells = <2>; ranges; - hss_payload: region at BFC00000 { - reg = <0x0 0xBFC00000 0x0 0x400000>; + hss_payload: region at bfc00000 { + reg = <0x0 0xbfc00000 0x0 0x400000>; no-map; }; }; @@ -134,6 +132,10 @@ &i2c2 { status = "okay"; }; +&ihc { + status = "okay"; +}; + &mac0 { phy-mode = "sgmii"; phy-handle = <&phy0>; diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-fabric.dtsi b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-fabric.dtsi index a6dda55a2d1d..e673b676fd1a 100644 --- a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-fabric.dtsi +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-fabric.dtsi @@ -2,9 +2,6 @@ /* Copyright (c) 2020-2021 Microchip Technology Inc */ / { - compatible = "microchip,mpfs-icicle-reference-rtlv2210", "microchip,mpfs-icicle-kit", - "microchip,mpfs"; - core_pwm0: pwm at 40000000 { compatible = "microchip,corepwm-rtl-v4"; reg = <0x0 0x40000000 0x0 0xF0>; @@ -26,6 +23,26 @@ i2c2: i2c at 40000200 { status = "disabled"; }; + ihc: mailbox { + compatible = "microchip,sbi-ipc"; + interrupt-parent = <&plic>; + interrupts = <180>, <179>, <178>, <177>; + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; + #mbox-cells = <1>; + status = "disabled"; + }; + + mailbox at 50000000 { + compatible = "microchip,miv-ihc-rtl-v2"; + reg = <0x0 0x50000000 0x0 0x1c000>; + interrupt-parent = <&plic>; + interrupts = <180>, <179>, <178>, <177>; + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; + #mbox-cells = <1>; + microchip,ihc-chan-disabled-mask = /bits/ 16 <0>; + status = "disabled"; + }; + pcie: pcie at 3000000000 { compatible = "microchip,pcie-host-1.0"; #address-cells = <0x3>; diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-prod.dts b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-prod.dts new file mode 100644 index 000000000000..8afedece89d1 --- /dev/null +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-prod.dts @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: (GPL-2.0 OR MIT) +/* Copyright (c) 2025 Microchip Technology Inc */ + +/dts-v1/; + +#include "mpfs-icicle-kit-common.dtsi" + +/ { + model = "Microchip PolarFire-SoC Icicle Kit (Production Silicon)"; + compatible = "microchip,mpfs-icicle-prod-reference-rtl-v2507", + "microchip,mpfs-icicle-kit-prod", + "microchip,mpfs-icicle-kit", + "microchip,mpfs-prod", + "microchip,mpfs"; +}; + +&syscontroller { + microchip,bitstream-flash = <&sys_ctrl_flash>; +}; + +&syscontroller_qspi { + status = "okay"; +}; diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts index 2cb08ed0946d..556aa9638282 100644 --- a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts @@ -7,6 +7,7 @@ / { model = "Microchip PolarFire-SoC Icicle Kit"; - compatible = "microchip,mpfs-icicle-reference-rtlv2210", "microchip,mpfs-icicle-kit", + compatible = "microchip,mpfs-icicle-es-reference-rtl-v2507", + "microchip,mpfs-icicle-kit", "microchip,mpfs"; }; -- 2.34.1 From valentina.fernandezalanis at microchip.com Tue Sep 2 00:55:47 2025 From: valentina.fernandezalanis at microchip.com (Valentina Fernandez) Date: Tue, 2 Sep 2025 08:55:47 +0100 Subject: [PATCH v2 4/5] dt-bindings: riscv: microchip: document Discovery Kit In-Reply-To: <20250902075548.1967613-1-valentina.fernandezalanis@microchip.com> References: <20250902075548.1967613-1-valentina.fernandezalanis@microchip.com> Message-ID: <20250902075548.1967613-5-valentina.fernandezalanis@microchip.com> The Discovery Kit (MPFS-DISCO-KIT) is a development board featuring a Microchip PolarFire SoC MPFS095T. Link: https://www.microchip.com/en-us/development-tool/mpfs-disco-kit Signed-off-by: Valentina Fernandez --- Documentation/devicetree/bindings/riscv/microchip.yaml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/Documentation/devicetree/bindings/riscv/microchip.yaml b/Documentation/devicetree/bindings/riscv/microchip.yaml index 8ddc5c02973e..381d6eb6672e 100644 --- a/Documentation/devicetree/bindings/riscv/microchip.yaml +++ b/Documentation/devicetree/bindings/riscv/microchip.yaml @@ -33,6 +33,11 @@ properties: - const: microchip,mpfs-icicle-kit - const: microchip,mpfs + - items: + - const: microchip,mpfs-disco-kit-reference-rtl-v2507 + - const: microchip,mpfs-disco-kit + - const: microchip,mpfs + - items: - enum: - aldec,tysom-m-mpfs250t-rev2 -- 2.34.1 From valentina.fernandezalanis at microchip.com Tue Sep 2 00:55:48 2025 From: valentina.fernandezalanis at microchip.com (Valentina Fernandez) Date: Tue, 2 Sep 2025 08:55:48 +0100 Subject: [PATCH v2 5/5] riscv: dts: microchip: add a device tree for Discovery Kit In-Reply-To: <20250902075548.1967613-1-valentina.fernandezalanis@microchip.com> References: <20250902075548.1967613-1-valentina.fernandezalanis@microchip.com> Message-ID: <20250902075548.1967613-6-valentina.fernandezalanis@microchip.com> Add a minimal device tree for the Microchip PolarFire SoC Discovery Kit. The Discovery Kit is a cost-optimized board based on PolarFire SoC MPFS095T and features: - 1 GB DDR4x16 - 1x Gigabit Ethernet - 3x UARTs - Raspberry Pi connector - mikroBus connector - microSD card connector Link: https://www.microchip.com/en-us/development-tool/mpfs-disco-kit Signed-off-by: Valentina Fernandez --- arch/riscv/boot/dts/microchip/Makefile | 1 + .../dts/microchip/mpfs-disco-kit-fabric.dtsi | 58 ++++++ .../boot/dts/microchip/mpfs-disco-kit.dts | 190 ++++++++++++++++++ 3 files changed, 249 insertions(+) create mode 100644 arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi create mode 100644 arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts diff --git a/arch/riscv/boot/dts/microchip/Makefile b/arch/riscv/boot/dts/microchip/Makefile index 1e2f4e41bf0d..345ed7a48cc1 100644 --- a/arch/riscv/boot/dts/microchip/Makefile +++ b/arch/riscv/boot/dts/microchip/Makefile @@ -1,5 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-beaglev-fire.dtb +dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-disco-kit.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-icicle-kit.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-icicle-kit-prod.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-m100pfsevp.dtb diff --git a/arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi b/arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi new file mode 100644 index 000000000000..03900e634fe2 --- /dev/null +++ b/arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi @@ -0,0 +1,58 @@ +// SPDX-License-Identifier: (GPL-2.0 OR MIT) +/* Copyright (c) 2020-2025 Microchip Technology Inc */ + +/ { + core_pwm0: pwm at 40000000 { + compatible = "microchip,corepwm-rtl-v4"; + reg = <0x0 0x40000000 0x0 0xF0>; + microchip,sync-update-mask = /bits/ 32 <0>; + #pwm-cells = <3>; + clocks = <&ccc_sw CLK_CCC_PLL0_OUT3>; + status = "disabled"; + }; + + i2c2: i2c at 40000200 { + compatible = "microchip,corei2c-rtl-v7"; + reg = <0x0 0x40000200 0x0 0x100>; + #address-cells = <1>; + #size-cells = <0>; + clocks = <&ccc_sw CLK_CCC_PLL0_OUT3>; + interrupt-parent = <&plic>; + interrupts = <122>; + clock-frequency = <100000>; + status = "disabled"; + }; + + ihc: mailbox { + compatible = "microchip,sbi-ipc"; + interrupt-parent = <&plic>; + interrupts = <180>, <179>, <178>, <177>; + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; + #mbox-cells = <1>; + status = "disabled"; + }; + + mailbox at 50000000 { + compatible = "microchip,miv-ihc-rtl-v2"; + reg = <0x0 0x50000000 0x0 0x1c000>; + interrupt-parent = <&plic>; + interrupts = <180>, <179>, <178>, <177>; + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; + #mbox-cells = <1>; + microchip,ihc-chan-disabled-mask = /bits/ 16 <0>; + status = "disabled"; + }; + + refclk_ccc: cccrefclk { + compatible = "fixed-clock"; + #clock-cells = <0>; + }; +}; + +&ccc_sw { + clocks = <&refclk_ccc>, <&refclk_ccc>, <&refclk_ccc>, <&refclk_ccc>, + <&refclk_ccc>, <&refclk_ccc>; + clock-names = "pll0_ref0", "pll0_ref1", "pll1_ref0", "pll1_ref1", + "dll0_ref", "dll1_ref"; + status = "okay"; +}; diff --git a/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts b/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts new file mode 100644 index 000000000000..c068b9bb5bfd --- /dev/null +++ b/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts @@ -0,0 +1,190 @@ +// SPDX-License-Identifier: (GPL-2.0 OR MIT) +/* Copyright (c) 2020-2025 Microchip Technology Inc */ + +/dts-v1/; + +#include "mpfs.dtsi" +#include "mpfs-disco-kit-fabric.dtsi" +#include +#include + +/ { + model = "Microchip PolarFire-SoC Discovery Kit"; + compatible = "microchip,mpfs-disco-kit-reference-rtl-v2507", + "microchip,mpfs-disco-kit", + "microchip,mpfs"; + + aliases { + ethernet0 = &mac0; + serial4 = &mmuart4; + }; + + chosen { + stdout-path = "serial4:115200n8"; + }; + + leds { + compatible = "gpio-leds"; + + led-1 { + gpios = <&gpio2 17 GPIO_ACTIVE_HIGH>; + color = ; + label = "led1"; + }; + + led-2 { + gpios = <&gpio2 18 GPIO_ACTIVE_HIGH>; + color = ; + label = "led2"; + }; + + led-3 { + gpios = <&gpio2 19 GPIO_ACTIVE_HIGH>; + color = ; + label = "led3"; + }; + + led-4 { + gpios = <&gpio2 20 GPIO_ACTIVE_HIGH>; + color = ; + label = "led4"; + }; + + led-5 { + gpios = <&gpio2 21 GPIO_ACTIVE_HIGH>; + color = ; + label = "led5"; + }; + + led-6 { + gpios = <&gpio2 22 GPIO_ACTIVE_HIGH>; + color = ; + label = "led6"; + }; + + led-7 { + gpios = <&gpio2 23 GPIO_ACTIVE_HIGH>; + color = ; + label = "led7"; + }; + + led-8 { + gpios = <&gpio1 9 GPIO_ACTIVE_HIGH>; + color = ; + label = "led8"; + }; + }; + + ddrc_cache_lo: memory at 80000000 { + device_type = "memory"; + reg = <0x0 0x80000000 0x0 0x40000000>; + }; + + reserved-memory { + #address-cells = <2>; + #size-cells = <2>; + ranges; + + hss_payload: region at bfc00000 { + reg = <0x0 0xbfc00000 0x0 0x400000>; + no-map; + }; + }; +}; + +&core_pwm0 { + status = "okay"; +}; + +&gpio1 { + interrupts = <27>, <28>, <29>, <30>, + <31>, <32>, <33>, <47>, + <35>, <36>, <37>, <38>, + <39>, <40>, <41>, <42>, + <43>, <44>, <45>, <46>, + <47>, <48>, <49>, <50>; + status = "okay"; +}; + +&gpio2 { + interrupts = <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>; + status = "okay"; +}; + +&i2c0 { + status = "okay"; +}; + +&i2c2 { + status = "okay"; +}; + +&ihc { + status = "okay"; +}; + +&mac0 { + phy-mode = "sgmii"; + phy-handle = <&phy0>; + status = "okay"; + + phy0: ethernet-phy at b { + reg = <0xb>; + }; +}; + +&mbox { + status = "okay"; +}; + +&mmc { + bus-width = <4>; + disable-wp; + cap-sd-highspeed; + cap-mmc-highspeed; + sd-uhs-sdr12; + sd-uhs-sdr25; + sd-uhs-sdr50; + sd-uhs-sdr104; + no-1-8-v; + status = "okay"; +}; + +&mmuart1 { + status = "okay"; +}; + +&mmuart4 { + status = "okay"; +}; + +&refclk { + clock-frequency = <125000000>; +}; + +&refclk_ccc { + clock-frequency = <50000000>; +}; + +&rtc { + status = "okay"; +}; + +&spi0 { + status = "okay"; +}; + +&spi1 { + status = "okay"; +}; + +&syscontroller { + status = "okay"; +}; -- 2.34.1 From alex at ghiti.fr Tue Sep 2 01:07:16 2025 From: alex at ghiti.fr (Alexandre Ghiti) Date: Tue, 2 Sep 2025 10:07:16 +0200 Subject: [GIT PULL] RISC-V Fixes for 6.17-rc5 Message-ID: <905a306b-6a92-4335-b31d-51802f7845af@ghiti.fr> The following changes since commit b320789d6883cc00ac78ce83bccbfe7ed58afcf0: ? Linux 6.17-rc4 (2025-08-31 15:33:07 -0700) are available in the Git repository at: ? git://git.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux tags/riscv-fixes-6.17-rc5 for you to fetch changes up to 1fa00f3deacafe202eba6887deba74ea6402c883: ? riscv: Use an atomic xchg in pudp_huge_get_and_clear() (2025-09-02 07:34:21 +0000) ---------------------------------------------------------------- riscv fixes for 6.17-rc5 - A fix for a link error by disabling LTO on medlow code model - 4 fixes where we used xlen-bits wide loads on 32-bit values - A fix in user access routines where we should have written the size of the destination, not the size of the source, which appeared in glibc testsuite - A fix in ACPI riscv csr read routines where the error code was incorrect - A fix for THP PUD to prevent returning an old pte value ---------------------------------------------------------------- Alexandre Ghiti (1): ? ? ? riscv: Use an atomic xchg in pudp_huge_get_and_clear() Anup Patel (1): ? ? ? ACPI: RISC-V: Fix FFH_CPPC_CSR error handling Aurelien Jarno (1): ? ? ? riscv: uaccess: fix __put_user_nocheck for unaligned accesses Nathan Chancellor (1): ? ? ? riscv: Only allow LTO with CMODEL_MEDANY Radim Kr?m?? (4): ? ? ? riscv: use lw when reading int cpu in new_vmalloc_check ? ? ? riscv: use lw when reading int cpu in asm_per_cpu ? ? ? riscv, bpf: use lw when reading int cpu in BPF_MOV64_PERCPU_REG ? ? ? riscv, bpf: use lw when reading int cpu in bpf_get_smp_processor_id ?arch/riscv/Kconfig? ? ? ? ? ? ? ?|? 2 +- ?arch/riscv/include/asm/asm.h? ? ?|? 2 +- ?arch/riscv/include/asm/pgtable.h | 11 +++++++++++ ?arch/riscv/include/asm/uaccess.h |? 2 +- ?arch/riscv/kernel/entry.S? ? ? ? |? 2 +- ?arch/riscv/net/bpf_jit_comp64.c? |? 4 ++-- ?drivers/acpi/riscv/cppc.c? ? ? ? |? 4 ++-- ?7 files changed, 19 insertions(+), 8 deletions(-) From krzk at kernel.org Tue Sep 2 01:27:53 2025 From: krzk at kernel.org (Krzysztof Kozlowski) Date: Tue, 2 Sep 2025 10:27:53 +0200 Subject: [PATCH 1/4] dt-bindings: reset: thead,th1520-reset: Add controllers for more subsys In-Reply-To: <20250901042320.22865-2-ziyao@disroot.org> References: <20250901042320.22865-1-ziyao@disroot.org> <20250901042320.22865-2-ziyao@disroot.org> Message-ID: <20250902-peach-jackal-of-judgment-8aee13@kuoka> On Mon, Sep 01, 2025 at 04:23:17AM +0000, Yao Zi wrote: > +/* VO Subsystem */ > #define TH1520_RESET_ID_GPU 0 > #define TH1520_RESET_ID_GPU_CLKGEN 1 > -#define TH1520_RESET_ID_NPU 2 > -#define TH1520_RESET_ID_WDT0 3 > -#define TH1520_RESET_ID_WDT1 4 This is ABI break and deserves explanation and its own patchset. Best regards, Krzysztof From conor at kernel.org Tue Sep 2 01:31:55 2025 From: conor at kernel.org (Conor Dooley) Date: Tue, 2 Sep 2025 09:31:55 +0100 Subject: [PATCH v1 5/5] riscv: dts: microchip: add a device tree for Discovery Kit In-Reply-To: <0d90eeb4-e6ac-459c-a6b1-26368f102e0e@kernel.org> References: <20250825161952.3902672-1-valentina.fernandezalanis@microchip.com> <20250825161952.3902672-6-valentina.fernandezalanis@microchip.com> <2b1eb8fd-2a64-4745-ad93-abc53d240b69@kernel.org> <0d90eeb4-e6ac-459c-a6b1-26368f102e0e@kernel.org> Message-ID: <20250902-affair-scrambler-2771df16372e@spud> On Tue, Sep 02, 2025 at 08:22:02AM +0200, Krzysztof Kozlowski wrote: > >>> + refclk_ccc: cccrefclk { > >> > >> Please use name for all fixed clocks which matches current format > >> recommendation: 'clock-' (see also the pattern in the binding for > >> any other options). > >> > >> https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/clock/fixed-clock.yaml > > The fabric dtsi describes elements configured by the FPGA bitstream. > > This node is named as such because the Clock Conditioner Circuit CCC's > > reference clock source is set by the FPGA bitstream, while its frequency > > is determined by an on-board oscillator. > > > > Hope this clarifies the rationale behind the node name. > No, because there is no style naming clocks like this. Neither proper > suffix, nor prefix. Use standard naming. So you want all fixed frequency clocks to be named "clk-foo" when "clk-" is not suitable? Fine if you do, but I didn't realise that it was required and haven't been keeping an eye out for it. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From ziyao at disroot.org Tue Sep 2 01:32:36 2025 From: ziyao at disroot.org (Yao Zi) Date: Tue, 2 Sep 2025 08:32:36 +0000 Subject: [PATCH v2 5/5] riscv: dts: microchip: add a device tree for Discovery Kit In-Reply-To: <20250902075548.1967613-6-valentina.fernandezalanis@microchip.com> References: <20250902075548.1967613-1-valentina.fernandezalanis@microchip.com> <20250902075548.1967613-6-valentina.fernandezalanis@microchip.com> Message-ID: On Tue, Sep 02, 2025 at 08:55:48AM +0100, Valentina Fernandez wrote: > Add a minimal device tree for the Microchip PolarFire SoC Discovery Kit. > The Discovery Kit is a cost-optimized board based on PolarFire SoC > MPFS095T and features: > > - 1 GB DDR4x16 > - 1x Gigabit Ethernet > - 3x UARTs > - Raspberry Pi connector > - mikroBus connector > - microSD card connector > > Link: https://www.microchip.com/en-us/development-tool/mpfs-disco-kit > Signed-off-by: Valentina Fernandez > --- > arch/riscv/boot/dts/microchip/Makefile | 1 + > .../dts/microchip/mpfs-disco-kit-fabric.dtsi | 58 ++++++ > .../boot/dts/microchip/mpfs-disco-kit.dts | 190 ++++++++++++++++++ > 3 files changed, 249 insertions(+) > create mode 100644 arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi > create mode 100644 arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts ... > diff --git a/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts b/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts > new file mode 100644 > index 000000000000..c068b9bb5bfd > --- /dev/null > +++ b/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts ... > +&mbox { > + status = "okay"; > +}; > + > +&mmc { > + bus-width = <4>; > + disable-wp; > + cap-sd-highspeed; > + cap-mmc-highspeed; > + sd-uhs-sdr12; > + sd-uhs-sdr25; > + sd-uhs-sdr50; > + sd-uhs-sdr104; I think sd-uhs-sdr104 implies sd-uhs-sdr{12,25,50}, thus the latter three properties could be dropped. > + no-1-8-v; > + status = "okay"; > +}; Best regards, Yao Zi From Valentina.FernandezAlanis at microchip.com Tue Sep 2 01:39:53 2025 From: Valentina.FernandezAlanis at microchip.com (Valentina.FernandezAlanis at microchip.com) Date: Tue, 2 Sep 2025 08:39:53 +0000 Subject: [PATCH v1 5/5] riscv: dts: microchip: add a device tree for Discovery Kit In-Reply-To: <0d90eeb4-e6ac-459c-a6b1-26368f102e0e@kernel.org> References: <20250825161952.3902672-1-valentina.fernandezalanis@microchip.com> <20250825161952.3902672-6-valentina.fernandezalanis@microchip.com> <2b1eb8fd-2a64-4745-ad93-abc53d240b69@kernel.org> <0d90eeb4-e6ac-459c-a6b1-26368f102e0e@kernel.org> Message-ID: On 02/09/2025 07:22, Krzysztof Kozlowski wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe > > On 01/09/2025 17:28, Valentina.FernandezAlanis at microchip.com wrote: >> On 28/08/2025 18:46, Krzysztof Kozlowski wrote: >>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe >>> >>> On 25/08/2025 18:19, Valentina Fernandez wrote: >>>> +++ b/arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi >>>> @@ -0,0 +1,58 @@ >>>> +// SPDX-License-Identifier: (GPL-2.0 OR MIT) >>>> +/* Copyright (c) 2020-2025 Microchip Technology Inc */ >>>> + >>>> +/ { >>>> + core_pwm0: pwm at 40000000 { >>>> + compatible = "microchip,corepwm-rtl-v4"; >>>> + reg = <0x0 0x40000000 0x0 0xF0>; >>>> + microchip,sync-update-mask = /bits/ 32 <0>; >>>> + #pwm-cells = <3>; >>>> + clocks = <&ccc_sw CLK_CCC_PLL0_OUT3>; >>>> + status = "disabled"; >>>> + }; >>>> + >>>> + i2c2: i2c at 40000200 { >>>> + compatible = "microchip,corei2c-rtl-v7"; >>>> + reg = <0x0 0x40000200 0x0 0x100>; >>>> + #address-cells = <1>; >>>> + #size-cells = <0>; >>>> + clocks = <&ccc_sw CLK_CCC_PLL0_OUT3>; >>>> + interrupt-parent = <&plic>; >>>> + interrupts = <122>; >>>> + clock-frequency = <100000>; >>>> + status = "disabled"; >>>> + }; >>>> + >>>> + ihc: mailbox { >>>> + compatible = "microchip,sbi-ipc"; >>>> + interrupt-parent = <&plic>; >>>> + interrupts = <180>, <179>, <178>, <177>; >>>> + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; >>>> + #mbox-cells = <1>; >>>> + status = "disabled"; >>>> + }; >>>> + >>>> + mailbox at 50000000 { >>>> + compatible = "microchip,miv-ihc-rtl-v2"; >>>> + microchip,ihc-chan-disabled-mask = /bits/ 16 <0>; >>> >>> Does not look like following DTS coding style - order of properties. >>> >>>> + reg = <0x0 0x50000000 0x0 0x1c000>; >>>> + interrupt-parent = <&plic>; >>>> + interrupts = <180>, <179>, <178>, <177>; >>>> + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; >>>> + #mbox-cells = <1>; >>>> + status = "disabled"; >>>> + }; >>>> + >>>> + refclk_ccc: cccrefclk { >>> >>> Please use name for all fixed clocks which matches current format >>> recommendation: 'clock-' (see also the pattern in the binding for >>> any other options). >>> >>> https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/clock/fixed-clock.yaml >> The fabric dtsi describes elements configured by the FPGA bitstream. >> This node is named as such because the Clock Conditioner Circuit CCC's >> reference clock source is set by the FPGA bitstream, while its frequency >> is determined by an on-board oscillator. >> >> Hope this clarifies the rationale behind the node name. > No, because there is no style naming clocks like this. Neither proper > suffix, nor prefix. Use standard naming. > > And all other comments you ignored? I sent a v2 with the rest of the comments addressed. I didn't notice you were still not happy with the clock node name, please ignore the v2. > > Best regards, > Krzysztof From ziyao at disroot.org Tue Sep 2 02:04:23 2025 From: ziyao at disroot.org (Yao Zi) Date: Tue, 2 Sep 2025 09:04:23 +0000 Subject: [PATCH 1/4] dt-bindings: reset: thead,th1520-reset: Add controllers for more subsys In-Reply-To: <20250902-peach-jackal-of-judgment-8aee13@kuoka> References: <20250901042320.22865-1-ziyao@disroot.org> <20250901042320.22865-2-ziyao@disroot.org> <20250902-peach-jackal-of-judgment-8aee13@kuoka> Message-ID: On Tue, Sep 02, 2025 at 10:27:53AM +0200, Krzysztof Kozlowski wrote: > On Mon, Sep 01, 2025 at 04:23:17AM +0000, Yao Zi wrote: > > +/* VO Subsystem */ > > #define TH1520_RESET_ID_GPU 0 > > #define TH1520_RESET_ID_GPU_CLKGEN 1 > > -#define TH1520_RESET_ID_NPU 2 > > -#define TH1520_RESET_ID_WDT0 3 > > -#define TH1520_RESET_ID_WDT1 4 > > This is ABI break and deserves explanation and its own patchset. The registers in control of TH1520_RESET_ID_{NPU,WDT0,WDT1} don't belong to the VO reset controller (documented as "thead,th1520-reset"), and thus cannot be implemented by it. They're in fact AP subsystem resets, which gets supported in Linux with this series. Is it okay for you to separate a patch to delete these wrong IDs and add them back for the AP reset controller latter? Anyway, I should have provided more information about these three resets. Thanks for catching this. > Best regards, > Krzysztof > Best regards, Yao Zi From tursulin at ursulin.net Tue Sep 2 02:22:09 2025 From: tursulin at ursulin.net (Tvrtko Ursulin) Date: Tue, 2 Sep 2025 10:22:09 +0100 Subject: [PATCH v2 26/37] drm/i915/gem: drop nth_page() usage within SG entry In-Reply-To: <20250901150359.867252-27-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-27-david@redhat.com> Message-ID: <4bbf5590-7591-4dfc-a23e-0bda6cb31a80@ursulin.net> On 01/09/2025 16:03, David Hildenbrand wrote: > It's no longer required to use nth_page() when iterating pages within a > single SG entry, so let's drop the nth_page() usage. > > Reviewed-by: Lorenzo Stoakes > Cc: Jani Nikula > Cc: Joonas Lahtinen > Cc: Rodrigo Vivi > Cc: Tvrtko Ursulin > Cc: David Airlie > Cc: Simona Vetter > Signed-off-by: David Hildenbrand > --- > drivers/gpu/drm/i915/gem/i915_gem_pages.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c > index c16a57160b262..031d7acc16142 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c > @@ -779,7 +779,7 @@ __i915_gem_object_get_page(struct drm_i915_gem_object *obj, pgoff_t n) > GEM_BUG_ON(!i915_gem_object_has_struct_page(obj)); > > sg = i915_gem_object_get_sg(obj, n, &offset); > - return nth_page(sg_page(sg), offset); > + return sg_page(sg) + offset; > } > > /* Like i915_gem_object_get_page(), but mark the returned page dirty */ LGTM. If you want an ack to merge via a tree other than i915 you have it. I suspect it might be easier to coordinate like that. Regards, Tvrtko From david at redhat.com Tue Sep 2 02:42:30 2025 From: david at redhat.com (David Hildenbrand) Date: Tue, 2 Sep 2025 11:42:30 +0200 Subject: [PATCH v2 26/37] drm/i915/gem: drop nth_page() usage within SG entry In-Reply-To: <4bbf5590-7591-4dfc-a23e-0bda6cb31a80@ursulin.net> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-27-david@redhat.com> <4bbf5590-7591-4dfc-a23e-0bda6cb31a80@ursulin.net> Message-ID: <22019944-2ef2-4463-9b3f-23c9e7c70b2f@redhat.com> On 02.09.25 11:22, Tvrtko Ursulin wrote: > > On 01/09/2025 16:03, David Hildenbrand wrote: >> It's no longer required to use nth_page() when iterating pages within a >> single SG entry, so let's drop the nth_page() usage. >> >> Reviewed-by: Lorenzo Stoakes >> Cc: Jani Nikula >> Cc: Joonas Lahtinen >> Cc: Rodrigo Vivi >> Cc: Tvrtko Ursulin >> Cc: David Airlie >> Cc: Simona Vetter >> Signed-off-by: David Hildenbrand >> --- >> drivers/gpu/drm/i915/gem/i915_gem_pages.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c >> index c16a57160b262..031d7acc16142 100644 >> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c >> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c >> @@ -779,7 +779,7 @@ __i915_gem_object_get_page(struct drm_i915_gem_object *obj, pgoff_t n) >> GEM_BUG_ON(!i915_gem_object_has_struct_page(obj)); >> >> sg = i915_gem_object_get_sg(obj, n, &offset); >> - return nth_page(sg_page(sg), offset); >> + return sg_page(sg) + offset; >> } >> >> /* Like i915_gem_object_get_page(), but mark the returned page dirty */ > > LGTM. If you want an ack to merge via a tree other than i915 you have > it. I suspect it might be easier to coordinate like that. Yeah, it would be best to route all of that through the MM tree. Thanks! -- Cheers David / dhildenb From vkoul at kernel.org Tue Sep 2 02:50:00 2025 From: vkoul at kernel.org (Vinod Koul) Date: Tue, 02 Sep 2025 15:20:00 +0530 Subject: (subset) [PATCH v5 0/8] dmaengine: mmp_pdma: Add SpacemiT K1 SoC support with 64-bit addressing In-Reply-To: <20250822-working_dma_0701_v2-v5-0-f5c0eda734cc@riscstar.com> References: <20250822-working_dma_0701_v2-v5-0-f5c0eda734cc@riscstar.com> Message-ID: <175680660058.246694.5045747556533020350.b4-ty@kernel.org> On Fri, 22 Aug 2025 11:06:26 +0800, Guodong Xu wrote: > This patchset adds support for SpacemiT K1 PDMA controller to the existing > mmp_pdma driver. The K1 PDMA controller is compatible with Marvell MMP PDMA > but extends it with 64-bit addressing capabilities through LPAE (Long > Physical Address Extension) bit and higher 32-bit address registers (DDADRH, > DSADRH and DTADRH). > > In v5, two smatch warnings reported by kernel test bot and Dan Carpenter were > fixed. > > [...] Applied, thanks! [1/8] dt-bindings: dma: Add SpacemiT K1 PDMA controller commit: 39ce725e621b256188550492b4b53fb02bfc872e [2/8] dmaengine: mmp_pdma: Add clock support commit: e73a9a13c99c5a55abfdb8c273651509be1eb5bb [3/8] dmaengine: mmp_pdma: Add reset controller support commit: fc72462bc6107b8babda05cad5bf8f7daf8bec20 [4/8] dmaengine: mmp_pdma: Add operations structure for controller abstraction commit: 35e40bf761fcb24b1355d6a8d48b5b10683fe1a3 [5/8] dmaengine: mmp_pdma: Add SpacemiT K1 PDMA support with 64-bit addressing commit: 5cfe585d8624f7482505183dd0e4c534b061e822 Best regards, -- ~Vinod From lkp at intel.com Tue Sep 2 03:50:38 2025 From: lkp at intel.com (kernel test robot) Date: Tue, 2 Sep 2025 18:50:38 +0800 Subject: [PATCH 3/4] reset: th1520: Support reset controllers in more subsystems In-Reply-To: <20250901042320.22865-4-ziyao@disroot.org> References: <20250901042320.22865-4-ziyao@disroot.org> Message-ID: <202509021804.FXl7up6q-lkp@intel.com> Hi Yao, kernel test robot noticed the following build warnings: [auto build test WARNING on pza/reset/next] [also build test WARNING on next-20250902] [cannot apply to robh/for-next pza/imx-drm/next linus/master v6.17-rc4] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Yao-Zi/dt-bindings-reset-thead-th1520-reset-Add-controllers-for-more-subsys/20250901-122656 base: https://git.pengutronix.de/git/pza/linux reset/next patch link: https://lore.kernel.org/r/20250901042320.22865-4-ziyao%40disroot.org patch subject: [PATCH 3/4] reset: th1520: Support reset controllers in more subsystems config: alpha-randconfig-r133-20250902 (https://download.01.org/0day-ci/archive/20250902/202509021804.FXl7up6q-lkp at intel.com/config) compiler: alpha-linux-gcc (GCC) 8.5.0 reproduce: (https://download.01.org/0day-ci/archive/20250902/202509021804.FXl7up6q-lkp at intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202509021804.FXl7up6q-lkp at intel.com/ sparse warnings: (new ones prefixed by >>) >> drivers/reset/reset-th1520.c:157:10: sparse: sparse: Initializer entry defined twice drivers/reset/reset-th1520.c:161:10: sparse: also defined here drivers/reset/reset-th1520.c:808:10: sparse: sparse: Initializer entry defined twice drivers/reset/reset-th1520.c:820:10: sparse: also defined here vim +157 drivers/reset/reset-th1520.c 135 136 static const struct th1520_reset_map th1520_ap_resets[] = { 137 [TH1520_RESET_ID_BROM] = { 138 .bit = BIT(0), 139 .reg = TH1520_BROM_RST_CFG, 140 }, 141 [TH1520_RESET_ID_C910_TOP] = { 142 .bit = BIT(0), 143 .reg = TH1520_C910_RST_CFG, 144 }, 145 [TH1520_RESET_ID_NPU] = { 146 .bit = BIT(0), 147 .reg = TH1520_IMG_NNA_RST_CFG, 148 }, 149 [TH1520_RESET_ID_WDT0] = { 150 .bit = BIT(0), 151 .reg = TH1520_WDT0_RST_CFG, 152 }, 153 [TH1520_RESET_ID_WDT1] = { 154 .bit = BIT(0), 155 .reg = TH1520_WDT1_RST_CFG, 156 }, > 157 [TH1520_RESET_ID_C910_C0] = { 158 .bit = BIT(1), 159 .reg = TH1520_C910_RST_CFG, 160 }, 161 [TH1520_RESET_ID_C910_C1] = { 162 .bit = BIT(2), 163 .reg = TH1520_C910_RST_CFG, 164 }, 165 [TH1520_RESET_ID_C910_C2] = { 166 .bit = BIT(3), 167 .reg = TH1520_C910_RST_CFG, 168 }, 169 [TH1520_RESET_ID_C910_C3] = { 170 .bit = BIT(4), 171 .reg = TH1520_C910_RST_CFG, 172 }, 173 [TH1520_RESET_ID_CHIP_DBG_CORE] = { 174 .bit = BIT(0), 175 .reg = TH1520_CHIP_DBG_RST_CFG, 176 }, 177 [TH1520_RESET_ID_CHIP_DBG_AXI] = { 178 .bit = BIT(1), 179 .reg = TH1520_CHIP_DBG_RST_CFG, 180 }, 181 [TH1520_RESET_ID_AXI4_CPUSYS2_AXI] = { 182 .bit = BIT(0), 183 .reg = TH1520_AXI4_CPUSYS2_RST_CFG, 184 }, 185 [TH1520_RESET_ID_AXI4_CPUSYS2_APB] = { 186 .bit = BIT(1), 187 .reg = TH1520_AXI4_CPUSYS2_RST_CFG, 188 }, 189 [TH1520_RESET_ID_X2H_CPUSYS] = { 190 .bit = BIT(0), 191 .reg = TH1520_X2H_CPUSYS_RST_CFG, 192 }, 193 [TH1520_RESET_ID_AHB2_CPUSYS] = { 194 .bit = BIT(0), 195 .reg = TH1520_AHB2_CPUSYS_RST_CFG, 196 }, 197 [TH1520_RESET_ID_APB3_CPUSYS] = { 198 .bit = BIT(0), 199 .reg = TH1520_APB3_CPUSYS_RST_CFG, 200 }, 201 [TH1520_RESET_ID_MBOX0_APB] = { 202 .bit = BIT(0), 203 .reg = TH1520_MBOX0_RST_CFG, 204 }, 205 [TH1520_RESET_ID_MBOX1_APB] = { 206 .bit = BIT(0), 207 .reg = TH1520_MBOX1_RST_CFG, 208 }, 209 [TH1520_RESET_ID_MBOX2_APB] = { 210 .bit = BIT(0), 211 .reg = TH1520_MBOX2_RST_CFG, 212 }, 213 [TH1520_RESET_ID_MBOX3_APB] = { 214 .bit = BIT(0), 215 .reg = TH1520_MBOX3_RST_CFG, 216 }, 217 [TH1520_RESET_ID_TIMER0_APB] = { 218 .bit = BIT(0), 219 .reg = TH1520_TIMER0_RST_CFG, 220 }, 221 [TH1520_RESET_ID_TIMER0_CORE] = { 222 .bit = BIT(1), 223 .reg = TH1520_TIMER0_RST_CFG, 224 }, 225 [TH1520_RESET_ID_TIMER1_APB] = { 226 .bit = BIT(0), 227 .reg = TH1520_TIMER1_RST_CFG, 228 }, 229 [TH1520_RESET_ID_TIMER1_CORE] = { 230 .bit = BIT(1), 231 .reg = TH1520_TIMER1_RST_CFG, 232 }, 233 [TH1520_RESET_ID_PERISYS_AHB] = { 234 .bit = BIT(0), 235 .reg = TH1520_PERISYS_AHB_RST_CFG, 236 }, 237 [TH1520_RESET_ID_PERISYS_APB1] = { 238 .bit = BIT(0), 239 .reg = TH1520_PERISYS_APB1_RST_CFG, 240 }, 241 [TH1520_RESET_ID_PERISYS_APB2] = { 242 .bit = BIT(0), 243 .reg = TH1520_PERISYS_APB2_RST_CFG, 244 }, 245 [TH1520_RESET_ID_GMAC0_APB] = { 246 .bit = BIT(0), 247 .reg = TH1520_GMAC0_RST_CFG, 248 }, 249 [TH1520_RESET_ID_GMAC0_AHB] = { 250 .bit = BIT(1), 251 .reg = TH1520_GMAC0_RST_CFG, 252 }, 253 [TH1520_RESET_ID_GMAC0_CLKGEN] = { 254 .bit = BIT(2), 255 .reg = TH1520_GMAC0_RST_CFG, 256 }, 257 [TH1520_RESET_ID_GMAC0_AXI] = { 258 .bit = BIT(3), 259 .reg = TH1520_GMAC0_RST_CFG, 260 }, 261 [TH1520_RESET_ID_UART0_APB] = { 262 .bit = BIT(0), 263 .reg = TH1520_UART0_RST_CFG, 264 }, 265 [TH1520_RESET_ID_UART0_IF] = { 266 .bit = BIT(1), 267 .reg = TH1520_UART0_RST_CFG, 268 }, 269 [TH1520_RESET_ID_UART1_APB] = { 270 .bit = BIT(0), 271 .reg = TH1520_UART1_RST_CFG, 272 }, 273 [TH1520_RESET_ID_UART1_IF] = { 274 .bit = BIT(1), 275 .reg = TH1520_UART1_RST_CFG, 276 }, 277 [TH1520_RESET_ID_UART2_APB] = { 278 .bit = BIT(0), 279 .reg = TH1520_UART2_RST_CFG, 280 }, 281 [TH1520_RESET_ID_UART2_IF] = { 282 .bit = BIT(1), 283 .reg = TH1520_UART2_RST_CFG, 284 }, 285 [TH1520_RESET_ID_UART3_APB] = { 286 .bit = BIT(0), 287 .reg = TH1520_UART3_RST_CFG, 288 }, 289 [TH1520_RESET_ID_UART3_IF] = { 290 .bit = BIT(1), 291 .reg = TH1520_UART3_RST_CFG, 292 }, 293 [TH1520_RESET_ID_UART4_APB] = { 294 .bit = BIT(0), 295 .reg = TH1520_UART4_RST_CFG, 296 }, 297 [TH1520_RESET_ID_UART4_IF] = { 298 .bit = BIT(1), 299 .reg = TH1520_UART4_RST_CFG, 300 }, 301 [TH1520_RESET_ID_UART5_APB] = { 302 .bit = BIT(0), 303 .reg = TH1520_UART5_RST_CFG, 304 }, 305 [TH1520_RESET_ID_UART5_IF] = { 306 .bit = BIT(1), 307 .reg = TH1520_UART5_RST_CFG, 308 }, 309 [TH1520_RESET_ID_QSPI0_IF] = { 310 .bit = BIT(0), 311 .reg = TH1520_QSPI0_RST_CFG, 312 }, 313 [TH1520_RESET_ID_QSPI0_APB] = { 314 .bit = BIT(1), 315 .reg = TH1520_QSPI0_RST_CFG, 316 }, 317 [TH1520_RESET_ID_QSPI1_IF] = { 318 .bit = BIT(0), 319 .reg = TH1520_QSPI1_RST_CFG, 320 }, 321 [TH1520_RESET_ID_QSPI1_APB] = { 322 .bit = BIT(1), 323 .reg = TH1520_QSPI1_RST_CFG, 324 }, 325 [TH1520_RESET_ID_SPI_IF] = { 326 .bit = BIT(0), 327 .reg = TH1520_SPI_RST_CFG, 328 }, 329 [TH1520_RESET_ID_SPI_APB] = { 330 .bit = BIT(1), 331 .reg = TH1520_SPI_RST_CFG, 332 }, 333 [TH1520_RESET_ID_I2C0_APB] = { 334 .bit = BIT(0), 335 .reg = TH1520_I2C0_RST_CFG, 336 }, 337 [TH1520_RESET_ID_I2C0_CORE] = { 338 .bit = BIT(1), 339 .reg = TH1520_I2C0_RST_CFG, 340 }, 341 [TH1520_RESET_ID_I2C1_APB] = { 342 .bit = BIT(0), 343 .reg = TH1520_I2C1_RST_CFG, 344 }, 345 [TH1520_RESET_ID_I2C1_CORE] = { 346 .bit = BIT(1), 347 .reg = TH1520_I2C1_RST_CFG, 348 }, 349 [TH1520_RESET_ID_I2C2_APB] = { 350 .bit = BIT(0), 351 .reg = TH1520_I2C2_RST_CFG, 352 }, 353 [TH1520_RESET_ID_I2C2_CORE] = { 354 .bit = BIT(1), 355 .reg = TH1520_I2C2_RST_CFG, 356 }, 357 [TH1520_RESET_ID_I2C3_APB] = { 358 .bit = BIT(0), 359 .reg = TH1520_I2C3_RST_CFG, 360 }, 361 [TH1520_RESET_ID_I2C3_CORE] = { 362 .bit = BIT(1), 363 .reg = TH1520_I2C3_RST_CFG, 364 }, 365 [TH1520_RESET_ID_I2C4_APB] = { 366 .bit = BIT(0), 367 .reg = TH1520_I2C4_RST_CFG, 368 }, 369 [TH1520_RESET_ID_I2C4_CORE] = { 370 .bit = BIT(1), 371 .reg = TH1520_I2C4_RST_CFG, 372 }, 373 [TH1520_RESET_ID_I2C5_APB] = { 374 .bit = BIT(0), 375 .reg = TH1520_I2C5_RST_CFG, 376 }, 377 [TH1520_RESET_ID_I2C5_CORE] = { 378 .bit = BIT(1), 379 .reg = TH1520_I2C5_RST_CFG, 380 }, 381 [TH1520_RESET_ID_GPIO0_DB] = { 382 .bit = BIT(0), 383 .reg = TH1520_GPIO0_RST_CFG, 384 }, 385 [TH1520_RESET_ID_GPIO0_APB] = { 386 .bit = BIT(1), 387 .reg = TH1520_GPIO0_RST_CFG, 388 }, 389 [TH1520_RESET_ID_GPIO1_DB] = { 390 .bit = BIT(0), 391 .reg = TH1520_GPIO1_RST_CFG, 392 }, 393 [TH1520_RESET_ID_GPIO1_APB] = { 394 .bit = BIT(1), 395 .reg = TH1520_GPIO1_RST_CFG, 396 }, 397 [TH1520_RESET_ID_GPIO2_DB] = { 398 .bit = BIT(0), 399 .reg = TH1520_GPIO2_RST_CFG, 400 }, 401 [TH1520_RESET_ID_GPIO2_APB] = { 402 .bit = BIT(1), 403 .reg = TH1520_GPIO2_RST_CFG, 404 }, 405 [TH1520_RESET_ID_PWM_COUNTER] = { 406 .bit = BIT(0), 407 .reg = TH1520_PWM_RST_CFG, 408 }, 409 [TH1520_RESET_ID_PWM_APB] = { 410 .bit = BIT(1), 411 .reg = TH1520_PWM_RST_CFG, 412 }, 413 [TH1520_RESET_ID_PADCTRL0_APB] = { 414 .bit = BIT(0), 415 .reg = TH1520_PADCTRL0_APSYS_RST_CFG, 416 }, 417 [TH1520_RESET_ID_CPU2PERI_X2H] = { 418 .bit = BIT(1), 419 .reg = TH1520_CPU2PERI_X2H_RST_CFG, 420 }, 421 [TH1520_RESET_ID_CPU2AON_X2H] = { 422 .bit = BIT(0), 423 .reg = TH1520_CPU2AON_X2H_RST_CFG, 424 }, 425 [TH1520_RESET_ID_AON2CPU_A2X] = { 426 .bit = BIT(0), 427 .reg = TH1520_AON2CPU_A2X_RST_CFG, 428 }, 429 [TH1520_RESET_ID_NPUSYS_AXI] = { 430 .bit = BIT(0), 431 .reg = TH1520_NPUSYS_AXI_RST_CFG, 432 }, 433 [TH1520_RESET_ID_NPUSYS_AXI_APB] = { 434 .bit = BIT(1), 435 .reg = TH1520_NPUSYS_AXI_RST_CFG, 436 }, 437 [TH1520_RESET_ID_CPU2VP_X2P] = { 438 .bit = BIT(0), 439 .reg = TH1520_CPU2VP_X2P_RST_CFG, 440 }, 441 [TH1520_RESET_ID_CPU2VI_X2H] = { 442 .bit = BIT(0), 443 .reg = TH1520_CPU2VI_X2H_RST_CFG, 444 }, 445 [TH1520_RESET_ID_BMU_AXI] = { 446 .bit = BIT(0), 447 .reg = TH1520_BMU_C910_RST_CFG, 448 }, 449 [TH1520_RESET_ID_BMU_APB] = { 450 .bit = BIT(1), 451 .reg = TH1520_BMU_C910_RST_CFG, 452 }, 453 [TH1520_RESET_ID_DMAC_CPUSYS_AXI] = { 454 .bit = BIT(0), 455 .reg = TH1520_DMAC_CPUSYS_RST_CFG, 456 }, 457 [TH1520_RESET_ID_DMAC_CPUSYS_AHB] = { 458 .bit = BIT(1), 459 .reg = TH1520_DMAC_CPUSYS_RST_CFG, 460 }, 461 [TH1520_RESET_ID_SPINLOCK] = { 462 .bit = BIT(0), 463 .reg = TH1520_SPINLOCK_RST_CFG, 464 }, 465 [TH1520_RESET_ID_CFG2TEE] = { 466 .bit = BIT(0), 467 .reg = TH1520_CFG2TEE_X2H_RST_CFG, 468 }, 469 [TH1520_RESET_ID_DSMART] = { 470 .bit = BIT(0), 471 .reg = TH1520_DSMART_RST_CFG, 472 }, 473 [TH1520_RESET_ID_GPIO3_DB] = { 474 .bit = BIT(0), 475 .reg = TH1520_GPIO3_RST_CFG, 476 }, 477 [TH1520_RESET_ID_GPIO3_APB] = { 478 .bit = BIT(1), 479 .reg = TH1520_GPIO3_RST_CFG, 480 }, 481 [TH1520_RESET_ID_PERI_I2S] = { 482 .bit = BIT(0), 483 .reg = TH1520_I2S_RST_CFG, 484 }, 485 [TH1520_RESET_ID_PERI_APB3] = { 486 .bit = BIT(0), 487 .reg = TH1520_PERI_APB3_RST_CFG, 488 }, 489 [TH1520_RESET_ID_PERI2PERI1_APB] = { 490 .bit = BIT(1), 491 .reg = TH1520_PERI_APB3_RST_CFG, 492 }, 493 [TH1520_RESET_ID_VPSYS_APB] = { 494 .bit = BIT(0), 495 .reg = TH1520_VP_SUBSYS_RST_CFG, 496 }, 497 [TH1520_RESET_ID_PERISYS_APB4] = { 498 .bit = BIT(0), 499 .reg = TH1520_PERISYS_APB4_RST_CFG, 500 }, 501 [TH1520_RESET_ID_GMAC1_APB] = { 502 .bit = BIT(0), 503 .reg = TH1520_GMAC1_RST_CFG, 504 }, 505 [TH1520_RESET_ID_GMAC1_AHB] = { 506 .bit = BIT(1), 507 .reg = TH1520_GMAC1_RST_CFG, 508 }, 509 [TH1520_RESET_ID_GMAC1_CLKGEN] = { 510 .bit = BIT(2), 511 .reg = TH1520_GMAC1_RST_CFG, 512 }, 513 [TH1520_RESET_ID_GMAC1_AXI] = { 514 .bit = BIT(3), 515 .reg = TH1520_GMAC1_RST_CFG, 516 }, 517 [TH1520_RESET_ID_GMAC_AXI] = { 518 .bit = BIT(0), 519 .reg = TH1520_GMAC_AXI_RST_CFG, 520 }, 521 [TH1520_RESET_ID_GMAC_AXI_APB] = { 522 .bit = BIT(1), 523 .reg = TH1520_GMAC_AXI_RST_CFG, 524 }, 525 [TH1520_RESET_ID_PADCTRL1_APB] = { 526 .bit = BIT(0), 527 .reg = TH1520_PADCTRL1_APSYS_RST_CFG, 528 }, 529 [TH1520_RESET_ID_VOSYS_AXI] = { 530 .bit = BIT(0), 531 .reg = TH1520_VOSYS_AXI_RST_CFG, 532 }, 533 [TH1520_RESET_ID_VOSYS_AXI_APB] = { 534 .bit = BIT(1), 535 .reg = TH1520_VOSYS_AXI_RST_CFG, 536 }, 537 [TH1520_RESET_ID_VOSYS_AXI_X2X] = { 538 .bit = BIT(0), 539 .reg = TH1520_VOSYS_X2X_RST_CFG, 540 }, 541 [TH1520_RESET_ID_MISC2VP_X2X] = { 542 .bit = BIT(0), 543 .reg = TH1520_MISC2VP_X2X_RST_CFG, 544 }, 545 [TH1520_RESET_ID_DSPSYS] = { 546 .bit = BIT(0), 547 .reg = TH1520_SUBSYS_RST_CFG, 548 }, 549 [TH1520_RESET_ID_VISYS] = { 550 .bit = BIT(1), 551 .reg = TH1520_SUBSYS_RST_CFG, 552 }, 553 [TH1520_RESET_ID_VOSYS] = { 554 .bit = BIT(2), 555 .reg = TH1520_SUBSYS_RST_CFG, 556 }, 557 [TH1520_RESET_ID_VPSYS] = { 558 .bit = BIT(3), 559 .reg = TH1520_SUBSYS_RST_CFG, 560 }, 561 }; 562 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki From andreas at gaisler.com Tue Sep 2 04:44:56 2025 From: andreas at gaisler.com (Andreas Larsson) Date: Tue, 2 Sep 2025 13:44:56 +0200 Subject: [PATCH v2 3/4] arch: copy_thread: pass clone_flags as u64 In-Reply-To: <11a4d0a953e3a9405177d67f287c69379a2b2f8f.camel@physik.fu-berlin.de> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> <11a4d0a953e3a9405177d67f287c69379a2b2f8f.camel@physik.fu-berlin.de> Message-ID: <92bace9a-b5c4-4ea1-a1f7-4742c15a64a0@gaisler.com> On 2025-09-02 09:15, John Paul Adrian Glaubitz wrote: >> Thanks for this and for the whole series! Needed foundation for a >> sparc32 clone3 implementation as well. > > Can you implement clone3 for sparc64 as well? (heavily pairing down the to list) We'll take a look at that as well. Cheers, Andreas From dlan at gentoo.org Tue Sep 2 05:26:58 2025 From: dlan at gentoo.org (Yixun Lan) Date: Tue, 02 Sep 2025 20:26:58 +0800 Subject: [PATCH v2] riscv: dts: spacemit: uart: remove sec_uart1 device node Message-ID: <20250902-02-k1-uart-clock-v2-1-f146918d44f6@gentoo.org> sec_uart1 is not available from Linux, and no clock is implemented in CCF framework, thus 'make dtbs_check' will pop up this warning message: serial at f0612000: 'clock-names' is a required property Removing the node from device tree to silence the DT check warning. Signed-off-by: Yixun Lan --- This patch try to resolve the DT check warning due to the clock for sec_uart1 is not implemented. --- Changes in v2: - remove sec_uart1 node instead of marking it as reserved - Link to v1: https://lore.kernel.org/r/20250718-02-k1-uart-clock-v1-1-698e884aa717 at gentoo.org --- arch/riscv/boot/dts/spacemit/k1.dtsi | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/arch/riscv/boot/dts/spacemit/k1.dtsi b/arch/riscv/boot/dts/spacemit/k1.dtsi index abde8bb07c95c5a745736a2dd6f0c0e0d7c696e4..3094f75ed13badfc3db333be2b3195c61f57fddf 100644 --- a/arch/riscv/boot/dts/spacemit/k1.dtsi +++ b/arch/riscv/boot/dts/spacemit/k1.dtsi @@ -777,16 +777,7 @@ uart9: serial at d4017800 { status = "disabled"; }; - sec_uart1: serial at f0612000 { - compatible = "spacemit,k1-uart", - "intel,xscale-uart"; - reg = <0x0 0xf0612000 0x0 0x100>; - interrupts = <43>; - clock-frequency = <14857000>; - reg-shift = <2>; - reg-io-width = <4>; - status = "reserved"; /* for TEE usage */ - }; + /* sec_uart1: 0xf0612000, not available from Linux */ }; multimedia-bus { --- base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585 change-id: 20250718-02-k1-uart-clock-0beb9ef10fe0 Best regards, -- Yixun Lan From wangruikang at iscas.ac.cn Tue Sep 2 05:55:32 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Tue, 2 Sep 2025 20:55:32 +0800 Subject: [PATCH v2] riscv: dts: spacemit: uart: remove sec_uart1 device node In-Reply-To: <20250902-02-k1-uart-clock-v2-1-f146918d44f6@gentoo.org> References: <20250902-02-k1-uart-clock-v2-1-f146918d44f6@gentoo.org> Message-ID: <680534b4-27e7-4506-885a-1c3dc9d12b8b@iscas.ac.cn> On 9/2/25 20:26, Yixun Lan wrote: > [...] > > diff --git a/arch/riscv/boot/dts/spacemit/k1.dtsi b/arch/riscv/boot/dts/spacemit/k1.dtsi > index abde8bb07c95c5a745736a2dd6f0c0e0d7c696e4..3094f75ed13badfc3db333be2b3195c61f57fddf 100644 > --- a/arch/riscv/boot/dts/spacemit/k1.dtsi > +++ b/arch/riscv/boot/dts/spacemit/k1.dtsi > @@ -777,16 +777,7 @@ uart9: serial at d4017800 { > status = "disabled"; > }; > > - sec_uart1: serial at f0612000 { > - compatible = "spacemit,k1-uart", > - "intel,xscale-uart"; > - reg = <0x0 0xf0612000 0x0 0x100>; > - interrupts = <43>; > - clock-frequency = <14857000>; > - reg-shift = <2>; > - reg-io-width = <4>; > - status = "reserved"; /* for TEE usage */ > - }; > + /* sec_uart1: 0xf0612000, not available from Linux */ I know this is going back and forth a lot but I don't think that's a good description of what's going on. My preference is that we just drop this node altogether, just forgetting that this thing even exists. But if you do think we want to keep the information we can drop the clock-frequency property too and change its status to something like: status = "disabled"; /* No clock defined */ Which also silences the warning - disabled nodes are allowed to be incomplete. My personal opinion is that I think sec_uart1 and TEE support feels too theoretical to be worth caring about. Vivian "dramforever" Wang From tglx at linutronix.de Tue Sep 2 06:11:27 2025 From: tglx at linutronix.de (Thomas Gleixner) Date: Tue, 02 Sep 2025 15:11:27 +0200 Subject: [PATCH v2 1/3] irqchip/sg2042-msi: Set irq type according to DT configuration In-Reply-To: <49e70989c2f0a8a67e48527e57b4877262996214.1756169460.git.unicorn_wang@outlook.com> References: <49e70989c2f0a8a67e48527e57b4877262996214.1756169460.git.unicorn_wang@outlook.com> Message-ID: <87v7m10zk0.ffs@tglx> On Tue, Aug 26 2025 at 09:09, Chen Wang wrote: > From: Chen Wang > > The original MSI interrupt type was hard-coded, which was not a good idea. That's not really helpful, unless you explain WHY it's not a good idea... Also for correctness sake, you want to change the DTs first and not after you changed the driver to read it from the stale device tree with the wrong type. Thanks, tglx From tglx at linutronix.de Tue Sep 2 06:12:39 2025 From: tglx at linutronix.de (Thomas Gleixner) Date: Tue, 02 Sep 2025 15:12:39 +0200 Subject: [PATCH v2 0/3] irqchip/sg2042-msi: Set irq type according to DT configuration In-Reply-To: References: Message-ID: <87seh50zi0.ffs@tglx> On Tue, Sep 02 2025 at 07:59, Chen Wang wrote: Please don't top-post and trim your replies. > P.S. Since the modification of the DTS part is closely dependent on the > modification of the driver part, I am not sure whether you are willing > to pick these three patches together, or just pick the driver part and > leave the DTS part to me? They need to go together in the right order obviously. From robh at kernel.org Tue Sep 2 06:17:49 2025 From: robh at kernel.org (Rob Herring (Arm)) Date: Tue, 02 Sep 2025 08:17:49 -0500 Subject: [PATCH v8 0/5] Add support for NetCube Systems Nagami SoM and its carrier boards In-Reply-To: <20250831162536.2380589-1-lukas.schmid@netcube.li> References: <20250831162536.2380589-1-lukas.schmid@netcube.li> Message-ID: <175678730955.877897.3145791714848835564.robh@kernel.org> On Sun, 31 Aug 2025 18:25:29 +0200, Lukas Schmid wrote: > This series adds support for the NetCube Systems Nagami SoM and its > associated carrier boards, the Nagami Basic Carrier and the Nagami Keypad > Carrier. > > Changes in v8: > - Use a gpio-mux instead of the gpio-hog for the USB0_SEC_EN signal > - Fix the dt-schema issues > > Changes in v7: > - Fix the gpio numbering for the USB_SEC_EN gpio hog > - Fix the gpio-line-names for the keypad carrier > > Changes in v6: > - Add 'usb0-enable-hog' to the som to enable the USB-OTG port by default > - Update the keypad carrier dts to match actual board revision > > Changes in v5: > - Re-add the non-removable property to the ESP32 interface > - Add the mmc-pwrseq node for the ESP32 to initialize the ESP32 correctly > - Remove the unused ehci0 and ohci0 nodes from the Keypad Carrier since > USB port is peripheral only > > Changes in v4: > - Disable the default interfaces on the card-edge but keep the pinctrl > definitions for them > - Split the pinctrl definitions for the SPI interface into the basic spi > pins and the hold/wp pins > - Move some mmc0 properties to the Basic Carrier dts > - Remove non-removable property from the ESP32 interface > - Fix typo in the keypad matrix definition > > Changes in v3: > - Add missing dcxo node to the SoM dtsi > - Rename the multi-led node > - Change dr_mode to "peripheral" for the Keypad Carrier > > Changes in v2: > - Squash the binding patches into one patch > - Fix formatting of the phy node in the SoM dtsi > - Add description on where the phy is located in the SoM dtsi > - Fix the phy address in the SoM dtsi > - Move the carrier bindings into the same description as enums > > Signed-off-by: Lukas Schmid > --- > Lukas Schmid (5): > dt-bindings: arm: sunxi: Add NetCube Systems Nagami SoM and carrier > board bindings > riscv: dts: allwinner: d1s-t113: Add pinctrl's required by NetCube > Systems Nagami SoM > ARM: dts: sunxi: add support for NetCube Systems Nagami SoM > ARM: dts: sunxi: add support for NetCube Systems Nagami Basic Carrier > ARM: dts: sunxi: add support for NetCube Systems Nagami Keypad Carrier > > .../devicetree/bindings/arm/sunxi.yaml | 8 + > arch/arm/boot/dts/allwinner/Makefile | 3 + > ...n8i-t113s-netcube-nagami-basic-carrier.dts | 67 +++++ > ...8i-t113s-netcube-nagami-keypad-carrier.dts | 129 +++++++++ > .../allwinner/sun8i-t113s-netcube-nagami.dtsi | 250 ++++++++++++++++++ > .../boot/dts/allwinner/sunxi-d1s-t113.dtsi | 48 ++++ > 6 files changed, 505 insertions(+) > create mode 100644 arch/arm/boot/dts/allwinner/sun8i-t113s-netcube-nagami-basic-carrier.dts > create mode 100644 arch/arm/boot/dts/allwinner/sun8i-t113s-netcube-nagami-keypad-carrier.dts > create mode 100644 arch/arm/boot/dts/allwinner/sun8i-t113s-netcube-nagami.dtsi > > -- > 2.39.5 > > > > My bot found new DTB warnings on the .dts files added or changed in this series. Some warnings may be from an existing SoC .dtsi. Or perhaps the warnings are fixed by another series. Ultimately, it is up to the platform maintainer whether these warnings are acceptable or not. No need to reply unless the platform maintainer has comments. If you already ran DT checks and didn't see these error(s), then make sure dt-schema is up to date: pip3 install dtschema --upgrade This patch series was applied (using b4) to base: Base: attempting to guess base-commit... Base: tags/next-20250829 (best guess, 2/3 blobs matched) If this is not the correct base, please add 'base-commit' tag (or use b4 which does this automatically) New warnings running 'make CHECK_DTBS=y for arch/arm/boot/dts/allwinner/' for 20250831162536.2380589-1-lukas.schmid at netcube.li: arch/arm/boot/dts/allwinner/sun8i-t113s-netcube-nagami-keypad-carrier.dtb: /soc/i2c at 2502800/keypad at 34: failed to match any schema with compatible: ['ti,tca8418'] From dlan at gentoo.org Tue Sep 2 06:44:17 2025 From: dlan at gentoo.org (Yixun Lan) Date: Tue, 2 Sep 2025 21:44:17 +0800 Subject: [PATCH v2] riscv: dts: spacemit: uart: remove sec_uart1 device node In-Reply-To: <680534b4-27e7-4506-885a-1c3dc9d12b8b@iscas.ac.cn> References: <20250902-02-k1-uart-clock-v2-1-f146918d44f6@gentoo.org> <680534b4-27e7-4506-885a-1c3dc9d12b8b@iscas.ac.cn> Message-ID: <20250902134417-GYA1155728@gentoo.org> Hi Vivian, On 20:55 Tue 02 Sep , Vivian Wang wrote: > > On 9/2/25 20:26, Yixun Lan wrote: > > [...] > > > > diff --git a/arch/riscv/boot/dts/spacemit/k1.dtsi b/arch/riscv/boot/dts/spacemit/k1.dtsi > > index abde8bb07c95c5a745736a2dd6f0c0e0d7c696e4..3094f75ed13badfc3db333be2b3195c61f57fddf 100644 > > --- a/arch/riscv/boot/dts/spacemit/k1.dtsi > > +++ b/arch/riscv/boot/dts/spacemit/k1.dtsi > > @@ -777,16 +777,7 @@ uart9: serial at d4017800 { > > status = "disabled"; > > }; > > > > - sec_uart1: serial at f0612000 { > > - compatible = "spacemit,k1-uart", > > - "intel,xscale-uart"; > > - reg = <0x0 0xf0612000 0x0 0x100>; > > - interrupts = <43>; > > - clock-frequency = <14857000>; > > - reg-shift = <2>; > > - reg-io-width = <4>; > > - status = "reserved"; /* for TEE usage */ > > - }; > > + /* sec_uart1: 0xf0612000, not available from Linux */ > > I know this is going back and forth a lot but I don't think that's a > good description of what's going on. > > My preference is that we just drop this node altogether, just forgetting > that this thing even exists. But if you do think we want to keep the yes, removing the comment and completely dropping it is an option.. > information we can drop the clock-frequency property too and change its > status to something like: > > status = "disabled"; /* No clock defined */ > > Which also silences the warning - disabled nodes are allowed to be > incomplete. no, set to 'disabled' is simply wrong, it doesn't reflect the meaning of "unavaiable", I remembered we've rejected this before introducing the 'sec_uart1' node in the first place > > My personal opinion is that I think sec_uart1 and TEE support feels too > theoretical to be worth caring about. > > Vivian "dramforever" Wang > -- Yixun Lan (dlan) From krzk at kernel.org Tue Sep 2 06:44:48 2025 From: krzk at kernel.org (Krzysztof Kozlowski) Date: Tue, 2 Sep 2025 15:44:48 +0200 Subject: [PATCH 1/4] dt-bindings: reset: thead,th1520-reset: Add controllers for more subsys In-Reply-To: References: <20250901042320.22865-1-ziyao@disroot.org> <20250901042320.22865-2-ziyao@disroot.org> <20250902-peach-jackal-of-judgment-8aee13@kuoka> Message-ID: <75cafd7e-02a5-41d1-9daf-24bef20dab82@kernel.org> On 02/09/2025 11:04, Yao Zi wrote: > On Tue, Sep 02, 2025 at 10:27:53AM +0200, Krzysztof Kozlowski wrote: >> On Mon, Sep 01, 2025 at 04:23:17AM +0000, Yao Zi wrote: >>> +/* VO Subsystem */ >>> #define TH1520_RESET_ID_GPU 0 >>> #define TH1520_RESET_ID_GPU_CLKGEN 1 >>> -#define TH1520_RESET_ID_NPU 2 >>> -#define TH1520_RESET_ID_WDT0 3 >>> -#define TH1520_RESET_ID_WDT1 4 >> >> This is ABI break and deserves explanation and its own patchset. > > The registers in control of TH1520_RESET_ID_{NPU,WDT0,WDT1} don't belong > to the VO reset controller (documented as "thead,th1520-reset"), and > thus cannot be implemented by it. They're in fact AP subsystem resets, > which gets supported in Linux with this series. > > Is it okay for you to separate a patch to delete these wrong IDs and add > them back for the AP reset controller latter? Anyway, I should have > provided more information about these three resets. Thanks for catching > this. So feels like separate patch dropping these resets with above explanation. Best regards, Krzysztof From krzk at kernel.org Tue Sep 2 06:47:56 2025 From: krzk at kernel.org (Krzysztof Kozlowski) Date: Tue, 2 Sep 2025 15:47:56 +0200 Subject: [PATCH v1 5/5] riscv: dts: microchip: add a device tree for Discovery Kit In-Reply-To: <20250902-affair-scrambler-2771df16372e@spud> References: <20250825161952.3902672-1-valentina.fernandezalanis@microchip.com> <20250825161952.3902672-6-valentina.fernandezalanis@microchip.com> <2b1eb8fd-2a64-4745-ad93-abc53d240b69@kernel.org> <0d90eeb4-e6ac-459c-a6b1-26368f102e0e@kernel.org> <20250902-affair-scrambler-2771df16372e@spud> Message-ID: <677aad27-66b9-4c4f-8fbe-6b9aabcd375a@kernel.org> On 02/09/2025 10:31, Conor Dooley wrote: > On Tue, Sep 02, 2025 at 08:22:02AM +0200, Krzysztof Kozlowski wrote: > >>>>> + refclk_ccc: cccrefclk { >>>> >>>> Please use name for all fixed clocks which matches current format >>>> recommendation: 'clock-' (see also the pattern in the binding for >>>> any other options). >>>> >>>> https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/clock/fixed-clock.yaml >>> The fabric dtsi describes elements configured by the FPGA bitstream. >>> This node is named as such because the Clock Conditioner Circuit CCC's >>> reference clock source is set by the FPGA bitstream, while its frequency >>> is determined by an on-board oscillator. >>> >>> Hope this clarifies the rationale behind the node name. >> No, because there is no style naming clocks like this. Neither proper >> suffix, nor prefix. Use standard naming. > > So you want all fixed frequency clocks to be named "clk-foo" when > "clk-" is not suitable? Fine if you do, but I didn't realise that > it was required and haven't been keeping an eye out for it. Recommended is to just use consistent suffixes or prefixes. Binding asks for "clock-" so that's what I propose to use here. Best regards, Krzysztof From p.zabel at pengutronix.de Tue Sep 2 06:57:07 2025 From: p.zabel at pengutronix.de (Philipp Zabel) Date: Tue, 02 Sep 2025 15:57:07 +0200 Subject: [PATCH 1/4] dt-bindings: reset: thead,th1520-reset: Add controllers for more subsys In-Reply-To: <20250901042320.22865-2-ziyao@disroot.org> References: <20250901042320.22865-1-ziyao@disroot.org> <20250901042320.22865-2-ziyao@disroot.org> Message-ID: <8de5763d7104a552f32196f04363d071548b7bba.camel@pengutronix.de> On Mo, 2025-09-01 at 04:23 +0000, Yao Zi wrote: > TH1520 SoC is divided into several subsystems, most of them have > distinct reset controllers. Let's document reset controllers other than > the one for VO subsystem and IDs for their reset signals. > > Signed-off-by: Yao Zi > --- > .../bindings/reset/thead,th1520-reset.yaml | 8 +- > .../dt-bindings/reset/thead,th1520-reset.h | 219 +++++++++++++++++- > 2 files changed, 223 insertions(+), 4 deletions(-) > [...] > diff --git a/include/dt-bindings/reset/thead,th1520-reset.h b/include/dt-bindings/reset/thead,th1520-reset.h > index ee799286c175..927e251edfab 100644 > --- a/include/dt-bindings/reset/thead,th1520-reset.h > +++ b/include/dt-bindings/reset/thead,th1520-reset.h > @@ -7,11 +7,186 @@ [...] > +/* AP Subsystem */ [...] > +#define TH1520_RESET_ID_C910_C0 5 > +#define TH1520_RESET_ID_C910_C1 5 > +#define TH1520_RESET_ID_C910_C2 5 > +#define TH1520_RESET_ID_C910_C3 5 As the kernel test robot already noticed, this doesn't seem right. regards Philipp From p.zabel at pengutronix.de Tue Sep 2 06:57:02 2025 From: p.zabel at pengutronix.de (Philipp Zabel) Date: Tue, 02 Sep 2025 15:57:02 +0200 Subject: [PATCH 1/4] dt-bindings: reset: thead,th1520-reset: Add controllers for more subsys In-Reply-To: <75cafd7e-02a5-41d1-9daf-24bef20dab82@kernel.org> References: <20250901042320.22865-1-ziyao@disroot.org> <20250901042320.22865-2-ziyao@disroot.org> <20250902-peach-jackal-of-judgment-8aee13@kuoka> <75cafd7e-02a5-41d1-9daf-24bef20dab82@kernel.org> Message-ID: <705de60088f72c1ed575d92e8c4f4b90989385c5.camel@pengutronix.de> On Di, 2025-09-02 at 15:44 +0200, Krzysztof Kozlowski wrote: > On 02/09/2025 11:04, Yao Zi wrote: > > On Tue, Sep 02, 2025 at 10:27:53AM +0200, Krzysztof Kozlowski wrote: > > > On Mon, Sep 01, 2025 at 04:23:17AM +0000, Yao Zi wrote: > > > > +/* VO Subsystem */ > > > > #define TH1520_RESET_ID_GPU 0 > > > > #define TH1520_RESET_ID_GPU_CLKGEN 1 > > > > -#define TH1520_RESET_ID_NPU 2 > > > > -#define TH1520_RESET_ID_WDT0 3 > > > > -#define TH1520_RESET_ID_WDT1 4 > > > > > > This is ABI break and deserves explanation and its own patchset. > > > > The registers in control of TH1520_RESET_ID_{NPU,WDT0,WDT1} don't belong > > to the VO reset controller (documented as "thead,th1520-reset"), and > > thus cannot be implemented by it. They're in fact AP subsystem resets, > > which gets supported in Linux with this series. > > > > Is it okay for you to separate a patch to delete these wrong IDs and add > > them back for the AP reset controller latter? Anyway, I should have > > provided more information about these three resets. Thanks for catching > > this. > > So feels like separate patch dropping these resets with above explanation. They happen to be reintroduced with exactly the same values, just for the AP subsystem reset controller: +/* AP Subsystem */ +#define TH1520_RESET_ID_BROM 0 +#define TH1520_RESET_ID_C910_TOP 1 +#define TH1520_RESET_ID_NPU 2 +#define TH1520_RESET_ID_WDT0 3 +#define TH1520_RESET_ID_WDT1 4 [...] +/* VO Subsystem */ #define TH1520_RESET_ID_GPU 0 #define TH1520_RESET_ID_GPU_CLKGEN 1 -#define TH1520_RESET_ID_NPU 2 -#define TH1520_RESET_ID_WDT0 3 -#define TH1520_RESET_ID_WDT1 4 regards Philipp From gabriel.fernandez at foss.st.com Tue Sep 2 07:05:21 2025 From: gabriel.fernandez at foss.st.com (Gabriel FERNANDEZ) Date: Tue, 2 Sep 2025 16:05:21 +0200 Subject: [PATCH v4 8/9] clk: divider, gate: create regmap-backed copies of gate and divider clocks In-Reply-To: <20250901-yearling-reconcile-99d06fe7868e@spud> References: <20250901-rigid-sacrifice-0039c6e6234e@spud> <20250901-yearling-reconcile-99d06fe7868e@spud> Message-ID: <9b669562-ee52-47b6-856e-3184b3e89d28@foss.st.com> On 9/1/25 13:04, Conor Dooley wrote: > From: Conor Dooley > > Implement regmap-backed copies of gate and divider clocks by replacing > the iomem pointer to the clock registers with a regmap and offset > within. > > Signed-off-by: Conor Dooley > --- > v4: > - increase map_offset to a u32 > - use a single Kconfig option for both divider and gate regmap > implementations > --- > drivers/clk/Kconfig | 4 + > drivers/clk/Makefile | 2 + > drivers/clk/clk-divider-regmap.c | 271 +++++++++++++++++++++++++++++++ > drivers/clk/clk-gate-regmap.c | 254 +++++++++++++++++++++++++++++ > include/linux/clk-provider.h | 119 ++++++++++++++ > 5 files changed, 650 insertions(+) > create mode 100644 drivers/clk/clk-divider-regmap.c > create mode 100644 drivers/clk/clk-gate-regmap.c > Hi Conor, i tested the clk_gate_remap part with my code, it works fine. Just a? minor remark concerning .round_rate, you can add my Reviewed-by: Gabriel Fernandez > +const struct clk_ops clk_divider_regmap_ops = { > + .recalc_rate = clk_divider_regmap_recalc_rate, > + .round_rate = clk_divider_regmap_round_rate, .round_rate could be removed ? > + .determine_rate = clk_divider_regmap_determine_rate, > + .set_rate = clk_divider_regmap_set_rate, > +}; > +EXPORT_SYMBOL_GPL(clk_divider_regmap_ops); > + > +const struct clk_ops clk_divider_regmap_ro_ops = { > + .recalc_rate = clk_divider_regmap_recalc_rate, > + .round_rate = clk_divider_regmap_round_rate, dito > + .determine_rate = clk_divider_regmap_determine_rate, > +}; > +EXPORT_SYMBOL_GPL(clk_divider_regmap_ro_ops); > + From maud_spierings at hotmail.com Tue Sep 2 07:15:08 2025 From: maud_spierings at hotmail.com (Maud Spierings) Date: Tue, 2 Sep 2025 16:15:08 +0200 Subject: [PATCH 3/4] reset: th1520: Support reset controllers in more subsystems In-Reply-To: <20250901042320.22865-4-ziyao@disroot.org> References: <20250901042320.22865-4-ziyao@disroot.org> Message-ID: Hi Yao, > Introduce reset controllers for AP, MISC, VI, VP and DSP subsystems and > add their reset signal mappings. > > Signed-off-by: Yao Zi > --- /* snip */ > static const struct of_device_id th1520_reset_match[] = { > + { .compatible = "thead,th1520-reset-ap", .data = &th1520_ap_reset_data }, > + { .compatible = "thead,th1520-reset-misc", .data = &th1520_misc_reset_data }, > + { .compatible = "thead,th1520-reset-vi", .data = &th1520_vi_reset_data }, > { .compatible = "thead,th1520-reset", .data = &th1520_reset_data }, > + { .compatible = "thead,th1520-reset-vp", .data = &th1520_vp_reset_data }, > + { .compatible = "thead,th1520-reset-dsp", .data = &th1520_dsp_reset_data }, I believe these should be alphabetically sorted on compatible name. > { /* sentinel */ } > }; > MODULE_DEVICE_TABLE(of, th1520_reset_match); > -- > 2.50.1 Kind regards, Maud From ajones at ventanamicro.com Tue Sep 2 08:36:10 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Tue, 2 Sep 2025 10:36:10 -0500 Subject: [PATCH v3 0/3] KVM: riscv: selftests: Enable supported test cases In-Reply-To: References: Message-ID: <20250902-9cc0d0dad59ba680062dbbf8@orel> On Mon, Sep 01, 2025 at 03:35:48PM +0800, dayss1224 at gmail.com wrote: > From: Dong Yang > > Add supported KVM test cases and fix the compilation dependencies. > --- > Changes in v3: > - Reorder patches to fix build dependencies > - Sort common supported test cases alphabetically > - Move ucall_common.h include from common header to specific source files > > Changes in v2: > - Delete some repeat KVM test cases on riscv > - Add missing headers to fix the build for new RISC-V KVM selftests > > Dong Yang (1): > KVM: riscv: selftests: Add missing headers for new testcases > > Quan Zhou (2): > KVM: riscv: selftests: Use the existing RISCV_FENCE macro in > `rseq-riscv.h` > KVM: riscv: selftests: Add common supported test cases > > tools/testing/selftests/kvm/Makefile.kvm | 6 ++++++ > tools/testing/selftests/kvm/access_tracking_perf_test.c | 1 + > tools/testing/selftests/kvm/include/riscv/processor.h | 1 + > .../selftests/kvm/memslot_modification_stress_test.c | 1 + > tools/testing/selftests/kvm/memslot_perf_test.c | 1 + > tools/testing/selftests/rseq/rseq-riscv.h | 3 +-- > 6 files changed, 11 insertions(+), 2 deletions(-) > > -- > 2.34.1 In the future please CC previous reviewers on the entire series (particularly when they have reviewed the entire previous series). For the series, Reviewed-by: Andrew Jones From uwu at icenowy.me Tue Sep 2 08:40:02 2025 From: uwu at icenowy.me (Icenowy Zheng) Date: Tue, 02 Sep 2025 23:40:02 +0800 Subject: [PATCH 6/7] drm/etnaviv: add shared context support for iommuv2 In-Reply-To: <20250816074757.2559055-7-uwu@icenowy.me> References: <20250816074757.2559055-1-uwu@icenowy.me> <20250816074757.2559055-7-uwu@icenowy.me> Message-ID: <05ef7c0df0a2235277030b9e33f34082e8938faa.camel@icenowy.me> ? 2025-08-16???? 15:47 +0800?Icenowy Zheng??? > Unfortunately the GC620 GPU seems to have broken PTA capibility, and > switching page table ID in command stream after it's running won't > work. > As directly switching mtlb isn't working either, there will be no > reliable way to switch page table in the command stream, and a shared > context, like iommuv1, is needed. > > Add support for this shared context situation. Shared context is set > when the broken PTA is known, and the context allocation code will be > made short circuit when a shared context is set. > > Signed-off-by: Icenowy Zheng > --- > ?drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c | 8 ++++++++ > ?drivers/gpu/drm/etnaviv/etnaviv_mmu.c????? | 1 + > ?drivers/gpu/drm/etnaviv/etnaviv_mmu.h????? | 2 ++ > ?3 files changed, 11 insertions(+) > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c > b/drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c > index 5654a604c70cf..960ba3d659dc5 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c Well, I forgot to clean shared_context in etnaviv_iommuv2_free() when the shared context is torn down... > @@ -273,6 +273,12 @@ etnaviv_iommuv2_context_alloc(struct > etnaviv_iommu_global *global) > ????????struct etnaviv_iommu_context *context; > ? > ????????mutex_lock(&global->lock); > +???????if (global->shared_context) { > +???????????????context = global->shared_context; > +???????????????etnaviv_iommu_context_get(context); > +???????????????mutex_unlock(&global->lock); > +???????????????return context; > +???????} > ? > ????????v2_context = vzalloc(sizeof(*v2_context)); > ????????if (!v2_context) > @@ -301,6 +307,8 @@ etnaviv_iommuv2_context_alloc(struct > etnaviv_iommu_global *global) > ????????mutex_init(&context->lock); > ????????INIT_LIST_HEAD(&context->mappings); > ????????drm_mm_init(&context->mm, SZ_4K, (u64)SZ_1G * 4 - SZ_4K); > +???????if (global->v2.broken_pta) > +???????????????global->shared_context = context; > ? > ????????mutex_unlock(&global->lock); > ????????return context; > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c > b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c > index df5192083b201..a0f9c950504e0 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c > @@ -504,6 +504,7 @@ int etnaviv_iommu_global_init(struct etnaviv_gpu > *gpu) > ????????memset32(global->bad_page_cpu, 0xdead55aa, SZ_4K / > sizeof(u32)); > ? > ????????if (version == ETNAVIV_IOMMU_V2) { > +???????????????global->v2.broken_pta = gpu->identity.model == > chipModel_GC620; > ????????????????global->v2.pta_cpu = dma_alloc_wc(dev, > ETNAVIV_PTA_SIZE, > ?????????????????????????????????????????????? &global->v2.pta_dma, > GFP_KERNEL); > ????????????????if (!global->v2.pta_cpu) > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.h > b/drivers/gpu/drm/etnaviv/etnaviv_mmu.h > index 2ec4acda02bc6..5627d2a0d0237 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.h > +++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.h > @@ -55,6 +55,8 @@ struct etnaviv_iommu_global { > ????????????????u64 *pta_cpu; > ????????????????dma_addr_t pta_dma; > ????????????????DECLARE_BITMAP(pta_alloc, ETNAVIV_PTA_ENTRIES); > +???????????????/* Whether runtime switching page table ID will fail > */ > +???????????????bool broken_pta; > ????????} v2; > ?}; > ? From tglx at linutronix.de Tue Sep 2 08:41:37 2025 From: tglx at linutronix.de (Thomas Gleixner) Date: Tue, 02 Sep 2025 17:41:37 +0200 Subject: [PATCH v2 4/7] entry/kvm: KVM: Move KVM details related to signal/-EINTR into KVM proper In-Reply-To: <20250828000156.23389-5-seanjc@google.com> References: <20250828000156.23389-1-seanjc@google.com> <20250828000156.23389-5-seanjc@google.com> Message-ID: <87wm6gzwsu.ffs@tglx> On Wed, Aug 27 2025 at 17:01, Sean Christopherson wrote: > Move KVM's morphing of pending signals into userspace exits into KVM > proper, and drop the @vcpu param from xfer_to_guest_mode_handle_work(). > How KVM responds to -EINTR is a detail that really belongs in KVM itself, > and invoking kvm_handle_signal_exit() from kernel code creates an inverted > module dependency. E.g. attempting to move kvm_handle_signal_exit() into > kvm_main.c would generate an linker error when building kvm.ko as a module. > > Dropping KVM details will also converting the KVM "entry" code into a more > generic virtualization framework so that it can be used when running as a > Hyper-V root partition. > > Lastly, eliminating usage of "struct kvm_vcpu" outside of KVM is also nice > to have for KVM x86 developers, as keeping the details of kvm_vcpu purely > within KVM allows changing the layout of the structure without having to > boot into a new kernel, e.g. allows rebuilding and reloading kvm.ko with a > modified kvm_vcpu structure as part of debug/development. > > Signed-off-by: Sean Christopherson Reviewed-by: Thomas Gleixner From tglx at linutronix.de Tue Sep 2 08:41:57 2025 From: tglx at linutronix.de (Thomas Gleixner) Date: Tue, 02 Sep 2025 17:41:57 +0200 Subject: [PATCH v2 5/7] entry: Rename "kvm" entry code assets to "virt" to genericize APIs In-Reply-To: <20250828000156.23389-6-seanjc@google.com> References: <20250828000156.23389-1-seanjc@google.com> <20250828000156.23389-6-seanjc@google.com> Message-ID: <87tt1kzwsa.ffs@tglx> On Wed, Aug 27 2025 at 17:01, Sean Christopherson wrote: > Rename the "kvm" entry code files and Kconfigs to use generic "virt" > nomenclature so that the code can be reused by other hypervisors (or > rather, their root/dom0 partition drivers), without incorrectly suggesting > the code somehow relies on and/or involves KVM. > > No functional change intended. > > Signed-off-by: Sean Christopherson Reviewed-by: Thomas Gleixner From krzk at kernel.org Tue Sep 2 08:43:43 2025 From: krzk at kernel.org (Krzysztof Kozlowski) Date: Tue, 2 Sep 2025 17:43:43 +0200 Subject: [PATCH 1/4] dt-bindings: reset: thead,th1520-reset: Add controllers for more subsys In-Reply-To: <705de60088f72c1ed575d92e8c4f4b90989385c5.camel@pengutronix.de> References: <20250901042320.22865-1-ziyao@disroot.org> <20250901042320.22865-2-ziyao@disroot.org> <20250902-peach-jackal-of-judgment-8aee13@kuoka> <75cafd7e-02a5-41d1-9daf-24bef20dab82@kernel.org> <705de60088f72c1ed575d92e8c4f4b90989385c5.camel@pengutronix.de> Message-ID: <8d538aca-a90f-4710-b697-0d7de65bfc4f@kernel.org> On 02/09/2025 15:57, Philipp Zabel wrote: > On Di, 2025-09-02 at 15:44 +0200, Krzysztof Kozlowski wrote: >> On 02/09/2025 11:04, Yao Zi wrote: >>> On Tue, Sep 02, 2025 at 10:27:53AM +0200, Krzysztof Kozlowski wrote: >>>> On Mon, Sep 01, 2025 at 04:23:17AM +0000, Yao Zi wrote: >>>>> +/* VO Subsystem */ >>>>> #define TH1520_RESET_ID_GPU 0 >>>>> #define TH1520_RESET_ID_GPU_CLKGEN 1 >>>>> -#define TH1520_RESET_ID_NPU 2 >>>>> -#define TH1520_RESET_ID_WDT0 3 >>>>> -#define TH1520_RESET_ID_WDT1 4 >>>> >>>> This is ABI break and deserves explanation and its own patchset. >>> >>> The registers in control of TH1520_RESET_ID_{NPU,WDT0,WDT1} don't belong >>> to the VO reset controller (documented as "thead,th1520-reset"), and >>> thus cannot be implemented by it. They're in fact AP subsystem resets, >>> which gets supported in Linux with this series. >>> >>> Is it okay for you to separate a patch to delete these wrong IDs and add >>> them back for the AP reset controller latter? Anyway, I should have >>> provided more information about these three resets. Thanks for catching >>> this. >> >> So feels like separate patch dropping these resets with above explanation. > > They happen to be reintroduced with exactly the same values, just for > the AP subsystem reset controller: Yes, I noticed, but that's different reset controller, so previous ABI for that controller stops working. Anyway my comment still stays. Best regards, Krzysztof From conor at kernel.org Tue Sep 2 09:38:44 2025 From: conor at kernel.org (Conor Dooley) Date: Tue, 2 Sep 2025 17:38:44 +0100 Subject: [PATCH v1 5/5] riscv: dts: microchip: add a device tree for Discovery Kit In-Reply-To: <677aad27-66b9-4c4f-8fbe-6b9aabcd375a@kernel.org> References: <20250825161952.3902672-1-valentina.fernandezalanis@microchip.com> <20250825161952.3902672-6-valentina.fernandezalanis@microchip.com> <2b1eb8fd-2a64-4745-ad93-abc53d240b69@kernel.org> <0d90eeb4-e6ac-459c-a6b1-26368f102e0e@kernel.org> <20250902-affair-scrambler-2771df16372e@spud> <677aad27-66b9-4c4f-8fbe-6b9aabcd375a@kernel.org> Message-ID: <20250902-crucial-hankering-193be936a139@spud> On Tue, Sep 02, 2025 at 03:47:56PM +0200, Krzysztof Kozlowski wrote: > On 02/09/2025 10:31, Conor Dooley wrote: > > On Tue, Sep 02, 2025 at 08:22:02AM +0200, Krzysztof Kozlowski wrote: > > > >>>>> + refclk_ccc: cccrefclk { > >>>> > >>>> Please use name for all fixed clocks which matches current format > >>>> recommendation: 'clock-' (see also the pattern in the binding for > >>>> any other options). > >>>> > >>>> https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/clock/fixed-clock.yaml > >>> The fabric dtsi describes elements configured by the FPGA bitstream. > >>> This node is named as such because the Clock Conditioner Circuit CCC's > >>> reference clock source is set by the FPGA bitstream, while its frequency > >>> is determined by an on-board oscillator. > >>> > >>> Hope this clarifies the rationale behind the node name. > >> No, because there is no style naming clocks like this. Neither proper > >> suffix, nor prefix. Use standard naming. > > > > So you want all fixed frequency clocks to be named "clk-foo" when > > "clk-" is not suitable? Fine if you do, but I didn't realise that > > it was required and haven't been keeping an eye out for it. > > Recommended is to just use consistent suffixes or prefixes. Binding asks > for "clock-" so that's what I propose to use here. Okay, I'll keep that in mind. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From parri.andrea at gmail.com Tue Sep 2 09:59:15 2025 From: parri.andrea at gmail.com (Andrea Parri) Date: Tue, 2 Sep 2025 18:59:15 +0200 Subject: [PATCH v2 0/4] riscv: Add Zalasr ISA extension support In-Reply-To: <20250902042432.78960-1-luxu.kernel@bytedance.com> References: <20250902042432.78960-1-luxu.kernel@bytedance.com> Message-ID: > Xu Lu (4): > riscv: add ISA extension parsing for Zalasr > dt-bindings: riscv: Add Zalasr ISA extension description > riscv: Instroduce Zalasr instructions > riscv: Use Zalasr for smp_load_acquire/smp_store_release Informally put, our (Linux) memory consistency model specifies that any sequence spin_unlock(s); spin_lock(t); behaves "as it provides at least FENCE.TSO ordering between operations which precede the UNLOCK+LOCK sequence and operations which follow the sequence". Unless I missing something, the patch set in question breaks such ordering property (on RISC-V): for example, a "release" annotation, .RL (as in spin_unlock() -> smp_store_release(), after patch #4) paired with an "acquire" fence, FENCE R,RW (as could be found in spin_lock() -> atomic_try_cmpxchg_acquire()) do not provide the specified property. I _think some solutions to the issue above include: a) make sure an .RL annotation is always paired with an .AQ annotation and viceversa an .AQ annotation is paired with an .RL annotation (this approach matches the current arm64 approach/implementation); b) on the opposite direction, always pair FENCE R,RW (or occasionally FENCE R,R) with FENCE RW,W (this matches the current approach/the current implementation within riscv); c) mix the previous two solutions (resp., annotations and fences), but make sure to "upgrade" any releases to provide (insert) a FENCE.TSO. (a) would align RISC-V and ARM64 (which is a good thing IMO), though it is probably the most invasive approach among the three approaches above (requiring certain changes to arch/riscv/include/asm/{cmpxchg,atomic}.h, which are already relatively messy due to the various ZABHA plus ZACAS switches). Overall, I'm not too exited at the idea of reviewing any of those changes, but if the community opts for it, I'll almost definitely take a closer look with due calm. ;-) Andrea From dianders at chromium.org Tue Sep 2 10:03:55 2025 From: dianders at chromium.org (Doug Anderson) Date: Tue, 2 Sep 2025 10:03:55 -0700 Subject: [External] Re: [PATCH 1/2] watchdog: refactor watchdog_hld functionality In-Reply-To: References: <20250827100959.83023-1-cuiyunhui@bytedance.com> <20250827100959.83023-2-cuiyunhui@bytedance.com> Message-ID: Hi, On Sun, Aug 31, 2025 at 10:57?PM yunhui cui wrote: > > Hi Doug, > > On Sat, Aug 30, 2025 at 5:34?AM Doug Anderson wrote: > > > > Hi, > > > > On Wed, Aug 27, 2025 at 3:10?AM Yunhui Cui wrote: > > > > > > Move watchdog_hld.c to kernel/, and rename arm_pmu_irq_is_nmi() > > > to arch_pmu_irq_is_nmi() for cross-arch reusability. > > > > > > Signed-off-by: Yunhui Cui > > > --- > > > arch/arm64/kernel/Makefile | 1 - > > > drivers/perf/arm_pmu.c | 2 +- > > > include/linux/nmi.h | 1 + > > > include/linux/perf/arm_pmu.h | 2 -- > > > kernel/Makefile | 2 +- > > > {arch/arm64/kernel => kernel}/watchdog_hld.c | 8 ++++++-- > > > 6 files changed, 9 insertions(+), 7 deletions(-) > > > rename {arch/arm64/kernel => kernel}/watchdog_hld.c (97%) > > > > I'm not a huge fan of the perf hardlockup detector and IMO we should > > maybe just delete it. Thus spreading it to support a new architecture > > isn't my favorite thing to do. Can't you use the buddy hardlockup > > detector? > > Why is there a plan to remove CONFIG_HARDLOCKUP_DETECTOR_PERF? Could > you explain the specific reasons? Is the community's future plan to > favor CONFIG_HARDLOCKUP_DETECTOR_BUDDY? I don't think there are any concrete plans, but there was some discussion here: https://lore.kernel.org/all/CAD=FV=WWUiCi6bZCs_gseFpDDWNkuJMoL6XCftEo6W7q6jRCkg at mail.gmail.com/ -Doug From conor at kernel.org Tue Sep 2 12:46:40 2025 From: conor at kernel.org (Conor Dooley) Date: Tue, 2 Sep 2025 20:46:40 +0100 Subject: [PATCH v2 2/4] dt-bindings: riscv: Add Zalasr ISA extension description In-Reply-To: <20250902042432.78960-3-luxu.kernel@bytedance.com> References: <20250902042432.78960-1-luxu.kernel@bytedance.com> <20250902042432.78960-3-luxu.kernel@bytedance.com> Message-ID: <20250902-embattled-pandemic-254a71360f10@spud> On Tue, Sep 02, 2025 at 12:24:30PM +0800, Xu Lu wrote: > Add description for the Zalasr ISA extension > > Signed-off-by: Xu Lu Acked-by: Conor Dooley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From unicorn_wang at outlook.com Tue Sep 2 17:37:12 2025 From: unicorn_wang at outlook.com (Chen Wang) Date: Wed, 3 Sep 2025 08:37:12 +0800 Subject: [PATCH v2 0/3] irqchip/sg2042-msi: Set irq type according to DT configuration In-Reply-To: <87seh50zi0.ffs@tglx> References: <87seh50zi0.ffs@tglx> Message-ID: On 9/2/2025 9:12 PM, Thomas Gleixner wrote: > On Tue, Sep 02 2025 at 07:59, Chen Wang wrote: > > Please don't top-post and trim your replies. > >> P.S. Since the modification of the DTS part is closely dependent on the >> modification of the driver part, I am not sure whether you are willing >> to pick these three patches together, or just pick the driver part and >> leave the DTS part to me? > They need to go together in the right order obviously. OK, I'll adjust the order of the patches and repost a new version. Thanks, Chen From ziyao at disroot.org Tue Sep 2 17:41:57 2025 From: ziyao at disroot.org (Yao Zi) Date: Wed, 3 Sep 2025 00:41:57 +0000 Subject: [PATCH 1/4] dt-bindings: reset: thead,th1520-reset: Add controllers for more subsys In-Reply-To: <8de5763d7104a552f32196f04363d071548b7bba.camel@pengutronix.de> References: <20250901042320.22865-1-ziyao@disroot.org> <20250901042320.22865-2-ziyao@disroot.org> <8de5763d7104a552f32196f04363d071548b7bba.camel@pengutronix.de> Message-ID: On Tue, Sep 02, 2025 at 03:57:07PM +0200, Philipp Zabel wrote: > On Mo, 2025-09-01 at 04:23 +0000, Yao Zi wrote: > > TH1520 SoC is divided into several subsystems, most of them have > > distinct reset controllers. Let's document reset controllers other than > > the one for VO subsystem and IDs for their reset signals. > > > > Signed-off-by: Yao Zi > > --- > > .../bindings/reset/thead,th1520-reset.yaml | 8 +- > > .../dt-bindings/reset/thead,th1520-reset.h | 219 +++++++++++++++++- > > 2 files changed, 223 insertions(+), 4 deletions(-) > > > [...] > > diff --git a/include/dt-bindings/reset/thead,th1520-reset.h b/include/dt-bindings/reset/thead,th1520-reset.h > > index ee799286c175..927e251edfab 100644 > > --- a/include/dt-bindings/reset/thead,th1520-reset.h > > +++ b/include/dt-bindings/reset/thead,th1520-reset.h > > @@ -7,11 +7,186 @@ > [...] > > +/* AP Subsystem */ > [...] > > +#define TH1520_RESET_ID_C910_C0 5 > > +#define TH1520_RESET_ID_C910_C1 5 > > +#define TH1520_RESET_ID_C910_C2 5 > > +#define TH1520_RESET_ID_C910_C3 5 > > As the kernel test robot already noticed, this doesn't seem right. Yes, this is a copy-paste error. I'll fix it and run static check before sending v2. Thanks. > regards > Philipp Best regards, Yao Zi From ziyao at disroot.org Tue Sep 2 17:44:25 2025 From: ziyao at disroot.org (Yao Zi) Date: Wed, 3 Sep 2025 00:44:25 +0000 Subject: [PATCH 3/4] reset: th1520: Support reset controllers in more subsystems In-Reply-To: References: <20250901042320.22865-4-ziyao@disroot.org> Message-ID: On Tue, Sep 02, 2025 at 04:15:08PM +0200, Maud Spierings wrote: > Hi Yao, > > > Introduce reset controllers for AP, MISC, VI, VP and DSP subsystems and > > add their reset signal mappings. > > > > Signed-off-by: Yao Zi > > --- > > /* snip */ > > > static const struct of_device_id th1520_reset_match[] = { > > + { .compatible = "thead,th1520-reset-ap", .data = &th1520_ap_reset_data }, > > + { .compatible = "thead,th1520-reset-misc", .data = &th1520_misc_reset_data }, > > + { .compatible = "thead,th1520-reset-vi", .data = &th1520_vi_reset_data }, > > { .compatible = "thead,th1520-reset", .data = &th1520_reset_data }, > > + { .compatible = "thead,th1520-reset-vp", .data = &th1520_vp_reset_data }, > > + { .compatible = "thead,th1520-reset-dsp", .data = &th1520_dsp_reset_data }, > > I believe these should be alphabetically sorted on compatible name. This is sorted according to the order they appear in the TRM, but yeah sorting them alphabetically makes more sense. I'll do this in v2. Thanks, Yao Zi > > { /* sentinel */ } > > }; > > MODULE_DEVICE_TABLE(of, th1520_reset_match); > > -- > > 2.50.1 > > Kind regards, > Maud > > _______________________________________________ > linux-riscv mailing list > linux-riscv at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv From spriteovo at gmail.com Tue Sep 2 17:59:29 2025 From: spriteovo at gmail.com (Asuna) Date: Wed, 3 Sep 2025 08:59:29 +0800 Subject: RISC-V: Re-enable GCC+Rust builds In-Reply-To: <20250901-unseemly-blimp-a74e3c77e780@spud> References: <68496eed-b5a4-4739-8d84-dcc428a08e20@gmail.com> <20250830-cheesy-prone-ee5fae406c22@spud> <20250901-lasso-kabob-de32b8fcede8@spud> <20250901-unseemly-blimp-a74e3c77e780@spud> Message-ID: > That particular one might be a problem not because of -mstack-protector-guard itself, but rather three options get added at once: > $(eval KBUILD_CFLAGS += -mstack-protector-guard=tls \ > -mstack-protector-guard-reg=tp \ > -mstack-protector-guard-offset=$(shell \ > awk '{if ($$2 == "TSK_STACK_CANARY") print $$3;}' \ > $(objtree)/include/generated/asm-offsets.h)) > and the other ones might be responsible for the error. I still don't understand the problem here. `bindgen_skip_c_flags` in `rust/Makefile` contains a pattern `-mstack-protector-guard%`, the % at the end enables it to match all those 3 options at the same time, and `filter-out` function removes them before passing to Rust bindgen's libclang. Am I missing something here? > Similarly, something like -Wno-unterminated-string-initialization could cause a problem if gcc supports it but not libclang. Yes. However, this option is only about warnings, not architecture related and does not affect the generated results, so simply adding it into `bindgen_skip_c_flags` patterns should be enough, I think. > I think you're mostly better off catching that sort of thing in Kconfig, where possible and just make incompatible mixes invalid. What's actually incompatible is likely going to depend heavily on what options are enabled. Sounds better, I'll go down that path. From lkp at intel.com Tue Sep 2 18:06:18 2025 From: lkp at intel.com (kernel test robot) Date: Wed, 3 Sep 2025 09:06:18 +0800 Subject: [PATCH v2 4/4] riscv: Use Zalasr for smp_load_acquire/smp_store_release In-Reply-To: <20250902042432.78960-5-luxu.kernel@bytedance.com> References: <20250902042432.78960-5-luxu.kernel@bytedance.com> Message-ID: <202509030832.0uHQ24ec-lkp@intel.com> Hi Xu, kernel test robot noticed the following build errors: [auto build test ERROR on robh/for-next] [also build test ERROR on linus/master v6.17-rc4 next-20250902] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Xu-Lu/riscv-add-ISA-extension-parsing-for-Zalasr/20250902-123357 base: https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git for-next patch link: https://lore.kernel.org/r/20250902042432.78960-5-luxu.kernel%40bytedance.com patch subject: [PATCH v2 4/4] riscv: Use Zalasr for smp_load_acquire/smp_store_release config: riscv-randconfig-002-20250903 (https://download.01.org/0day-ci/archive/20250903/202509030832.0uHQ24ec-lkp at intel.com/config) compiler: riscv64-linux-gcc (GCC) 9.5.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250903/202509030832.0uHQ24ec-lkp at intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202509030832.0uHQ24ec-lkp at intel.com/ All errors (new ones prefixed by >>): In file included from include/asm-generic/bitops/generic-non-atomic.h:7, from include/linux/bitops.h:28, from include/linux/thread_info.h:27, from include/asm-generic/preempt.h:5, from ./arch/riscv/include/generated/asm/preempt.h:1, from include/linux/preempt.h:79, from include/linux/spinlock.h:56, from include/linux/mmzone.h:8, from include/linux/gfp.h:7, from include/linux/mm.h:7, from arch/riscv/kernel/asm-offsets.c:8: include/linux/list.h: In function 'list_empty_careful': >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/list.h:409:27: note: in expansion of macro 'smp_load_acquire' 409 | struct list_head *next = smp_load_acquire(&head->next); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/list.h:409:27: note: in expansion of macro 'smp_load_acquire' 409 | struct list_head *next = smp_load_acquire(&head->next); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/list.h:409:27: note: in expansion of macro 'smp_load_acquire' 409 | struct list_head *next = smp_load_acquire(&head->next); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/list.h:409:27: note: in expansion of macro 'smp_load_acquire' 409 | struct list_head *next = smp_load_acquire(&head->next); | ^~~~~~~~~~~~~~~~ In file included from include/asm-generic/bitops/generic-non-atomic.h:7, from include/linux/bitops.h:28, from include/linux/thread_info.h:27, from include/asm-generic/preempt.h:5, from ./arch/riscv/include/generated/asm/preempt.h:1, from include/linux/preempt.h:79, from include/linux/spinlock.h:56, from include/linux/mmzone.h:8, from include/linux/gfp.h:7, from include/linux/mm.h:7, from arch/riscv/kernel/asm-offsets.c:8: include/linux/atomic/atomic-arch-fallback.h: In function 'raw_atomic_read_acquire': >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/atomic/atomic-arch-fallback.h:479:9: note: in expansion of macro 'smp_load_acquire' 479 | ret = smp_load_acquire(&(v)->counter); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/atomic/atomic-arch-fallback.h:479:9: note: in expansion of macro 'smp_load_acquire' 479 | ret = smp_load_acquire(&(v)->counter); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/atomic/atomic-arch-fallback.h:479:9: note: in expansion of macro 'smp_load_acquire' 479 | ret = smp_load_acquire(&(v)->counter); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/atomic/atomic-arch-fallback.h:479:9: note: in expansion of macro 'smp_load_acquire' 479 | ret = smp_load_acquire(&(v)->counter); | ^~~~~~~~~~~~~~~~ include/linux/atomic/atomic-arch-fallback.h: In function 'raw_atomic64_read_acquire': >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/atomic/atomic-arch-fallback.h:2605:9: note: in expansion of macro 'smp_load_acquire' 2605 | ret = smp_load_acquire(&(v)->counter); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/atomic/atomic-arch-fallback.h:2605:9: note: in expansion of macro 'smp_load_acquire' 2605 | ret = smp_load_acquire(&(v)->counter); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/atomic/atomic-arch-fallback.h:2605:9: note: in expansion of macro 'smp_load_acquire' 2605 | ret = smp_load_acquire(&(v)->counter); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/atomic/atomic-arch-fallback.h:2605:9: note: in expansion of macro 'smp_load_acquire' 2605 | ret = smp_load_acquire(&(v)->counter); | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h: In function '__seqprop_sequence': >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:211:9: note: in expansion of macro 'smp_load_acquire' 211 | return smp_load_acquire(&s->sequence); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:211:9: note: in expansion of macro 'smp_load_acquire' 211 | return smp_load_acquire(&s->sequence); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:211:9: note: in expansion of macro 'smp_load_acquire' 211 | return smp_load_acquire(&s->sequence); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:211:9: note: in expansion of macro 'smp_load_acquire' 211 | return smp_load_acquire(&s->sequence); | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h: In function '__seqprop_raw_spinlock_sequence': >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:226:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 226 | SEQCOUNT_LOCKNAME(raw_spinlock, raw_spinlock_t, false, raw_spin) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:226:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 226 | SEQCOUNT_LOCKNAME(raw_spinlock, raw_spinlock_t, false, raw_spin) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:226:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 226 | SEQCOUNT_LOCKNAME(raw_spinlock, raw_spinlock_t, false, raw_spin) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:226:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 226 | SEQCOUNT_LOCKNAME(raw_spinlock, raw_spinlock_t, false, raw_spin) | ^~~~~~~~~~~~~~~~~ >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:226:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 226 | SEQCOUNT_LOCKNAME(raw_spinlock, raw_spinlock_t, false, raw_spin) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:226:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 226 | SEQCOUNT_LOCKNAME(raw_spinlock, raw_spinlock_t, false, raw_spin) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:226:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 226 | SEQCOUNT_LOCKNAME(raw_spinlock, raw_spinlock_t, false, raw_spin) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:226:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 226 | SEQCOUNT_LOCKNAME(raw_spinlock, raw_spinlock_t, false, raw_spin) | ^~~~~~~~~~~~~~~~~ include/linux/seqlock.h: In function '__seqprop_spinlock_sequence': >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:227:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 227 | SEQCOUNT_LOCKNAME(spinlock, spinlock_t, __SEQ_RT, spin) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:227:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 227 | SEQCOUNT_LOCKNAME(spinlock, spinlock_t, __SEQ_RT, spin) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:227:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 227 | SEQCOUNT_LOCKNAME(spinlock, spinlock_t, __SEQ_RT, spin) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:227:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 227 | SEQCOUNT_LOCKNAME(spinlock, spinlock_t, __SEQ_RT, spin) | ^~~~~~~~~~~~~~~~~ >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:227:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 227 | SEQCOUNT_LOCKNAME(spinlock, spinlock_t, __SEQ_RT, spin) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:227:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 227 | SEQCOUNT_LOCKNAME(spinlock, spinlock_t, __SEQ_RT, spin) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:227:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 227 | SEQCOUNT_LOCKNAME(spinlock, spinlock_t, __SEQ_RT, spin) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:227:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 227 | SEQCOUNT_LOCKNAME(spinlock, spinlock_t, __SEQ_RT, spin) | ^~~~~~~~~~~~~~~~~ include/linux/seqlock.h: In function '__seqprop_rwlock_sequence': >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:228:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 228 | SEQCOUNT_LOCKNAME(rwlock, rwlock_t, __SEQ_RT, read) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:228:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 228 | SEQCOUNT_LOCKNAME(rwlock, rwlock_t, __SEQ_RT, read) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:228:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 228 | SEQCOUNT_LOCKNAME(rwlock, rwlock_t, __SEQ_RT, read) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:228:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 228 | SEQCOUNT_LOCKNAME(rwlock, rwlock_t, __SEQ_RT, read) | ^~~~~~~~~~~~~~~~~ >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:228:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 228 | SEQCOUNT_LOCKNAME(rwlock, rwlock_t, __SEQ_RT, read) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:228:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 228 | SEQCOUNT_LOCKNAME(rwlock, rwlock_t, __SEQ_RT, read) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:228:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 228 | SEQCOUNT_LOCKNAME(rwlock, rwlock_t, __SEQ_RT, read) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:228:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 228 | SEQCOUNT_LOCKNAME(rwlock, rwlock_t, __SEQ_RT, read) | ^~~~~~~~~~~~~~~~~ include/linux/seqlock.h: In function '__seqprop_mutex_sequence': >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:229:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 229 | SEQCOUNT_LOCKNAME(mutex, struct mutex, true, mutex) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:229:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 229 | SEQCOUNT_LOCKNAME(mutex, struct mutex, true, mutex) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:229:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 229 | SEQCOUNT_LOCKNAME(mutex, struct mutex, true, mutex) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:160:17: note: in expansion of macro 'smp_load_acquire' 160 | unsigned seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:229:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 229 | SEQCOUNT_LOCKNAME(mutex, struct mutex, true, mutex) | ^~~~~~~~~~~~~~~~~ >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:229:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 229 | SEQCOUNT_LOCKNAME(mutex, struct mutex, true, mutex) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:229:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 229 | SEQCOUNT_LOCKNAME(mutex, struct mutex, true, mutex) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:229:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 229 | SEQCOUNT_LOCKNAME(mutex, struct mutex, true, mutex) | ^~~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/seqlock.h:173:9: note: in expansion of macro 'smp_load_acquire' 173 | seq = smp_load_acquire(&s->seqcount.sequence); \ | ^~~~~~~~~~~~~~~~ include/linux/seqlock.h:229:1: note: in expansion of macro 'SEQCOUNT_LOCKNAME' 229 | SEQCOUNT_LOCKNAME(mutex, struct mutex, true, mutex) | ^~~~~~~~~~~~~~~~~ In file included from include/asm-generic/bitops/generic-non-atomic.h:7, from include/linux/bitops.h:28, from include/linux/thread_info.h:27, from include/asm-generic/preempt.h:5, from ./arch/riscv/include/generated/asm/preempt.h:1, from include/linux/preempt.h:79, from include/linux/spinlock.h:56, from include/linux/mmzone.h:8, from include/linux/gfp.h:7, from include/linux/mm.h:7, from arch/riscv/kernel/asm-offsets.c:8: include/linux/key.h: In function 'key_read_state': >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/key.h:459:9: note: in expansion of macro 'smp_load_acquire' 459 | return smp_load_acquire(&key->state); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/key.h:459:9: note: in expansion of macro 'smp_load_acquire' 459 | return smp_load_acquire(&key->state); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/key.h:459:9: note: in expansion of macro 'smp_load_acquire' 459 | return smp_load_acquire(&key->state); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/key.h:459:9: note: in expansion of macro 'smp_load_acquire' 459 | return smp_load_acquire(&key->state); | ^~~~~~~~~~~~~~~~ include/linux/fs.h: In function 'i_size_read': >> arch/riscv/include/asm/barrier.h:96:3: error: read-only variable 'val' used as 'asm' output 96 | asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/fs.h:988:9: note: in expansion of macro 'smp_load_acquire' 988 | return smp_load_acquire(&inode->i_size); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:102:3: error: read-only variable 'val' used as 'asm' output 102 | asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/fs.h:988:9: note: in expansion of macro 'smp_load_acquire' 988 | return smp_load_acquire(&inode->i_size); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:108:3: error: read-only variable 'val' used as 'asm' output 108 | asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/fs.h:988:9: note: in expansion of macro 'smp_load_acquire' 988 | return smp_load_acquire(&inode->i_size); | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/barrier.h:114:3: error: read-only variable 'val' used as 'asm' output 114 | asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ | ^~~ include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^~~~~~~~~~~~~~~~~~ include/linux/fs.h:988:9: note: in expansion of macro 'smp_load_acquire' 988 | return smp_load_acquire(&inode->i_size); | ^~~~~~~~~~~~~~~~ make[3]: *** [scripts/Makefile.build:182: arch/riscv/kernel/asm-offsets.s] Error 1 shuffle=1073380763 make[3]: Target 'prepare' not remade because of errors. make[2]: *** [Makefile:1282: prepare0] Error 2 shuffle=1073380763 make[2]: Target 'prepare' not remade because of errors. make[1]: *** [Makefile:248: __sub-make] Error 2 shuffle=1073380763 make[1]: Target 'prepare' not remade because of errors. make: *** [Makefile:248: __sub-make] Error 2 shuffle=1073380763 make: Target 'prepare' not remade because of errors. vim +96 arch/riscv/include/asm/barrier.h 89 90 #define __smp_load_acquire(p) \ 91 ({ \ 92 TYPEOF_UNQUAL(*p) val; \ 93 compiletime_assert_atomic_type(*p); \ 94 switch (sizeof(*p)) { \ 95 case 1: \ > 96 asm volatile(ALTERNATIVE("lb %0, 0(%1)\t\nfence r, rw\t\n", \ 97 LB_AQ(%0, %1) "\t\nnop\t\n", \ 98 0, RISCV_ISA_EXT_ZALASR, 1) \ 99 : "=r" (val) : "r" (p) : "memory"); \ 100 break; \ 101 case 2: \ 102 asm volatile(ALTERNATIVE("lh %0, 0(%1)\t\nfence r, rw\t\n", \ 103 LH_AQ(%0, %1) "\t\nnop\t\n", \ 104 0, RISCV_ISA_EXT_ZALASR, 1) \ 105 : "=r" (val) : "r" (p) : "memory"); \ 106 break; \ 107 case 4: \ 108 asm volatile(ALTERNATIVE("lw %0, 0(%1)\t\nfence r, rw\t\n", \ 109 LW_AQ(%0, %1) "\t\nnop\t\n", \ 110 0, RISCV_ISA_EXT_ZALASR, 1) \ 111 : "=r" (val) : "r" (p) : "memory"); \ 112 break; \ 113 case 8: \ 114 asm volatile(ALTERNATIVE("ld %0, 0(%1)\t\nfence r, rw\t\n", \ 115 LD_AQ(%0, %1) "\t\nnop\t\n", \ 116 0, RISCV_ISA_EXT_ZALASR, 1) \ 117 : "=r" (val) : "r" (p) : "memory"); \ 118 break; \ 119 default: \ 120 __bad_size_call_parameter(); \ 121 break; \ 122 } \ 123 val; \ 124 }) 125 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki From Valentina.FernandezAlanis at microchip.com Wed Sep 3 02:43:46 2025 From: Valentina.FernandezAlanis at microchip.com (Valentina.FernandezAlanis at microchip.com) Date: Wed, 3 Sep 2025 09:43:46 +0000 Subject: [PATCH v2 5/5] riscv: dts: microchip: add a device tree for Discovery Kit In-Reply-To: References: <20250902075548.1967613-1-valentina.fernandezalanis@microchip.com> <20250902075548.1967613-6-valentina.fernandezalanis@microchip.com> Message-ID: <8b371196-c853-4e47-980f-3f2b3525180e@microchip.com> On 02/09/2025 09:32, Yao Zi wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe > > On Tue, Sep 02, 2025 at 08:55:48AM +0100, Valentina Fernandez wrote: >> Add a minimal device tree for the Microchip PolarFire SoC Discovery Kit. >> The Discovery Kit is a cost-optimized board based on PolarFire SoC >> MPFS095T and features: >> >> - 1 GB DDR4x16 >> - 1x Gigabit Ethernet >> - 3x UARTs >> - Raspberry Pi connector >> - mikroBus connector >> - microSD card connector >> >> Link: https://www.microchip.com/en-us/development-tool/mpfs-disco-kit >> Signed-off-by: Valentina Fernandez >> --- >> arch/riscv/boot/dts/microchip/Makefile | 1 + >> .../dts/microchip/mpfs-disco-kit-fabric.dtsi | 58 ++++++ >> .../boot/dts/microchip/mpfs-disco-kit.dts | 190 ++++++++++++++++++ >> 3 files changed, 249 insertions(+) >> create mode 100644 arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi >> create mode 100644 arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts > > ... > >> diff --git a/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts b/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts >> new file mode 100644 >> index 000000000000..c068b9bb5bfd >> --- /dev/null >> +++ b/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts > > ... > >> +&mbox { >> + status = "okay"; >> +}; >> + >> +&mmc { >> + bus-width = <4>; >> + disable-wp; >> + cap-sd-highspeed; >> + cap-mmc-highspeed; >> + sd-uhs-sdr12; >> + sd-uhs-sdr25; >> + sd-uhs-sdr50; >> + sd-uhs-sdr104; > > I think sd-uhs-sdr104 implies sd-uhs-sdr{12,25,50}, thus the latter > three properties could be dropped. Even though the kernel treats sd-uhs-sdr104 as implying support for sdr12, sdr25, and sdr50, the binding has no such rules about implying other modes. For this reason, I thought it could be valid to explicitly list all supported modes to ensure accurate hw representation.> >> + no-1-8-v; >> + status = "okay"; >> +}; > > Best regards, > Yao Zi From hendrik.hamerlinck at hammernet.be Wed Sep 3 03:01:04 2025 From: hendrik.hamerlinck at hammernet.be (Hendrik Hamerlinck) Date: Wed, 3 Sep 2025 12:01:04 +0200 Subject: [PATCH] pinctrl: spacemit: fix typo in PRI_TDI pin name Message-ID: <20250903100104.360637-1-hendrik.hamerlinck@hammernet.be> The datasheet lists this signal as PRI_TDI, not PRI_DTI. Fix the pin name to match the documentation and JTAG naming convention (TDI = Test Data In). Signed-off-by: Hendrik Hamerlinck --- drivers/pinctrl/spacemit/pinctrl-k1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pinctrl/spacemit/pinctrl-k1.c b/drivers/pinctrl/spacemit/pinctrl-k1.c index 9996b1c4a07e..a3f433b611f7 100644 --- a/drivers/pinctrl/spacemit/pinctrl-k1.c +++ b/drivers/pinctrl/spacemit/pinctrl-k1.c @@ -847,7 +847,7 @@ static const struct pinctrl_pin_desc k1_pin_desc[] = { PINCTRL_PIN(67, "GPIO_67"), PINCTRL_PIN(68, "GPIO_68"), PINCTRL_PIN(69, "GPIO_69"), - PINCTRL_PIN(70, "GPIO_70/PRI_DTI"), + PINCTRL_PIN(70, "GPIO_70/PRI_TDI"), PINCTRL_PIN(71, "GPIO_71/PRI_TMS"), PINCTRL_PIN(72, "GPIO_72/PRI_TCK"), PINCTRL_PIN(73, "GPIO_73/PRI_TDO"), -- 2.43.0 From e at freeshell.de Wed Sep 3 03:13:34 2025 From: e at freeshell.de (E Shattow) Date: Wed, 3 Sep 2025 03:13:34 -0700 Subject: [PATCH v1 0/2] riscv: dts: starfive: jh7110-common: drop no-mmc and power-on-delay-ms from mmc interfaces Message-ID: <20250903101346.861076-1-e@freeshell.de> Drop no-mmc and power-on-delay-ms properties. The committer cannot be reached for comment and per discussion [1] and testing there is not any observable problem that is being solved by having these properties for the VisionFive 2 or similar variant boards through the jh7110-common.dtsi include. E Shattow (2): riscv: dts: starfive: jh7110-common: drop no-mmc property from mmc1 riscv: dts: starfive: jh7110-common: drop mmc post-power-on-delay-ms arch/riscv/boot/dts/starfive/jh7110-common.dtsi | 3 --- 1 file changed, 3 deletions(-) base-commit: f66eb149b87677da3171a0ed51c77c3599ad55d6 -- 2.50.0 From e at freeshell.de Wed Sep 3 03:13:35 2025 From: e at freeshell.de (E Shattow) Date: Wed, 3 Sep 2025 03:13:35 -0700 Subject: [PATCH v1 1/2] riscv: dts: starfive: jh7110-common: drop no-mmc property from mmc1 In-Reply-To: <20250903101346.861076-1-e@freeshell.de> References: <20250903101346.861076-1-e@freeshell.de> Message-ID: <20250903101346.861076-2-e@freeshell.de> Relax no-mmc restriction on mmc1 for jh7110 boards. The restriction is only needed to block use of commands that would cause a device to malfunction, which by testing and observation [1] is not any problem. 1: https://lore.kernel.org/lkml/NT0PR01MB1312E0D9EE9F158A57B77700E63D2 at NT0PR01MB1312.CHNPR01.prod.partner.outlook.cn/ Signed-off-by: E Shattow Tested-by: Hal Feng --- arch/riscv/boot/dts/starfive/jh7110-common.dtsi | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/riscv/boot/dts/starfive/jh7110-common.dtsi b/arch/riscv/boot/dts/starfive/jh7110-common.dtsi index a315113840e5..4fa77ffd54e3 100644 --- a/arch/riscv/boot/dts/starfive/jh7110-common.dtsi +++ b/arch/riscv/boot/dts/starfive/jh7110-common.dtsi @@ -299,7 +299,6 @@ &mmc1 { assigned-clock-rates = <50000000>; bus-width = <4>; bootph-pre-ram; - no-mmc; cd-gpios = <&sysgpio 41 GPIO_ACTIVE_LOW>; disable-wp; cap-sd-highspeed; -- 2.50.0 From e at freeshell.de Wed Sep 3 03:13:36 2025 From: e at freeshell.de (E Shattow) Date: Wed, 3 Sep 2025 03:13:36 -0700 Subject: [PATCH v1 2/2] riscv: dts: starfive: jh7110-common: drop mmc post-power-on-delay-ms In-Reply-To: <20250903101346.861076-1-e@freeshell.de> References: <20250903101346.861076-1-e@freeshell.de> Message-ID: <20250903101346.861076-3-e@freeshell.de> Drop post-power-on-delay-ms from mmc0 mmc1 interfaces. There is no known reason for these properties to continue, testing appears to be fine without them [1]. 1: https://lore.kernel.org/lkml/NT0PR01MB1312E0D9EE9F158A57B77700E63D2 at NT0PR01MB1312.CHNPR01.prod.partner.outlook.cn/ Signed-off-by: E Shattow Tested-by: Hal Feng --- arch/riscv/boot/dts/starfive/jh7110-common.dtsi | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/riscv/boot/dts/starfive/jh7110-common.dtsi b/arch/riscv/boot/dts/starfive/jh7110-common.dtsi index 4fa77ffd54e3..5dc15e48b74b 100644 --- a/arch/riscv/boot/dts/starfive/jh7110-common.dtsi +++ b/arch/riscv/boot/dts/starfive/jh7110-common.dtsi @@ -285,7 +285,6 @@ &mmc0 { mmc-ddr-1_8v; mmc-hs200-1_8v; cap-mmc-hw-reset; - post-power-on-delay-ms = <200>; pinctrl-names = "default"; pinctrl-0 = <&mmc0_pins>; vmmc-supply = <&vcc_3v3>; @@ -302,7 +301,6 @@ &mmc1 { cd-gpios = <&sysgpio 41 GPIO_ACTIVE_LOW>; disable-wp; cap-sd-highspeed; - post-power-on-delay-ms = <200>; pinctrl-names = "default"; pinctrl-0 = <&mmc1_pins>; status = "okay"; -- 2.50.0 From dlan at gentoo.org Wed Sep 3 03:18:33 2025 From: dlan at gentoo.org (Yixun Lan) Date: Wed, 3 Sep 2025 18:18:33 +0800 Subject: [PATCH] pinctrl: spacemit: fix typo in PRI_TDI pin name In-Reply-To: <20250903100104.360637-1-hendrik.hamerlinck@hammernet.be> References: <20250903100104.360637-1-hendrik.hamerlinck@hammernet.be> Message-ID: <20250903101833-GYB1155728@gentoo.org> Hi Hendrik, On 12:01 Wed 03 Sep , Hendrik Hamerlinck wrote: > The datasheet lists this signal as PRI_TDI, not PRI_DTI. > Fix the pin name to match the documentation and JTAG naming > convention (TDI = Test Data In). > > Signed-off-by: Hendrik Hamerlinck Reviewed-by: Yixun Lan > --- > drivers/pinctrl/spacemit/pinctrl-k1.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pinctrl/spacemit/pinctrl-k1.c b/drivers/pinctrl/spacemit/pinctrl-k1.c > index 9996b1c4a07e..a3f433b611f7 100644 > --- a/drivers/pinctrl/spacemit/pinctrl-k1.c > +++ b/drivers/pinctrl/spacemit/pinctrl-k1.c > @@ -847,7 +847,7 @@ static const struct pinctrl_pin_desc k1_pin_desc[] = { > PINCTRL_PIN(67, "GPIO_67"), > PINCTRL_PIN(68, "GPIO_68"), > PINCTRL_PIN(69, "GPIO_69"), > - PINCTRL_PIN(70, "GPIO_70/PRI_DTI"), > + PINCTRL_PIN(70, "GPIO_70/PRI_TDI"), > PINCTRL_PIN(71, "GPIO_71/PRI_TMS"), > PINCTRL_PIN(72, "GPIO_72/PRI_TCK"), > PINCTRL_PIN(73, "GPIO_73/PRI_TDO"), > -- > 2.43.0 > -- Yixun Lan (dlan) From e at freeshell.de Wed Sep 3 03:19:42 2025 From: e at freeshell.de (E Shattow) Date: Wed, 3 Sep 2025 03:19:42 -0700 Subject: [PATCH v1 0/2] riscv: dts: starfive: jh7110-common: drop no-mmc and power-on-delay-ms from mmc interfaces In-Reply-To: <20250903101346.861076-1-e@freeshell.de> References: <20250903101346.861076-1-e@freeshell.de> Message-ID: <067174f5-32ea-45df-9a48-96222850e813@freeshell.de> On 9/3/25 03:13, E Shattow wrote: > Drop no-mmc and power-on-delay-ms properties. > > The committer cannot be reached for comment and per discussion [1] and > testing there is not any observable problem that is being solved by > having these properties for the VisionFive 2 or similar variant boards > through the jh7110-common.dtsi include. > > E Shattow (2): > riscv: dts: starfive: jh7110-common: drop no-mmc property from mmc1 > riscv: dts: starfive: jh7110-common: drop mmc post-power-on-delay-ms > > arch/riscv/boot/dts/starfive/jh7110-common.dtsi | 3 --- > 1 file changed, 3 deletions(-) > > > base-commit: f66eb149b87677da3171a0ed51c77c3599ad55d6 P.S. missed the URL for reference [1] in this cover letter. It is: 1: https://lore.kernel.org/lkml/NT0PR01MB1312E0D9EE9F158A57B77700E63D2 at NT0PR01MB1312.CHNPR01.prod.partner.outlook.cn/ From luxu.kernel at bytedance.com Wed Sep 3 04:41:57 2025 From: luxu.kernel at bytedance.com (Xu Lu) Date: Wed, 3 Sep 2025 19:41:57 +0800 Subject: [External] Re: [PATCH v2 0/4] riscv: Add Zalasr ISA extension support In-Reply-To: References: <20250902042432.78960-1-luxu.kernel@bytedance.com> Message-ID: Hi Andrea, Great catch! Thanks a lot for your review. The problem comes from the mixed use of acquire/release semantics via fence and via real ld.aq/sd.rl. I would prefer your method (a). The existing atomic acquire/release functions' implementation can be further modified to amocas.sq/amocas.rl/lr.aq/sc.rl. I will send the next version after I finish it and hope you can help with review then. Best regards, Xu Lu On Wed, Sep 3, 2025 at 12:59?AM Andrea Parri wrote: > > > Xu Lu (4): > > riscv: add ISA extension parsing for Zalasr > > dt-bindings: riscv: Add Zalasr ISA extension description > > riscv: Instroduce Zalasr instructions > > riscv: Use Zalasr for smp_load_acquire/smp_store_release > > Informally put, our (Linux) memory consistency model specifies that any > sequence > > spin_unlock(s); > spin_lock(t); > > behaves "as it provides at least FENCE.TSO ordering between operations > which precede the UNLOCK+LOCK sequence and operations which follow the > sequence". Unless I missing something, the patch set in question breaks > such ordering property (on RISC-V): for example, a "release" annotation, > .RL (as in spin_unlock() -> smp_store_release(), after patch #4) paired > with an "acquire" fence, FENCE R,RW (as could be found in spin_lock() -> > atomic_try_cmpxchg_acquire()) do not provide the specified property. > > I _think some solutions to the issue above include: > > a) make sure an .RL annotation is always paired with an .AQ annotation > and viceversa an .AQ annotation is paired with an .RL annotation > (this approach matches the current arm64 approach/implementation); > > b) on the opposite direction, always pair FENCE R,RW (or occasionally > FENCE R,R) with FENCE RW,W (this matches the current approach/the > current implementation within riscv); > > c) mix the previous two solutions (resp., annotations and fences), but > make sure to "upgrade" any releases to provide (insert) a FENCE.TSO. > > (a) would align RISC-V and ARM64 (which is a good thing IMO), though it > is probably the most invasive approach among the three approaches above > (requiring certain changes to arch/riscv/include/asm/{cmpxchg,atomic}.h, > which are already relatively messy due to the various ZABHA plus ZACAS > switches). Overall, I'm not too exited at the idea of reviewing any of > those changes, but if the community opts for it, I'll almost definitely > take a closer look with due calm. ;-) > > Andrea From cuiyunhui at bytedance.com Wed Sep 3 04:56:28 2025 From: cuiyunhui at bytedance.com (yunhui cui) Date: Wed, 3 Sep 2025 19:56:28 +0800 Subject: [External] Re: [PATCH 1/2] watchdog: refactor watchdog_hld functionality In-Reply-To: References: <20250827100959.83023-1-cuiyunhui@bytedance.com> <20250827100959.83023-2-cuiyunhui@bytedance.com> Message-ID: Hi Doug? On Wed, Sep 3, 2025 at 1:04?AM Doug Anderson wrote: > > Hi, > > On Sun, Aug 31, 2025 at 10:57?PM yunhui cui wrote: > > > > Hi Doug, > > > > On Sat, Aug 30, 2025 at 5:34?AM Doug Anderson wrote: > > > > > > Hi, > > > > > > On Wed, Aug 27, 2025 at 3:10?AM Yunhui Cui wrote: > > > > > > > > Move watchdog_hld.c to kernel/, and rename arm_pmu_irq_is_nmi() > > > > to arch_pmu_irq_is_nmi() for cross-arch reusability. > > > > > > > > Signed-off-by: Yunhui Cui > > > > --- > > > > arch/arm64/kernel/Makefile | 1 - > > > > drivers/perf/arm_pmu.c | 2 +- > > > > include/linux/nmi.h | 1 + > > > > include/linux/perf/arm_pmu.h | 2 -- > > > > kernel/Makefile | 2 +- > > > > {arch/arm64/kernel => kernel}/watchdog_hld.c | 8 ++++++-- > > > > 6 files changed, 9 insertions(+), 7 deletions(-) > > > > rename {arch/arm64/kernel => kernel}/watchdog_hld.c (97%) > > > > > > I'm not a huge fan of the perf hardlockup detector and IMO we should > > > maybe just delete it. Thus spreading it to support a new architecture > > > isn't my favorite thing to do. Can't you use the buddy hardlockup > > > detector? > > > > Why is there a plan to remove CONFIG_HARDLOCKUP_DETECTOR_PERF? Could > > you explain the specific reasons? Is the community's future plan to > > favor CONFIG_HARDLOCKUP_DETECTOR_BUDDY? > > I don't think there are any concrete plans, but there was some discussion here: > > https://lore.kernel.org/all/CAD=FV=WWUiCi6bZCs_gseFpDDWNkuJMoL6XCftEo6W7q6jRCkg at mail.gmail.com/ > > -Doug > I?ve read your linked content, which details the pros and cons of perf watchdog and buddy watchdog. I think everyone will agree on choosing one as the default. It seems there?s no kernel/watchdog entry in MAINTAINERS?what?s next for these two approaches? Thanks, Yunhui From andreyknvl at gmail.com Wed Sep 3 06:00:53 2025 From: andreyknvl at gmail.com (Andrey Konovalov) Date: Wed, 3 Sep 2025 15:00:53 +0200 Subject: [PATCH v6 1/2] kasan: introduce ARCH_DEFER_KASAN and unify static key across modes In-Reply-To: <20250810125746.1105476-2-snovitoll@gmail.com> References: <20250810125746.1105476-1-snovitoll@gmail.com> <20250810125746.1105476-2-snovitoll@gmail.com> Message-ID: On Sun, Aug 10, 2025 at 2:58?PM Sabyrzhan Tasbolatov wrote: > > Introduce CONFIG_ARCH_DEFER_KASAN to identify architectures [1] that need > to defer KASAN initialization until shadow memory is properly set up, > and unify the static key infrastructure across all KASAN modes. > > [1] PowerPC, UML, LoongArch selects ARCH_DEFER_KASAN. > > The core issue is that different architectures haveinconsistent approaches > to KASAN readiness tracking: > - PowerPC, LoongArch, and UML arch, each implement own > kasan_arch_is_ready() > - Only HW_TAGS mode had a unified static key (kasan_flag_enabled) > - Generic and SW_TAGS modes relied on arch-specific solutions or always-on > behavior > > This patch addresses the fragmentation in KASAN initialization > across architectures by introducing a unified approach that eliminates > duplicate static keys and arch-specific kasan_arch_is_ready() > implementations. > > Let's replace kasan_arch_is_ready() with existing kasan_enabled() check, > which examines the static key being enabled if arch selects > ARCH_DEFER_KASAN or has HW_TAGS mode support. > For other arch, kasan_enabled() checks the enablement during compile time. > > Now KASAN users can use a single kasan_enabled() check everywhere. > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217049 > Signed-off-by: Sabyrzhan Tasbolatov > --- > Changes in v6: > - Added more details in git commit message > - Fixed commenting format per coding style in UML (Christophe Leroy) > - Changed exporting to GPL for kasan_flag_enabled (Christophe Leroy) > - Converted ARCH_DEFER_KASAN to def_bool depending on KASAN to avoid > arch users to have `if KASAN` condition (Christophe Leroy) > - Forgot to add __init for kasan_init in UML > > Changes in v5: > - Unified patches where arch (powerpc, UML, loongarch) selects > ARCH_DEFER_KASAN in the first patch not to break > bisectability > - Removed kasan_arch_is_ready completely as there is no user > - Removed __wrappers in v4, left only those where it's necessary > due to different implementations > > Changes in v4: > - Fixed HW_TAGS static key functionality (was broken in v3) > - Merged configuration and implementation for atomicity > --- > arch/loongarch/Kconfig | 1 + > arch/loongarch/include/asm/kasan.h | 7 ------ > arch/loongarch/mm/kasan_init.c | 8 +++---- > arch/powerpc/Kconfig | 1 + > arch/powerpc/include/asm/kasan.h | 12 ---------- > arch/powerpc/mm/kasan/init_32.c | 2 +- > arch/powerpc/mm/kasan/init_book3e_64.c | 2 +- > arch/powerpc/mm/kasan/init_book3s_64.c | 6 +---- > arch/um/Kconfig | 1 + > arch/um/include/asm/kasan.h | 5 ++-- > arch/um/kernel/mem.c | 13 ++++++++--- > include/linux/kasan-enabled.h | 32 ++++++++++++++++++-------- > include/linux/kasan.h | 6 +++++ > lib/Kconfig.kasan | 12 ++++++++++ > mm/kasan/common.c | 17 ++++++++++---- > mm/kasan/generic.c | 19 +++++++++++---- > mm/kasan/hw_tags.c | 9 +------- > mm/kasan/kasan.h | 8 ++++++- > mm/kasan/shadow.c | 12 +++++----- > mm/kasan/sw_tags.c | 1 + > mm/kasan/tags.c | 2 +- > 21 files changed, 106 insertions(+), 70 deletions(-) > > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig > index f0abc38c40ac..e449e3fcecf9 100644 > --- a/arch/loongarch/Kconfig > +++ b/arch/loongarch/Kconfig > @@ -9,6 +9,7 @@ config LOONGARCH > select ACPI_PPTT if ACPI > select ACPI_SYSTEM_POWER_STATES_SUPPORT if ACPI > select ARCH_BINFMT_ELF_STATE > + select ARCH_NEEDS_DEFER_KASAN > select ARCH_DISABLE_KASAN_INLINE > select ARCH_ENABLE_MEMORY_HOTPLUG > select ARCH_ENABLE_MEMORY_HOTREMOVE > diff --git a/arch/loongarch/include/asm/kasan.h b/arch/loongarch/include/asm/kasan.h > index 62f139a9c87d..0e50e5b5e056 100644 > --- a/arch/loongarch/include/asm/kasan.h > +++ b/arch/loongarch/include/asm/kasan.h > @@ -66,7 +66,6 @@ > #define XKPRANGE_WC_SHADOW_OFFSET (KASAN_SHADOW_START + XKPRANGE_WC_KASAN_OFFSET) > #define XKVRANGE_VC_SHADOW_OFFSET (KASAN_SHADOW_START + XKVRANGE_VC_KASAN_OFFSET) > > -extern bool kasan_early_stage; > extern unsigned char kasan_early_shadow_page[PAGE_SIZE]; > > #define kasan_mem_to_shadow kasan_mem_to_shadow > @@ -75,12 +74,6 @@ void *kasan_mem_to_shadow(const void *addr); > #define kasan_shadow_to_mem kasan_shadow_to_mem > const void *kasan_shadow_to_mem(const void *shadow_addr); > > -#define kasan_arch_is_ready kasan_arch_is_ready > -static __always_inline bool kasan_arch_is_ready(void) > -{ > - return !kasan_early_stage; > -} > - > #define addr_has_metadata addr_has_metadata > static __always_inline bool addr_has_metadata(const void *addr) > { > diff --git a/arch/loongarch/mm/kasan_init.c b/arch/loongarch/mm/kasan_init.c > index d2681272d8f0..170da98ad4f5 100644 > --- a/arch/loongarch/mm/kasan_init.c > +++ b/arch/loongarch/mm/kasan_init.c > @@ -40,11 +40,9 @@ static pgd_t kasan_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); > #define __pte_none(early, pte) (early ? pte_none(pte) : \ > ((pte_val(pte) & _PFN_MASK) == (unsigned long)__pa(kasan_early_shadow_page))) > > -bool kasan_early_stage = true; > - > void *kasan_mem_to_shadow(const void *addr) > { > - if (!kasan_arch_is_ready()) { > + if (!kasan_enabled()) { > return (void *)(kasan_early_shadow_page); > } else { > unsigned long maddr = (unsigned long)addr; > @@ -298,7 +296,8 @@ void __init kasan_init(void) > kasan_populate_early_shadow(kasan_mem_to_shadow((void *)VMALLOC_START), > kasan_mem_to_shadow((void *)KFENCE_AREA_END)); > > - kasan_early_stage = false; > + /* Enable KASAN here before kasan_mem_to_shadow(). */ > + kasan_init_generic(); > > /* Populate the linear mapping */ > for_each_mem_range(i, &pa_start, &pa_end) { > @@ -329,5 +328,4 @@ void __init kasan_init(void) > > /* At this point kasan is fully initialized. Enable error messages */ > init_task.kasan_depth = 0; > - pr_info("KernelAddressSanitizer initialized.\n"); > } > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 93402a1d9c9f..4730c676b6bf 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -122,6 +122,7 @@ config PPC > # Please keep this list sorted alphabetically. > # > select ARCH_32BIT_OFF_T if PPC32 > + select ARCH_NEEDS_DEFER_KASAN if PPC_RADIX_MMU > select ARCH_DISABLE_KASAN_INLINE if PPC_RADIX_MMU > select ARCH_DMA_DEFAULT_COHERENT if !NOT_COHERENT_CACHE > select ARCH_ENABLE_MEMORY_HOTPLUG > diff --git a/arch/powerpc/include/asm/kasan.h b/arch/powerpc/include/asm/kasan.h > index b5bbb94c51f6..957a57c1db58 100644 > --- a/arch/powerpc/include/asm/kasan.h > +++ b/arch/powerpc/include/asm/kasan.h > @@ -53,18 +53,6 @@ > #endif > > #ifdef CONFIG_KASAN > -#ifdef CONFIG_PPC_BOOK3S_64 > -DECLARE_STATIC_KEY_FALSE(powerpc_kasan_enabled_key); > - > -static __always_inline bool kasan_arch_is_ready(void) > -{ > - if (static_branch_likely(&powerpc_kasan_enabled_key)) > - return true; > - return false; > -} > - > -#define kasan_arch_is_ready kasan_arch_is_ready > -#endif > > void kasan_early_init(void); > void kasan_mmu_init(void); > diff --git a/arch/powerpc/mm/kasan/init_32.c b/arch/powerpc/mm/kasan/init_32.c > index 03666d790a53..1d083597464f 100644 > --- a/arch/powerpc/mm/kasan/init_32.c > +++ b/arch/powerpc/mm/kasan/init_32.c > @@ -165,7 +165,7 @@ void __init kasan_init(void) > > /* At this point kasan is fully initialized. Enable error messages */ > init_task.kasan_depth = 0; > - pr_info("KASAN init done\n"); > + kasan_init_generic(); > } > > void __init kasan_late_init(void) > diff --git a/arch/powerpc/mm/kasan/init_book3e_64.c b/arch/powerpc/mm/kasan/init_book3e_64.c > index 60c78aac0f63..0d3a73d6d4b0 100644 > --- a/arch/powerpc/mm/kasan/init_book3e_64.c > +++ b/arch/powerpc/mm/kasan/init_book3e_64.c > @@ -127,7 +127,7 @@ void __init kasan_init(void) > > /* Enable error messages */ > init_task.kasan_depth = 0; > - pr_info("KASAN init done\n"); > + kasan_init_generic(); > } > > void __init kasan_late_init(void) { } > diff --git a/arch/powerpc/mm/kasan/init_book3s_64.c b/arch/powerpc/mm/kasan/init_book3s_64.c > index 7d959544c077..dcafa641804c 100644 > --- a/arch/powerpc/mm/kasan/init_book3s_64.c > +++ b/arch/powerpc/mm/kasan/init_book3s_64.c > @@ -19,8 +19,6 @@ > #include > #include > > -DEFINE_STATIC_KEY_FALSE(powerpc_kasan_enabled_key); > - > static void __init kasan_init_phys_region(void *start, void *end) > { > unsigned long k_start, k_end, k_cur; > @@ -92,11 +90,9 @@ void __init kasan_init(void) > */ > memset(kasan_early_shadow_page, 0, PAGE_SIZE); > > - static_branch_inc(&powerpc_kasan_enabled_key); > - > /* Enable error messages */ > init_task.kasan_depth = 0; > - pr_info("KASAN init done\n"); > + kasan_init_generic(); > } > > void __init kasan_early_init(void) { } > diff --git a/arch/um/Kconfig b/arch/um/Kconfig > index 9083bfdb7735..1d4def0db841 100644 > --- a/arch/um/Kconfig > +++ b/arch/um/Kconfig > @@ -5,6 +5,7 @@ menu "UML-specific options" > config UML > bool > default y > + select ARCH_NEEDS_DEFER_KASAN if STATIC_LINK > select ARCH_WANTS_DYNAMIC_TASK_STRUCT > select ARCH_HAS_CACHE_LINE_SIZE > select ARCH_HAS_CPU_FINALIZE_INIT > diff --git a/arch/um/include/asm/kasan.h b/arch/um/include/asm/kasan.h > index f97bb1f7b851..b54a4e937fd1 100644 > --- a/arch/um/include/asm/kasan.h > +++ b/arch/um/include/asm/kasan.h > @@ -24,10 +24,9 @@ > > #ifdef CONFIG_KASAN > void kasan_init(void); > -extern int kasan_um_is_ready; > > -#ifdef CONFIG_STATIC_LINK > -#define kasan_arch_is_ready() (kasan_um_is_ready) > +#if defined(CONFIG_STATIC_LINK) && defined(CONFIG_KASAN_INLINE) > +#error UML does not work in KASAN_INLINE mode with STATIC_LINK enabled! > #endif > #else > static inline void kasan_init(void) { } > diff --git a/arch/um/kernel/mem.c b/arch/um/kernel/mem.c > index 76bec7de81b5..32e3b1972dc1 100644 > --- a/arch/um/kernel/mem.c > +++ b/arch/um/kernel/mem.c > @@ -21,10 +21,10 @@ > #include > #include > #include > +#include > > #ifdef CONFIG_KASAN > -int kasan_um_is_ready; > -void kasan_init(void) > +void __init kasan_init(void) > { > /* > * kasan_map_memory will map all of the required address space and > @@ -32,7 +32,11 @@ void kasan_init(void) > */ > kasan_map_memory((void *)KASAN_SHADOW_START, KASAN_SHADOW_SIZE); > init_task.kasan_depth = 0; > - kasan_um_is_ready = true; > + /* > + * Since kasan_init() is called before main(), > + * KASAN is initialized but the enablement is deferred after > + * jump_label_init(). See arch_mm_preinit(). > + */ > } > > static void (*kasan_init_ptr)(void) > @@ -58,6 +62,9 @@ static unsigned long brk_end; > > void __init arch_mm_preinit(void) > { > + /* Safe to call after jump_label_init(). Enables KASAN. */ > + kasan_init_generic(); > + > /* clear the zero-page */ > memset(empty_zero_page, 0, PAGE_SIZE); > > diff --git a/include/linux/kasan-enabled.h b/include/linux/kasan-enabled.h > index 6f612d69ea0c..9eca967d8526 100644 > --- a/include/linux/kasan-enabled.h > +++ b/include/linux/kasan-enabled.h > @@ -4,32 +4,46 @@ > > #include > > -#ifdef CONFIG_KASAN_HW_TAGS > - > +#if defined(CONFIG_ARCH_DEFER_KASAN) || defined(CONFIG_KASAN_HW_TAGS) > +/* > + * Global runtime flag for KASAN modes that need runtime control. > + * Used by ARCH_DEFER_KASAN architectures and HW_TAGS mode. > + */ > DECLARE_STATIC_KEY_FALSE(kasan_flag_enabled); > > +/* > + * Runtime control for shadow memory initialization or HW_TAGS mode. > + * Uses static key for architectures that need deferred KASAN or HW_TAGS. > + */ > static __always_inline bool kasan_enabled(void) > { > return static_branch_likely(&kasan_flag_enabled); > } > > -static inline bool kasan_hw_tags_enabled(void) > +static inline void kasan_enable(void) > { > - return kasan_enabled(); > + static_branch_enable(&kasan_flag_enabled); > } > - > -#else /* CONFIG_KASAN_HW_TAGS */ > - > -static inline bool kasan_enabled(void) > +#else > +/* For architectures that can enable KASAN early, use compile-time check. */ > +static __always_inline bool kasan_enabled(void) > { > return IS_ENABLED(CONFIG_KASAN); > } > > +static inline void kasan_enable(void) {} > +#endif /* CONFIG_ARCH_DEFER_KASAN || CONFIG_KASAN_HW_TAGS */ > + > +#ifdef CONFIG_KASAN_HW_TAGS > +static inline bool kasan_hw_tags_enabled(void) > +{ > + return kasan_enabled(); > +} > +#else > static inline bool kasan_hw_tags_enabled(void) > { > return false; > } > - > #endif /* CONFIG_KASAN_HW_TAGS */ > > #endif /* LINUX_KASAN_ENABLED_H */ > diff --git a/include/linux/kasan.h b/include/linux/kasan.h > index 890011071f2b..51a8293d1af6 100644 > --- a/include/linux/kasan.h > +++ b/include/linux/kasan.h > @@ -543,6 +543,12 @@ void kasan_report_async(void); > > #endif /* CONFIG_KASAN_HW_TAGS */ > > +#ifdef CONFIG_KASAN_GENERIC > +void __init kasan_init_generic(void); > +#else > +static inline void kasan_init_generic(void) { } > +#endif > + > #ifdef CONFIG_KASAN_SW_TAGS > void __init kasan_init_sw_tags(void); > #else > diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan > index f82889a830fa..a4bb610a7a6f 100644 > --- a/lib/Kconfig.kasan > +++ b/lib/Kconfig.kasan > @@ -19,6 +19,18 @@ config ARCH_DISABLE_KASAN_INLINE > Disables both inline and stack instrumentation. Selected by > architectures that do not support these instrumentation types. > > +config ARCH_NEEDS_DEFER_KASAN > + bool > + > +config ARCH_DEFER_KASAN > + def_bool y > + depends on KASAN && ARCH_NEEDS_DEFER_KASAN > + help > + Architectures should select this if they need to defer KASAN > + initialization until shadow memory is properly set up. This > + enables runtime control via static keys. Otherwise, KASAN uses > + compile-time constants for better performance. > + > config CC_HAS_KASAN_GENERIC > def_bool $(cc-option, -fsanitize=kernel-address) > > diff --git a/mm/kasan/common.c b/mm/kasan/common.c > index 9142964ab9c9..e3765931a31f 100644 > --- a/mm/kasan/common.c > +++ b/mm/kasan/common.c > @@ -32,6 +32,15 @@ > #include "kasan.h" > #include "../slab.h" > > +#if defined(CONFIG_ARCH_DEFER_KASAN) || defined(CONFIG_KASAN_HW_TAGS) > +/* > + * Definition of the unified static key declared in kasan-enabled.h. > + * This provides consistent runtime enable/disable across KASAN modes. > + */ > +DEFINE_STATIC_KEY_FALSE(kasan_flag_enabled); > +EXPORT_SYMBOL_GPL(kasan_flag_enabled); > +#endif > + > struct slab *kasan_addr_to_slab(const void *addr) > { > if (virt_addr_valid(addr)) > @@ -246,7 +255,7 @@ static inline void poison_slab_object(struct kmem_cache *cache, void *object, > bool __kasan_slab_pre_free(struct kmem_cache *cache, void *object, > unsigned long ip) > { > - if (!kasan_arch_is_ready() || is_kfence_address(object)) > + if (is_kfence_address(object)) > return false; Why is the check removed here and in some other places below? This need to be explained in the commit message. > return check_slab_allocation(cache, object, ip); > } > @@ -254,7 +263,7 @@ bool __kasan_slab_pre_free(struct kmem_cache *cache, void *object, > bool __kasan_slab_free(struct kmem_cache *cache, void *object, bool init, > bool still_accessible) > { > - if (!kasan_arch_is_ready() || is_kfence_address(object)) > + if (is_kfence_address(object)) > return false; > > /* > @@ -293,7 +302,7 @@ bool __kasan_slab_free(struct kmem_cache *cache, void *object, bool init, > > static inline bool check_page_allocation(void *ptr, unsigned long ip) > { > - if (!kasan_arch_is_ready()) > + if (!kasan_enabled()) > return false; > > if (ptr != page_address(virt_to_head_page(ptr))) { > @@ -522,7 +531,7 @@ bool __kasan_mempool_poison_object(void *ptr, unsigned long ip) > return true; > } > > - if (is_kfence_address(ptr) || !kasan_arch_is_ready()) > + if (is_kfence_address(ptr)) > return true; > > slab = folio_slab(folio); > diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c > index d54e89f8c3e7..b413c46b3e04 100644 > --- a/mm/kasan/generic.c > +++ b/mm/kasan/generic.c > @@ -36,6 +36,17 @@ > #include "kasan.h" > #include "../slab.h" > > +/* > + * Initialize Generic KASAN and enable runtime checks. > + * This should be called from arch kasan_init() once shadow memory is ready. > + */ > +void __init kasan_init_generic(void) > +{ > + kasan_enable(); > + > + pr_info("KernelAddressSanitizer initialized (generic)\n"); > +} > + > /* > * All functions below always inlined so compiler could > * perform better optimizations in each of __asan_loadX/__assn_storeX > @@ -165,7 +176,7 @@ static __always_inline bool check_region_inline(const void *addr, > size_t size, bool write, > unsigned long ret_ip) > { > - if (!kasan_arch_is_ready()) > + if (!kasan_enabled()) > return true; > > if (unlikely(size == 0)) > @@ -193,7 +204,7 @@ bool kasan_byte_accessible(const void *addr) > { > s8 shadow_byte; > > - if (!kasan_arch_is_ready()) > + if (!kasan_enabled()) > return true; > > shadow_byte = READ_ONCE(*(s8 *)kasan_mem_to_shadow(addr)); > @@ -495,7 +506,7 @@ static void release_alloc_meta(struct kasan_alloc_meta *meta) > > static void release_free_meta(const void *object, struct kasan_free_meta *meta) > { > - if (!kasan_arch_is_ready()) > + if (!kasan_enabled()) > return; > > /* Check if free meta is valid. */ > @@ -562,7 +573,7 @@ void kasan_save_alloc_info(struct kmem_cache *cache, void *object, gfp_t flags) > kasan_save_track(&alloc_meta->alloc_track, flags); > } > > -void kasan_save_free_info(struct kmem_cache *cache, void *object) > +void __kasan_save_free_info(struct kmem_cache *cache, void *object) > { > struct kasan_free_meta *free_meta; > > diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c > index 9a6927394b54..c8289a3feabf 100644 > --- a/mm/kasan/hw_tags.c > +++ b/mm/kasan/hw_tags.c > @@ -45,13 +45,6 @@ static enum kasan_arg kasan_arg __ro_after_init; > static enum kasan_arg_mode kasan_arg_mode __ro_after_init; > static enum kasan_arg_vmalloc kasan_arg_vmalloc __initdata; > > -/* > - * Whether KASAN is enabled at all. > - * The value remains false until KASAN is initialized by kasan_init_hw_tags(). > - */ > -DEFINE_STATIC_KEY_FALSE(kasan_flag_enabled); > -EXPORT_SYMBOL(kasan_flag_enabled); > - > /* > * Whether the selected mode is synchronous, asynchronous, or asymmetric. > * Defaults to KASAN_MODE_SYNC. > @@ -260,7 +253,7 @@ void __init kasan_init_hw_tags(void) > kasan_init_tags(); > > /* KASAN is now initialized, enable it. */ > - static_branch_enable(&kasan_flag_enabled); > + kasan_enable(); > > pr_info("KernelAddressSanitizer initialized (hw-tags, mode=%s, vmalloc=%s, stacktrace=%s)\n", > kasan_mode_info(), > diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h > index 129178be5e64..8a9d8a6ea717 100644 > --- a/mm/kasan/kasan.h > +++ b/mm/kasan/kasan.h > @@ -398,7 +398,13 @@ depot_stack_handle_t kasan_save_stack(gfp_t flags, depot_flags_t depot_flags); > void kasan_set_track(struct kasan_track *track, depot_stack_handle_t stack); > void kasan_save_track(struct kasan_track *track, gfp_t flags); > void kasan_save_alloc_info(struct kmem_cache *cache, void *object, gfp_t flags); > -void kasan_save_free_info(struct kmem_cache *cache, void *object); > + > +void __kasan_save_free_info(struct kmem_cache *cache, void *object); > +static inline void kasan_save_free_info(struct kmem_cache *cache, void *object) > +{ > + if (kasan_enabled()) > + __kasan_save_free_info(cache, object); > +} What I meant with these __wrappers was that we should add them for the KASAN hooks that are called from non-KASAN code (i.e. for the hooks defined in include/linux/kasan.h). And then move all the kasan_enabled() checks from mm/kasan/* to where the wrappers are defined in include/linux/kasan.h (see kasan_unpoison_range() as an example). kasan_save_free_info is a KASAN internal function that should need such a wrapper. For now, to make these patches simpler, you can keep kasan_enabled() checks in mm/kasan/*, where they are now. Later we can move them to include/linux/kasan.h with a separate patch. > > #ifdef CONFIG_KASAN_GENERIC > bool kasan_quarantine_put(struct kmem_cache *cache, void *object); > diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c > index d2c70cd2afb1..2e126cb21b68 100644 > --- a/mm/kasan/shadow.c > +++ b/mm/kasan/shadow.c > @@ -125,7 +125,7 @@ void kasan_poison(const void *addr, size_t size, u8 value, bool init) > { > void *shadow_start, *shadow_end; > > - if (!kasan_arch_is_ready()) > + if (!kasan_enabled()) > return; > > /* > @@ -150,7 +150,7 @@ EXPORT_SYMBOL_GPL(kasan_poison); > #ifdef CONFIG_KASAN_GENERIC > void kasan_poison_last_granule(const void *addr, size_t size) > { > - if (!kasan_arch_is_ready()) > + if (!kasan_enabled()) > return; > > if (size & KASAN_GRANULE_MASK) { > @@ -390,7 +390,7 @@ int kasan_populate_vmalloc(unsigned long addr, unsigned long size) > unsigned long shadow_start, shadow_end; > int ret; > > - if (!kasan_arch_is_ready()) > + if (!kasan_enabled()) > return 0; > > if (!is_vmalloc_or_module_addr((void *)addr)) > @@ -560,7 +560,7 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end, > unsigned long region_start, region_end; > unsigned long size; > > - if (!kasan_arch_is_ready()) > + if (!kasan_enabled()) > return; > > region_start = ALIGN(start, KASAN_MEMORY_PER_SHADOW_PAGE); > @@ -611,7 +611,7 @@ void *__kasan_unpoison_vmalloc(const void *start, unsigned long size, > * with setting memory tags, so the KASAN_VMALLOC_INIT flag is ignored. > */ > > - if (!kasan_arch_is_ready()) > + if (!kasan_enabled()) > return (void *)start; > > if (!is_vmalloc_or_module_addr(start)) > @@ -636,7 +636,7 @@ void *__kasan_unpoison_vmalloc(const void *start, unsigned long size, > */ > void __kasan_poison_vmalloc(const void *start, unsigned long size) > { > - if (!kasan_arch_is_ready()) > + if (!kasan_enabled()) > return; > > if (!is_vmalloc_or_module_addr(start)) > diff --git a/mm/kasan/sw_tags.c b/mm/kasan/sw_tags.c > index b9382b5b6a37..c75741a74602 100644 > --- a/mm/kasan/sw_tags.c > +++ b/mm/kasan/sw_tags.c > @@ -44,6 +44,7 @@ void __init kasan_init_sw_tags(void) > per_cpu(prng_state, cpu) = (u32)get_cycles(); > > kasan_init_tags(); > + kasan_enable(); > > pr_info("KernelAddressSanitizer initialized (sw-tags, stacktrace=%s)\n", > str_on_off(kasan_stack_collection_enabled())); > diff --git a/mm/kasan/tags.c b/mm/kasan/tags.c > index d65d48b85f90..b9f31293622b 100644 > --- a/mm/kasan/tags.c > +++ b/mm/kasan/tags.c > @@ -142,7 +142,7 @@ void kasan_save_alloc_info(struct kmem_cache *cache, void *object, gfp_t flags) > save_stack_info(cache, object, flags, false); > } > > -void kasan_save_free_info(struct kmem_cache *cache, void *object) > +void __kasan_save_free_info(struct kmem_cache *cache, void *object) > { > save_stack_info(cache, object, 0, true); > } > -- > 2.34.1 > From andreyknvl at gmail.com Wed Sep 3 06:01:40 2025 From: andreyknvl at gmail.com (Andrey Konovalov) Date: Wed, 3 Sep 2025 15:01:40 +0200 Subject: [PATCH v6 1/2] kasan: introduce ARCH_DEFER_KASAN and unify static key across modes In-Reply-To: References: <20250810125746.1105476-1-snovitoll@gmail.com> <20250810125746.1105476-2-snovitoll@gmail.com> Message-ID: On Wed, Sep 3, 2025 at 3:00?PM Andrey Konovalov wrote: > > > +void __kasan_save_free_info(struct kmem_cache *cache, void *object); > > +static inline void kasan_save_free_info(struct kmem_cache *cache, void *object) > > +{ > > + if (kasan_enabled()) > > + __kasan_save_free_info(cache, object); > > +} > > What I meant with these __wrappers was that we should add them for the > KASAN hooks that are called from non-KASAN code (i.e. for the hooks > defined in include/linux/kasan.h). And then move all the > kasan_enabled() checks from mm/kasan/* to where the wrappers are > defined in include/linux/kasan.h (see kasan_unpoison_range() as an > example). > > kasan_save_free_info is a KASAN internal function that should need > such a wrapper. ... should _not_ need ... > > For now, to make these patches simpler, you can keep kasan_enabled() > checks in mm/kasan/*, where they are now. Later we can move them to > include/linux/kasan.h with a separate patch. From anup at brainfault.org Wed Sep 3 07:31:37 2025 From: anup at brainfault.org (Anup Patel) Date: Wed, 3 Sep 2025 20:01:37 +0530 Subject: [PATCH v3 0/3] KVM: riscv: selftests: Enable supported test cases In-Reply-To: References: Message-ID: On Mon, Sep 1, 2025 at 1:06?PM wrote: > > From: Dong Yang > > Add supported KVM test cases and fix the compilation dependencies. > --- > Changes in v3: > - Reorder patches to fix build dependencies > - Sort common supported test cases alphabetically > - Move ucall_common.h include from common header to specific source files > > Changes in v2: > - Delete some repeat KVM test cases on riscv > - Add missing headers to fix the build for new RISC-V KVM selftests > > Dong Yang (1): > KVM: riscv: selftests: Add missing headers for new testcases > > Quan Zhou (2): > KVM: riscv: selftests: Use the existing RISCV_FENCE macro in > `rseq-riscv.h` > KVM: riscv: selftests: Add common supported test cases Queued this series for Linux-6.18 Thanks, Anup From guoren at kernel.org Wed Sep 3 07:42:17 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Wed, 3 Sep 2025 10:42:17 -0400 Subject: [PATCH] iommu/riscv: Use two individual 4-byte accesses for 8-byte register Message-ID: <20250903144217.837448-1-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RISC-V IOMMU memory-mapped register interface define: The 8-byte IOMMU registers are defined in such a way that software can perform two individual 4-byte accesses. Therefore, use two individual 4-byte accesses for an 8-byte register to make the driver compatible with a 32-bit-wide interconnect. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- drivers/iommu/riscv/iommu.c | 7 +++++-- drivers/iommu/riscv/iommu.h | 27 ++++++++++++++++++++------- 2 files changed, 25 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c index 0eae2f4bdc5e..9a80464ed7be 100644 --- a/drivers/iommu/riscv/iommu.c +++ b/drivers/iommu/riscv/iommu.c @@ -662,9 +662,12 @@ void riscv_iommu_disable(struct riscv_iommu_device *iommu) #define riscv_iommu_read_ddtp(iommu) ({ \ u64 ddtp; \ - riscv_iommu_readq_timeout((iommu), RISCV_IOMMU_REG_DDTP, ddtp, \ - !(ddtp & RISCV_IOMMU_DDTP_BUSY), 10, \ + u32 ddtp_lo, ddtp_hi; \ + riscv_iommu_readl_timeout((iommu), RISCV_IOMMU_REG_DDTP, ddtp_lo, \ + !(ddtp_lo & RISCV_IOMMU_DDTP_BUSY), 10, \ RISCV_IOMMU_DDTP_TIMEOUT); \ + ddtp_hi = riscv_iommu_readl(iommu, RISCV_IOMMU_REG_DDTP + 4); \ + ddtp = ((u64)ddtp_hi << 32) | ddtp_lo; \ ddtp; }) static int riscv_iommu_iodir_alloc(struct riscv_iommu_device *iommu) diff --git a/drivers/iommu/riscv/iommu.h b/drivers/iommu/riscv/iommu.h index 46df79dd5495..698acffff298 100644 --- a/drivers/iommu/riscv/iommu.h +++ b/drivers/iommu/riscv/iommu.h @@ -69,18 +69,31 @@ void riscv_iommu_disable(struct riscv_iommu_device *iommu); #define riscv_iommu_readl(iommu, addr) \ readl_relaxed((iommu)->reg + (addr)) -#define riscv_iommu_readq(iommu, addr) \ - readq_relaxed((iommu)->reg + (addr)) +static inline u64 riscv_iommu_readq(struct riscv_iommu_device *iommu, + u16 addr) +{ + u32 val_lo, val_hi; + + val_lo = readl_relaxed((iommu)->reg + (addr)); + val_hi = readl_relaxed((iommu)->reg + (addr) + 4); + + return (u64) val_lo | ((u64) val_hi << 32); +} #define riscv_iommu_writel(iommu, addr, val) \ writel_relaxed((val), (iommu)->reg + (addr)) -#define riscv_iommu_writeq(iommu, addr, val) \ - writeq_relaxed((val), (iommu)->reg + (addr)) +static inline void riscv_iommu_writeq(struct riscv_iommu_device *iommu, + u16 addr, u64 val) +{ + u32 val_lo, val_hi; -#define riscv_iommu_readq_timeout(iommu, addr, val, cond, delay_us, timeout_us) \ - readx_poll_timeout(readq_relaxed, (iommu)->reg + (addr), val, cond, \ - delay_us, timeout_us) + val_hi = (u32) (val >> 32); + val_lo = (u32) val; + + writel_relaxed((val_hi), (iommu)->reg + (addr) + 4); + writel_relaxed((val_lo), (iommu)->reg + (addr)); +} #define riscv_iommu_readl_timeout(iommu, addr, val, cond, delay_us, timeout_us) \ readx_poll_timeout(readl_relaxed, (iommu)->reg + (addr), val, cond, \ -- 2.40.1 From hendrik.hamerlinck at hammernet.be Wed Sep 3 07:53:34 2025 From: hendrik.hamerlinck at hammernet.be (Hendrik Hamerlinck) Date: Wed, 3 Sep 2025 16:53:34 +0200 Subject: [PATCH] riscv: dts: spacemit: add UART pinctrl combinations Message-ID: <20250903145334.425633-1-hendrik.hamerlinck@hammernet.be> This adds UART pinctrl configurations based on the SoC datasheet and the downstream Bianbu Linux tree. The drive strength values were taken from the downstream implementation, which uses medium drive strength. For convenience, the board DTS files have been updated to include all UART instances with their possible pinmux options in a disabled state. Tested this locally on both Orange Pi RV2 and Banana Pi BPI-F3 boards. Signed-off-by: Hendrik Hamerlinck --- .../boot/dts/spacemit/k1-bananapi-f3.dts | 18 ++ .../boot/dts/spacemit/k1-orangepi-rv2.dts | 18 ++ arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi | 276 +++++++++++++++++- 3 files changed, 309 insertions(+), 3 deletions(-) diff --git a/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts b/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts index 6013be258542..661d47d1ce9e 100644 --- a/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts +++ b/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts @@ -49,3 +49,21 @@ &uart0 { pinctrl-0 = <&uart0_2_cfg>; status = "okay"; }; + +&uart5 { + pinctrl-names = "default"; + pinctrl-0 = <&uart5_3_cfg>; + status = "disabled"; +}; + +&uart8 { + pinctrl-names = "default"; + pinctrl-0 = <&uart8_2_cfg>; + status = "disabled"; +}; + +&uart9 { + pinctrl-names = "default"; + pinctrl-0 = <&uart9_2_cfg>; + status = "disabled"; +}; diff --git a/arch/riscv/boot/dts/spacemit/k1-orangepi-rv2.dts b/arch/riscv/boot/dts/spacemit/k1-orangepi-rv2.dts index 337240ebb7b7..dc45b75b1ad4 100644 --- a/arch/riscv/boot/dts/spacemit/k1-orangepi-rv2.dts +++ b/arch/riscv/boot/dts/spacemit/k1-orangepi-rv2.dts @@ -38,3 +38,21 @@ &uart0 { pinctrl-0 = <&uart0_2_cfg>; status = "okay"; }; + +&uart5 { + pinctrl-names = "default"; + pinctrl-0 = <&uart5_3_cfg>; + status = "disabled"; +}; + +&uart8 { + pinctrl-names = "default"; + pinctrl-0 = <&uart8_2_cfg>; + status = "disabled"; +}; + +&uart9 { + pinctrl-names = "default"; + pinctrl-0 = <&uart9_2_cfg>; + status = "disabled"; +}; diff --git a/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi b/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi index 381055737422..43425530b5bf 100644 --- a/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi +++ b/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi @@ -11,12 +11,282 @@ #define K1_GPIO(x) (x / 32) (x % 32) &pinctrl { + uart0_0_cfg: uart0-0-cfg { + uart0-0-pins { + pinmux = , /* uart0_txd */ + ; /* uart0_rxd */ + power-source = <3300>; + bias-pull-up; + drive-strength = <19>; + }; + }; + + uart0_1_cfg: uart0-1-cfg { + uart0-1-pins { + pinmux = , /* uart0_txd */ + ; /* uart0_rxd */ + power-source = <3300>; + bias-pull-up; + drive-strength = <19>; + }; + }; + uart0_2_cfg: uart0-2-cfg { uart0-2-pins { - pinmux = , - ; + pinmux = , /* uart0_txd */ + ; /* uart0_rxd */ + bias-pull-up; + drive-strength = <32>; + }; + }; - bias-pull-up = <0>; + uart2_0_cfg: uart2-0-cfg { + uart2-0-pins { + pinmux = , /* uart2_txd */ + , /* uart2_rxd */ + , /* uart2_cts */ + ; /* uart2_rts */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart3_0_cfg: uart3-0-cfg { + uart3-0-pins { + pinmux = , /* uart3_txd */ + , /* uart3_rxd */ + , /* uart3_cts */ + ; /* uart3_rts */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart3_1_cfg: uart3-1-cfg { + uart3-1-pins { + pinmux = , /* uart3_txd */ + , /* uart3_rxd */ + , /* uart3_cts */ + ; /* uart3_rts */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart3_2_cfg: uart3-2-cfg { + uart3-2-pins { + pinmux = , /* uart3_txd */ + , /* uart3_rxd */ + , /* uart3_cts */ + ; /* uart3_rts */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart4_0_cfg: uart4-0-cfg { + uart4-0-pins { + pinmux = , /* uart4_txd */ + ; /* uart4_rxd */ + power-source = <3300>; + bias-pull-up; + drive-strength = <19>; + }; + }; + + uart4_1_cfg: uart4-1-cfg { + uart4-1-pins { + pinmux = , /* uart4_cts */ + , /* uart4_rts */ + , /* uart4_txd */ + ; /* uart4_rxd */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart4_2_cfg: uart4-2-cfg { + uart4-2-pins { + pinmux = , /* uart4_txd */ + ; /* uart4_rxd */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart4_3_cfg: uart4-3-cfg { + uart4-3-pins { + pinmux = , /* uart4_txd */ + , /* uart4_rxd */ + , /* uart4_cts */ + ; /* uart4_rts */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart4_4_cfg: uart4-4-cfg { + uart4-4-pins { + pinmux = , /* uart4_txd */ + , /* uart4_rxd */ + , /* uart4_cts */ + ; /* uart4_rts */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart5_0_cfg: uart5-0-cfg { + uart5-0-pins { + pinmux = , /* uart5_txd */ + ; /* uart5_rxd */ + power-source = <3300>; + bias-pull-up; + drive-strength = <19>; + }; + }; + + uart5_1_cfg: uart5-1-cfg { + uart5-1-pins { + pinmux = , /* uart5_txd */ + , /* uart5_rxd */ + , /* uart5_cts */ + ; /* uart5_rts */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart5_2_cfg: uart5-2-cfg { + uart5-2-pins { + pinmux = , /* uart5_txd */ + , /* uart5_rxd */ + , /* uart5_cts */ + ; /* uart5_rts */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart5_3_cfg: uart5-3-cfg { + uart5-3-pins { + pinmux = , /* uart5_txd */ + , /* uart5_rxd */ + , /* uart5_cts */ + ; /* uart5_rts */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart6_0_cfg: uart6-0-cfg { + uart6-0-pins { + pinmux = , /* uart6_cts */ + , /* uart6_txd */ + , /* uart6_rxd */ + ; /* uart6_rts */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart6_1_cfg: uart6-1-cfg { + uart6-1-pins { + pinmux = , /* uart6_txd */ + , /* uart6_rxd */ + , /* uart6_cts */ + ; /* uart6_rts */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart6_2_cfg: uart6-2-cfg { + uart6-2-pins { + pinmux = , /* uart6_txd */ + ; /* uart6_rxd */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart7_0_cfg: uart7-0-cfg { + uart7-0-pins { + pinmux = , /* uart7_txd */ + ; /* uart7_rxd */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart7_1_cfg: uart7-1-cfg { + uart7-1-pins { + pinmux = , /* uart7_txd */ + , /* uart7_rxd */ + , /* uart7_cts */ + ; /* uart7_rts */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart8_0_cfg: uart8-0-cfg { + uart8-0-pins { + pinmux = , /* uart8_txd */ + ; /* uart8_rxd */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart8_1_cfg: uart8-1-cfg { + uart8-1-pins { + pinmux = , /* uart8_txd */ + , /* uart8_rxd */ + , /* uart8_cts */ + ; /* uart8_rts */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart8_2_cfg: uart8-2-cfg { + uart8-2-pins { + pinmux = , /* uart8_txd */ + , /* uart8_rxd */ + , /* uart8_cts */ + ; /* uart8_rts */ + power-source = <3300>; + bias-pull-up; + drive-strength = <19>; + }; + }; + + uart9_0_cfg: uart9-0-cfg { + uart9-0-pins { + pinmux = , /* uart9_txd */ + ; /* uart9_rxd */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart9_1_cfg: uart9-1-cfg { + uart9-1-pins { + pinmux = , /* uart9_cts */ + , /* uart9_rts */ + , /* uart9_txd */ + ; /* uart9_rxd */ + bias-pull-up; + drive-strength = <32>; + }; + }; + + uart9_2_cfg: uart9-2-cfg { + uart9-2-pins { + pinmux = , /* uart9_txd */ + ; /* uart9_rxd */ + bias-pull-up; drive-strength = <32>; }; }; -- 2.43.0 From spriteovo at gmail.com Wed Sep 3 11:52:24 2025 From: spriteovo at gmail.com (Asuna) Date: Thu, 4 Sep 2025 02:52:24 +0800 Subject: RISC-V: Re-enable GCC+Rust builds In-Reply-To: <20250901-unseemly-blimp-a74e3c77e780@spud> References: <68496eed-b5a4-4739-8d84-dcc428a08e20@gmail.com> <20250830-cheesy-prone-ee5fae406c22@spud> <20250901-lasso-kabob-de32b8fcede8@spud> <20250901-unseemly-blimp-a74e3c77e780@spud> Message-ID: <9de3673c-3b1c-4838-a5fe-e8877a1c3ace@gmail.com> (I apologize if anyone gets this email twice, the first time I mistakenly sent it as HTML and it was rejected by mailing lists) > Similarly, something like -Wno-unterminated-string-initialization could > cause a problem if gcc supports it but not libclang. And the -Wno-unterminated-string-initialization is not supposed to be a problem either, today I noticed that in rust/Makefile there is: # All warnings are inhibited since GCC builds are very experimental, # many GCC warnings are not supported by Clang, they may only appear in # some configurations, with new GCC versions, etc. bindgen_extra_c_flags = -w --target=$(BINDGEN_TARGET) The -w flag inhibits all warnings, even though Clang may not recognize it. I was not able to reproduce any errors related to this. I have a patch ready and will send it later. Please let me know if I'm missing something there. Thanks. From alexghiti at rivosinc.com Wed Sep 3 11:53:07 2025 From: alexghiti at rivosinc.com (Alexandre Ghiti) Date: Wed, 03 Sep 2025 18:53:07 +0000 Subject: [PATCH 0/2] Fix riscv sparse warnings Message-ID: <20250903-dev-alex-sparse_warnings_v1-v1-0-7e6350beb700@rivosinc.com> This series simply fixes 2 recently introduced sparse warnings. Signed-off-by: Alexandre Ghiti --- Alexandre Ghiti (2): riscv: Fix sparse warning in __get_user_error() riscv: Fix sparse warning about different address spaces arch/riscv/include/asm/uaccess.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- base-commit: ae9a687664d965b13eeab276111b2f97dd02e090 change-id: 20250903-dev-alex-sparse_warnings_v1-ecb4a333afdd Best regards, -- Alexandre Ghiti From alexghiti at rivosinc.com Wed Sep 3 11:53:08 2025 From: alexghiti at rivosinc.com (Alexandre Ghiti) Date: Wed, 03 Sep 2025 18:53:08 +0000 Subject: [PATCH 1/2] riscv: Fix sparse warning in __get_user_error() In-Reply-To: <20250903-dev-alex-sparse_warnings_v1-v1-0-7e6350beb700@rivosinc.com> References: <20250903-dev-alex-sparse_warnings_v1-v1-0-7e6350beb700@rivosinc.com> Message-ID: <20250903-dev-alex-sparse_warnings_v1-v1-1-7e6350beb700@rivosinc.com> We used to assign 0 to x without an appropriate cast which results in sparse complaining when x is a pointer: >> block/ioctl.c:72:39: sparse: sparse: Using plain integer as NULL pointer So fix this by casting 0 to the correct type of x. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202508062321.gHv4kvuY-lkp at intel.com/ Fixes: f6bff7827a48 ("riscv: uaccess: use 'asm_goto_output' for get_user()") Cc: stable at vger.kernel.org Signed-off-by: Alexandre Ghiti --- arch/riscv/include/asm/uaccess.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h index 22e3f52a763d1c0350e8185225e4c99aac3fc549..551e7490737effb2c238e6a4db50293ece7c9df9 100644 --- a/arch/riscv/include/asm/uaccess.h +++ b/arch/riscv/include/asm/uaccess.h @@ -209,7 +209,7 @@ do { \ err = 0; \ break; \ __gu_failed: \ - x = 0; \ + x = (__typeof__(x))0; \ err = -EFAULT; \ } while (0) -- 2.34.1 From alexghiti at rivosinc.com Wed Sep 3 11:53:09 2025 From: alexghiti at rivosinc.com (Alexandre Ghiti) Date: Wed, 03 Sep 2025 18:53:09 +0000 Subject: [PATCH 2/2] riscv: Fix sparse warning about different address spaces In-Reply-To: <20250903-dev-alex-sparse_warnings_v1-v1-0-7e6350beb700@rivosinc.com> References: <20250903-dev-alex-sparse_warnings_v1-v1-0-7e6350beb700@rivosinc.com> Message-ID: <20250903-dev-alex-sparse_warnings_v1-v1-2-7e6350beb700@rivosinc.com> We did not propagate the __user attribute of the pointers in __get_kernel_nofault() and __put_kernel_nofault(), which results in sparse complaining: >> mm/maccess.c:41:17: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected void const [noderef] __user *from @@ got unsigned long long [usertype] * @@ mm/maccess.c:41:17: sparse: expected void const [noderef] __user *from mm/maccess.c:41:17: sparse: got unsigned long long [usertype] * So fix this by correctly casting those pointers. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202508161713.RWu30Lv1-lkp at intel.com/ Suggested-by: Al Viro Fixes: f6bff7827a48 ("riscv: uaccess: use 'asm_goto_output' for get_user()") Cc: stable at vger.kernel.org Signed-off-by: Alexandre Ghiti --- arch/riscv/include/asm/uaccess.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h index 551e7490737effb2c238e6a4db50293ece7c9df9..f5f4f7f85543f2a635b18e4bd1c6202b20e3b239 100644 --- a/arch/riscv/include/asm/uaccess.h +++ b/arch/riscv/include/asm/uaccess.h @@ -438,10 +438,10 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n) } #define __get_kernel_nofault(dst, src, type, err_label) \ - __get_user_nocheck(*((type *)(dst)), (type *)(src), err_label) + __get_user_nocheck(*((type *)(dst)), (__force __user type *)(src), err_label) #define __put_kernel_nofault(dst, src, type, err_label) \ - __put_user_nocheck(*((type *)(src)), (type *)(dst), err_label) + __put_user_nocheck(*((type *)(src)), (__force __user type *)(dst), err_label) static __must_check __always_inline bool user_access_begin(const void __user *ptr, size_t len) { -- 2.34.1 From vishal.moola at gmail.com Wed Sep 3 11:59:14 2025 From: vishal.moola at gmail.com (Vishal Moola (Oracle)) Date: Wed, 3 Sep 2025 11:59:14 -0700 Subject: [PATCH v3 0/7] Cleanup free_pages() misuse Message-ID: <20250903185921.1785167-1-vishal.moola@gmail.com> free_pages() is supposed to be called when we only have a virtual address. __free_pages() is supposed to be called when we have a page. There are a number of callers that use page_address() to get a page's virtual address then call free_pages() on it when they should just call __free_pages() directly. Add kernel-docs for free_pages() to help callers better understand which function they should be calling, and replace the obvious cases of misuse. ----------------- Based on mm-new, I intend to have all of these taken through the mm tree. I've split the patches into separate subsystems to make it easier to resolve conflicts, but there aren't any functional changes. v3: - Collect some Reviewed-by Tags - Replace remaining free_page() calls in patch 7 (all other patches are unchanged from v2) - Add all appropriate mailing lists that were missing from v2 v2: - Reference __get_free_pages() instead of alloc_pages() in the free_pages() kernel-doc - Get some Reviewed-by tags - cc the subsystem maintainers related to specific patches Vishal Moola (Oracle) (7): mm/page_alloc: Add kernel-docs for free_pages() aoe: Stop calling page_address() in free_page() x86: Stop calling page_address() in free_pages() riscv: Stop calling page_address() in free_pages() powerpc: Stop calling page_address() in free_pages() arm64: Stop calling page_address() in free_pages() virtio_balloon: Stop calling page_address() in free_pages() arch/arm64/mm/mmu.c | 2 +- arch/powerpc/mm/book3s64/radix_pgtable.c | 2 +- arch/riscv/mm/init.c | 4 ++-- arch/x86/mm/init_64.c | 2 +- arch/x86/platform/efi/memmap.c | 2 +- drivers/block/aoe/aoecmd.c | 2 +- drivers/virtio/virtio_balloon.c | 8 +++----- mm/page_alloc.c | 9 +++++++++ 8 files changed, 19 insertions(+), 12 deletions(-) -- 2.51.0 From vishal.moola at gmail.com Wed Sep 3 11:59:15 2025 From: vishal.moola at gmail.com (Vishal Moola (Oracle)) Date: Wed, 3 Sep 2025 11:59:15 -0700 Subject: [PATCH v3 1/7] mm/page_alloc: Add kernel-docs for free_pages() In-Reply-To: <20250903185921.1785167-1-vishal.moola@gmail.com> References: <20250903185921.1785167-1-vishal.moola@gmail.com> Message-ID: <20250903185921.1785167-2-vishal.moola@gmail.com> Add kernel-docs to free_pages(). This will help callers understand when to use it instead of __free_pages(). Signed-off-by: Vishal Moola (Oracle) Reviewed-by: Matthew Wilcox (Oracle) Acked-by: SeongJae Park --- mm/page_alloc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c2a254d877f8..0277b86b62ac 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5269,6 +5269,15 @@ void free_pages_nolock(struct page *page, unsigned int order) ___free_pages(page, order, FPI_TRYLOCK); } +/** + * free_pages - Free pages allocated with __get_free_pages(). + * @addr: The virtual address tied to a page returned from __get_free_pages(). + * @order: The order of the allocation. + * + * This function behaves the same as __free_pages(). Use this function + * to free pages when you only have a valid virtual address. If you have + * the page, call __free_pages() instead. + */ void free_pages(unsigned long addr, unsigned int order) { if (addr != 0) { -- 2.51.0 From vishal.moola at gmail.com Wed Sep 3 11:59:16 2025 From: vishal.moola at gmail.com (Vishal Moola (Oracle)) Date: Wed, 3 Sep 2025 11:59:16 -0700 Subject: [PATCH v3 2/7] aoe: Stop calling page_address() in free_page() In-Reply-To: <20250903185921.1785167-1-vishal.moola@gmail.com> References: <20250903185921.1785167-1-vishal.moola@gmail.com> Message-ID: <20250903185921.1785167-3-vishal.moola@gmail.com> free_page() should be used when we only have a virtual address. We should call __free_page() directly on our page instead. Signed-off-by: Vishal Moola (Oracle) Reviewed-by: Matthew Wilcox (Oracle) --- drivers/block/aoe/aoecmd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 6298f8e271e3..a9affb7c264d 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -1761,6 +1761,6 @@ aoecmd_exit(void) kfree(kts); kfree(ktiowq); - free_page((unsigned long) page_address(empty_page)); + __free_page(empty_page); empty_page = NULL; } -- 2.51.0 From vishal.moola at gmail.com Wed Sep 3 11:59:17 2025 From: vishal.moola at gmail.com (Vishal Moola (Oracle)) Date: Wed, 3 Sep 2025 11:59:17 -0700 Subject: [PATCH v3 3/7] x86: Stop calling page_address() in free_pages() In-Reply-To: <20250903185921.1785167-1-vishal.moola@gmail.com> References: <20250903185921.1785167-1-vishal.moola@gmail.com> Message-ID: <20250903185921.1785167-4-vishal.moola@gmail.com> free_pages() should be used when we only have a virtual address. We should call __free_pages() directly on our page instead. Signed-off-by: Vishal Moola (Oracle) Acked-by: Dave Hansen --- arch/x86/mm/init_64.c | 2 +- arch/x86/platform/efi/memmap.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index b9426fce5f3e..0e4270e20fad 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1031,7 +1031,7 @@ static void __meminit free_pagetable(struct page *page, int order) free_reserved_pages(page, nr_pages); #endif } else { - free_pages((unsigned long)page_address(page), order); + __free_pages(page, order); } } diff --git a/arch/x86/platform/efi/memmap.c b/arch/x86/platform/efi/memmap.c index 061b8ecc71a1..023697c88910 100644 --- a/arch/x86/platform/efi/memmap.c +++ b/arch/x86/platform/efi/memmap.c @@ -42,7 +42,7 @@ void __init __efi_memmap_free(u64 phys, unsigned long size, unsigned long flags) struct page *p = pfn_to_page(PHYS_PFN(phys)); unsigned int order = get_order(size); - free_pages((unsigned long) page_address(p), order); + __free_pages(p, order); } } -- 2.51.0 From vishal.moola at gmail.com Wed Sep 3 11:59:18 2025 From: vishal.moola at gmail.com (Vishal Moola (Oracle)) Date: Wed, 3 Sep 2025 11:59:18 -0700 Subject: [PATCH v3 4/7] riscv: Stop calling page_address() in free_pages() In-Reply-To: <20250903185921.1785167-1-vishal.moola@gmail.com> References: <20250903185921.1785167-1-vishal.moola@gmail.com> Message-ID: <20250903185921.1785167-5-vishal.moola@gmail.com> free_pages() should be used when we only have a virtual address. We should call __free_pages() directly on our page instead. Signed-off-by: Vishal Moola (Oracle) --- arch/riscv/mm/init.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 15683ae13fa5..1056c11d3251 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1624,7 +1624,7 @@ static void __meminit free_pud_table(pud_t *pud_start, p4d_t *p4d) if (PageReserved(page)) free_reserved_page(page); else - free_pages((unsigned long)page_address(page), 0); + __free_pages(page, 0); p4d_clear(p4d); } @@ -1646,7 +1646,7 @@ static void __meminit free_vmemmap_storage(struct page *page, size_t size, return; } - free_pages((unsigned long)page_address(page), order); + __free_pages(page, order); } static void __meminit remove_pte_mapping(pte_t *pte_base, unsigned long addr, unsigned long end, -- 2.51.0 From vishal.moola at gmail.com Wed Sep 3 11:59:19 2025 From: vishal.moola at gmail.com (Vishal Moola (Oracle)) Date: Wed, 3 Sep 2025 11:59:19 -0700 Subject: [PATCH v3 5/7] powerpc: Stop calling page_address() in free_pages() In-Reply-To: <20250903185921.1785167-1-vishal.moola@gmail.com> References: <20250903185921.1785167-1-vishal.moola@gmail.com> Message-ID: <20250903185921.1785167-6-vishal.moola@gmail.com> free_pages() should be used when we only have a virtual address. We should call __free_pages() directly on our page instead. Signed-off-by: Vishal Moola (Oracle) Reviewed-by: Ritesh Harjani (IBM) --- arch/powerpc/mm/book3s64/radix_pgtable.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c index be523e5fe9c5..73977dbabcf2 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -780,7 +780,7 @@ static void __meminit free_vmemmap_pages(struct page *page, while (nr_pages--) free_reserved_page(page++); } else - free_pages((unsigned long)page_address(page), order); + __free_pages(page, order); } static void __meminit remove_pte_table(pte_t *pte_start, unsigned long addr, -- 2.51.0 From vishal.moola at gmail.com Wed Sep 3 11:59:20 2025 From: vishal.moola at gmail.com (Vishal Moola (Oracle)) Date: Wed, 3 Sep 2025 11:59:20 -0700 Subject: [PATCH v3 6/7] arm64: Stop calling page_address() in free_pages() In-Reply-To: <20250903185921.1785167-1-vishal.moola@gmail.com> References: <20250903185921.1785167-1-vishal.moola@gmail.com> Message-ID: <20250903185921.1785167-7-vishal.moola@gmail.com> free_pages() should be used when we only have a virtual address. We should call __free_pages() directly on our page instead. Signed-off-by: Vishal Moola (Oracle) Acked-by: Catalin Marinas --- arch/arm64/mm/mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 34e5d78af076..e14a75d0dbd3 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -843,7 +843,7 @@ static void free_hotplug_page_range(struct page *page, size_t size, vmem_altmap_free(altmap, size >> PAGE_SHIFT); } else { WARN_ON(PageReserved(page)); - free_pages((unsigned long)page_address(page), get_order(size)); + __free_pages(page, get_order(size)); } } -- 2.51.0 From vishal.moola at gmail.com Wed Sep 3 11:59:21 2025 From: vishal.moola at gmail.com (Vishal Moola (Oracle)) Date: Wed, 3 Sep 2025 11:59:21 -0700 Subject: [PATCH v3 7/7] virtio_balloon: Stop calling page_address() in free_pages() In-Reply-To: <20250903185921.1785167-1-vishal.moola@gmail.com> References: <20250903185921.1785167-1-vishal.moola@gmail.com> Message-ID: <20250903185921.1785167-8-vishal.moola@gmail.com> free_pages() should be used when we only have a virtual address. We should call __free_pages() directly on our page instead. Signed-off-by: Vishal Moola (Oracle) --- drivers/virtio/virtio_balloon.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index eae65136cdfb..7f3fd72678eb 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -488,8 +488,7 @@ static unsigned long return_free_pages_to_mm(struct virtio_balloon *vb, page = balloon_page_pop(&vb->free_page_list); if (!page) break; - free_pages((unsigned long)page_address(page), - VIRTIO_BALLOON_HINT_BLOCK_ORDER); + __free_pages(page, VIRTIO_BALLOON_HINT_BLOCK_ORDER); } vb->num_free_page_blocks -= num_returned; spin_unlock_irq(&vb->free_page_list_lock); @@ -719,8 +718,7 @@ static int get_free_page_and_send(struct virtio_balloon *vb) if (vq->num_free > 1) { err = virtqueue_add_inbuf(vq, &sg, 1, p, GFP_KERNEL); if (unlikely(err)) { - free_pages((unsigned long)p, - VIRTIO_BALLOON_HINT_BLOCK_ORDER); + __free_pages(page, VIRTIO_BALLOON_HINT_BLOCK_ORDER); return err; } virtqueue_kick(vq); @@ -733,7 +731,7 @@ static int get_free_page_and_send(struct virtio_balloon *vb) * The vq has no available entry to add this page block, so * just free it. */ - free_pages((unsigned long)p, VIRTIO_BALLOON_HINT_BLOCK_ORDER); + __free_pages(page, VIRTIO_BALLOON_HINT_BLOCK_ORDER); } return 0; -- 2.51.0 From spriteovo at gmail.com Wed Sep 3 12:07:56 2025 From: spriteovo at gmail.com (Asuna Yang) Date: Wed, 3 Sep 2025 21:07:56 +0200 Subject: [PATCH 1/2] rust: get the version of libclang used by bindgen in a separate script In-Reply-To: <20250830-cheesy-prone-ee5fae406c22@spud> References: <20250830-cheesy-prone-ee5fae406c22@spud> Message-ID: <20250903190806.2604757-1-SpriteOvO@gmail.com> Decouple the code for getting the version of libclang used by Rust bindgen from rust_is_available.sh into a separate script so that we can define a symbol for the version in Kconfig that will be used for checking in subsequent patches. Signed-off-by: Asuna Yang --- init/Kconfig | 6 ++ rust/Makefile | 2 +- scripts/Kconfig.include | 1 + ...lang.h => rust-bindgen-libclang-version.h} | 0 scripts/rust-bindgen-libclang-version.sh | 94 +++++++++++++++++++ scripts/rust_is_available.sh | 58 +++--------- scripts/rust_is_available_test.py | 22 ++--- 7 files changed, 125 insertions(+), 58 deletions(-) rename scripts/{rust_is_available_bindgen_libclang.h => rust-bindgen-libclang-version.h} (100%) create mode 100755 scripts/rust-bindgen-libclang-version.sh diff --git a/init/Kconfig b/init/Kconfig index 666783eb50ab..322af2ba76cd 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -82,6 +82,12 @@ config RUSTC_LLVM_VERSION int default $(rustc-llvm-version) +config RUST_BINDGEN_LIBCLANG_VERSION + int + default $(rustc-bindgen-libclang-version) + help + This is the version of `libclang` used by the Rust bindings generator. + config CC_CAN_LINK bool default $(success,$(srctree)/scripts/cc-can-link.sh $(CC) $(CLANG_FLAGS) $(USERCFLAGS) $(USERLDFLAGS) $(m64-flag)) if 64BIT diff --git a/rust/Makefile b/rust/Makefile index 115b63b7d1e3..34d0429d50fd 100644 --- a/rust/Makefile +++ b/rust/Makefile @@ -300,7 +300,7 @@ bindgen_extra_c_flags = -w --target=$(BINDGEN_TARGET) # https://github.com/llvm/llvm-project/issues/44842 # https://github.com/llvm/llvm-project/blob/llvmorg-16.0.0-rc2/clang/docs/ReleaseNotes.rst#deprecated-compiler-flags ifdef CONFIG_INIT_STACK_ALL_ZERO -libclang_maj_ver=$(shell $(BINDGEN) $(srctree)/scripts/rust_is_available_bindgen_libclang.h 2>&1 | sed -ne 's/.*clang version \([0-9]*\).*/\1/p') +libclang_maj_ver=$(shell $(srctree)/scripts/rust-bindgen-libclang-version.sh --with-non-canonical $(BINDGEN) | sed -ne '2s/\([0-9]*\).*/\1/p') ifeq ($(shell expr $(libclang_maj_ver) \< 16), 1) bindgen_extra_c_flags += -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang endif diff --git a/scripts/Kconfig.include b/scripts/Kconfig.include index 33193ca6e803..68df1fed69a1 100644 --- a/scripts/Kconfig.include +++ b/scripts/Kconfig.include @@ -67,6 +67,7 @@ m64-flag := $(cc-option-bit,-m64) rustc-version := $(shell,$(srctree)/scripts/rustc-version.sh $(RUSTC)) rustc-llvm-version := $(shell,$(srctree)/scripts/rustc-llvm-version.sh $(RUSTC)) +rustc-bindgen-libclang-version := $(shell,$(srctree)/scripts/rust-bindgen-libclang-version.sh $(BINDGEN) 2>/dev/null) # $(rustc-option,) # Return y if the Rust compiler supports , n otherwise diff --git a/scripts/rust_is_available_bindgen_libclang.h b/scripts/rust-bindgen-libclang-version.h similarity index 100% rename from scripts/rust_is_available_bindgen_libclang.h rename to scripts/rust-bindgen-libclang-version.h diff --git a/scripts/rust-bindgen-libclang-version.sh b/scripts/rust-bindgen-libclang-version.sh new file mode 100755 index 000000000000..45485d0f95c8 --- /dev/null +++ b/scripts/rust-bindgen-libclang-version.sh @@ -0,0 +1,94 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# +# Print the version of `libclang` used by the Rust bindings generator in a 5 or 6-digit form, +# and a non-canonical form if `--with-non-canonical` option is specified. +# Also, perform the minimum version check. + +set -e + +# If the script fails, print 0 to stdout as the version output. +trap 'if [ $? -ne 0 ]; then echo 0; fi' EXIT + +while [ $# -gt 0 ]; do + case "$1" in + --with-non-canonical) + with_non_canonical=1 + ;; + -*) + echo >&2 "Unknown option: $1" + exit 1 + ;; + *) + break + ;; + esac + shift +done + +get_bindgen_libclang_version() +{ + # Invoke `bindgen` to get the `libclang` version found by `bindgen`. This step + # may already fail if, for instance, `libclang` is not found, thus inform the + # user in such a case. + output=$( \ + LC_ALL=C "$@" $(dirname $0)/rust-bindgen-libclang-version.h 2>&1 >/dev/null + ) || code=$? + if [ -n "$code" ]; then + echo >&2 "***" + echo >&2 "*** Running '$@' to check the libclang version (used by the Rust" + echo >&2 "*** bindings generator) failed with code $code. This may be caused by" + echo >&2 "*** a failure to locate libclang. See output and docs below for details:" + echo >&2 "***" + echo >&2 "$output" + echo >&2 "***" + exit 1 + fi + + # Unlike other version checks, note that this one does not necessarily appear + # in the first line of the output, thus no `sed` address is provided. + version=$( \ + echo "$output" \ + | sed -nE 's:.*clang version ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p' + ) + if [ -z "$version" ]; then + echo >&2 "***" + echo >&2 "*** Running '$@' to check the libclang version (used by the Rust" + echo >&2 "*** bindings generator) did not return an expected output. See output" + echo >&2 "*** and docs below for details:" + echo >&2 "***" + echo >&2 "$output" + echo >&2 "***" + exit 1 + fi + echo "$version" +} + +# Convert the version string x.y.z to a canonical 5 or 6-digit form. +get_canonical_version() +{ + IFS=. + set -- $1 + echo $((10000 * $1 + 100 * $2 + $3)) +} + +min_tool_version=$(dirname $0)/min-tool-version.sh + +version=$(get_bindgen_libclang_version "$@") +min_version=$($min_tool_version llvm) +cversion=$(get_canonical_version $version) +min_cversion=$(get_canonical_version $min_version) + +if [ "$cversion" -lt "$min_cversion" ]; then + echo >&2 "***" + echo >&2 "*** libclang (used by the Rust bindings generator '$@') is too old." + echo >&2 "*** Your version: $version" + echo >&2 "*** Minimum version: $min_version" + echo >&2 "***" + exit 1 +fi + +echo "$cversion" +if [ -n "$with_non_canonical" ]; then + echo "$version" +fi diff --git a/scripts/rust_is_available.sh b/scripts/rust_is_available.sh index d2323de0692c..ccbd5efe9498 100755 --- a/scripts/rust_is_available.sh +++ b/scripts/rust_is_available.sh @@ -179,55 +179,21 @@ fi # Check that the `libclang` used by the Rust bindings generator is suitable. # -# In order to do that, first invoke `bindgen` to get the `libclang` version -# found by `bindgen`. This step may already fail if, for instance, `libclang` -# is not found, thus inform the user in such a case. -bindgen_libclang_output=$( \ - LC_ALL=C "$BINDGEN" $(dirname $0)/rust_is_available_bindgen_libclang.h 2>&1 >/dev/null -) || bindgen_libclang_code=$? -if [ -n "$bindgen_libclang_code" ]; then - echo >&2 "***" - echo >&2 "*** Running '$BINDGEN' to check the libclang version (used by the Rust" - echo >&2 "*** bindings generator) failed with code $bindgen_libclang_code. This may be caused by" - echo >&2 "*** a failure to locate libclang. See output and docs below for details:" - echo >&2 "***" - echo >&2 "$bindgen_libclang_output" - echo >&2 "***" - exit 1 -fi - -# `bindgen` returned successfully, thus use the output to check that the version -# of the `libclang` found by the Rust bindings generator is suitable. -# -# Unlike other version checks, note that this one does not necessarily appear -# in the first line of the output, thus no `sed` address is provided. -bindgen_libclang_version=$( \ - echo "$bindgen_libclang_output" \ - | sed -nE 's:.*clang version ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p' -) -if [ -z "$bindgen_libclang_version" ]; then - echo >&2 "***" - echo >&2 "*** Running '$BINDGEN' to check the libclang version (used by the Rust" - echo >&2 "*** bindings generator) did not return an expected output. See output" - echo >&2 "*** and docs below for details:" - echo >&2 "***" - echo >&2 "$bindgen_libclang_output" - echo >&2 "***" - exit 1 -fi -bindgen_libclang_min_version=$($min_tool_version llvm) -bindgen_libclang_cversion=$(get_canonical_version $bindgen_libclang_version) -bindgen_libclang_min_cversion=$(get_canonical_version $bindgen_libclang_min_version) -if [ "$bindgen_libclang_cversion" -lt "$bindgen_libclang_min_cversion" ]; then - echo >&2 "***" - echo >&2 "*** libclang (used by the Rust bindings generator '$BINDGEN') is too old." - echo >&2 "*** Your version: $bindgen_libclang_version" - echo >&2 "*** Minimum version: $bindgen_libclang_min_version" - echo >&2 "***" +# Get the version, and the minimum version check will be performed internally. +bindgen_libclang_version_output=$( \ + $(dirname $0)/rust-bindgen-libclang-version.sh --with-non-canonical $BINDGEN +) || bindgen_libclang_version_code=$? +if [ -n "$bindgen_libclang_version_code" ]; then + # Detailed error messages have already been output in the script we just called. exit 1 fi -if [ "$bindgen_libclang_cversion" -ge 1900100 ] && +# Getting the version successfully, thus use the output to check that the +# version of the `libclang` found by the Rust bindings generator is suitable. +readarray -t bindgen_libclang_version_array <<<"$bindgen_libclang_version_output" +bindgen_libclang_version=${bindgen_libclang_version_array[1]} +bindgen_libclang_cversion=${bindgen_libclang_version_array[0]} +if [ "$bindgen_libclang_cversion" -ge 190100 ] && [ "$rust_bindings_generator_cversion" -lt 6905 ]; then # Distributions may have patched the issue (e.g. Debian did). if ! "$BINDGEN" $(dirname $0)/rust_is_available_bindgen_libclang_concat.h | grep -q foofoo; then diff --git a/scripts/rust_is_available_test.py b/scripts/rust_is_available_test.py index 4fcc319dea84..b7265e2691cf 100755 --- a/scripts/rust_is_available_test.py +++ b/scripts/rust_is_available_test.py @@ -72,7 +72,7 @@ else: return cls.generate_executable(f"""#!/usr/bin/env python3 import sys -if "rust_is_available_bindgen_libclang.h" in " ".join(sys.argv): +if "rust-bindgen-libclang-version.h" in " ".join(sys.argv): {libclang_case} elif "rust_is_available_bindgen_0_66.h" in " ".join(sys.argv): {version_0_66_case} @@ -113,7 +113,7 @@ else: cls.bindgen_default_bindgen_version_stdout = f"bindgen {cls.bindgen_default_version}" cls.bindgen_default_bindgen_libclang_failure_exit_code = 42 - cls.bindgen_default_bindgen_libclang_stderr = f"scripts/rust_is_available_bindgen_libclang.h:2:9: warning: clang version {cls.llvm_default_version} [-W#pragma-messages], err: false" + cls.bindgen_default_bindgen_libclang_stderr = f"scripts/rust-bindgen-libclang-version.h:2:9: warning: clang version {cls.llvm_default_version} [-W#pragma-messages], err: false" cls.default_rustc = cls.generate_rustc(f"rustc {cls.rustc_default_version}") cls.default_bindgen = cls.generate_bindgen(cls.bindgen_default_bindgen_version_stdout, cls.bindgen_default_bindgen_libclang_stderr) @@ -265,13 +265,13 @@ else: self.assertIn(f"bindings generator) failed with code {self.bindgen_default_bindgen_libclang_failure_exit_code}. This may be caused by", result.stderr) def test_bindgen_libclang_unexpected_version(self): - bindgen = self.generate_bindgen_libclang("scripts/rust_is_available_bindgen_libclang.h:2:9: warning: clang version unexpected [-W#pragma-messages], err: false") + bindgen = self.generate_bindgen_libclang("scripts/rust-bindgen-libclang-version.h:2:9: warning: clang version unexpected [-W#pragma-messages], err: false") result = self.run_script(self.Expected.FAILURE, { "BINDGEN": bindgen }) self.assertIn(f"Running '{bindgen}' to check the libclang version (used by the Rust", result.stderr) self.assertIn("bindings generator) did not return an expected output. See output", result.stderr) def test_bindgen_libclang_old_version(self): - bindgen = self.generate_bindgen_libclang("scripts/rust_is_available_bindgen_libclang.h:2:9: warning: clang version 10.0.0 [-W#pragma-messages], err: false") + bindgen = self.generate_bindgen_libclang("scripts/rust-bindgen-libclang-version.h:2:9: warning: clang version 10.0.0 [-W#pragma-messages], err: false") result = self.run_script(self.Expected.FAILURE, { "BINDGEN": bindgen }) self.assertIn(f"libclang (used by the Rust bindings generator '{bindgen}') is too old.", result.stderr) @@ -291,7 +291,7 @@ else: ): with self.subTest(bindgen_version=bindgen_version, libclang_version=libclang_version): cc = self.generate_clang(f"clang version {libclang_version}") - libclang_stderr = f"scripts/rust_is_available_bindgen_libclang.h:2:9: warning: clang version {libclang_version} [-W#pragma-messages], err: false" + libclang_stderr = f"scripts/rust-bindgen-libclang-version.h:2:9: warning: clang version {libclang_version} [-W#pragma-messages], err: false" bindgen = self.generate_bindgen(f"bindgen {bindgen_version}", libclang_stderr) result = self.run_script(expected_not_patched, { "BINDGEN": bindgen, "CC": cc }) if expected_not_patched == self.Expected.SUCCESS_WITH_WARNINGS: @@ -301,7 +301,7 @@ else: result = self.run_script(self.Expected.SUCCESS, { "BINDGEN": bindgen, "CC": cc }) def test_clang_matches_bindgen_libclang_different_bindgen(self): - bindgen = self.generate_bindgen_libclang("scripts/rust_is_available_bindgen_libclang.h:2:9: warning: clang version 999.0.0 [-W#pragma-messages], err: false") + bindgen = self.generate_bindgen_libclang("scripts/rust-bindgen-libclang-version.h:2:9: warning: clang version 999.0.0 [-W#pragma-messages], err: false") result = self.run_script(self.Expected.SUCCESS_WITH_WARNINGS, { "BINDGEN": bindgen }) self.assertIn("version does not match Clang's. This may be a problem.", result.stderr) @@ -352,16 +352,16 @@ InstalledDir: /usr/bin def test_success_bindgen_libclang(self): for stderr in ( - f"scripts/rust_is_available_bindgen_libclang.h:2:9: warning: clang version {self.llvm_default_version} (https://github.com/llvm/llvm-project.git 4a2c05b05ed07f1f620e94f6524a8b4b2760a0b1) [-W#pragma-messages], err: false", - f"/home/jd/Documents/dev/kernel-module-flake/linux-6.1/outputs/dev/lib/modules/6.1.0-development/source/scripts/rust_is_available_bindgen_libclang.h:2:9: warning: clang version {self.llvm_default_version} [-W#pragma-messages], err: false", - f"scripts/rust_is_available_bindgen_libclang.h:2:9: warning: clang version {self.llvm_default_version} (Fedora 13.0.0-3.fc35) [-W#pragma-messages], err: false", + f"scripts/rust-bindgen-libclang-version.h:2:9: warning: clang version {self.llvm_default_version} (https://github.com/llvm/llvm-project.git 4a2c05b05ed07f1f620e94f6524a8b4b2760a0b1) [-W#pragma-messages], err: false", + f"/home/jd/Documents/dev/kernel-module-flake/linux-6.1/outputs/dev/lib/modules/6.1.0-development/source/scripts/rust-bindgen-libclang-version.h:2:9: warning: clang version {self.llvm_default_version} [-W#pragma-messages], err: false", + f"scripts/rust-bindgen-libclang-version.h:2:9: warning: clang version {self.llvm_default_version} (Fedora 13.0.0-3.fc35) [-W#pragma-messages], err: false", f""" /nix/store/dsd5gz46hdbdk2rfdimqddhq6m8m8fqs-bash-5.1-p16/bin/bash: warning: setlocale: LC_ALL: cannot change locale (c) -scripts/rust_is_available_bindgen_libclang.h:2:9: warning: clang version {self.llvm_default_version} [-W#pragma-messages], err: false +scripts/rust-bindgen-libclang-version.h:2:9: warning: clang version {self.llvm_default_version} [-W#pragma-messages], err: false """, f""" /nix/store/dsd5gz46hdbdk2rfdimqddhq6m8m8fqs-bash-5.1.0-p16/bin/bash: warning: setlocale: LC_ALL: cannot change locale (c) -/home/jd/Documents/dev/kernel-module-flake/linux-6.1/outputs/dev/lib/modules/6.1.0-development/source/scripts/rust_is_available_bindgen_libclang.h:2:9: warning: clang version {self.llvm_default_version} (Fedora 13.0.0-3.fc35) [-W#pragma-messages], err: false +/home/jd/Documents/dev/kernel-module-flake/linux-6.1/outputs/dev/lib/modules/6.1.0-development/source/scripts/rust-bindgen-libclang-version.h:2:9: warning: clang version {self.llvm_default_version} (Fedora 13.0.0-3.fc35) [-W#pragma-messages], err: false """ ): with self.subTest(stderr=stderr): -- 2.51.0 From spriteovo at gmail.com Wed Sep 3 12:07:57 2025 From: spriteovo at gmail.com (Asuna Yang) Date: Wed, 3 Sep 2025 21:07:57 +0200 Subject: [PATCH 2/2] RISC-V: re-enable gcc + rust builds In-Reply-To: <20250903190806.2604757-1-SpriteOvO@gmail.com> References: <20250830-cheesy-prone-ee5fae406c22@spud> <20250903190806.2604757-1-SpriteOvO@gmail.com> Message-ID: <20250903190806.2604757-2-SpriteOvO@gmail.com> Commit 33549fcf37ec ("RISC-V: disallow gcc + rust builds") disabled GCC + Rust builds for RISC-V due to differences in extension handling compared to LLVM. Add a Kconfig non-visible symbol to ensure that all important RISC-V specific flags that will be used by GCC can be correctly recognized by Rust bindgen's libclang, otherwise config HAVE_RUST will not be selected. Signed-off-by: Asuna Yang --- Documentation/rust/arch-support.rst | 2 +- arch/riscv/Kconfig | 62 ++++++++++++++++++++++++++++- rust/Makefile | 7 +++- 3 files changed, 68 insertions(+), 3 deletions(-) diff --git a/Documentation/rust/arch-support.rst b/Documentation/rust/arch-support.rst index 6e6a515d0899..5282e0e174e8 100644 --- a/Documentation/rust/arch-support.rst +++ b/Documentation/rust/arch-support.rst @@ -18,7 +18,7 @@ Architecture Level of support Constraints ``arm`` Maintained ARMv7 Little Endian only. ``arm64`` Maintained Little Endian only. ``loongarch`` Maintained \- -``riscv`` Maintained ``riscv64`` and LLVM/Clang only. +``riscv`` Maintained ``riscv64`` only. ``um`` Maintained \- ``x86`` Maintained ``x86_64`` only. ============= ================ ============================================== diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 1c5544401530..d7f421e0f429 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -195,7 +195,7 @@ config RISCV select HAVE_REGS_AND_STACK_ACCESS_API select HAVE_RETHOOK if !XIP_KERNEL select HAVE_RSEQ - select HAVE_RUST if RUSTC_SUPPORTS_RISCV && CC_IS_CLANG + select HAVE_RUST if RUSTC_SUPPORTS_RISCV && RUST_BINDGEN_LIBCLANG_RECOGNIZES_FLAGS select HAVE_SAMPLE_FTRACE_DIRECT select HAVE_SAMPLE_FTRACE_DIRECT_MULTI select HAVE_STACKPROTECTOR @@ -236,6 +236,27 @@ config RUSTC_SUPPORTS_RISCV # -Zsanitizer=shadow-call-stack flag. depends on !SHADOW_CALL_STACK || RUSTC_VERSION >= 108200 +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_FLAGS + def_bool y + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_V + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZABHA + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZACAS + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBA + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBB + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBC + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBKB + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZICSR_ZIFENCEI + help + Rust bindgen currently relies on libclang as backend. When a mixed build is + performed (building C code with GCC), GCC flags will be passed to libclang. + However, not all GCC flags are recognized by Clang, so most of the + incompatible flags have been filtered out in rust/Makefile. + + For RISC-V, GCC and Clang are not at the same pace of implementing extensions. + This config ensures that all important RISC-V specific flags that will be + used by GCC can be correctly recognized by Rust bindgen's libclang, otherwise + config HAVE_RUST will not be selected. + config CLANG_SUPPORTS_DYNAMIC_FTRACE def_bool CC_IS_CLANG # https://github.com/ClangBuiltLinux/linux/issues/1817 @@ -634,6 +655,11 @@ config TOOLCHAIN_HAS_V depends on LLD_VERSION >= 140000 || LD_VERSION >= 23800 depends on AS_HAS_OPTION_ARCH +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_V + def_bool y + # https://github.com/llvm/llvm-project/commit/e6de53b4de4aecca4ac892500a0907805896ed27 + depends on !TOOLCHAIN_HAS_V || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 + config RISCV_ISA_V bool "Vector extension support" depends on TOOLCHAIN_HAS_V @@ -698,6 +724,11 @@ config TOOLCHAIN_HAS_ZABHA depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zabha) depends on AS_HAS_OPTION_ARCH +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZABHA + def_bool y + # https://github.com/llvm/llvm-project/commit/6b7444964a8d028989beee554a1f5c61d16a1cac + depends on !TOOLCHAIN_HAS_ZABHA || RUST_BINDGEN_LIBCLANG_VERSION >= 190100 + config RISCV_ISA_ZABHA bool "Zabha extension support for atomic byte/halfword operations" depends on TOOLCHAIN_HAS_ZABHA @@ -716,6 +747,11 @@ config TOOLCHAIN_HAS_ZACAS depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zacas) depends on AS_HAS_OPTION_ARCH +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZACAS + def_bool y + # https://github.com/llvm/llvm-project/commit/614aeda93b2225c6eb42b00ba189ba7ca2585c60 + depends on !TOOLCHAIN_HAS_ZACAS || RUST_BINDGEN_LIBCLANG_VERSION >= 200100 + config RISCV_ISA_ZACAS bool "Zacas extension support for atomic CAS" depends on TOOLCHAIN_HAS_ZACAS @@ -735,6 +771,11 @@ config TOOLCHAIN_HAS_ZBB depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 depends on AS_HAS_OPTION_ARCH +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBB + def_bool y + # https://github.com/llvm/llvm-project/commit/33d008b169f3c813a4a45da220d0952f795ac477 + depends on !TOOLCHAIN_HAS_ZBB || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 + # This symbol indicates that the toolchain supports all v1.0 vector crypto # extensions, including Zvk*, Zvbb, and Zvbc. LLVM added all of these at once. # binutils added all except Zvkb, then added Zvkb. So we just check for Zvkb. @@ -750,6 +791,11 @@ config TOOLCHAIN_HAS_ZBA depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 depends on AS_HAS_OPTION_ARCH +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBA + def_bool y + # https://github.com/llvm/llvm-project/commit/33d008b169f3c813a4a45da220d0952f795ac477 + depends on !TOOLCHAIN_HAS_ZBA || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 + config RISCV_ISA_ZBA bool "Zba extension support for bit manipulation instructions" default y @@ -785,6 +831,11 @@ config TOOLCHAIN_HAS_ZBC depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 depends on AS_HAS_OPTION_ARCH +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBC + def_bool y + # https://github.com/llvm/llvm-project/commit/33d008b169f3c813a4a45da220d0952f795ac477 + depends on !TOOLCHAIN_HAS_ZBC || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 + config RISCV_ISA_ZBC bool "Zbc extension support for carry-less multiplication instructions" depends on TOOLCHAIN_HAS_ZBC @@ -808,6 +859,11 @@ config TOOLCHAIN_HAS_ZBKB depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 depends on AS_HAS_OPTION_ARCH +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBKB + def_bool y + # https://github.com/llvm/llvm-project/commit/7ee1c162cc53d37f717f9a138276ad64fa6863bc + depends on !TOOLCHAIN_HAS_ZBKB || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 + config RISCV_ISA_ZBKB bool "Zbkb extension support for bit manipulation instructions" depends on TOOLCHAIN_HAS_ZBKB @@ -894,6 +950,10 @@ config TOOLCHAIN_NEEDS_OLD_ISA_SPEC versions of clang and GCC to be passed to GAS, which has the same result as passing zicsr and zifencei to -march. +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZICSR_ZIFENCEI + def_bool y + depends on TOOLCHAIN_NEEDS_OLD_ISA_SPEC || (TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI && RUST_BINDGEN_LIBCLANG_VERSION >= 170000) + config FPU bool "FPU support" default y diff --git a/rust/Makefile b/rust/Makefile index 34d0429d50fd..7b1055c98146 100644 --- a/rust/Makefile +++ b/rust/Makefile @@ -277,20 +277,25 @@ bindgen_skip_c_flags := -mno-fp-ret-in-387 -mpreferred-stack-boundary=% \ -fno-inline-functions-called-once -fsanitize=bounds-strict \ -fstrict-flex-arrays=% -fmin-function-alignment=% \ -fzero-init-padding-bits=% -mno-fdpic \ - --param=% --param asan-% + --param=% --param asan-% -mno-riscv-attribute # Derived from `scripts/Makefile.clang`. BINDGEN_TARGET_x86 := x86_64-linux-gnu BINDGEN_TARGET_arm64 := aarch64-linux-gnu BINDGEN_TARGET_arm := arm-linux-gnueabi BINDGEN_TARGET_loongarch := loongarch64-linux-gnusf +BINDGEN_TARGET_riscv := riscv64-linux-gnu BINDGEN_TARGET_um := $(BINDGEN_TARGET_$(SUBARCH)) BINDGEN_TARGET := $(BINDGEN_TARGET_$(SRCARCH)) +ifeq ($(BINDGEN_TARGET),) +$(error add '--target=' option to rust/Makefile) +else # All warnings are inhibited since GCC builds are very experimental, # many GCC warnings are not supported by Clang, they may only appear in # some configurations, with new GCC versions, etc. bindgen_extra_c_flags = -w --target=$(BINDGEN_TARGET) +endif # Auto variable zero-initialization requires an additional special option with # clang that is going to be removed sometime in the future (likely in -- 2.51.0 From david at redhat.com Wed Sep 3 12:13:17 2025 From: david at redhat.com (David Hildenbrand) Date: Wed, 3 Sep 2025 21:13:17 +0200 Subject: [PATCH v3 7/7] virtio_balloon: Stop calling page_address() in free_pages() In-Reply-To: <20250903185921.1785167-8-vishal.moola@gmail.com> References: <20250903185921.1785167-1-vishal.moola@gmail.com> <20250903185921.1785167-8-vishal.moola@gmail.com> Message-ID: <8ac6e37b-15f0-40ee-b60c-8893156f8cbd@redhat.com> On 03.09.25 20:59, Vishal Moola (Oracle) wrote: > free_pages() should be used when we only have a virtual address. We > should call __free_pages() directly on our page instead. > > Signed-off-by: Vishal Moola (Oracle) > --- Acked-by: David Hildenbrand -- Cheers David / dhildenb From paul at crapouillou.net Wed Sep 3 12:18:09 2025 From: paul at crapouillou.net (Paul Cercueil) Date: Wed, 03 Sep 2025 21:18:09 +0200 Subject: [PATCH 022/114] clk: ingenic: cgu: convert from round_rate() to determine_rate() In-Reply-To: <20250811-clk-for-stephen-round-rate-v1-22-b3bf97b038dc@redhat.com> References: <20250811-clk-for-stephen-round-rate-v1-0-b3bf97b038dc@redhat.com> <20250811-clk-for-stephen-round-rate-v1-22-b3bf97b038dc@redhat.com> Message-ID: <9387ed0a6d4e4c77ffd0f7aee55eaa1ff6ecd22e.camel@crapouillou.net> Hi Brian, Le lundi 11 ao?t 2025 ? 11:18 -0400, Brian Masney via B4 Relay a ?crit?: > From: Brian Masney > > The round_rate() clk ops is deprecated, so migrate this driver from > round_rate() to determine_rate() using the Coccinelle semantic patch > on the cover letter of this series. > > Signed-off-by: Brian Masney Reviewed-by: Paul Cercueil Cheers, -Paul > --- > ?drivers/clk/ingenic/cgu.c | 12 +++++++----- > ?1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/clk/ingenic/cgu.c b/drivers/clk/ingenic/cgu.c > index > 0c9c8344ad1103b13337a26e14b0d5d5c340d705..91e7ac0cc3342e3552acb9d2ec0 > 0865a5234ad4f 100644 > --- a/drivers/clk/ingenic/cgu.c > +++ b/drivers/clk/ingenic/cgu.c > @@ -174,14 +174,16 @@ ingenic_pll_calc(const struct > ingenic_cgu_clk_info *clk_info, > ? n * od); > ?} > ? > -static long > -ingenic_pll_round_rate(struct clk_hw *hw, unsigned long req_rate, > - ?????? unsigned long *prate) > +static int ingenic_pll_determine_rate(struct clk_hw *hw, > + ????? struct clk_rate_request *req) > ?{ > ? struct ingenic_clk *ingenic_clk = to_ingenic_clk(hw); > ? const struct ingenic_cgu_clk_info *clk_info = > to_clk_info(ingenic_clk); > ? > - return ingenic_pll_calc(clk_info, req_rate, *prate, NULL, > NULL, NULL); > + req->rate = ingenic_pll_calc(clk_info, req->rate, req- > >best_parent_rate, > + ???? NULL, NULL, NULL); > + > + return 0; > ?} > ? > ?static inline int ingenic_pll_check_stable(struct ingenic_cgu *cgu, > @@ -317,7 +319,7 @@ static int ingenic_pll_is_enabled(struct clk_hw > *hw) > ? > ?static const struct clk_ops ingenic_pll_ops = { > ? .recalc_rate = ingenic_pll_recalc_rate, > - .round_rate = ingenic_pll_round_rate, > + .determine_rate = ingenic_pll_determine_rate, > ? .set_rate = ingenic_pll_set_rate, > ? > ? .enable = ingenic_pll_enable, From paul at crapouillou.net Wed Sep 3 12:20:32 2025 From: paul at crapouillou.net (Paul Cercueil) Date: Wed, 03 Sep 2025 21:20:32 +0200 Subject: [PATCH 024/114] clk: ingenic: x1000-cgu: convert from round_rate() to determine_rate() In-Reply-To: <20250811-clk-for-stephen-round-rate-v1-24-b3bf97b038dc@redhat.com> References: <20250811-clk-for-stephen-round-rate-v1-0-b3bf97b038dc@redhat.com> <20250811-clk-for-stephen-round-rate-v1-24-b3bf97b038dc@redhat.com> Message-ID: Le lundi 11 ao?t 2025 ? 11:18 -0400, Brian Masney via B4 Relay a ?crit?: > From: Brian Masney > > The round_rate() clk ops is deprecated, so migrate this driver from > round_rate() to determine_rate() using the Coccinelle semantic patch > on the cover letter of this series. > > Signed-off-by: Brian Masney Reviewed-by: Paul Cercueil Cheers, -Paul > --- > ?drivers/clk/ingenic/x1000-cgu.c | 19 ++++++++++--------- > ?1 file changed, 10 insertions(+), 9 deletions(-) > > diff --git a/drivers/clk/ingenic/x1000-cgu.c > b/drivers/clk/ingenic/x1000-cgu.c > index > feb03eed4fe8c8f617ef98254a522d72d452ac17..d80886caf393309a0c908c06fb5 > aa8b59aced127 100644 > --- a/drivers/clk/ingenic/x1000-cgu.c > +++ b/drivers/clk/ingenic/x1000-cgu.c > @@ -84,16 +84,17 @@ static unsigned long > x1000_otg_phy_recalc_rate(struct clk_hw *hw, > ? return parent_rate; > ?} > ? > -static long x1000_otg_phy_round_rate(struct clk_hw *hw, unsigned > long req_rate, > - ????? unsigned long *parent_rate) > +static int x1000_otg_phy_determine_rate(struct clk_hw *hw, > + struct clk_rate_request > *req) > ?{ > - if (req_rate < 18000000) > - return 12000000; > - > - if (req_rate < 36000000) > - return 24000000; > + if (req->rate < 18000000) > + req->rate = 12000000; > + else if (req->rate < 36000000) > + req->rate = 24000000; > + else > + req->rate = 48000000; > ? > - return 48000000; > + return 0; > ?} > ? > ?static int x1000_otg_phy_set_rate(struct clk_hw *hw, unsigned long > req_rate, > @@ -161,7 +162,7 @@ static int x1000_usb_phy_is_enabled(struct clk_hw > *hw) > ? > ?static const struct clk_ops x1000_otg_phy_ops = { > ? .recalc_rate = x1000_otg_phy_recalc_rate, > - .round_rate = x1000_otg_phy_round_rate, > + .determine_rate = x1000_otg_phy_determine_rate, > ? .set_rate = x1000_otg_phy_set_rate, > ? > ? .enable = x1000_usb_phy_enable, From alexghiti at rivosinc.com Wed Sep 3 12:54:29 2025 From: alexghiti at rivosinc.com (Alexandre Ghiti) Date: Wed, 03 Sep 2025 19:54:29 +0000 Subject: [PATCH RFC] riscv: Do not handle break traps from kernel as nmi Message-ID: <20250903-dev-alex-break_nmi_v1-v1-1-4a3d81c29598@rivosinc.com> kprobe has been broken on riscv for quite some time. There is an attempt [1] to fix that which actually works. This patch works because it enables ARCH_HAVE_NMI_SAFE_CMPXCHG and that makes the ring buffer allocation succeed when handling a kprobe because we handle *all* kprobes in nmi context. We do so because Peter advised us to treat all kernel traps as nmi [2]. But that does not seem right for kprobe handling, so instead, treat break traps from kernel as non-nmi. Link: https://lore.kernel.org/linux-riscv/20250711090443.1688404-1-pulehui at huaweicloud.com/ [1] Link: https://lore.kernel.org/linux-riscv/20250422094419.GC14170 at noisy.programming.kicks-ass.net/ [2] Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry") Cc: stable at vger.kernel.org Signed-off-by: Alexandre Ghiti --- This is clearly an RFC and this is likely not the right way to go, it is just a way to trigger a discussion about if handling kprobes in an nmi context is the right way or not. --- arch/riscv/kernel/traps.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 80230de167def3c33db5bc190347ec5f87dbb6e3..90f36bb9b12d4ba0db0f084f87899156e3c7dc6f 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -315,11 +315,11 @@ asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs) local_irq_disable(); irqentry_exit_to_user_mode(regs); } else { - irqentry_state_t state = irqentry_nmi_enter(regs); + irqentry_state_t state = irqentry_enter(regs); handle_break(regs); - irqentry_nmi_exit(regs, state); + irqentry_exit(regs, state); } } --- base-commit: ae9a687664d965b13eeab276111b2f97dd02e090 change-id: 20250903-dev-alex-break_nmi_v1-57c5321f3e80 Best regards, -- Alexandre Ghiti From peterz at infradead.org Wed Sep 3 13:28:03 2025 From: peterz at infradead.org (Peter Zijlstra) Date: Wed, 3 Sep 2025 22:28:03 +0200 Subject: [PATCH RFC] riscv: Do not handle break traps from kernel as nmi In-Reply-To: <20250903-dev-alex-break_nmi_v1-v1-1-4a3d81c29598@rivosinc.com> References: <20250903-dev-alex-break_nmi_v1-v1-1-4a3d81c29598@rivosinc.com> Message-ID: <20250903202803.GQ4067720@noisy.programming.kicks-ass.net> On Wed, Sep 03, 2025 at 07:54:29PM +0000, Alexandre Ghiti wrote: > kprobe has been broken on riscv for quite some time. There is an attempt > [1] to fix that which actually works. This patch works because it enables > ARCH_HAVE_NMI_SAFE_CMPXCHG and that makes the ring buffer allocation > succeed when handling a kprobe because we handle *all* kprobes in nmi > context. We do so because Peter advised us to treat all kernel traps as > nmi [2]. > > But that does not seem right for kprobe handling, so instead, treat > break traps from kernel as non-nmi. You can put a kprobe inside: local_irq_disable(), no? Inside any random spinlock region in fact. How is the probe then not NMI like? From miguel.ojeda.sandonis at gmail.com Wed Sep 3 16:24:22 2025 From: miguel.ojeda.sandonis at gmail.com (Miguel Ojeda) Date: Thu, 4 Sep 2025 01:24:22 +0200 Subject: [PATCH 1/2] rust: get the version of libclang used by bindgen in a separate script In-Reply-To: <20250903190806.2604757-1-SpriteOvO@gmail.com> References: <20250830-cheesy-prone-ee5fae406c22@spud> <20250903190806.2604757-1-SpriteOvO@gmail.com> Message-ID: On Wed, Sep 3, 2025 at 9:08?PM Asuna Yang wrote: > > Decouple the code for getting the version of libclang used by Rust > bindgen from rust_is_available.sh into a separate script so that we can > define a symbol for the version in Kconfig that will be used for > checking in subsequent patches. Hmm... I am not sure it is a good idea to move that into another script. Do we really need to intertwine these two scripts? The rename isn't great either. Cc'ing the rust-for-linux list too. Thanks! Cheers, Miguel From miguel.ojeda.sandonis at gmail.com Wed Sep 3 16:27:03 2025 From: miguel.ojeda.sandonis at gmail.com (Miguel Ojeda) Date: Thu, 4 Sep 2025 01:27:03 +0200 Subject: [PATCH 2/2] RISC-V: re-enable gcc + rust builds In-Reply-To: <20250903190806.2604757-2-SpriteOvO@gmail.com> References: <20250830-cheesy-prone-ee5fae406c22@spud> <20250903190806.2604757-1-SpriteOvO@gmail.com> <20250903190806.2604757-2-SpriteOvO@gmail.com> Message-ID: On Wed, Sep 3, 2025 at 9:08?PM Asuna Yang wrote: > > Commit 33549fcf37ec ("RISC-V: disallow gcc + rust builds") disabled GCC > + Rust builds for RISC-V due to differences in extension handling > compared to LLVM. > > Add a Kconfig non-visible symbol to ensure that all important RISC-V > specific flags that will be used by GCC can be correctly recognized by > Rust bindgen's libclang, otherwise config HAVE_RUST will not be > selected. I think the commit message should try to explain each the changes here (or to split them). e.g. it doesn't mention the other config symbols added, nor the extra flag skipped, nor the `error` call. Cc'ing the rust-for-linux list. Thanks! Cheers, Miguel From cyrilbur at tenstorrent.com Wed Sep 3 17:34:17 2025 From: cyrilbur at tenstorrent.com (Cyril Bur) Date: Thu, 4 Sep 2025 10:34:17 +1000 Subject: [PATCH 1/2] riscv: Fix sparse warning in __get_user_error() In-Reply-To: <20250903-dev-alex-sparse_warnings_v1-v1-1-7e6350beb700@rivosinc.com> References: <20250903-dev-alex-sparse_warnings_v1-v1-0-7e6350beb700@rivosinc.com> <20250903-dev-alex-sparse_warnings_v1-v1-1-7e6350beb700@rivosinc.com> Message-ID: These two are on me. Sorry. Thanks for fixing them Alexandre. On 4/9/2025 4:53 am, Alexandre Ghiti wrote: > We used to assign 0 to x without an appropriate cast which results in > sparse complaining when x is a pointer: > >>> block/ioctl.c:72:39: sparse: sparse: Using plain integer as NULL pointer > > So fix this by casting 0 to the correct type of x. > > Reported-by: kernel test robot > Closes: https://lore.kernel.org/oe-kbuild-all/202508062321.gHv4kvuY-lkp at intel.com/ > Fixes: f6bff7827a48 ("riscv: uaccess: use 'asm_goto_output' for get_user()") > Cc: stable at vger.kernel.org > Signed-off-by: Alexandre Ghiti Reviewed-by: Cyril Bur > --- > arch/riscv/include/asm/uaccess.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h > index 22e3f52a763d1c0350e8185225e4c99aac3fc549..551e7490737effb2c238e6a4db50293ece7c9df9 100644 > --- a/arch/riscv/include/asm/uaccess.h > +++ b/arch/riscv/include/asm/uaccess.h > @@ -209,7 +209,7 @@ do { \ > err = 0; \ > break; \ > __gu_failed: \ > - x = 0; \ > + x = (__typeof__(x))0; \ > err = -EFAULT; \ > } while (0) > > From ritesh.list at gmail.com Wed Sep 3 17:24:04 2025 From: ritesh.list at gmail.com (Ritesh Harjani (IBM)) Date: Thu, 04 Sep 2025 05:54:04 +0530 Subject: [PATCH v6 1/2] kasan: introduce ARCH_DEFER_KASAN and unify static key across modes In-Reply-To: <20250810125746.1105476-2-snovitoll@gmail.com> References: <20250810125746.1105476-1-snovitoll@gmail.com> <20250810125746.1105476-2-snovitoll@gmail.com> Message-ID: <87ldmv6p5n.ritesh.list@gmail.com> Sabyrzhan Tasbolatov writes: > Introduce CONFIG_ARCH_DEFER_KASAN to identify architectures [1] that need > to defer KASAN initialization until shadow memory is properly set up, > and unify the static key infrastructure across all KASAN modes. > > [1] PowerPC, UML, LoongArch selects ARCH_DEFER_KASAN. > > The core issue is that different architectures haveinconsistent approaches > to KASAN readiness tracking: > - PowerPC, LoongArch, and UML arch, each implement own > kasan_arch_is_ready() > - Only HW_TAGS mode had a unified static key (kasan_flag_enabled) > - Generic and SW_TAGS modes relied on arch-specific solutions or always-on > behavior > > This patch addresses the fragmentation in KASAN initialization > across architectures by introducing a unified approach that eliminates > duplicate static keys and arch-specific kasan_arch_is_ready() > implementations. > > Let's replace kasan_arch_is_ready() with existing kasan_enabled() check, > which examines the static key being enabled if arch selects > ARCH_DEFER_KASAN or has HW_TAGS mode support. > For other arch, kasan_enabled() checks the enablement during compile time. > > Now KASAN users can use a single kasan_enabled() check everywhere. > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217049 > Signed-off-by: Sabyrzhan Tasbolatov > --- > Changes in v6: > - Added more details in git commit message > - Fixed commenting format per coding style in UML (Christophe Leroy) > - Changed exporting to GPL for kasan_flag_enabled (Christophe Leroy) > - Converted ARCH_DEFER_KASAN to def_bool depending on KASAN to avoid > arch users to have `if KASAN` condition (Christophe Leroy) > - Forgot to add __init for kasan_init in UML > > Changes in v5: > - Unified patches where arch (powerpc, UML, loongarch) selects > ARCH_DEFER_KASAN in the first patch not to break > bisectability > - Removed kasan_arch_is_ready completely as there is no user > - Removed __wrappers in v4, left only those where it's necessary > due to different implementations > > Changes in v4: > - Fixed HW_TAGS static key functionality (was broken in v3) > - Merged configuration and implementation for atomicity > --- > arch/loongarch/Kconfig | 1 + > arch/loongarch/include/asm/kasan.h | 7 ------ > arch/loongarch/mm/kasan_init.c | 8 +++---- > arch/powerpc/Kconfig | 1 + > arch/powerpc/include/asm/kasan.h | 12 ---------- > arch/powerpc/mm/kasan/init_32.c | 2 +- > arch/powerpc/mm/kasan/init_book3e_64.c | 2 +- > arch/powerpc/mm/kasan/init_book3s_64.c | 6 +---- > arch/um/Kconfig | 1 + > arch/um/include/asm/kasan.h | 5 ++-- > arch/um/kernel/mem.c | 13 ++++++++--- > include/linux/kasan-enabled.h | 32 ++++++++++++++++++-------- > include/linux/kasan.h | 6 +++++ > lib/Kconfig.kasan | 12 ++++++++++ > mm/kasan/common.c | 17 ++++++++++---- > mm/kasan/generic.c | 19 +++++++++++---- > mm/kasan/hw_tags.c | 9 +------- > mm/kasan/kasan.h | 8 ++++++- > mm/kasan/shadow.c | 12 +++++----- > mm/kasan/sw_tags.c | 1 + > mm/kasan/tags.c | 2 +- > 21 files changed, 106 insertions(+), 70 deletions(-) > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 93402a1d9c9f..4730c676b6bf 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -122,6 +122,7 @@ config PPC > # Please keep this list sorted alphabetically. > # > select ARCH_32BIT_OFF_T if PPC32 > + select ARCH_NEEDS_DEFER_KASAN if PPC_RADIX_MMU > select ARCH_DISABLE_KASAN_INLINE if PPC_RADIX_MMU > select ARCH_DMA_DEFAULT_COHERENT if !NOT_COHERENT_CACHE > select ARCH_ENABLE_MEMORY_HOTPLUG > diff --git a/arch/powerpc/include/asm/kasan.h b/arch/powerpc/include/asm/kasan.h > index b5bbb94c51f6..957a57c1db58 100644 > --- a/arch/powerpc/include/asm/kasan.h > +++ b/arch/powerpc/include/asm/kasan.h > @@ -53,18 +53,6 @@ > #endif > > #ifdef CONFIG_KASAN > -#ifdef CONFIG_PPC_BOOK3S_64 > -DECLARE_STATIC_KEY_FALSE(powerpc_kasan_enabled_key); > - > -static __always_inline bool kasan_arch_is_ready(void) > -{ > - if (static_branch_likely(&powerpc_kasan_enabled_key)) > - return true; > - return false; > -} > - > -#define kasan_arch_is_ready kasan_arch_is_ready > -#endif > > void kasan_early_init(void); > void kasan_mmu_init(void); > diff --git a/arch/powerpc/mm/kasan/init_32.c b/arch/powerpc/mm/kasan/init_32.c > index 03666d790a53..1d083597464f 100644 > --- a/arch/powerpc/mm/kasan/init_32.c > +++ b/arch/powerpc/mm/kasan/init_32.c > @@ -165,7 +165,7 @@ void __init kasan_init(void) > > /* At this point kasan is fully initialized. Enable error messages */ > init_task.kasan_depth = 0; > - pr_info("KASAN init done\n"); > + kasan_init_generic(); > } > > void __init kasan_late_init(void) > diff --git a/arch/powerpc/mm/kasan/init_book3e_64.c b/arch/powerpc/mm/kasan/init_book3e_64.c > index 60c78aac0f63..0d3a73d6d4b0 100644 > --- a/arch/powerpc/mm/kasan/init_book3e_64.c > +++ b/arch/powerpc/mm/kasan/init_book3e_64.c > @@ -127,7 +127,7 @@ void __init kasan_init(void) > > /* Enable error messages */ > init_task.kasan_depth = 0; > - pr_info("KASAN init done\n"); > + kasan_init_generic(); > } > > void __init kasan_late_init(void) { } > diff --git a/arch/powerpc/mm/kasan/init_book3s_64.c b/arch/powerpc/mm/kasan/init_book3s_64.c > index 7d959544c077..dcafa641804c 100644 > --- a/arch/powerpc/mm/kasan/init_book3s_64.c > +++ b/arch/powerpc/mm/kasan/init_book3s_64.c > @@ -19,8 +19,6 @@ > #include > #include > > -DEFINE_STATIC_KEY_FALSE(powerpc_kasan_enabled_key); > - > static void __init kasan_init_phys_region(void *start, void *end) > { > unsigned long k_start, k_end, k_cur; > @@ -92,11 +90,9 @@ void __init kasan_init(void) > */ > memset(kasan_early_shadow_page, 0, PAGE_SIZE); > > - static_branch_inc(&powerpc_kasan_enabled_key); > - > /* Enable error messages */ > init_task.kasan_depth = 0; > - pr_info("KASAN init done\n"); > + kasan_init_generic(); > } > Only book3s64 needs static keys here because of radix v/s hash mode selection during runtime. The changes in above for powerpc looks good to me. It's a nice cleanup too. So feel free to take: Reviewed-by: Ritesh Harjani (IBM) #powerpc However I have few comments below... ... > diff --git a/mm/kasan/common.c b/mm/kasan/common.c > index 9142964ab9c9..e3765931a31f 100644 > --- a/mm/kasan/common.c > +++ b/mm/kasan/common.c > @@ -32,6 +32,15 @@ > #include "kasan.h" > #include "../slab.h" > > +#if defined(CONFIG_ARCH_DEFER_KASAN) || defined(CONFIG_KASAN_HW_TAGS) > +/* > + * Definition of the unified static key declared in kasan-enabled.h. > + * This provides consistent runtime enable/disable across KASAN modes. > + */ > +DEFINE_STATIC_KEY_FALSE(kasan_flag_enabled); > +EXPORT_SYMBOL_GPL(kasan_flag_enabled); > +#endif > + > struct slab *kasan_addr_to_slab(const void *addr) > { > if (virt_addr_valid(addr)) > @@ -246,7 +255,7 @@ static inline void poison_slab_object(struct kmem_cache *cache, void *object, > bool __kasan_slab_pre_free(struct kmem_cache *cache, void *object, > unsigned long ip) > { > - if (!kasan_arch_is_ready() || is_kfence_address(object)) > + if (is_kfence_address(object)) For changes in mm/kasan/common.c.. you have removed !kasan_enabled() check at few places. This seems to be partial revert of commit [1]: b3c34245756ada "kasan: catch invalid free before SLUB reinitializes the object" Can you please explain why this needs to be removed? Also the explaination of the same should be added in the commit msg too. [1]: https://lore.kernel.org/all/20240809-kasan-tsbrcu-v8-1-aef4593f9532 at google.com/ > return false; > return check_slab_allocation(cache, object, ip); > } > @@ -254,7 +263,7 @@ bool __kasan_slab_pre_free(struct kmem_cache *cache, void *object, > bool __kasan_slab_free(struct kmem_cache *cache, void *object, bool init, > bool still_accessible) > { > - if (!kasan_arch_is_ready() || is_kfence_address(object)) > + if (is_kfence_address(object)) > return false; > > /* > @@ -293,7 +302,7 @@ bool __kasan_slab_free(struct kmem_cache *cache, void *object, bool init, > > static inline bool check_page_allocation(void *ptr, unsigned long ip) > { > - if (!kasan_arch_is_ready()) > + if (!kasan_enabled()) > return false; > > if (ptr != page_address(virt_to_head_page(ptr))) { > @@ -522,7 +531,7 @@ bool __kasan_mempool_poison_object(void *ptr, unsigned long ip) > return true; > } > > - if (is_kfence_address(ptr) || !kasan_arch_is_ready()) > + if (is_kfence_address(ptr)) > return true; > > slab = folio_slab(folio); -ritesh From cyrilbur at tenstorrent.com Wed Sep 3 17:42:49 2025 From: cyrilbur at tenstorrent.com (Cyril Bur) Date: Thu, 4 Sep 2025 10:42:49 +1000 Subject: [PATCH 2/2] riscv: Fix sparse warning about different address spaces In-Reply-To: <20250903-dev-alex-sparse_warnings_v1-v1-2-7e6350beb700@rivosinc.com> References: <20250903-dev-alex-sparse_warnings_v1-v1-0-7e6350beb700@rivosinc.com> <20250903-dev-alex-sparse_warnings_v1-v1-2-7e6350beb700@rivosinc.com> Message-ID: <2fb511b8-4841-40e1-a364-e52dad51e300@tenstorrent.com> On 4/9/2025 4:53 am, Alexandre Ghiti wrote: > We did not propagate the __user attribute of the pointers in > __get_kernel_nofault() and __put_kernel_nofault(), which results in > sparse complaining: > >>> mm/maccess.c:41:17: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected void const [noderef] __user *from @@ got unsigned long long [usertype] * @@ > mm/maccess.c:41:17: sparse: expected void const [noderef] __user *from > mm/maccess.c:41:17: sparse: got unsigned long long [usertype] * > > So fix this by correctly casting those pointers. > > Reported-by: kernel test robot > Closes: https://lore.kernel.org/oe-kbuild-all/202508161713.RWu30Lv1-lkp at intel.com/ > Suggested-by: Al Viro > Fixes: f6bff7827a48 ("riscv: uaccess: use 'asm_goto_output' for get_user()") > Cc: stable at vger.kernel.org > Signed-off-by: Alexandre Ghiti Reviewed-by: Cyril Bur > --- > arch/riscv/include/asm/uaccess.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h > index 551e7490737effb2c238e6a4db50293ece7c9df9..f5f4f7f85543f2a635b18e4bd1c6202b20e3b239 100644 > --- a/arch/riscv/include/asm/uaccess.h > +++ b/arch/riscv/include/asm/uaccess.h > @@ -438,10 +438,10 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n) > } > > #define __get_kernel_nofault(dst, src, type, err_label) \ > - __get_user_nocheck(*((type *)(dst)), (type *)(src), err_label) > + __get_user_nocheck(*((type *)(dst)), (__force __user type *)(src), err_label) > > #define __put_kernel_nofault(dst, src, type, err_label) \ > - __put_user_nocheck(*((type *)(src)), (type *)(dst), err_label) > + __put_user_nocheck(*((type *)(src)), (__force __user type *)(dst), err_label) > > static __must_check __always_inline bool user_access_begin(const void __user *ptr, size_t len) > { > From dlan at gentoo.org Wed Sep 3 17:48:45 2025 From: dlan at gentoo.org (Yixun Lan) Date: Thu, 4 Sep 2025 08:48:45 +0800 Subject: (subset) [PATCH v5 0/8] dmaengine: mmp_pdma: Add SpacemiT K1 SoC support with 64-bit addressing In-Reply-To: <20250822-working_dma_0701_v2-v5-0-f5c0eda734cc@riscstar.com> References: <20250822-working_dma_0701_v2-v5-0-f5c0eda734cc@riscstar.com> Message-ID: <175681694608.479569.7465779228756094615.b4-ty@gentoo.org> On Fri, 22 Aug 2025 11:06:26 +0800, Guodong Xu wrote: > This patchset adds support for SpacemiT K1 PDMA controller to the existing > mmp_pdma driver. The K1 PDMA controller is compatible with Marvell MMP PDMA > but extends it with 64-bit addressing capabilities through LPAE (Long > Physical Address Extension) bit and higher 32-bit address registers (DDADRH, > DSADRH and DTADRH). > > In v5, two smatch warnings reported by kernel test bot and Dan Carpenter were > fixed. > > [...] Applied, thanks! [6/8] riscv: dts: spacemit: Add PDMA node for K1 SoC https://github.com/spacemit-com/linux/commit/81d79ad0ddcaeaf6136abe870b2386bde31b7ed4 [7/8] riscv: dts: spacemit: Enable PDMA on Banana Pi F3 and Milkv Jupiter https://github.com/spacemit-com/linux/commit/0e28eab0ca51282e3d14f3e2dba9fc92e3fddbe6 Best regards, -- Yixun Lan From dlan at gentoo.org Wed Sep 3 17:52:24 2025 From: dlan at gentoo.org (Yixun Lan) Date: Thu, 4 Sep 2025 08:52:24 +0800 Subject: [PATCH v2] riscv: dts: spacemit: uart: remove sec_uart1 device node In-Reply-To: <20250902-02-k1-uart-clock-v2-1-f146918d44f6@gentoo.org> References: <20250902-02-k1-uart-clock-v2-1-f146918d44f6@gentoo.org> Message-ID: <175694709379.32917.16100695238119707451.b4-ty@gentoo.org> On Tue, 02 Sep 2025 20:26:58 +0800, Yixun Lan wrote: > sec_uart1 is not available from Linux, and no clock is implemented in > CCF framework, thus 'make dtbs_check' will pop up this warning message: > > serial at f0612000: 'clock-names' is a required property > > Removing the node from device tree to silence the DT check warning. > > [...] Applied, thanks! [1/1] riscv: dts: spacemit: uart: remove sec_uart1 device node https://github.com/spacemit-com/linux/commit/0f084b221e2c5ba16eca85b3d2497f9486bd0329 Best regards, -- Yixun Lan From hengqi.chen at gmail.com Wed Sep 3 18:45:01 2025 From: hengqi.chen at gmail.com (Hengqi Chen) Date: Thu, 4 Sep 2025 09:45:01 +0800 Subject: [PATCH] riscv, bpf: Sign extend struct ops return values properly In-Reply-To: References: <20250827120344.6796-1-hengqi.chen@gmail.com> <1be38ff5-ea37-4d5d-9f33-16799d2fe2c5@huawei.com> Message-ID: On Mon, Sep 1, 2025 at 9:23?PM Pu Lehui wrote: > > > > On 2025/9/1 17:14, Hengqi Chen wrote: > > On Mon, Sep 1, 2025 at 4:06?PM Pu Lehui wrote: > >> > >> > >> > >> On 2025/8/28 9:53, Pu Lehui wrote: > >>> > >>> On 2025/8/27 20:03, Hengqi Chen wrote: > >>>> The ns_bpf_qdisc selftest triggers a kernel panic: > >>>> > >>>> Unable to handle kernel paging request at virtual address > >>>> ffffffffa38dbf58 > >>>> Current test_progs pgtable: 4K pagesize, 57-bit VAs, > >>>> pgdp=0x00000001109cc000 > >>>> [ffffffffa38dbf58] pgd=000000011fffd801, p4d=000000011fffd401, > >>>> pud=000000011fffd001, pmd=0000000000000000 > >>>> Oops [#1] > >>>> Modules linked in: bpf_testmod(OE) xt_conntrack nls_iso8859_1 > >>>> dm_mod drm drm_panel_orientation_quirks configfs backlight btrfs > >>>> blake2b_generic xor lzo_compress zlib_deflate raid6_pq efivarfs [last > >>>> unloaded: bpf_testmod(OE)] > >>>> CPU: 1 UID: 0 PID: 23584 Comm: test_progs Tainted: G W > >>>> OE 6.17.0-rc1-g2465bb83e0b4 #1 NONE > >>>> Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > >>>> Hardware name: Unknown Unknown Product/Unknown Product, BIOS > >>>> 2024.01+dfsg-1ubuntu5.1 01/01/2024 > >>>> epc : __qdisc_run+0x82/0x6f0 > >>>> ra : __qdisc_run+0x6e/0x6f0 > >>>> epc : ffffffff80bd5c7a ra : ffffffff80bd5c66 sp : ff2000000eecb550 > >>>> gp : ffffffff82472098 tp : ff60000096895940 t0 : ffffffff8001f180 > >>>> t1 : ffffffff801e1664 t2 : 0000000000000000 s0 : ff2000000eecb5d0 > >>>> s1 : ff60000093a6a600 a0 : ffffffffa38dbee8 a1 : 0000000000000001 > >>>> a2 : ff2000000eecb510 a3 : 0000000000000001 a4 : 0000000000000000 > >>>> a5 : 0000000000000010 a6 : 0000000000000000 a7 : 0000000000735049 > >>>> s2 : ffffffffa38dbee8 s3 : 0000000000000040 s4 : ff6000008bcda000 > >>>> s5 : 0000000000000008 s6 : ff60000093a6a680 s7 : ff60000093a6a6f0 > >>>> s8 : ff60000093a6a6ac s9 : ff60000093140000 s10: 0000000000000000 > >>>> s11: ff2000000eecb9d0 t3 : 0000000000000000 t4 : 0000000000ff0000 > >>>> t5 : 0000000000000000 t6 : ff60000093a6a8b6 > >>>> status: 0000000200000120 badaddr: ffffffffa38dbf58 cause: > >>>> 000000000000000d > >>>> [] __qdisc_run+0x82/0x6f0 > >>>> [] __dev_queue_xmit+0x4c0/0x1128 > >>>> [] neigh_resolve_output+0xd0/0x170 > >>>> [] ip6_finish_output2+0x226/0x6c8 > >>>> [] ip6_finish_output+0x10c/0x2a0 > >>>> [] ip6_output+0x5e/0x178 > >>>> [] ip6_xmit+0x29a/0x608 > >>>> [] inet6_csk_xmit+0xe6/0x140 > >>>> [] __tcp_transmit_skb+0x45c/0xaa8 > >>>> [] tcp_connect+0x9ce/0xd10 > >>>> [] tcp_v6_connect+0x4ac/0x5e8 > >>>> [] __inet_stream_connect+0xd8/0x318 > >>>> [] inet_stream_connect+0x3e/0x68 > >>>> [] __sys_connect_file+0x50/0x88 > >>>> [] __sys_connect+0x96/0xc8 > >>>> [] __riscv_sys_connect+0x20/0x30 > >>>> [] do_trap_ecall_u+0x256/0x378 > >>>> [] handle_exception+0x14a/0x156 > >>>> Code: 892a 0363 1205 489c 8bc1 c7e5 2d03 084a 2703 080a (2783) 0709 > >>>> ---[ end trace 0000000000000000 ]--- > >>>> > >>>> The bpf_fifo_dequeue prog returns a skb which is a pointer. > >>>> The pointer is treated as a 32bit value and sign extend to > >>>> 64bit in epilogue. This behavior is right for most bpf prog > >>>> types but wrong for struct ops which requires RISC-V ABI. > >>> > >>> Hi Hengqi, > >>> > >>> Nice catch! > >>> > >>> Actually, I think commit 7112cd26e606c7ba51f9cc5c1905f06039f6f379 looks > >>> a little bit wired and related to this issue. I guess I need some time > >>> to recall this commit. > >> > >> Hi Hengqi, > >> > >> Sorry for late due to busy work. After some backtracking, I dismissed my > >> doubts about commit 7112cd26e606. > >> > >>> > >>> Thanks. > >>> > >>>> > >>>> So let's sign extend struct ops return values according to > >>>> the return value spec in function model. > >>>> > >>>> Fixes: 25ad10658dc1 ("riscv, bpf: Adapt bpf trampoline to optimized > >>>> riscv ftrace framework") > >>>> Signed-off-by: Hengqi Chen > >>>> --- > >>>> arch/riscv/net/bpf_jit_comp64.c | 33 +++++++++++++++++++++++++++++++++ > >>>> 1 file changed, 33 insertions(+) > >>>> > >>>> diff --git a/arch/riscv/net/bpf_jit_comp64.c > >>>> b/arch/riscv/net/bpf_jit_comp64.c > >>>> index 549c3063c7f1..11ca56320a3f 100644 > >>>> --- a/arch/riscv/net/bpf_jit_comp64.c > >>>> +++ b/arch/riscv/net/bpf_jit_comp64.c > >>>> @@ -954,6 +954,33 @@ static int invoke_bpf_prog(struct bpf_tramp_link > >>>> *l, int args_off, int retval_of > >>>> return ret; > >>>> } > >>>> +/* > >>>> + * Sign-extend the register if necessary > >>>> + */ >>>> +static int sign_extend(struct rv_jit_context *ctx, int r, u8 size) > > put `ctx` as last param would be more aligned with other function. > > >>>> +{ > >>>> + switch (size) { > >>>> + case 1: > >>>> + emit_slli(r, r, 56, ctx); > >>>> + emit_srai(r, r, 56, ctx); >>>> + break; > >>>> + case 2: > >>>> + emit_slli(r, r, 48, ctx); > >>>> + emit_srai(r, r, 48, ctx) >>>> + break; > >>>> + case 4: > >>>> + emit_addiw(r, r, 0, ctx); > > pls use emit_sextb/h/w() helper > > >>>> + break; > >>>> + case 8: > >>>> + break; > >>>> + default: > >>>> + pr_err("bpf-jit: invalid size %d for sign_extend\n", size); > >>>> + return -EINVAL; > >>>> + } > >>>> + > >>>> + return 0; > >>>> +} > >> > >> We don't need to sign-ext when return value is 1 or 2 bytes. As for 4 > > > > Could you please elaborate more on this ? > > Indeed, you pointed out my misunderstanding. According to riscv calling > convention [0], for signed char and short, we need to do sign extension, > but no need to do the same for unsigned. So for 1 or 2 bytes, we only > need to do that for the signed. > > Link: https://riscv.org/wp-content/uploads/2024/12/riscv-calling.pdf [0] > Thanks, will do. > > IIUC, addiw on 1 byte / 2 byte values is equivalent to zext them. > > > >> bytes, we have already do that in __build_epilogue. So we only need to > >> take care of 8 bytes return value. And the real fix would be: > >> > >> diff --git a/arch/riscv/net/bpf_jit_comp64.c > >> b/arch/riscv/net/bpf_jit_comp64.c > >> index 2f7188e0340a..08cc641f8b7c 100644 > >> --- a/arch/riscv/net/bpf_jit_comp64.c > >> +++ b/arch/riscv/net/bpf_jit_comp64.c > >> @@ -1177,6 +1177,9 @@ static int __arch_prepare_bpf_trampoline(struct > >> bpf_tramp_image *im, > >> if (save_ret) { > >> emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > >> emit_ld(regmap[BPF_REG_0], -(retval_off - 8), > >> RV_REG_FP, ctx); > >> + /* Do not truncate return value when it's 8 bytes */ > >> + if (is_struct_ops && m->ret_size == 8) > >> + emit_mv(RV_REG_A0, regmap[BPF_REG_0], ctx); > >> } > >> > >> emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); > >> > >>>> + > >>>> static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, > >>>> const struct btf_func_model *m, > >>>> struct bpf_tramp_links *tlinks, > >>>> @@ -1177,6 +1204,12 @@ static int __arch_prepare_bpf_trampoline(struct > >>>> bpf_tramp_image *im, > >>>> if (save_ret) { > >>>> emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > >>>> emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx); > >>>> + if (is_struct_ops) { > >>>> + emit_mv(RV_REG_A0, regmap[BPF_REG_0], ctx); > > This could be omit by combining with the sign_extend insn like > `sextb(rd, rs, ctx)`. > > >>>> + ret = sign_extend(ctx, RV_REG_A0, m->ret_size); > >>>> + if (ret) > >>>> + goto out; > >>>> + } > >>>> } > >>>> emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); > From unicornxw at gmail.com Wed Sep 3 20:00:06 2025 From: unicornxw at gmail.com (Chen Wang) Date: Thu, 4 Sep 2025 11:00:06 +0800 Subject: [PATCH v3 0/3] irqchip/sg2042-msi: Set irq type according to DT configuration Message-ID: From: Chen Wang Read the device tree configuration and then use it to set the interrupt type. This patchset is based on irq/drivers branch of tip. --- Changes in v3: Thers is no major change in this version. Just adjust the order of the patches to change the DTs first. Thanks to Thomas for the suggestion. Changes in v2: The patch series is based on irq/drivers branch of tip. You can simply review or test the patches at the link [2]. Reverted the change to obtain params of "msi-ranges"; it's better not to assume the value of "#interrupt-cells" is 2, even though it's known to be the case. Thanks to Inochi for the comments. Changes in v1: The patch series is based on irq/drivers branch of tip. You can simply review or test the patches at the link [1]. Link: https://lore.kernel.org/linux-riscv/cover.1756103516.git.unicorn_wang at outlook.com/ [1] Link: https://lore.kernel.org/linux-riscv/cover.1756169460.git.unicorn_wang at outlook.com/ [2] --- Chen Wang (3): riscv: sophgo: dts: sg2042: change msi irq type to IRQ_TYPE_EDGE_RISING riscv: sophgo: dts: sg2044: change msi irq type to IRQ_TYPE_EDGE_RISING irqchip/sg2042-msi: Set irq type according to DT configuration arch/riscv/boot/dts/sophgo/sg2042.dtsi | 2 +- arch/riscv/boot/dts/sophgo/sg2044.dtsi | 2 +- drivers/irqchip/irq-sg2042-msi.c | 7 +++++-- 3 files changed, 7 insertions(+), 4 deletions(-) base-commit: d36bf356068cdb5499b9bc458db9149c0fd938a2 -- 2.34.1 From unicornxw at gmail.com Wed Sep 3 20:00:37 2025 From: unicornxw at gmail.com (Chen Wang) Date: Thu, 4 Sep 2025 11:00:37 +0800 Subject: [PATCH v3 1/3] riscv: sophgo: dts: sg2042: change msi irq type to IRQ_TYPE_EDGE_RISING In-Reply-To: References: Message-ID: <831c1b650c575380d56ef3e2faed9bee278c9006.1756953919.git.unicorn_wang@outlook.com> From: Chen Wang Fixed msi irq type to be the correct type, although this field is not used. Signed-off-by: Chen Wang --- arch/riscv/boot/dts/sophgo/sg2042.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/boot/dts/sophgo/sg2042.dtsi b/arch/riscv/boot/dts/sophgo/sg2042.dtsi index b3e4d3c18fdc..6430c6e25c00 100644 --- a/arch/riscv/boot/dts/sophgo/sg2042.dtsi +++ b/arch/riscv/boot/dts/sophgo/sg2042.dtsi @@ -190,7 +190,7 @@ msi: msi-controller at 7030010304 { reg-names = "clr", "doorbell"; msi-controller; #msi-cells = <0>; - msi-ranges = <&intc 64 IRQ_TYPE_LEVEL_HIGH 32>; + msi-ranges = <&intc 64 IRQ_TYPE_EDGE_RISING 32>; }; rpgate: clock-controller at 7030010368 { -- 2.34.1 From unicornxw at gmail.com Wed Sep 3 20:00:59 2025 From: unicornxw at gmail.com (Chen Wang) Date: Thu, 4 Sep 2025 11:00:59 +0800 Subject: [PATCH v3 2/3] riscv: sophgo: dts: sg2044: change msi irq type to IRQ_TYPE_EDGE_RISING In-Reply-To: References: Message-ID: From: Chen Wang Fixed msi irq type to be the correct type, although this field is not used. Tested-by: Inochi Amaoto # Sophgo SRD3-10 Signed-off-by: Chen Wang --- arch/riscv/boot/dts/sophgo/sg2044.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/boot/dts/sophgo/sg2044.dtsi b/arch/riscv/boot/dts/sophgo/sg2044.dtsi index 6ec955744b0c..320c4d1d08e6 100644 --- a/arch/riscv/boot/dts/sophgo/sg2044.dtsi +++ b/arch/riscv/boot/dts/sophgo/sg2044.dtsi @@ -214,7 +214,7 @@ msi: msi-controller at 6d50000000 { reg-names = "clr", "doorbell"; #msi-cells = <0>; msi-controller; - msi-ranges = <&intc 352 IRQ_TYPE_LEVEL_HIGH 512>; + msi-ranges = <&intc 352 IRQ_TYPE_EDGE_RISING 512>; status = "disabled"; }; -- 2.34.1 From unicornxw at gmail.com Wed Sep 3 20:01:19 2025 From: unicornxw at gmail.com (Chen Wang) Date: Thu, 4 Sep 2025 11:01:19 +0800 Subject: [PATCH v3 3/3] irqchip/sg2042-msi: Set irq type according to DT configuration In-Reply-To: References: Message-ID: From: Chen Wang Read the device tree configuration and use it to set the interrupt type. Tested-by: Inochi Amaoto # Sophgo SRD3-10 Signed-off-by: Chen Wang --- drivers/irqchip/irq-sg2042-msi.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/irqchip/irq-sg2042-msi.c b/drivers/irqchip/irq-sg2042-msi.c index 3b13dbbfdb51..f7cf0dc72eab 100644 --- a/drivers/irqchip/irq-sg2042-msi.c +++ b/drivers/irqchip/irq-sg2042-msi.c @@ -30,6 +30,7 @@ struct sg204x_msi_chip_info { * @doorbell_addr: see TRM, 10.1.32, GP_INTR0_SET * @irq_first: First vectors number that MSIs starts * @num_irqs: Number of vectors for MSIs + * @irq_type: IRQ type for MSIs * @msi_map: mapping for allocated MSI vectors. * @msi_map_lock: Lock for msi_map * @chip_info: chip specific infomations @@ -41,6 +42,7 @@ struct sg204x_msi_chipdata { u32 irq_first; u32 num_irqs; + unsigned int irq_type; unsigned long *msi_map; struct mutex msi_map_lock; @@ -137,14 +139,14 @@ static int sg204x_msi_parent_domain_alloc(struct irq_domain *domain, unsigned in fwspec.fwnode = domain->parent->fwnode; fwspec.param_count = 2; fwspec.param[0] = data->irq_first + hwirq; - fwspec.param[1] = IRQ_TYPE_EDGE_RISING; + fwspec.param[1] = data->irq_type; ret = irq_domain_alloc_irqs_parent(domain, virq, 1, &fwspec); if (ret) return ret; d = irq_domain_get_irq_data(domain->parent, virq); - return d->chip->irq_set_type(d, IRQ_TYPE_EDGE_RISING); + return d->chip->irq_set_type(d, data->irq_type); } static int sg204x_msi_middle_domain_alloc(struct irq_domain *domain, unsigned int virq, @@ -298,6 +300,7 @@ static int sg2042_msi_probe(struct platform_device *pdev) } data->irq_first = (u32)args.args[0]; + data->irq_type = (unsigned int)args.args[1]; data->num_irqs = (u32)args.args[args.nargs - 1]; mutex_init(&data->msi_map_lock); -- 2.34.1 From guoren at kernel.org Wed Sep 3 20:08:58 2025 From: guoren at kernel.org (Guo Ren) Date: Thu, 4 Sep 2025 11:08:58 +0800 Subject: [PATCH V4 RESEND 0/3] Fixup & optimize hgatp mode & vmid detect functions In-Reply-To: <20250821142542.2472079-1-guoren@kernel.org> References: <20250821142542.2472079-1-guoren@kernel.org> Message-ID: Hi Anup, Ping..., hope for feedback. On Thu, Aug 21, 2025 at 10:26?PM wrote: > > From: "Guo Ren (Alibaba DAMO Academy)" > > Here are serval fixup & optmizitions for hgatp detect according > to the RISC-V Privileged Architecture Spec. > > --- > Changes in v4: > - Involve ("RISC-V: KVM: Prevent HGATP_MODE_BARE passed"), which > explain why gstage_mode_detect needs reset HGATP to zero. > - RESEND for wrong mailing thread. > > Changes in v3: > - Add "Fixes" tag. > - Involve("RISC-V: KVM: Remove unnecessary HGATP csr_read"), which > depends on patch 1. > > Changes in v2: > - Fixed build error since kvm_riscv_gstage_mode() has been modified. > --- > > Fangyu Yu (1): > RISC-V: KVM: Write hgatp register with valid mode bits > > Guo Ren (Alibaba DAMO Academy) (2): > RISC-V: KVM: Remove unnecessary HGATP csr_read > RISC-V: KVM: Prevent HGATP_MODE_BARE passed > > arch/riscv/kvm/gstage.c | 27 ++++++++++++++++++++++++--- > arch/riscv/kvm/main.c | 35 +++++++++++++++++------------------ > arch/riscv/kvm/vmid.c | 8 +++----- > 3 files changed, 44 insertions(+), 26 deletions(-) > > -- > 2.40.1 > -- Best Regards Guo Ren From kees at kernel.org Wed Sep 3 20:52:02 2025 From: kees at kernel.org (Kees Cook) Date: Wed, 3 Sep 2025 20:52:02 -0700 Subject: [RFC RESEND v3] binfmt_elf: preserve original ELF e_flags for core dumps In-Reply-To: <20250901135350.619485-1-svetlana.parfenova@syntacore.com> References: <20250806161814.607668-1-svetlana.parfenova@syntacore.com> <20250901135350.619485-1-svetlana.parfenova@syntacore.com> Message-ID: <202509032051.BF7FC654F@keescook> On Mon, Sep 01, 2025 at 08:53:50PM +0700, Svetlana Parfenova wrote: > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index a4b233a0659e..1bef00208bdd 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -224,6 +224,7 @@ config RISCV > select VDSO_GETRANDOM if HAVE_GENERIC_VDSO > select USER_STACKTRACE_SUPPORT > select ZONE_DMA32 if 64BIT > + select ARCH_HAS_ELF_CORE_EFLAGS > > config RUSTC_SUPPORTS_RISCV I'll take this patch and alphabetize the above select into the right place. Everything else looks great. Thank you! -Kees -- Kees Cook From kees at kernel.org Wed Sep 3 20:52:40 2025 From: kees at kernel.org (Kees Cook) Date: Wed, 3 Sep 2025 20:52:40 -0700 Subject: [RFC RESEND v3] binfmt_elf: preserve original ELF e_flags for core dumps In-Reply-To: <20250901135350.619485-1-svetlana.parfenova@syntacore.com> References: <20250806161814.607668-1-svetlana.parfenova@syntacore.com> <20250901135350.619485-1-svetlana.parfenova@syntacore.com> Message-ID: <175695795639.3712216.15743949549231818751.b4-ty@kernel.org> On Mon, 01 Sep 2025 20:53:50 +0700, Svetlana Parfenova wrote: > Some architectures, such as RISC-V, use the ELF e_flags field to encode > ABI-specific information (e.g., ISA extensions, fpu support). Debuggers > like GDB rely on these flags in core dumps to correctly interpret > optional register sets. If the flags are missing or incorrect, GDB may > warn and ignore valid data, for example: > > warning: Unexpected size of section '.reg2/213' in core file. > > [...] Applied to for-next/execve, thanks! [1/1] binfmt_elf: preserve original ELF e_flags for core dumps https://git.kernel.org/kees/c/8c94db0ae97c Take care, -- Kees Cook From apatel at ventanamicro.com Wed Sep 3 22:31:02 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Thu, 4 Sep 2025 11:01:02 +0530 Subject: [PATCH 4/8] riscv: Introduce support for hardware break/watchpoints In-Reply-To: <20250822174715.1269138-5-jesse@rivosinc.com> References: <20250822174715.1269138-1-jesse@rivosinc.com> <20250822174715.1269138-5-jesse@rivosinc.com> Message-ID: On Fri, Aug 22, 2025 at 11:17?PM Jesse Taube wrote: > > From: Himanshu Chauhan > > RISC-V hardware breakpoint framework is built on top of perf subsystem and uses > SBI debug trigger extension to install/uninstall/update/enable/disable hardware > triggers as specified in Sdtrig ISA extension. > > Signed-off-by: Himanshu Chauhan > Signed-off-by: Jesse Taube > --- > RFC -> V1: > - Add dbtr_mode to rv_init_mcontrol(6)_trigger > - Add select HAVE_MIXED_BREAKPOINTS_REGS > - Add TDATA1_MCTRL_SZ and TDATA1_MCTRL6_SZ > - Capitalize F in Fallback comment > - Fix in_callback code to allow multiple breakpoints > - Move perf_bp_event above setup_singlestep to save the correct state > - Use sbi_err_map_linux_errno for arch_smp_teardown/setup_sbi_shmem > V1 -> V2: > - No change > --- > arch/riscv/Kconfig | 2 + > arch/riscv/include/asm/hw_breakpoint.h | 59 +++ > arch/riscv/include/asm/kdebug.h | 3 +- > arch/riscv/include/asm/sbi.h | 4 +- > arch/riscv/kernel/Makefile | 1 + > arch/riscv/kernel/hw_breakpoint.c | 614 +++++++++++++++++++++++++ > arch/riscv/kernel/traps.c | 6 + > 7 files changed, 687 insertions(+), 2 deletions(-) > create mode 100644 arch/riscv/include/asm/hw_breakpoint.h > create mode 100644 arch/riscv/kernel/hw_breakpoint.c > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index bbec87b79309..fd8b62cdc6f5 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -163,6 +163,7 @@ config RISCV > select HAVE_FUNCTION_ERROR_INJECTION > select HAVE_GCC_PLUGINS > select HAVE_GENERIC_VDSO if MMU && 64BIT > + select HAVE_HW_BREAKPOINT if PERF_EVENTS && RISCV_SBI > select HAVE_IRQ_TIME_ACCOUNTING > select HAVE_KERNEL_BZIP2 if !XIP_KERNEL && !EFI_ZBOOT > select HAVE_KERNEL_GZIP if !XIP_KERNEL && !EFI_ZBOOT > @@ -176,6 +177,7 @@ config RISCV > select HAVE_KRETPROBES if !XIP_KERNEL > # https://github.com/ClangBuiltLinux/linux/issues/1881 > select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if !LD_IS_LLD > + select HAVE_MIXED_BREAKPOINTS_REGS > select HAVE_MOVE_PMD > select HAVE_MOVE_PUD > select HAVE_PAGE_SIZE_4KB > diff --git a/arch/riscv/include/asm/hw_breakpoint.h b/arch/riscv/include/asm/hw_breakpoint.h > new file mode 100644 > index 000000000000..cde6688b91d2 > --- /dev/null > +++ b/arch/riscv/include/asm/hw_breakpoint.h > @@ -0,0 +1,59 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > +/* > + * Copyright (C) 2024 Ventana Micro Systems Inc. > + */ > + > +#ifndef __RISCV_HW_BREAKPOINT_H > +#define __RISCV_HW_BREAKPOINT_H > + > +struct task_struct; > + > +#ifdef CONFIG_HAVE_HW_BREAKPOINT > + > +#include > + > +#if __riscv_xlen == 64 > +#define cpu_to_le cpu_to_le64 > +#define le_to_cpu le64_to_cpu > +#elif __riscv_xlen == 32 > +#define cpu_to_le cpu_to_le32 > +#define le_to_cpu le32_to_cpu > +#else > +#error "Unexpected __riscv_xlen" > +#endif > + > +struct arch_hw_breakpoint { > + unsigned long address; > + unsigned long len; > + > + /* Callback info */ > + unsigned long next_addr; > + bool in_callback; > + > + /* Trigger configuration data */ > + unsigned long tdata1; > + unsigned long tdata2; > + unsigned long tdata3; > +}; > + > +/* Maximum number of hardware breakpoints supported */ > +#define RV_MAX_TRIGGERS 32 > + > +struct perf_event_attr; > +struct notifier_block; > +struct perf_event; > +struct pt_regs; > + > +int hw_breakpoint_slots(int type); > +int arch_check_bp_in_kernelspace(struct arch_hw_breakpoint *hw); > +int hw_breakpoint_arch_parse(struct perf_event *bp, > + const struct perf_event_attr *attr, > + struct arch_hw_breakpoint *hw); > +int hw_breakpoint_exceptions_notify(struct notifier_block *unused, > + unsigned long val, void *data); > +int arch_install_hw_breakpoint(struct perf_event *bp); > +void arch_uninstall_hw_breakpoint(struct perf_event *bp); > +void hw_breakpoint_pmu_read(struct perf_event *bp); > + > +#endif /* CONFIG_HAVE_HW_BREAKPOINT */ > +#endif /* __RISCV_HW_BREAKPOINT_H */ > diff --git a/arch/riscv/include/asm/kdebug.h b/arch/riscv/include/asm/kdebug.h > index 85ac00411f6e..53e989781aa1 100644 > --- a/arch/riscv/include/asm/kdebug.h > +++ b/arch/riscv/include/asm/kdebug.h > @@ -6,7 +6,8 @@ > enum die_val { > DIE_UNUSED, > DIE_TRAP, > - DIE_OOPS > + DIE_OOPS, > + DIE_DEBUG > }; > > #endif > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > index be2ca8e8a49e..64fa7a82aa45 100644 > --- a/arch/riscv/include/asm/sbi.h > +++ b/arch/riscv/include/asm/sbi.h > @@ -282,7 +282,9 @@ struct sbi_sta_struct { > u8 pad[47]; > } __packed; > > -#define SBI_SHMEM_DISABLE -1 > +#define SBI_SHMEM_DISABLE (-1UL) > +#define SBI_SHMEM_LO(pa) ((unsigned long)lower_32_bits(pa)) > +#define SBI_SHMEM_HI(pa) ((unsigned long)upper_32_bits(pa)) These definitions of SBI_SHMEM_LO() and SBI_SHMEM_HI() are broken for RV64 platforms where a good amount of RAM is beyond first 4GB. This should be: #ifdef CONFIG_32BIT #define SBI_SHMEM_LO(pa) ((unsigned long)lower_32_bits(pa)) #define SBI_SHMEM_HI(pa) ((unsigned long)upper_32_bits(pa)) #else #define SBI_SHMEM_LO(pa) ((unsigned long)pa) #define SBI_SHMEM_HI(pa) 0UL #endif Regards, Anup From atishp at rivosinc.com Wed Sep 3 23:24:24 2025 From: atishp at rivosinc.com (Atish Kumar Patra) Date: Wed, 3 Sep 2025 23:24:24 -0700 Subject: [PATCH v5 6/9] KVM: Add a helper function to check if a gpa is in writable memselot In-Reply-To: References: <20250829-pmu_event_info-v5-0-9dca26139a33@rivosinc.com> <20250829-pmu_event_info-v5-6-9dca26139a33@rivosinc.com> Message-ID: On Fri, Aug 29, 2025 at 1:47?PM Sean Christopherson wrote: > > On Fri, Aug 29, 2025, Atish Patra wrote: > > The arch specific code may need to know if a particular gpa is valid and > > writable for the shared memory between the host and the guest. Currently, > > there are few places where it is used in RISC-V implementation. Given the > > nature of the function it may be used for other architectures. > > Hence, a common helper function is added. > > > > Suggested-by: Sean Christopherson > > Signed-off-by: Atish Patra > > --- > > include/linux/kvm_host.h | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index 15656b7fba6c..eec5cbbcb4b3 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -1892,6 +1892,14 @@ static inline bool kvm_is_gpa_in_memslot(struct kvm *kvm, gpa_t gpa) > > return !kvm_is_error_hva(hva); > > } > > > > +static inline bool kvm_is_gpa_in_writable_memslot(struct kvm *kvm, gpa_t gpa) > > +{ > > + bool writable; > > + unsigned long hva = gfn_to_hva_prot(kvm, gpa_to_gfn(gpa), &writable); > > + > > + return !kvm_is_error_hva(hva) && writable; > > I don't hate this API, but I don't love it either. Because knowing that the > _memslot_ is writable doesn't mean all that much. E.g. in this usage: > > hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable); > if (kvm_is_error_hva(hva) || !writable) > return SBI_ERR_INVALID_ADDRESS; > > ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta)); > if (ret) > return SBI_ERR_FAILURE; > > the error code returned to the guest will be different if the memslot is read-only > versus if the VMA is read-only (or not even mapped!). Unless every read-only > memslot is explicitly communicated as such to the guest, I don't see how the guest > can *know* that a memslot is read-only, so returning INVALID_ADDRESS in that case > but not when the underlying VMA isn't writable seems odd. > > It's also entirely possible the memslot could be replaced with a read-only memslot > after the check, or vice versa, i.e. become writable after being rejected. Is it > *really* a problem to return FAILURE if the guest attempts to setup steal-time in > a read-only memslot? I.e. why not do this and call it good? > Reposting the response as gmail converted my previous response as html. Sorry for the spam. >From a functionality pov, that should be fine. However, we have explicit error conditions for read only memory defined in the SBI STA specification[1]. Technically, we will violate the spec if we return FAILURE instead of INVALID_ADDRESS for read only memslot. TBH, I don't save much duplicate code with the new generic API now. If you don't see if the generic API will be useful in other cases, I can drop that patch and changes in the steal time code. [1] https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-steal-time.adoc#table_sta_steal_time_set_shmem_errors > if (!kvm_is_gpa_in_memslot(vcpu->kvm, shmem >> PAGE_SHIFT)) > return SBI_ERR_INVALID_ADDRESS; > > ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta)); > if (ret) > return SBI_ERR_FAILURE; From christophe.leroy at csgroup.eu Thu Sep 4 00:10:01 2025 From: christophe.leroy at csgroup.eu (Christophe Leroy) Date: Thu, 4 Sep 2025 09:10:01 +0200 Subject: [PATCH v3 5/7] powerpc: Stop calling page_address() in free_pages() In-Reply-To: <20250903185921.1785167-6-vishal.moola@gmail.com> References: <20250903185921.1785167-1-vishal.moola@gmail.com> <20250903185921.1785167-6-vishal.moola@gmail.com> Message-ID: <3b333f4e-9817-4a5b-bf0a-f8a9d33575e9@csgroup.eu> Le 03/09/2025 ? 20:59, Vishal Moola (Oracle) a ?crit?: > free_pages() should be used when we only have a virtual address. We > should call __free_pages() directly on our page instead. > > Signed-off-by: Vishal Moola (Oracle) > Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Christophe Leroy > --- > arch/powerpc/mm/book3s64/radix_pgtable.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c > index be523e5fe9c5..73977dbabcf2 100644 > --- a/arch/powerpc/mm/book3s64/radix_pgtable.c > +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c > @@ -780,7 +780,7 @@ static void __meminit free_vmemmap_pages(struct page *page, > while (nr_pages--) > free_reserved_page(page++); > } else > - free_pages((unsigned long)page_address(page), order); > + __free_pages(page, order); > } > > static void __meminit remove_pte_table(pte_t *pte_start, unsigned long addr, From uwu at icenowy.me Thu Sep 4 00:31:44 2025 From: uwu at icenowy.me (Icenowy Zheng) Date: Thu, 4 Sep 2025 15:31:44 +0800 Subject: [PATCH v2 0/7] drm/etnaviv: add support for GC620 on T-Head TH1520 Message-ID: <20250904073151.686227-1-uwu@icenowy.me> This patchset tries to add support for the GC620 2D accelerator, which is a quirky thing -- it has quirks on both MMU and DEC. The DEC quirk is bound to the model number and revision number currently, and only involves writing to some DEC registers at specific situation. The MMU quirk is more weird -- it contains a broken implementation of PTA, which blocks directly writing MTLB address to switch MMU context, but loading page table IDs different to the initial one does not work either. A shared context practice, like what's done for IOMMUv1, has to be used instead. The DT patch isn't ready because the VP (video processing) subsystem on TH1520 does not have proper clock and reset driver yet, and the DT patch included in this patchset uses fake clocks and ignore resets. Tested by both the etnaviv_2d_test program in libdrm tests and xf86-video-thead 2D-accelerated DDX. Icenowy Zheng (7): drm/etnaviv: add HWDB entry for GC620 r5552 c20b drm/etnaviv: add handle for GPUs with only SECURITY_AHB flag drm/etnaviv: setup DEC400EX on GC620 r5552 drm/etnaviv: protect whole iommuv2 ctx alloc func under global mutex drm/etnaviv: prepare for shared_context support for iommuv2 drm/etnaviv: add shared context support for iommuv2 [NOT FOR UPSTREAM] riscv: dts: thead: enable GC620 G2D on TH1520 arch/riscv/boot/dts/thead/th1520.dtsi | 19 +++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 19 ++++++++++--- drivers/gpu/drm/etnaviv/etnaviv_hwdb.c | 31 ++++++++++++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_iommu.c | 8 +++--- drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c | 26 +++++++++++++----- drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 1 + drivers/gpu/drm/etnaviv/etnaviv_mmu.h | 24 +++++++---------- 7 files changed, 99 insertions(+), 29 deletions(-) -- 2.51.0 From uwu at icenowy.me Thu Sep 4 00:31:45 2025 From: uwu at icenowy.me (Icenowy Zheng) Date: Thu, 4 Sep 2025 15:31:45 +0800 Subject: [PATCH v2 1/7] drm/etnaviv: add HWDB entry for GC620 r5552 c20b In-Reply-To: <20250904073151.686227-1-uwu@icenowy.me> References: <20250904073151.686227-1-uwu@icenowy.me> Message-ID: <20250904073151.686227-2-uwu@icenowy.me> This is the 2D GPU found on the T-Head TH1520 SoC. Feature bits taken from the downstream kernel driver 6.4.6.9.354872. Signed-off-by: Icenowy Zheng --- No changes in v2. drivers/gpu/drm/etnaviv/etnaviv_hwdb.c | 31 ++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c b/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c index 8665f2658d51b..6a56f1ab44449 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c @@ -69,6 +69,37 @@ static const struct etnaviv_chip_identity etnaviv_chip_identities[] = { .minor_features10 = 0x00000000, .minor_features11 = 0x00000000, }, + { + .model = 0x620, + .revision = 0x5552, + .product_id = 0x6200, + .customer_id = 0x20b, + .eco_id = 0, + .stream_count = 1, + .register_max = 64, + .thread_count = 256, + .shader_core_count = 1, + .vertex_cache_size = 8, + .vertex_output_buffer_size = 512, + .pixel_pipes = 1, + .instruction_count = 256, + .num_constants = 168, + .buffer_size = 0, + .varyings_count = 8, + .features = 0x001b4a40, + .minor_features0 = 0xa0600080, + .minor_features1 = 0x18050000, + .minor_features2 = 0x04f30000, + .minor_features3 = 0x00060005, + .minor_features4 = 0x20629000, + .minor_features5 = 0x0003380c, + .minor_features6 = 0x00000000, + .minor_features7 = 0x00001000, + .minor_features8 = 0x00000000, + .minor_features9 = 0x00000180, + .minor_features10 = 0x00004000, + .minor_features11 = 0x00000000, + }, { .model = 0x7000, .revision = 0x6202, -- 2.51.0 From uwu at icenowy.me Thu Sep 4 00:31:46 2025 From: uwu at icenowy.me (Icenowy Zheng) Date: Thu, 4 Sep 2025 15:31:46 +0800 Subject: [PATCH v2 2/7] drm/etnaviv: add handle for GPUs with only SECURITY_AHB flag In-Reply-To: <20250904073151.686227-1-uwu@icenowy.me> References: <20250904073151.686227-1-uwu@icenowy.me> Message-ID: <20250904073151.686227-3-uwu@icenowy.me> In the GC620 on T-Head TH1520 SoC, the SECURITY feature flag isn't set but the SECURITY_AHB feature flag is set. In this situation, the VIVS_MMUv2_AHB_CONTROL register isn't available, but the GPU otherwise behave like secure ones and require commands to load PTA. The 6.4.6.9.354872 driver from T-Head asserts SECURITY_AHB feature flag is set when SECURITY one is set, so it could be assumed that the situation that only SECURITY is set do not exist. Signed-off-by: Icenowy Zheng --- No changes in v2. drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index cf0d9049bcf1e..7431e180b3ae4 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -559,7 +559,7 @@ static int etnaviv_hw_reset(struct etnaviv_gpu *gpu) control |= VIVS_HI_CLOCK_CONTROL_ISOLATE_GPU; gpu_write(gpu, VIVS_HI_CLOCK_CONTROL, control); - if (gpu->sec_mode == ETNA_SEC_KERNEL) { + if (gpu->identity.minor_features7 & chipMinorFeatures7_BIT_SECURITY) { gpu_write(gpu, VIVS_MMUv2_AHB_CONTROL, VIVS_MMUv2_AHB_CONTROL_RESET); } else { @@ -797,7 +797,7 @@ static void etnaviv_gpu_hw_init(struct etnaviv_gpu *gpu) gpu_write(gpu, VIVS_MC_BUS_CONFIG, bus_config); } - if (gpu->sec_mode == ETNA_SEC_KERNEL) { + if (gpu->identity.minor_features7 & chipMinorFeatures7_BIT_SECURITY) { u32 val = gpu_read(gpu, VIVS_MMUv2_AHB_CONTROL); val |= VIVS_MMUv2_AHB_CONTROL_NONSEC_ACCESS; gpu_write(gpu, VIVS_MMUv2_AHB_CONTROL, val); @@ -853,7 +853,7 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu) * On cores with security features supported, we claim control over the * security states. */ - if ((gpu->identity.minor_features7 & chipMinorFeatures7_BIT_SECURITY) && + if ((gpu->identity.minor_features7 & chipMinorFeatures7_BIT_SECURITY) || (gpu->identity.minor_features10 & chipMinorFeatures10_SECURITY_AHB)) gpu->sec_mode = ETNA_SEC_KERNEL; -- 2.51.0 From uwu at icenowy.me Thu Sep 4 00:31:47 2025 From: uwu at icenowy.me (Icenowy Zheng) Date: Thu, 4 Sep 2025 15:31:47 +0800 Subject: [PATCH v2 3/7] drm/etnaviv: setup DEC400EX on GC620 r5552 In-Reply-To: <20250904073151.686227-1-uwu@icenowy.me> References: <20250904073151.686227-1-uwu@icenowy.me> Message-ID: <20250904073151.686227-4-uwu@icenowy.me> The GC620 r5552 GPU found on T-Head TH1520 features (and requires) a DEC400EX buffer compressor that needs to be set up. In addition, some quirk exist for the DEC400 part that needs to be handled during GPU reset, otherwise the reset will not happen. Set the DEC400EX up and add the quirk code to the GPU reset codepath. Currently the DEC400EX setup is gated by this specific GPU identity, however in future we should add a feature flag for it. Signed-off-by: Icenowy Zheng --- No changes in v2. drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index 7431e180b3ae4..a8d4394c8f637 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -559,6 +559,10 @@ static int etnaviv_hw_reset(struct etnaviv_gpu *gpu) control |= VIVS_HI_CLOCK_CONTROL_ISOLATE_GPU; gpu_write(gpu, VIVS_HI_CLOCK_CONTROL, control); + if (etnaviv_is_model_rev(gpu, 0x620, 0x5552)) { + gpu_write(gpu, VIVS_DEC400EX_UNK00800, 0x10); + } + if (gpu->identity.minor_features7 & chipMinorFeatures7_BIT_SECURITY) { gpu_write(gpu, VIVS_MMUv2_AHB_CONTROL, VIVS_MMUv2_AHB_CONTROL_RESET); @@ -797,6 +801,15 @@ static void etnaviv_gpu_hw_init(struct etnaviv_gpu *gpu) gpu_write(gpu, VIVS_MC_BUS_CONFIG, bus_config); } + /* + * FIXME: Required by GC620 r5552 as a bug workaround, but might be + * useful on other GPUs with G2D_DEC400EX feature too. + */ + if (etnaviv_is_model_rev(gpu, 0x620, 0x5552)) { + gpu_write(gpu, VIVS_DEC400EX_UNK00800, 0x2010188); + gpu_write(gpu, VIVS_DEC400EX_UNK00808, 0x3fc104); + } + if (gpu->identity.minor_features7 & chipMinorFeatures7_BIT_SECURITY) { u32 val = gpu_read(gpu, VIVS_MMUv2_AHB_CONTROL); val |= VIVS_MMUv2_AHB_CONTROL_NONSEC_ACCESS; -- 2.51.0 From uwu at icenowy.me Thu Sep 4 00:31:48 2025 From: uwu at icenowy.me (Icenowy Zheng) Date: Thu, 4 Sep 2025 15:31:48 +0800 Subject: [PATCH v2 4/7] drm/etnaviv: protect whole iommuv2 ctx alloc func under global mutex In-Reply-To: <20250904073151.686227-1-uwu@icenowy.me> References: <20250904073151.686227-1-uwu@icenowy.me> Message-ID: <20250904073151.686227-5-uwu@icenowy.me> As we are forced to use a global shared context on some PTA-equipped-but-broken GPUs, the fine-grained mutex locking in the current implemtnation of etnaviv_iommuv2_context_alloc() won't be meaningful any more. Make the whole function to be protected by the global lock, in order to prevent reentrance when allocating global shared context. Signed-off-by: Icenowy Zheng --- No changes in v2. drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c b/drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c index d664ae29ae209..5654a604c70cf 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c @@ -272,20 +272,18 @@ etnaviv_iommuv2_context_alloc(struct etnaviv_iommu_global *global) struct etnaviv_iommuv2_context *v2_context; struct etnaviv_iommu_context *context; + mutex_lock(&global->lock); + v2_context = vzalloc(sizeof(*v2_context)); if (!v2_context) - return NULL; + goto out_mutex_unlock; - mutex_lock(&global->lock); v2_context->id = find_first_zero_bit(global->v2.pta_alloc, ETNAVIV_PTA_ENTRIES); - if (v2_context->id < ETNAVIV_PTA_ENTRIES) { + if (v2_context->id < ETNAVIV_PTA_ENTRIES) set_bit(v2_context->id, global->v2.pta_alloc); - } else { - mutex_unlock(&global->lock); + else goto out_free; - } - mutex_unlock(&global->lock); v2_context->mtlb_cpu = dma_alloc_wc(global->dev, SZ_4K, &v2_context->mtlb_dma, GFP_KERNEL); @@ -304,11 +302,14 @@ etnaviv_iommuv2_context_alloc(struct etnaviv_iommu_global *global) INIT_LIST_HEAD(&context->mappings); drm_mm_init(&context->mm, SZ_4K, (u64)SZ_1G * 4 - SZ_4K); + mutex_unlock(&global->lock); return context; out_free_id: clear_bit(v2_context->id, global->v2.pta_alloc); out_free: vfree(v2_context); +out_mutex_unlock: + mutex_unlock(&global->lock); return NULL; } -- 2.51.0 From uwu at icenowy.me Thu Sep 4 00:31:49 2025 From: uwu at icenowy.me (Icenowy Zheng) Date: Thu, 4 Sep 2025 15:31:49 +0800 Subject: [PATCH v2 5/7] drm/etnaviv: prepare for shared_context support for iommuv2 In-Reply-To: <20250904073151.686227-1-uwu@icenowy.me> References: <20250904073151.686227-1-uwu@icenowy.me> Message-ID: <20250904073151.686227-6-uwu@icenowy.me> As we have some unfortunate GPUs with IOMMUv2 but broken PTA (reloading a different page table at runtime always fails), shared_context is now not a v1-only thing. Move it out of the v1 struct in the union. Signed-off-by: Icenowy Zheng --- No changes in v2. drivers/gpu/drm/etnaviv/etnaviv_iommu.c | 8 ++++---- drivers/gpu/drm/etnaviv/etnaviv_mmu.h | 22 +++++++--------------- 2 files changed, 11 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_iommu.c b/drivers/gpu/drm/etnaviv/etnaviv_iommu.c index afe5dd6a9925b..6fdce63b9971a 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_iommu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_iommu.c @@ -39,7 +39,7 @@ static void etnaviv_iommuv1_free(struct etnaviv_iommu_context *context) dma_free_wc(context->global->dev, PT_SIZE, v1_context->pgtable_cpu, v1_context->pgtable_dma); - context->global->v1.shared_context = NULL; + context->global->shared_context = NULL; kfree(v1_context); } @@ -136,8 +136,8 @@ etnaviv_iommuv1_context_alloc(struct etnaviv_iommu_global *global) * a stop the world operation, so we only support a single shared * context with this version. */ - if (global->v1.shared_context) { - context = global->v1.shared_context; + if (global->shared_context) { + context = global->shared_context; etnaviv_iommu_context_get(context); mutex_unlock(&global->lock); return context; @@ -163,7 +163,7 @@ etnaviv_iommuv1_context_alloc(struct etnaviv_iommu_global *global) mutex_init(&context->lock); INIT_LIST_HEAD(&context->mappings); drm_mm_init(&context->mm, GPU_MEM_START, PT_ENTRIES * SZ_4K); - context->global->v1.shared_context = context; + context->global->shared_context = context; mutex_unlock(&global->lock); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.h b/drivers/gpu/drm/etnaviv/etnaviv_mmu.h index 7f8ac01785474..2ec4acda02bc6 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.h @@ -49,21 +49,13 @@ struct etnaviv_iommu_global { u32 memory_base; - /* - * This union holds members needed by either MMUv1 or MMUv2, which - * can not exist at the same time. - */ - union { - struct { - struct etnaviv_iommu_context *shared_context; - } v1; - struct { - /* P(age) T(able) A(rray) */ - u64 *pta_cpu; - dma_addr_t pta_dma; - DECLARE_BITMAP(pta_alloc, ETNAVIV_PTA_ENTRIES); - } v2; - }; + struct etnaviv_iommu_context *shared_context; + struct { + /* P(age) T(able) A(rray) */ + u64 *pta_cpu; + dma_addr_t pta_dma; + DECLARE_BITMAP(pta_alloc, ETNAVIV_PTA_ENTRIES); + } v2; }; struct etnaviv_iommu_context { -- 2.51.0 From uwu at icenowy.me Thu Sep 4 00:31:50 2025 From: uwu at icenowy.me (Icenowy Zheng) Date: Thu, 4 Sep 2025 15:31:50 +0800 Subject: [PATCH v2 6/7] drm/etnaviv: add shared context support for iommuv2 In-Reply-To: <20250904073151.686227-1-uwu@icenowy.me> References: <20250904073151.686227-1-uwu@icenowy.me> Message-ID: <20250904073151.686227-7-uwu@icenowy.me> Unfortunately the GC620 GPU seems to have broken PTA capibility, and switching page table ID in command stream after it's running won't work. As directly switching mtlb isn't working either, there will be no reliable way to switch page table in the command stream, and a shared context, like iommuv1, is needed. Add support for this shared context situation. Shared context is set when the broken PTA is known, and the context allocation code will be made short circuit when a shared context is set. Signed-off-by: Icenowy Zheng --- Changes in v2: - Add the shared_context cleanup code in etnaviv_iommuv2_free() to fix issues when the GPU is closed and opened again. drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c | 11 +++++++++++ drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 1 + drivers/gpu/drm/etnaviv/etnaviv_mmu.h | 2 ++ 3 files changed, 14 insertions(+) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c b/drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c index 5654a604c70cf..16b89e72602a3 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_iommu_v2.c @@ -63,6 +63,9 @@ static void etnaviv_iommuv2_free(struct etnaviv_iommu_context *context) clear_bit(v2_context->id, context->global->v2.pta_alloc); + if (context->global->shared_context == context) + context->global->shared_context = NULL; + vfree(v2_context); } static int @@ -273,6 +276,12 @@ etnaviv_iommuv2_context_alloc(struct etnaviv_iommu_global *global) struct etnaviv_iommu_context *context; mutex_lock(&global->lock); + if (global->shared_context) { + context = global->shared_context; + etnaviv_iommu_context_get(context); + mutex_unlock(&global->lock); + return context; + } v2_context = vzalloc(sizeof(*v2_context)); if (!v2_context) @@ -301,6 +310,8 @@ etnaviv_iommuv2_context_alloc(struct etnaviv_iommu_global *global) mutex_init(&context->lock); INIT_LIST_HEAD(&context->mappings); drm_mm_init(&context->mm, SZ_4K, (u64)SZ_1G * 4 - SZ_4K); + if (global->v2.broken_pta) + global->shared_context = context; mutex_unlock(&global->lock); return context; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c index df5192083b201..a0f9c950504e0 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c @@ -504,6 +504,7 @@ int etnaviv_iommu_global_init(struct etnaviv_gpu *gpu) memset32(global->bad_page_cpu, 0xdead55aa, SZ_4K / sizeof(u32)); if (version == ETNAVIV_IOMMU_V2) { + global->v2.broken_pta = gpu->identity.model == chipModel_GC620; global->v2.pta_cpu = dma_alloc_wc(dev, ETNAVIV_PTA_SIZE, &global->v2.pta_dma, GFP_KERNEL); if (!global->v2.pta_cpu) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.h b/drivers/gpu/drm/etnaviv/etnaviv_mmu.h index 2ec4acda02bc6..5627d2a0d0237 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.h @@ -55,6 +55,8 @@ struct etnaviv_iommu_global { u64 *pta_cpu; dma_addr_t pta_dma; DECLARE_BITMAP(pta_alloc, ETNAVIV_PTA_ENTRIES); + /* Whether runtime switching page table ID will fail */ + bool broken_pta; } v2; }; -- 2.51.0 From uwu at icenowy.me Thu Sep 4 00:31:51 2025 From: uwu at icenowy.me (Icenowy Zheng) Date: Thu, 4 Sep 2025 15:31:51 +0800 Subject: [PATCH v2 7/7] [NOT FOR UPSTREAM] riscv: dts: thead: enable GC620 G2D on TH1520 In-Reply-To: <20250904073151.686227-1-uwu@icenowy.me> References: <20250904073151.686227-1-uwu@icenowy.me> Message-ID: <20250904073151.686227-8-uwu@icenowy.me> The T-Head TH1520 SoC contains a GC620 2D graphics accelerator. Enable it in the devicetree to allow using etnaviv driver with it. This patch is currently very dirty because it relies on the bootloader leaving the clocks enabled, and the core clock is a fake one. Signed-off-by: Icenowy Zheng --- No changes in v2. arch/riscv/boot/dts/thead/th1520.dtsi | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/arch/riscv/boot/dts/thead/th1520.dtsi b/arch/riscv/boot/dts/thead/th1520.dtsi index 03f1d73190499..bc7dd7ee59dd5 100644 --- a/arch/riscv/boot/dts/thead/th1520.dtsi +++ b/arch/riscv/boot/dts/thead/th1520.dtsi @@ -225,6 +225,13 @@ aonsys_clk: clock-73728000 { #clock-cells = <0>; }; + gc620_cclk: clk-gc620-fake { + compatible = "fixed-clock"; + clock-frequency = <264000000>; + clock-output-names = "gc620_cclk"; + #clock-cells = <0>; + }; + stmmac_axi_config: stmmac-axi-config { snps,wr_osr_lmt = <15>; snps,rd_osr_lmt = <15>; @@ -495,6 +502,18 @@ uart2: serial at ffec010000 { status = "disabled"; }; + /* Vivante GC620, 2D only */ + g2d: gpu at ffecc80000 { + compatible = "vivante,gc"; + reg = <0xff 0xecc80000 0x0 0x40000>; + interrupt-parent = <&plic>; + interrupts = <101 IRQ_TYPE_LEVEL_HIGH>; + + clocks = <&gc620_cclk>; + clock-names = "core"; + status = "okay"; + }; + clk: clock-controller at ffef010000 { compatible = "thead,th1520-clk-ap"; reg = <0xff 0xef010000 0x0 0x1000>; -- 2.51.0 From alex at ghiti.fr Thu Sep 4 00:59:46 2025 From: alex at ghiti.fr (Alexandre Ghiti) Date: Thu, 4 Sep 2025 09:59:46 +0200 Subject: [PATCH RFC] riscv: Do not handle break traps from kernel as nmi In-Reply-To: <20250903202803.GQ4067720@noisy.programming.kicks-ass.net> References: <20250903-dev-alex-break_nmi_v1-v1-1-4a3d81c29598@rivosinc.com> <20250903202803.GQ4067720@noisy.programming.kicks-ass.net> Message-ID: Hi Peter, On 9/3/25 22:28, Peter Zijlstra wrote: > On Wed, Sep 03, 2025 at 07:54:29PM +0000, Alexandre Ghiti wrote: >> kprobe has been broken on riscv for quite some time. There is an attempt >> [1] to fix that which actually works. This patch works because it enables >> ARCH_HAVE_NMI_SAFE_CMPXCHG and that makes the ring buffer allocation >> succeed when handling a kprobe because we handle *all* kprobes in nmi >> context. We do so because Peter advised us to treat all kernel traps as >> nmi [2]. >> >> But that does not seem right for kprobe handling, so instead, treat >> break traps from kernel as non-nmi. > You can put a kprobe inside: local_irq_disable(), no? Inside any random > spinlock region in fact. How is the probe then not NMI like? Yes yes, in that case that will be NMI-like, sorry this patch is coarse grain. The ideal solution would be to re-enable the interrupts if they were enabled at the moment of the trap. In that case, would that make sense to you? Thanks, Alex > > _______________________________________________ > linux-riscv mailing list > linux-riscv at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv From emil.renner.berthing at canonical.com Thu Sep 4 02:02:22 2025 From: emil.renner.berthing at canonical.com (Emil Renner Berthing) Date: Thu, 4 Sep 2025 02:02:22 -0700 Subject: [PATCH v3 RESEND 0/3] riscv: dts: starfive: jh7110: More U-Boot downstream changes for JH7110 In-Reply-To: <20250823100159.203925-1-e@freeshell.de> References: <20250823100159.203925-1-e@freeshell.de> Message-ID: E Shattow wrote: > Bring in additional downstream U-Boot boot loader changes for StarFive > VisionFive2 board target (and related JH7110 common boards). Create a > basic dt-binding (and not any Linux driver) in support of the > memory-controller dts node used in mainline U-Boot. Also add > bootph-pre-ram hinting to jh7110.dtsi needed at SPL boot phase. > > Changes since v2: > > - patch 1/3 "add StarFive JH7110 SoC DMC": wrap at 80 col, clock-names > const is 'pll'. > > - patch 2/3 "add memory controller node": memory-controller node follows > sorting style by reg address, between watchdog and crypto nodes. Update > clock-names to 'pll'. > > - patch 3/3 "bootph-pre-ram hinting needed by boot loader": add missing > hints for syscrg dependencies 'gmac1_rgmii_rxin', 'gmac1_rmii_refin', > and 'pllclk'. > > E Shattow (3): > dt-bindings: memory-controllers: add StarFive JH7110 SoC DMC > riscv: dts: starfive: jh7110: add DMC memory controller > riscv: dts: starfive: jh7110: bootph-pre-ram hinting needed by boot > loader > > .../starfive,jh7110-dmc.yaml | 74 +++++++++++++++++++ > arch/riscv/boot/dts/starfive/jh7110.dtsi | 24 ++++++ > 2 files changed, 98 insertions(+) > create mode 100644 Documentation/devicetree/bindings/memory-controllers/starfive,jh7110-dmc.yaml Thank you! For the whole series: Reviewed-by: Emil Renner Berthing From dayss1224 at gmail.com Thu Sep 4 02:48:21 2025 From: dayss1224 at gmail.com (Dong Yang) Date: Thu, 4 Sep 2025 17:48:21 +0800 Subject: [PATCH v3 0/3] KVM: riscv: selftests: Enable supported test cases In-Reply-To: <20250902-9cc0d0dad59ba680062dbbf8@orel> References: <20250902-9cc0d0dad59ba680062dbbf8@orel> Message-ID: On Tue, Sep 02, 2025 at 10:36:10AM -0500, Andrew Jones wrote: > On Mon, Sep 01, 2025 at 03:35:48PM +0800, dayss1224 at gmail.com wrote: > > From: Dong Yang > > > > Add supported KVM test cases and fix the compilation dependencies. > > --- > > Changes in v3: > > - Reorder patches to fix build dependencies > > - Sort common supported test cases alphabetically > > - Move ucall_common.h include from common header to specific source files > > > > Changes in v2: > > - Delete some repeat KVM test cases on riscv > > - Add missing headers to fix the build for new RISC-V KVM selftests > > > > Dong Yang (1): > > KVM: riscv: selftests: Add missing headers for new testcases > > > > Quan Zhou (2): > > KVM: riscv: selftests: Use the existing RISCV_FENCE macro in > > `rseq-riscv.h` > > KVM: riscv: selftests: Add common supported test cases > > > > tools/testing/selftests/kvm/Makefile.kvm | 6 ++++++ > > tools/testing/selftests/kvm/access_tracking_perf_test.c | 1 + > > tools/testing/selftests/kvm/include/riscv/processor.h | 1 + > > .../selftests/kvm/memslot_modification_stress_test.c | 1 + > > tools/testing/selftests/kvm/memslot_perf_test.c | 1 + > > tools/testing/selftests/rseq/rseq-riscv.h | 3 +-- > > 6 files changed, 11 insertions(+), 2 deletions(-) > > > > -- > > 2.34.1 > > In the future please CC previous reviewers on the entire series > (particularly when they have reviewed the entire previous series). Okay, I will pay attention to this in the future. Thanks. > > For the series, > > Reviewed-by: Andrew Jones From ulf.hansson at linaro.org Thu Sep 4 03:14:31 2025 From: ulf.hansson at linaro.org (Ulf Hansson) Date: Thu, 4 Sep 2025 12:14:31 +0200 Subject: [PATCH 2/2] pmdomain: thead: create auxiliary device for rebooting In-Reply-To: <20250818074906.2907277-3-uwu@icenowy.me> References: <20250818074906.2907277-1-uwu@icenowy.me> <20250818074906.2907277-3-uwu@icenowy.me> Message-ID: On Mon, 18 Aug 2025 at 09:49, Icenowy Zheng wrote: > > The reboot / power off operations require communication with the AON > firmware too. > > As the driver is already present, create an auxiliary device with name > "reboot" to match that driver, and pass the AON channel by using > platform_data. > > Signed-off-by: Icenowy Zheng > --- > drivers/pmdomain/thead/th1520-pm-domains.c | 35 ++++++++++++++++++++-- > 1 file changed, 33 insertions(+), 2 deletions(-) > > diff --git a/drivers/pmdomain/thead/th1520-pm-domains.c b/drivers/pmdomain/thead/th1520-pm-domains.c > index 9040b698e7f7f..8285f552897b0 100644 > --- a/drivers/pmdomain/thead/th1520-pm-domains.c > +++ b/drivers/pmdomain/thead/th1520-pm-domains.c > @@ -129,12 +129,39 @@ static void th1520_pd_init_all_off(struct generic_pm_domain **domains, > } > } > > -static void th1520_pd_pwrseq_unregister_adev(void *adev) > +static void th1520_pd_unregister_adev(void *adev) > { > auxiliary_device_delete(adev); > auxiliary_device_uninit(adev); > } > > +static int th1520_pd_reboot_init(struct device *dev, struct th1520_aon_chan *aon_chan) > +{ > + struct auxiliary_device *adev; > + int ret; > + > + adev = devm_kzalloc(dev, sizeof(*adev), GFP_KERNEL); > + if (!adev) > + return -ENOMEM; > + > + adev->name = "reboot"; > + adev->dev.parent = dev; > + adev->dev.platform_data = aon_chan; > + > + ret = auxiliary_device_init(adev); > + if (ret) > + return ret; > + > + ret = auxiliary_device_add(adev); > + if (ret) { > + auxiliary_device_uninit(adev); > + return ret; > + } > + > + return devm_add_action_or_reset(dev, th1520_pd_unregister_adev, > + adev); We have devm_auxiliary_device_create() now, I suggest we use that instead. That said, I think it would make sense to convert the pwrseq-gpu auxiliary device to be registered with devm_auxiliary_device_create() too, but that's a separate change, of course. > +} > + > static int th1520_pd_pwrseq_gpu_init(struct device *dev) > { > struct auxiliary_device *adev; > @@ -169,7 +196,7 @@ static int th1520_pd_pwrseq_gpu_init(struct device *dev) > return ret; > } > > - return devm_add_action_or_reset(dev, th1520_pd_pwrseq_unregister_adev, > + return devm_add_action_or_reset(dev, th1520_pd_unregister_adev, > adev); > } > > @@ -235,6 +262,10 @@ static int th1520_pd_probe(struct platform_device *pdev) > if (ret) > goto err_clean_provider; > > + ret = th1520_pd_reboot_init(dev, aon_chan); > + if (ret) > + goto err_clean_provider; > + > return 0; > > err_clean_provider: > -- > 2.50.1 > Otherwise this looks good to me! Kind regards Uffe From hengqi.chen at gmail.com Thu Sep 4 03:38:06 2025 From: hengqi.chen at gmail.com (Hengqi Chen) Date: Thu, 4 Sep 2025 10:38:06 +0000 Subject: [PATCH v2 bpf-next] riscv, bpf: Sign extend struct ops return values properly Message-ID: <20250904103806.18937-1-hengqi.chen@gmail.com> The ns_bpf_qdisc selftest triggers a kernel panic: Unable to handle kernel paging request at virtual address ffffffffa38dbf58 Current test_progs pgtable: 4K pagesize, 57-bit VAs, pgdp=0x00000001109cc000 [ffffffffa38dbf58] pgd=000000011fffd801, p4d=000000011fffd401, pud=000000011fffd001, pmd=0000000000000000 Oops [#1] Modules linked in: bpf_testmod(OE) xt_conntrack nls_iso8859_1 dm_mod drm drm_panel_orientation_quirks configfs backlight btrfs blake2b_generic xor lzo_compress zlib_deflate raid6_pq efivarfs [last unloaded: bpf_testmod(OE)] CPU: 1 UID: 0 PID: 23584 Comm: test_progs Tainted: G W OE 6.17.0-rc1-g2465bb83e0b4 #1 NONE Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2024.01+dfsg-1ubuntu5.1 01/01/2024 epc : __qdisc_run+0x82/0x6f0 ra : __qdisc_run+0x6e/0x6f0 epc : ffffffff80bd5c7a ra : ffffffff80bd5c66 sp : ff2000000eecb550 gp : ffffffff82472098 tp : ff60000096895940 t0 : ffffffff8001f180 t1 : ffffffff801e1664 t2 : 0000000000000000 s0 : ff2000000eecb5d0 s1 : ff60000093a6a600 a0 : ffffffffa38dbee8 a1 : 0000000000000001 a2 : ff2000000eecb510 a3 : 0000000000000001 a4 : 0000000000000000 a5 : 0000000000000010 a6 : 0000000000000000 a7 : 0000000000735049 s2 : ffffffffa38dbee8 s3 : 0000000000000040 s4 : ff6000008bcda000 s5 : 0000000000000008 s6 : ff60000093a6a680 s7 : ff60000093a6a6f0 s8 : ff60000093a6a6ac s9 : ff60000093140000 s10: 0000000000000000 s11: ff2000000eecb9d0 t3 : 0000000000000000 t4 : 0000000000ff0000 t5 : 0000000000000000 t6 : ff60000093a6a8b6 status: 0000000200000120 badaddr: ffffffffa38dbf58 cause: 000000000000000d [] __qdisc_run+0x82/0x6f0 [] __dev_queue_xmit+0x4c0/0x1128 [] neigh_resolve_output+0xd0/0x170 [] ip6_finish_output2+0x226/0x6c8 [] ip6_finish_output+0x10c/0x2a0 [] ip6_output+0x5e/0x178 [] ip6_xmit+0x29a/0x608 [] inet6_csk_xmit+0xe6/0x140 [] __tcp_transmit_skb+0x45c/0xaa8 [] tcp_connect+0x9ce/0xd10 [] tcp_v6_connect+0x4ac/0x5e8 [] __inet_stream_connect+0xd8/0x318 [] inet_stream_connect+0x3e/0x68 [] __sys_connect_file+0x50/0x88 [] __sys_connect+0x96/0xc8 [] __riscv_sys_connect+0x20/0x30 [] do_trap_ecall_u+0x256/0x378 [] handle_exception+0x14a/0x156 Code: 892a 0363 1205 489c 8bc1 c7e5 2d03 084a 2703 080a (2783) 0709 ---[ end trace 0000000000000000 ]--- The bpf_fifo_dequeue prog returns a skb which is a pointer. The pointer is treated as a 32bit value and sign extend to 64bit in epilogue. This behavior is right for most bpf prog types but wrong for struct ops which requires RISC-V ABI. So let's sign extend struct ops return values according to the function model and RISC-V ABI([0]). [0]: https://riscv.org/wp-content/uploads/2024/12/riscv-calling.pdf Fixes: 25ad10658dc1 ("riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace framework") Signed-off-by: Hengqi Chen --- arch/riscv/net/bpf_jit_comp64.c | 38 ++++++++++++++++++++++++++++++++- 1 file changed, 37 insertions(+), 1 deletion(-) diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c index 549c3063c7f1..c7ae4d0a8361 100644 --- a/arch/riscv/net/bpf_jit_comp64.c +++ b/arch/riscv/net/bpf_jit_comp64.c @@ -954,6 +954,35 @@ static int invoke_bpf_prog(struct bpf_tramp_link *l, int args_off, int retval_of return ret; } +/* + * Sign-extend the register if necessary + */ +static int sign_extend(int rd, int rs, u8 size, u8 flags, struct rv_jit_context *ctx) +{ + if (!(flags & BTF_FMODEL_SIGNED_ARG) && (size == 1 || size == 2)) + return 0; + + switch (size) { + case 1: + emit_sextb(rd, rs, ctx); + break; + case 2: + emit_sexth(rd, rs, ctx); + break; + case 4: + emit_sextw(rd, rs, ctx); + break; + case 8: + emit_mv(rd, rs, ctx); + break; + default: + pr_err("bpf-jit: invalid size %d for sign_extend\n", size); + return -EINVAL; + } + + return 0; +} + static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, const struct btf_func_model *m, struct bpf_tramp_links *tlinks, @@ -1175,8 +1204,15 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, restore_args(min_t(int, nr_arg_slots, RV_MAX_REG_ARGS), args_off, ctx); if (save_ret) { - emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx); + if (is_struct_ops) { + ret = sign_extend(RV_REG_A0, regmap[BPF_REG_0], + m->ret_size, m->ret_flags, ctx); + if (ret) + goto out; + } else { + emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); + } } emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); -- 2.45.2 From hengqi.chen at gmail.com Thu Sep 4 03:51:18 2025 From: hengqi.chen at gmail.com (Hengqi Chen) Date: Thu, 4 Sep 2025 10:51:18 +0000 Subject: [PATCH bpf-next] riscv, bpf: Remove duplicated bpf_flush_icache() Message-ID: <20250904105119.21861-1-hengqi.chen@gmail.com> The bpf_flush_icache() is done by bpf_arch_text_copy() already. Remove the duplicated one in arch_prepare_bpf_trampoline(). Signed-off-by: Hengqi Chen --- arch/riscv/net/bpf_jit_comp64.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c index c7ae4d0a8361..3fcc011c6be4 100644 --- a/arch/riscv/net/bpf_jit_comp64.c +++ b/arch/riscv/net/bpf_jit_comp64.c @@ -1305,7 +1305,6 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image, goto out; } - bpf_flush_icache(ro_image, ro_image_end); out: kvfree(image); return ret < 0 ? ret : size; -- 2.45.2 From conor at kernel.org Thu Sep 4 04:28:19 2025 From: conor at kernel.org (Conor Dooley) Date: Thu, 4 Sep 2025 12:28:19 +0100 Subject: RISC-V: Re-enable GCC+Rust builds In-Reply-To: References: <68496eed-b5a4-4739-8d84-dcc428a08e20@gmail.com> <20250830-cheesy-prone-ee5fae406c22@spud> <20250901-lasso-kabob-de32b8fcede8@spud> <20250901-unseemly-blimp-a74e3c77e780@spud> Message-ID: <20250904-little-fester-546dac576c72@spud> On Wed, Sep 03, 2025 at 08:59:29AM +0800, Asuna wrote: > > That particular one might be a problem not because of -mstack-protector-guard itself, but rather three options get added at once: > > $(eval KBUILD_CFLAGS += -mstack-protector-guard=tls \ > > -mstack-protector-guard-reg=tp \ > > -mstack-protector-guard-offset=$(shell \ > > awk '{if ($$2 == "TSK_STACK_CANARY") print $$3;}' \ > > $(objtree)/include/generated/asm-offsets.h)) > > and the other ones might be responsible for the error. > > > I still don't understand the problem here. `bindgen_skip_c_flags` in > `rust/Makefile` contains a pattern `-mstack-protector-guard%`, the % at the > end enables it to match all those 3 options at the same time, and > `filter-out` function removes them before passing to Rust bindgen's > libclang. Am I missing something here? If they don't ever appear with gcc + llvm builds, that's fine. > > Similarly, something like -Wno-unterminated-string-initialization could cause a problem if gcc supports it but not libclang. > > > Yes. However, this option is only about warnings, not architecture related > and does not affect the generated results, so simply adding it into > `bindgen_skip_c_flags` patterns should be enough, I think. > > > I think you're mostly better off catching that sort of thing in Kconfig, where possible and just make incompatible mixes invalid. What's actually incompatible is likely going to depend heavily on what options are enabled. > > Sounds better, I'll go down that path. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From rppt at kernel.org Thu Sep 4 04:51:06 2025 From: rppt at kernel.org (Mike Rapoport) Date: Thu, 4 Sep 2025 14:51:06 +0300 Subject: [PATCH v3 3/7] x86: Stop calling page_address() in free_pages() In-Reply-To: <20250903185921.1785167-4-vishal.moola@gmail.com> References: <20250903185921.1785167-1-vishal.moola@gmail.com> <20250903185921.1785167-4-vishal.moola@gmail.com> Message-ID: On Wed, Sep 03, 2025 at 11:59:17AM -0700, Vishal Moola (Oracle) wrote: > free_pages() should be used when we only have a virtual address. We > should call __free_pages() directly on our page instead. > > Signed-off-by: Vishal Moola (Oracle) > Acked-by: Dave Hansen > --- > arch/x86/mm/init_64.c | 2 +- > arch/x86/platform/efi/memmap.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index b9426fce5f3e..0e4270e20fad 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -1031,7 +1031,7 @@ static void __meminit free_pagetable(struct page *page, int order) > free_reserved_pages(page, nr_pages); > #endif > } else { > - free_pages((unsigned long)page_address(page), order); > + __free_pages(page, order); > } > } > > diff --git a/arch/x86/platform/efi/memmap.c b/arch/x86/platform/efi/memmap.c > index 061b8ecc71a1..023697c88910 100644 > --- a/arch/x86/platform/efi/memmap.c > +++ b/arch/x86/platform/efi/memmap.c > @@ -42,7 +42,7 @@ void __init __efi_memmap_free(u64 phys, unsigned long size, unsigned long flags) > struct page *p = pfn_to_page(PHYS_PFN(phys)); > unsigned int order = get_order(size); > > - free_pages((unsigned long) page_address(p), order); Could be just free_pages((unsigned long)phys_to_virt(phys), order), then the page is not needed at all. > + __free_pages(p, order); > } > } > > -- > 2.51.0 > > -- Sincerely yours, Mike. From rppt at kernel.org Thu Sep 4 04:54:24 2025 From: rppt at kernel.org (Mike Rapoport) Date: Thu, 4 Sep 2025 14:54:24 +0300 Subject: [PATCH v3 3/7] x86: Stop calling page_address() in free_pages() In-Reply-To: References: <20250903185921.1785167-1-vishal.moola@gmail.com> <20250903185921.1785167-4-vishal.moola@gmail.com> Message-ID: On Thu, Sep 04, 2025 at 02:51:14PM +0300, Mike Rapoport wrote: > On Wed, Sep 03, 2025 at 11:59:17AM -0700, Vishal Moola (Oracle) wrote: > > free_pages() should be used when we only have a virtual address. We > > should call __free_pages() directly on our page instead. > > > > Signed-off-by: Vishal Moola (Oracle) > > Acked-by: Dave Hansen > > --- > > arch/x86/mm/init_64.c | 2 +- > > arch/x86/platform/efi/memmap.c | 2 +- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > > index b9426fce5f3e..0e4270e20fad 100644 > > --- a/arch/x86/mm/init_64.c > > +++ b/arch/x86/mm/init_64.c > > @@ -1031,7 +1031,7 @@ static void __meminit free_pagetable(struct page *page, int order) > > free_reserved_pages(page, nr_pages); > > #endif > > } else { > > - free_pages((unsigned long)page_address(page), order); > > + __free_pages(page, order); > > } > > } > > > > diff --git a/arch/x86/platform/efi/memmap.c b/arch/x86/platform/efi/memmap.c > > index 061b8ecc71a1..023697c88910 100644 > > --- a/arch/x86/platform/efi/memmap.c > > +++ b/arch/x86/platform/efi/memmap.c > > @@ -42,7 +42,7 @@ void __init __efi_memmap_free(u64 phys, unsigned long size, unsigned long flags) > > struct page *p = pfn_to_page(PHYS_PFN(phys)); > > unsigned int order = get_order(size); > > > > - free_pages((unsigned long) page_address(p), order); > > Could be just free_pages((unsigned long)phys_to_virt(phys), order), then > the page is not needed at all. Or even __free_pages(phys_to_page(phys), order); > > + __free_pages(p, order); > > } > > } > > > > -- > > 2.51.0 > > > > > > -- > Sincerely yours, > Mike. -- Sincerely yours, Mike. From rppt at kernel.org Thu Sep 4 04:55:21 2025 From: rppt at kernel.org (Mike Rapoport) Date: Thu, 4 Sep 2025 14:55:21 +0300 Subject: [PATCH v3 0/7] Cleanup free_pages() misuse In-Reply-To: <20250903185921.1785167-1-vishal.moola@gmail.com> References: <20250903185921.1785167-1-vishal.moola@gmail.com> Message-ID: On Wed, Sep 03, 2025 at 11:59:14AM -0700, Vishal Moola (Oracle) wrote: > free_pages() is supposed to be called when we only have a virtual address. > __free_pages() is supposed to be called when we have a page. > > There are a number of callers that use page_address() to get a page's > virtual address then call free_pages() on it when they should just call > __free_pages() directly. > > Add kernel-docs for free_pages() to help callers better understand which > function they should be calling, and replace the obvious cases of > misuse. > > Vishal Moola (Oracle) (7): > mm/page_alloc: Add kernel-docs for free_pages() > aoe: Stop calling page_address() in free_page() > x86: Stop calling page_address() in free_pages() > riscv: Stop calling page_address() in free_pages() > powerpc: Stop calling page_address() in free_pages() > arm64: Stop calling page_address() in free_pages() > virtio_balloon: Stop calling page_address() in free_pages() Acked-by: Mike Rapoport (Microsoft) > arch/arm64/mm/mmu.c | 2 +- > arch/powerpc/mm/book3s64/radix_pgtable.c | 2 +- > arch/riscv/mm/init.c | 4 ++-- > arch/x86/mm/init_64.c | 2 +- > arch/x86/platform/efi/memmap.c | 2 +- > drivers/block/aoe/aoecmd.c | 2 +- > drivers/virtio/virtio_balloon.c | 8 +++----- > mm/page_alloc.c | 9 +++++++++ > 8 files changed, 19 insertions(+), 12 deletions(-) > > -- > 2.51.0 > > -- Sincerely yours, Mike. From alex at ghiti.fr Thu Sep 4 05:27:02 2025 From: alex at ghiti.fr (Alexandre Ghiti) Date: Thu, 4 Sep 2025 14:27:02 +0200 Subject: [PATCH v3 4/7] riscv: Stop calling page_address() in free_pages() In-Reply-To: <20250903185921.1785167-5-vishal.moola@gmail.com> References: <20250903185921.1785167-1-vishal.moola@gmail.com> <20250903185921.1785167-5-vishal.moola@gmail.com> Message-ID: <113262ca-214a-4cd4-86f2-c0e3e4bb1a06@ghiti.fr> Hi Vishal, On 9/3/25 20:59, Vishal Moola (Oracle) wrote: > free_pages() should be used when we only have a virtual address. We > should call __free_pages() directly on our page instead. > > Signed-off-by: Vishal Moola (Oracle) > --- > arch/riscv/mm/init.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 15683ae13fa5..1056c11d3251 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -1624,7 +1624,7 @@ static void __meminit free_pud_table(pud_t *pud_start, p4d_t *p4d) > if (PageReserved(page)) > free_reserved_page(page); > else > - free_pages((unsigned long)page_address(page), 0); > + __free_pages(page, 0); > p4d_clear(p4d); > } > > @@ -1646,7 +1646,7 @@ static void __meminit free_vmemmap_storage(struct page *page, size_t size, > return; > } > > - free_pages((unsigned long)page_address(page), order); > + __free_pages(page, order); > } > > static void __meminit remove_pte_mapping(pte_t *pte_base, unsigned long addr, unsigned long end, Acked-by: Alexandre Ghiti Thanks, Alex From conor at kernel.org Thu Sep 4 05:27:53 2025 From: conor at kernel.org (Conor Dooley) Date: Thu, 4 Sep 2025 13:27:53 +0100 Subject: [PATCH 2/2] RISC-V: re-enable gcc + rust builds In-Reply-To: <20250903190806.2604757-2-SpriteOvO@gmail.com> References: <20250830-cheesy-prone-ee5fae406c22@spud> <20250903190806.2604757-1-SpriteOvO@gmail.com> <20250903190806.2604757-2-SpriteOvO@gmail.com> Message-ID: <20250904-sterilize-swagger-c7999b124e83@spud> On Wed, Sep 03, 2025 at 09:07:57PM +0200, Asuna Yang wrote: > Commit 33549fcf37ec ("RISC-V: disallow gcc + rust builds") disabled GCC > + Rust builds for RISC-V due to differences in extension handling > compared to LLVM. > > Add a Kconfig non-visible symbol to ensure that all important RISC-V > specific flags that will be used by GCC can be correctly recognized by > Rust bindgen's libclang, otherwise config HAVE_RUST will not be > selected. > > Signed-off-by: Asuna Yang Thanks for working on this. One thing - please don't send new versions of patchsets in response to earlier versions or other threads. It doesn't do you any favours with mailbox visibility. > --- > Documentation/rust/arch-support.rst | 2 +- > arch/riscv/Kconfig | 62 ++++++++++++++++++++++++++++- > rust/Makefile | 7 +++- > 3 files changed, 68 insertions(+), 3 deletions(-) > > diff --git a/Documentation/rust/arch-support.rst b/Documentation/rust/arch-support.rst > index 6e6a515d0899..5282e0e174e8 100644 > --- a/Documentation/rust/arch-support.rst > +++ b/Documentation/rust/arch-support.rst > @@ -18,7 +18,7 @@ Architecture Level of support Constraints > ``arm`` Maintained ARMv7 Little Endian only. > ``arm64`` Maintained Little Endian only. > ``loongarch`` Maintained \- > -``riscv`` Maintained ``riscv64`` and LLVM/Clang only. > +``riscv`` Maintained ``riscv64`` only. > ``um`` Maintained \- > ``x86`` Maintained ``x86_64`` only. > ============= ================ ============================================== > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index 1c5544401530..d7f421e0f429 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -195,7 +195,7 @@ config RISCV > select HAVE_REGS_AND_STACK_ACCESS_API > select HAVE_RETHOOK if !XIP_KERNEL > select HAVE_RSEQ > - select HAVE_RUST if RUSTC_SUPPORTS_RISCV && CC_IS_CLANG > + select HAVE_RUST if RUSTC_SUPPORTS_RISCV && RUST_BINDGEN_LIBCLANG_RECOGNIZES_FLAGS > select HAVE_SAMPLE_FTRACE_DIRECT > select HAVE_SAMPLE_FTRACE_DIRECT_MULTI > select HAVE_STACKPROTECTOR > @@ -236,6 +236,27 @@ config RUSTC_SUPPORTS_RISCV > # -Zsanitizer=shadow-call-stack flag. > depends on !SHADOW_CALL_STACK || RUSTC_VERSION >= 108200 > > +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_FLAGS > + def_bool y > + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_V > + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZABHA > + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZACAS > + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBA > + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBB > + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBC > + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBKB > + depends on RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZICSR_ZIFENCEI Other than Zicsr/Zifencei that may need explicit handling in a dedicated option, the approach here seems kinda backwards. Individually these symbols don't actually mean what they say they do, which is confusing: "recognises" here is true even when it may not be true at all because TOOLCHAIN_HAS_FOO is not set. Why can these options not be removed, and instead the TOOLCHAIN_HAS_FOO options grow a "depends on !RUST || "? > + help > + Rust bindgen currently relies on libclang as backend. When a mixed build is > + performed (building C code with GCC), GCC flags will be passed to libclang. > + However, not all GCC flags are recognized by Clang, so most of the > + incompatible flags have been filtered out in rust/Makefile. > + > + For RISC-V, GCC and Clang are not at the same pace of implementing extensions. > + This config ensures that all important RISC-V specific flags that will be > + used by GCC can be correctly recognized by Rust bindgen's libclang, otherwise > + config HAVE_RUST will not be selected. > + > config CLANG_SUPPORTS_DYNAMIC_FTRACE > def_bool CC_IS_CLANG > # https://github.com/ClangBuiltLinux/linux/issues/1817 > @@ -634,6 +655,11 @@ config TOOLCHAIN_HAS_V > depends on LLD_VERSION >= 140000 || LD_VERSION >= 23800 > depends on AS_HAS_OPTION_ARCH > > +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_V > + def_bool y > + # https://github.com/llvm/llvm-project/commit/e6de53b4de4aecca4ac892500a0907805896ed27 > + depends on !TOOLCHAIN_HAS_V || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 > + > config RISCV_ISA_V > bool "Vector extension support" > depends on TOOLCHAIN_HAS_V > @@ -698,6 +724,11 @@ config TOOLCHAIN_HAS_ZABHA > depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zabha) > depends on AS_HAS_OPTION_ARCH > > +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZABHA > + def_bool y > + # https://github.com/llvm/llvm-project/commit/6b7444964a8d028989beee554a1f5c61d16a1cac > + depends on !TOOLCHAIN_HAS_ZABHA || RUST_BINDGEN_LIBCLANG_VERSION >= 190100 > + > config RISCV_ISA_ZABHA > bool "Zabha extension support for atomic byte/halfword operations" > depends on TOOLCHAIN_HAS_ZABHA > @@ -716,6 +747,11 @@ config TOOLCHAIN_HAS_ZACAS > depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zacas) > depends on AS_HAS_OPTION_ARCH > > +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZACAS > + def_bool y > + # https://github.com/llvm/llvm-project/commit/614aeda93b2225c6eb42b00ba189ba7ca2585c60 > + depends on !TOOLCHAIN_HAS_ZACAS || RUST_BINDGEN_LIBCLANG_VERSION >= 200100 > + > config RISCV_ISA_ZACAS > bool "Zacas extension support for atomic CAS" > depends on TOOLCHAIN_HAS_ZACAS > @@ -735,6 +771,11 @@ config TOOLCHAIN_HAS_ZBB > depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 > depends on AS_HAS_OPTION_ARCH > > +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBB > + def_bool y > + # https://github.com/llvm/llvm-project/commit/33d008b169f3c813a4a45da220d0952f795ac477 > + depends on !TOOLCHAIN_HAS_ZBB || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 > + > # This symbol indicates that the toolchain supports all v1.0 vector crypto > # extensions, including Zvk*, Zvbb, and Zvbc. LLVM added all of these at once. > # binutils added all except Zvkb, then added Zvkb. So we just check for Zvkb. > @@ -750,6 +791,11 @@ config TOOLCHAIN_HAS_ZBA > depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 > depends on AS_HAS_OPTION_ARCH > > +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBA > + def_bool y > + # https://github.com/llvm/llvm-project/commit/33d008b169f3c813a4a45da220d0952f795ac477 > + depends on !TOOLCHAIN_HAS_ZBA || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 > + > config RISCV_ISA_ZBA > bool "Zba extension support for bit manipulation instructions" > default y > @@ -785,6 +831,11 @@ config TOOLCHAIN_HAS_ZBC > depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 > depends on AS_HAS_OPTION_ARCH > > +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBC > + def_bool y > + # https://github.com/llvm/llvm-project/commit/33d008b169f3c813a4a45da220d0952f795ac477 > + depends on !TOOLCHAIN_HAS_ZBC || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 > + > config RISCV_ISA_ZBCawl > bool "Zbc extension support for carry-less multiplication instructions" > depends on TOOLCHAIN_HAS_ZBC > @@ -808,6 +859,11 @@ config TOOLCHAIN_HAS_ZBKB > depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 > depends on AS_HAS_OPTION_ARCH > > +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZBKB > + def_bool y > + # https://github.com/llvm/llvm-project/commit/7ee1c162cc53d37f717f9a138276ad64fa6863bc > + depends on !TOOLCHAIN_HAS_ZBKB || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 > + > config RISCV_ISA_ZBKB > bool "Zbkb extension support for bit manipulation instructions" > depends on TOOLCHAIN_HAS_ZBKB > @@ -894,6 +950,10 @@ config TOOLCHAIN_NEEDS_OLD_ISA_SPEC > versions of clang and GCC to be passed to GAS, which has the same result > as passing zicsr and zifencei to -march. > +config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZICSR_ZIFENCEI > + def_bool y > + depends on TOOLCHAIN_NEEDS_OLD_ISA_SPEC || (TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI && RUST_BINDGEN_LIBCLANG_VERSION >= 170000) What does the libclang >= 17 requirement actually do here? Is that the version where llvm starts to require that Zicsr/Zifencei is set in order to use them? I think a comment to that effect is required if so. This doesn't actually need to be blocking either, should just be able to filter it out of march when passing to bindgen, no? What about the case where TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI is not set at all? Currently your patch is going to block rust in that case, when actually nothing needs to be done at all - no part of the toolchain requires understanding Zicsr/Zifencei as standalone extensions in this case. The TOOLCHAIN_NEEDS_OLD_ISA_SPEC handling I don't remember 100% how it works, but if bindgen requires them to be set to use the extension this will return true but do nothing to add the extensions to march? That seems wrong to me. I'd be fairly amenable to disabling rust though when used in combination with gcc < 11.3 and gas >=2.36 since it's such a niche condition, rather doing work to support it. That'd be effectively an inversion of your first condition. You could probably do something like blocking rust if TOOLCHAIN_NEEDS_OLD_ISA_SPEC and where TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI is set in combination with an older libclang - so like: select HAVE_RUST if FOO && !ZICSR_ZIFENCEI_MISMATCH config ZICSR_ZIFENCEI_MISMATCH def_bool y depends on TOOLCHAIN_NEEDS_OLD_ISA_SPEC || (TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI && RUST_BINDGEN_LIBCLANG_VERSION < 170000) or alternatively, make a Kconfig option for the later half of that condition along the lines of: config BINDGEN_FILTER_OUT_ZICSR_ZIFENCEI def_bool y depends on TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI && RUST_BINDGEN_LIBCLANG_VERSION < 170000 and use it to filter out _zicsr_zifencei and make the select select HAVE_RUST if FOO && !TOOLCHAIN_NEEDS_OLD_ISA_SPEC FWIW the reason that these odd mixes have dedicated work done for them in Kconfig is that the ?Linaro? CI infrastructure was running clang + binutils builds with a version of LLVM that predated us having full LLVM=1 support and it was done to stop that CI infrastructure falling over constantly. Cheers, Conor. > + > config FPU > bool "FPU support" > default y > diff --git a/rust/Makefile b/rust/Makefile > index 34d0429d50fd..7b1055c98146 100644 > --- a/rust/Makefile > +++ b/rust/Makefile > @@ -277,20 +277,25 @@ bindgen_skip_c_flags := -mno-fp-ret-in-387 -mpreferred-stack-boundary=% \ > -fno-inline-functions-called-once -fsanitize=bounds-strict \ > -fstrict-flex-arrays=% -fmin-function-alignment=% \ > -fzero-init-padding-bits=% -mno-fdpic \ > - --param=% --param asan-% > + --param=% --param asan-% -mno-riscv-attribute > > # Derived from `scripts/Makefile.clang`. > BINDGEN_TARGET_x86 := x86_64-linux-gnu > BINDGEN_TARGET_arm64 := aarch64-linux-gnu > BINDGEN_TARGET_arm := arm-linux-gnueabi > BINDGEN_TARGET_loongarch := loongarch64-linux-gnusf > +BINDGEN_TARGET_riscv := riscv64-linux-gnu > BINDGEN_TARGET_um := $(BINDGEN_TARGET_$(SUBARCH)) > BINDGEN_TARGET := $(BINDGEN_TARGET_$(SRCARCH)) > > +ifeq ($(BINDGEN_TARGET),) > +$(error add '--target=' option to rust/Makefile) > +else > # All warnings are inhibited since GCC builds are very experimental, > # many GCC warnings are not supported by Clang, they may only appear in > # some configurations, with new GCC versions, etc. > bindgen_extra_c_flags = -w --target=$(BINDGEN_TARGET) > +endif > > # Auto variable zero-initialization requires an additional special option with > # clang that is going to be removed sometime in the future (likely in > -- > 2.51.0 > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From cleger at rivosinc.com Thu Sep 4 06:50:24 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Thu, 4 Sep 2025 15:50:24 +0200 Subject: [PATCH 1/2] riscv: Fix sparse warning in __get_user_error() In-Reply-To: <20250903-dev-alex-sparse_warnings_v1-v1-1-7e6350beb700@rivosinc.com> References: <20250903-dev-alex-sparse_warnings_v1-v1-0-7e6350beb700@rivosinc.com> <20250903-dev-alex-sparse_warnings_v1-v1-1-7e6350beb700@rivosinc.com> Message-ID: <49bdcd0c-18ef-42ec-a71d-497bc6d6414d@rivosinc.com> On 03/09/2025 20:53, Alexandre Ghiti wrote: > We used to assign 0 to x without an appropriate cast which results in > sparse complaining when x is a pointer: > >>> block/ioctl.c:72:39: sparse: sparse: Using plain integer as NULL pointer > > So fix this by casting 0 to the correct type of x. > > Reported-by: kernel test robot > Closes: https://lore.kernel.org/oe-kbuild-all/202508062321.gHv4kvuY-lkp at intel.com/ > Fixes: f6bff7827a48 ("riscv: uaccess: use 'asm_goto_output' for get_user()") > Cc: stable at vger.kernel.org > Signed-off-by: Alexandre Ghiti > --- > arch/riscv/include/asm/uaccess.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h > index 22e3f52a763d1c0350e8185225e4c99aac3fc549..551e7490737effb2c238e6a4db50293ece7c9df9 100644 > --- a/arch/riscv/include/asm/uaccess.h > +++ b/arch/riscv/include/asm/uaccess.h > @@ -209,7 +209,7 @@ do { \ > err = 0; \ > break; \ > __gu_failed: \ > - x = 0; \ > + x = (__typeof__(x))0; \ > err = -EFAULT; \ > } while (0) > > Hi Alex, I applied that and checked that the sparse warnings were fixed as well, looks good to me. Reviewed-by: Cl?ment L?ger Thanks, Cl?ment From conor at kernel.org Thu Sep 4 10:58:08 2025 From: conor at kernel.org (Conor Dooley) Date: Thu, 4 Sep 2025 18:58:08 +0100 Subject: [PATCH v3 RESEND 0/3] riscv: dts: starfive: jh7110: More U-Boot downstream changes for JH7110 In-Reply-To: <20250823100159.203925-1-e@freeshell.de> References: <20250823100159.203925-1-e@freeshell.de> Message-ID: <20250904-grape-convent-8c36463138e2@spud> From: Conor Dooley On Sat, 23 Aug 2025 03:01:40 -0700, E Shattow wrote: > Bring in additional downstream U-Boot boot loader changes for StarFive > VisionFive2 board target (and related JH7110 common boards). Create a > basic dt-binding (and not any Linux driver) in support of the > memory-controller dts node used in mainline U-Boot. Also add > bootph-pre-ram hinting to jh7110.dtsi needed at SPL boot phase. > > Changes since v2: > > [...] Applied to riscv-dt-for-next, thanks! [1/3] dt-bindings: memory-controllers: add StarFive JH7110 SoC DMC https://git.kernel.org/conor/c/f5e36ecc9e4a [2/3] riscv: dts: starfive: jh7110: add DMC memory controller https://git.kernel.org/conor/c/7114969021ec [3/3] riscv: dts: starfive: jh7110: bootph-pre-ram hinting needed by boot loader https://git.kernel.org/conor/c/8181cc2f3f21 Thanks, Conor. From linus.walleij at linaro.org Thu Sep 4 12:35:28 2025 From: linus.walleij at linaro.org (Linus Walleij) Date: Thu, 4 Sep 2025 21:35:28 +0200 Subject: [PATCH] pinctrl: spacemit: fix typo in PRI_TDI pin name In-Reply-To: <20250903100104.360637-1-hendrik.hamerlinck@hammernet.be> References: <20250903100104.360637-1-hendrik.hamerlinck@hammernet.be> Message-ID: On Wed, Sep 3, 2025 at 12:01?PM Hendrik Hamerlinck wrote: > The datasheet lists this signal as PRI_TDI, not PRI_DTI. > Fix the pin name to match the documentation and JTAG naming > convention (TDI = Test Data In). > > Signed-off-by: Hendrik Hamerlinck Patch applied! Yours, Linus Walleij From fustini at kernel.org Thu Sep 4 12:42:48 2025 From: fustini at kernel.org (Drew Fustini) Date: Thu, 4 Sep 2025 12:42:48 -0700 Subject: [PATCH] MAINTAINERS: Add RISC-V T-HEAD SoC patchwork Message-ID: <20250904194247.82655-1-fustini@kernel.org> Add patchwork entry for RISC-V T-HEAD SoC support. Signed-off-by: Drew Fustini --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 6dcfbd11efef..9e0c149682f4 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -21745,6 +21745,7 @@ M: Guo Ren M: Fu Wei L: linux-riscv at lists.infradead.org S: Maintained +Q: https://patchwork.kernel.org/project/riscv-thead/list/ T: git https://github.com/pdp7/linux.git F: Documentation/devicetree/bindings/clock/thead,th1520-clk-ap.yaml F: Documentation/devicetree/bindings/firmware/thead,th1520-aon.yaml -- 2.34.1 From mst at redhat.com Thu Sep 4 14:38:30 2025 From: mst at redhat.com (Michael S. Tsirkin) Date: Thu, 4 Sep 2025 17:38:30 -0400 Subject: [PATCH v3 7/7] virtio_balloon: Stop calling page_address() in free_pages() In-Reply-To: <20250903185921.1785167-8-vishal.moola@gmail.com> References: <20250903185921.1785167-1-vishal.moola@gmail.com> <20250903185921.1785167-8-vishal.moola@gmail.com> Message-ID: <20250904173824-mutt-send-email-mst@kernel.org> On Wed, Sep 03, 2025 at 11:59:21AM -0700, Vishal Moola (Oracle) wrote: > free_pages() should be used when we only have a virtual address. We > should call __free_pages() directly on our page instead. > > Signed-off-by: Vishal Moola (Oracle) Acked-by: Michael S. Tsirkin > --- > drivers/virtio/virtio_balloon.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index eae65136cdfb..7f3fd72678eb 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -488,8 +488,7 @@ static unsigned long return_free_pages_to_mm(struct virtio_balloon *vb, > page = balloon_page_pop(&vb->free_page_list); > if (!page) > break; > - free_pages((unsigned long)page_address(page), > - VIRTIO_BALLOON_HINT_BLOCK_ORDER); > + __free_pages(page, VIRTIO_BALLOON_HINT_BLOCK_ORDER); > } > vb->num_free_page_blocks -= num_returned; > spin_unlock_irq(&vb->free_page_list_lock); > @@ -719,8 +718,7 @@ static int get_free_page_and_send(struct virtio_balloon *vb) > if (vq->num_free > 1) { > err = virtqueue_add_inbuf(vq, &sg, 1, p, GFP_KERNEL); > if (unlikely(err)) { > - free_pages((unsigned long)p, > - VIRTIO_BALLOON_HINT_BLOCK_ORDER); > + __free_pages(page, VIRTIO_BALLOON_HINT_BLOCK_ORDER); > return err; > } > virtqueue_kick(vq); > @@ -733,7 +731,7 @@ static int get_free_page_and_send(struct virtio_balloon *vb) > * The vq has no available entry to add this page block, so > * just free it. > */ > - free_pages((unsigned long)p, VIRTIO_BALLOON_HINT_BLOCK_ORDER); > + __free_pages(page, VIRTIO_BALLOON_HINT_BLOCK_ORDER); > } > > return 0; > -- > 2.51.0 From ameryhung at gmail.com Thu Sep 4 15:42:39 2025 From: ameryhung at gmail.com (Amery Hung) Date: Thu, 4 Sep 2025 15:42:39 -0700 Subject: [PATCH v2 bpf-next] riscv, bpf: Sign extend struct ops return values properly In-Reply-To: <20250904103806.18937-1-hengqi.chen@gmail.com> References: <20250904103806.18937-1-hengqi.chen@gmail.com> Message-ID: <5829abcf-f1b9-4fb0-8811-b6098fdd8a29@gmail.com> On 9/4/25 3:38 AM, Hengqi Chen wrote: > The ns_bpf_qdisc selftest triggers a kernel panic: > > Unable to handle kernel paging request at virtual address ffffffffa38dbf58 > Current test_progs pgtable: 4K pagesize, 57-bit VAs, pgdp=0x00000001109cc000 > [ffffffffa38dbf58] pgd=000000011fffd801, p4d=000000011fffd401, pud=000000011fffd001, pmd=0000000000000000 > Oops [#1] > Modules linked in: bpf_testmod(OE) xt_conntrack nls_iso8859_1 dm_mod drm drm_panel_orientation_quirks configfs backlight btrfs blake2b_generic xor lzo_compress zlib_deflate raid6_pq efivarfs [last unloaded: bpf_testmod(OE)] > CPU: 1 UID: 0 PID: 23584 Comm: test_progs Tainted: G W OE 6.17.0-rc1-g2465bb83e0b4 #1 NONE > Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2024.01+dfsg-1ubuntu5.1 01/01/2024 > epc : __qdisc_run+0x82/0x6f0 > ra : __qdisc_run+0x6e/0x6f0 > epc : ffffffff80bd5c7a ra : ffffffff80bd5c66 sp : ff2000000eecb550 > gp : ffffffff82472098 tp : ff60000096895940 t0 : ffffffff8001f180 > t1 : ffffffff801e1664 t2 : 0000000000000000 s0 : ff2000000eecb5d0 > s1 : ff60000093a6a600 a0 : ffffffffa38dbee8 a1 : 0000000000000001 > a2 : ff2000000eecb510 a3 : 0000000000000001 a4 : 0000000000000000 > a5 : 0000000000000010 a6 : 0000000000000000 a7 : 0000000000735049 > s2 : ffffffffa38dbee8 s3 : 0000000000000040 s4 : ff6000008bcda000 > s5 : 0000000000000008 s6 : ff60000093a6a680 s7 : ff60000093a6a6f0 > s8 : ff60000093a6a6ac s9 : ff60000093140000 s10: 0000000000000000 > s11: ff2000000eecb9d0 t3 : 0000000000000000 t4 : 0000000000ff0000 > t5 : 0000000000000000 t6 : ff60000093a6a8b6 > status: 0000000200000120 badaddr: ffffffffa38dbf58 cause: 000000000000000d > [] __qdisc_run+0x82/0x6f0 > [] __dev_queue_xmit+0x4c0/0x1128 > [] neigh_resolve_output+0xd0/0x170 > [] ip6_finish_output2+0x226/0x6c8 > [] ip6_finish_output+0x10c/0x2a0 > [] ip6_output+0x5e/0x178 > [] ip6_xmit+0x29a/0x608 > [] inet6_csk_xmit+0xe6/0x140 > [] __tcp_transmit_skb+0x45c/0xaa8 > [] tcp_connect+0x9ce/0xd10 > [] tcp_v6_connect+0x4ac/0x5e8 > [] __inet_stream_connect+0xd8/0x318 > [] inet_stream_connect+0x3e/0x68 > [] __sys_connect_file+0x50/0x88 > [] __sys_connect+0x96/0xc8 > [] __riscv_sys_connect+0x20/0x30 > [] do_trap_ecall_u+0x256/0x378 > [] handle_exception+0x14a/0x156 > Code: 892a 0363 1205 489c 8bc1 c7e5 2d03 084a 2703 080a (2783) 0709 > ---[ end trace 0000000000000000 ]--- > > The bpf_fifo_dequeue prog returns a skb which is a pointer. > The pointer is treated as a 32bit value and sign extend to > 64bit in epilogue. This behavior is right for most bpf prog > types but wrong for struct ops which requires RISC-V ABI. > > So let's sign extend struct ops return values according to > the function model and RISC-V ABI([0]). > > [0]: https://riscv.org/wp-content/uploads/2024/12/riscv-calling.pdf > > Fixes: 25ad10658dc1 ("riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace framework") > Signed-off-by: Hengqi Chen > --- > arch/riscv/net/bpf_jit_comp64.c | 38 ++++++++++++++++++++++++++++++++- > 1 file changed, 37 insertions(+), 1 deletion(-) > > diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c > index 549c3063c7f1..c7ae4d0a8361 100644 > --- a/arch/riscv/net/bpf_jit_comp64.c > +++ b/arch/riscv/net/bpf_jit_comp64.c > @@ -954,6 +954,35 @@ static int invoke_bpf_prog(struct bpf_tramp_link *l, int args_off, int retval_of > return ret; > } > > +/* > + * Sign-extend the register if necessary > + */ > +static int sign_extend(int rd, int rs, u8 size, u8 flags, struct rv_jit_context *ctx) > +{ > + if (!(flags & BTF_FMODEL_SIGNED_ARG) && (size == 1 || size == 2)) > + return 0; > + > + switch (size) { > + case 1: > + emit_sextb(rd, rs, ctx); > + break; > + case 2: > + emit_sexth(rd, rs, ctx); > + break; > + case 4: > + emit_sextw(rd, rs, ctx); > + break; > + case 8: > + emit_mv(rd, rs, ctx); > + break; > + default: > + pr_err("bpf-jit: invalid size %d for sign_extend\n", size); > + return -EINVAL; Will this accidentally rejects struct_ops functions that return void? > + } > + > + return 0; > +} > + > static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, > const struct btf_func_model *m, > struct bpf_tramp_links *tlinks, > @@ -1175,8 +1204,15 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, > restore_args(min_t(int, nr_arg_slots, RV_MAX_REG_ARGS), args_off, ctx); > > if (save_ret) { > - emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx); > + if (is_struct_ops) { > + ret = sign_extend(RV_REG_A0, regmap[BPF_REG_0], > + m->ret_size, m->ret_flags, ctx); > + if (ret) > + goto out; > + } else { > + emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > + } > } > > emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); From spriteovo at gmail.com Thu Sep 4 15:56:35 2025 From: spriteovo at gmail.com (Asuna) Date: Fri, 5 Sep 2025 06:56:35 +0800 Subject: [PATCH 2/2] RISC-V: re-enable gcc + rust builds In-Reply-To: <20250904-sterilize-swagger-c7999b124e83@spud> References: <20250830-cheesy-prone-ee5fae406c22@spud> <20250903190806.2604757-1-SpriteOvO@gmail.com> <20250903190806.2604757-2-SpriteOvO@gmail.com> <20250904-sterilize-swagger-c7999b124e83@spud> Message-ID: > One thing - please don't send new versions > of patchsets in response to earlier versions or other threads. It > doesn't do you any favours with mailbox visibility. I apologize for this, I'm pretty much new to mailing lists, so I had followed the step "Explicit In-Reply-To headers" [1] in doc. For future patches I'll send them alone instead of replying to existing threads. [1]: https://www.kernel.org/doc/html/v6.9/process/submitting-patches.html#explicit-in-reply-to-headers > Other than Zicsr/Zifencei that may need explicit handling in a dedicated > option, the approach here seems kinda backwards. > Individually these symbols don't actually mean what they say they do, > which is confusing: "recognises" here is true even when it may not be > true at all because TOOLCHAIN_HAS_FOO is not set. Why can these options > not be removed, and instead the TOOLCHAIN_HAS_FOO options grow a > "depends on !RUST || "? Yes, it's kinda "backwards", which is intentional, based on the following considerations: 1) As mentioned in rust/Makefile, filtering flags for libclang is a hack, because currently bindgen only has libclang as backend, and ideally bindgen should support GCC so that the passed CC flags are supposed to be fully compatible. On the RISC-V side, I tend to think that version checking for extensions for libclang is also a hack, which could have been accomplished with just the cc-option function, ideally. 2) Rust bindgen only "generates" FFI stuff, it is not involved in the final assembly stage. In other words, it doesn't matter so much what RISC-V extensions to turn on for bindgen (although it does have a little impact, like some macro switches), it's more matter to CC. Therefore, I chose not to modify the original extension config conditions so that if libclang doesn't support the CC flag for an extension, then the Rust build is not supported, rather than treating the extension as not supported. Nonetheless, it occurred to me as I was writing this reply that if GCC implements a new extension in the future that LLVM/Clang doesn't yet have, this could once again lead to a break in GCC+Rust build support if the kernel decides to use the new extension. So it's a trade-off, you guys decide, I'm fine with both. Regarding the name, initially I named it "compatible", and ended up changed it to "recognize" before sending the patch. If we continue on this path, I'm not sure what name is appropriate to use here, do you guys have any ideas? > What does the libclang >= 17 requirement actually do here? Is that the > version where llvm starts to require that Zicsr/Zifencei is set in order > to use them? I think a comment to that effect is required if so. This > doesn't actually need to be blocking either, should just be able to > filter it out of march when passing to bindgen, no? libclang >= 17 starts recognizing Zicsr/Zifencei in -march, passing them to -march doesn't generate an error, and passing them or not doesn't have any real difference. (still follows ISA before version 20190608 -- Zicsr/Zifencei are included in base ISA). I should have written a comment there to avoid confusion. Reference commit in LLVM/Clang 22e199e6af ("[RISCV] Accept zicsr and zifencei command line options") https://github.com/llvm/llvm-project/commit/22e199e6afb1263c943c0c0d4498694e15bf8a16 > What about the case where TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI is not > set at all? Currently your patch is going to block rust in that case, > when actually nothing needs to be done at all - no part of the toolchain > requires understanding Zicsr/Zifencei as standalone extensions in this > case. This is a bug, I missed this case. So it should be corrected to: config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZICSR_ZIFENCEI ? ? def_bool y ? ? depends on TOOLCHAIN_NEEDS_OLD_ISA_SPEC || !TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI || RUST_BINDGEN_LIBCLANG_VERSION >= 170000 > The TOOLCHAIN_NEEDS_OLD_ISA_SPEC handling I don't remember 100% how it > works, but if bindgen requires them to be set to use the extension > this will return true but do nothing to add the extensions to march? > That seems wrong to me. > I'd be fairly amenable to disabling rust though when used in combination > with gcc < 11.3 and gas >=2.36 since it's such a niche condition, rather > doing work to support it. That'd be effectively an inversion of your > first condition. The current latest version of LLVM/Clang still does not require explicit Zicsr/Zifence to enable these two extensions, Clang just accepts them in -march and then silently ignores them. Checking the usage of CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC: ifdef CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC KBUILD_CFLAGS += -Wa,-misa-spec=2.2 KBUILD_AFLAGS += -Wa,-misa-spec=2.2 else riscv-march-$(CONFIG_TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI) := $(riscv-march-y)_zicsr_zifencei endif It just uses -Wa to force an older ISA version to GAS. So the RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZICSR_ZIFENCEI I corrected above should be fine now I guess? Or would you still prefer your idea of blocking Rust if TOOLCHAIN_NEEDS_OLD_ISA_SPEC is true? (To be clear, the breaking changes regarding Zicsr/Zifence are since ISA version 20190608, and versions 2.0, 2.1, 2.2 are older than 20190608) The only thing I'm confused about is that according to the comment of TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI, GCC-12.1.0 bumped the default ISA to 20191213, but why doesn't the depends-on have condition || (CC_IS_GCC && GCC_VERSION >= 120100)? Thanks for your detailed review. From spriteovo at gmail.com Thu Sep 4 16:07:20 2025 From: spriteovo at gmail.com (Asuna) Date: Fri, 5 Sep 2025 07:07:20 +0800 Subject: [PATCH 2/2] RISC-V: re-enable gcc + rust builds In-Reply-To: References: <20250830-cheesy-prone-ee5fae406c22@spud> <20250903190806.2604757-1-SpriteOvO@gmail.com> <20250903190806.2604757-2-SpriteOvO@gmail.com> <20250904-sterilize-swagger-c7999b124e83@spud> Message-ID: <1b95b2f0-e916-4a86-a274-da2ff7f9d516@gmail.com> CC rust-for-linux list, I missed it in copying from get_maintainer.pl, the thread is a bit of a mess now :( From spriteovo at gmail.com Thu Sep 4 16:15:15 2025 From: spriteovo at gmail.com (Asuna) Date: Fri, 5 Sep 2025 07:15:15 +0800 Subject: [PATCH 1/2] rust: get the version of libclang used by bindgen in a separate script In-Reply-To: References: <20250830-cheesy-prone-ee5fae406c22@spud> <20250903190806.2604757-1-SpriteOvO@gmail.com> Message-ID: On 9/4/25 7:24 AM, Miguel Ojeda wrote: > Hmm... I am not sure it is a good idea to move that into another > script. Do we really need to intertwine these two scripts? The rename > isn't great either. > Because of adding a new Kconfig symbol for the Rust bindgen libclang version, then we have three places manually calling bindgen for rust_is_available_bindgen_libclang.h to get the version. I'd like to merge them into one script so that it's easy to maintain in the future. But if you prefer not to, I'd also be willing to revert it. For this approach and naming, I referred to script/cc-version.sh rustc-version.sh and rustc-llvm-version.sh. From spriteovo at gmail.com Thu Sep 4 16:17:04 2025 From: spriteovo at gmail.com (Asuna) Date: Fri, 5 Sep 2025 07:17:04 +0800 Subject: [PATCH 2/2] RISC-V: re-enable gcc + rust builds In-Reply-To: References: <20250830-cheesy-prone-ee5fae406c22@spud> <20250903190806.2604757-1-SpriteOvO@gmail.com> <20250903190806.2604757-2-SpriteOvO@gmail.com> Message-ID: On 9/4/25 7:27 AM, Miguel Ojeda wrote: > I think the commit message should try to explain each the changes here > (or to split them). > > e.g. it doesn't mention the other config symbols added, nor the extra > flag skipped, nor the `error` call. Yes, the commit message is worth being more detailed, I'll improve it in the v2 patch. From wei.liu at kernel.org Thu Sep 4 16:41:27 2025 From: wei.liu at kernel.org (Wei Liu) Date: Thu, 4 Sep 2025 23:41:27 +0000 Subject: [PATCH v2 0/7] Drivers: hv: Fix NEED_RESCHED_LAZY and use common APIs In-Reply-To: <20250828000156.23389-1-seanjc@google.com> References: <20250828000156.23389-1-seanjc@google.com> Message-ID: On Wed, Aug 27, 2025 at 05:01:49PM -0700, Sean Christopherson wrote: > Fix a bug where MSHV root partitions (and upper-level VTL code) don't honor > NEED_RESCHED_LAZY, and then deduplicate the TIF related MSHV code by turning > the "kvm" entry APIs into more generic "virt" APIs. > > This version is based on > > git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git hyperv-next > > in order to pickup the VTL changes that are queued for 6.18. I also > squashed the NEED_RESCHED_LAZY fixes for root and VTL modes into a single > patch, as it should be easy/straightforward to drop the VTL change as needed > if we want this in 6.17 or earlier. > > That effectively means the full series is dependent on the VTL changes being > fully merged for 6.18. But I think that's ok as it's really only the MSHV > changes that have any urgency whatsoever, and I assume that Microsoft is > the only user that truly cares about the MSHV root fix. I.e. if the whole > thing gets delayed, I think it's only the Hyper-V folks that are impacted. > > I have no preference what tree this goes through, or when, and can respin > and/or split as needed. > > As with v1, the Hyper-V stuff and non-x86 architectures are compile-tested > only. > > v2: > - Rebase on hyperv-next. > - Fix and converge the VTL code as well. [Peter, Nuno] > > v1: https://lore.kernel.org/all/20250825200622.3759571-1-seanjc at google.com > I dropped the mshv_vtl changes in this series and applied the rest (including the KVM changes) to hyperv-next. Thanks, Wei From hengqi.chen at gmail.com Thu Sep 4 18:24:46 2025 From: hengqi.chen at gmail.com (Hengqi Chen) Date: Fri, 5 Sep 2025 09:24:46 +0800 Subject: [PATCH v2 bpf-next] riscv, bpf: Sign extend struct ops return values properly In-Reply-To: <5829abcf-f1b9-4fb0-8811-b6098fdd8a29@gmail.com> References: <20250904103806.18937-1-hengqi.chen@gmail.com> <5829abcf-f1b9-4fb0-8811-b6098fdd8a29@gmail.com> Message-ID: On Fri, Sep 5, 2025 at 6:42?AM Amery Hung wrote: > > > > On 9/4/25 3:38 AM, Hengqi Chen wrote: > > The ns_bpf_qdisc selftest triggers a kernel panic: > > > > Unable to handle kernel paging request at virtual address ffffffffa38dbf58 > > Current test_progs pgtable: 4K pagesize, 57-bit VAs, pgdp=0x00000001109cc000 > > [ffffffffa38dbf58] pgd=000000011fffd801, p4d=000000011fffd401, pud=000000011fffd001, pmd=0000000000000000 > > Oops [#1] > > Modules linked in: bpf_testmod(OE) xt_conntrack nls_iso8859_1 dm_mod drm drm_panel_orientation_quirks configfs backlight btrfs blake2b_generic xor lzo_compress zlib_deflate raid6_pq efivarfs [last unloaded: bpf_testmod(OE)] > > CPU: 1 UID: 0 PID: 23584 Comm: test_progs Tainted: G W OE 6.17.0-rc1-g2465bb83e0b4 #1 NONE > > Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > > Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2024.01+dfsg-1ubuntu5.1 01/01/2024 > > epc : __qdisc_run+0x82/0x6f0 > > ra : __qdisc_run+0x6e/0x6f0 > > epc : ffffffff80bd5c7a ra : ffffffff80bd5c66 sp : ff2000000eecb550 > > gp : ffffffff82472098 tp : ff60000096895940 t0 : ffffffff8001f180 > > t1 : ffffffff801e1664 t2 : 0000000000000000 s0 : ff2000000eecb5d0 > > s1 : ff60000093a6a600 a0 : ffffffffa38dbee8 a1 : 0000000000000001 > > a2 : ff2000000eecb510 a3 : 0000000000000001 a4 : 0000000000000000 > > a5 : 0000000000000010 a6 : 0000000000000000 a7 : 0000000000735049 > > s2 : ffffffffa38dbee8 s3 : 0000000000000040 s4 : ff6000008bcda000 > > s5 : 0000000000000008 s6 : ff60000093a6a680 s7 : ff60000093a6a6f0 > > s8 : ff60000093a6a6ac s9 : ff60000093140000 s10: 0000000000000000 > > s11: ff2000000eecb9d0 t3 : 0000000000000000 t4 : 0000000000ff0000 > > t5 : 0000000000000000 t6 : ff60000093a6a8b6 > > status: 0000000200000120 badaddr: ffffffffa38dbf58 cause: 000000000000000d > > [] __qdisc_run+0x82/0x6f0 > > [] __dev_queue_xmit+0x4c0/0x1128 > > [] neigh_resolve_output+0xd0/0x170 > > [] ip6_finish_output2+0x226/0x6c8 > > [] ip6_finish_output+0x10c/0x2a0 > > [] ip6_output+0x5e/0x178 > > [] ip6_xmit+0x29a/0x608 > > [] inet6_csk_xmit+0xe6/0x140 > > [] __tcp_transmit_skb+0x45c/0xaa8 > > [] tcp_connect+0x9ce/0xd10 > > [] tcp_v6_connect+0x4ac/0x5e8 > > [] __inet_stream_connect+0xd8/0x318 > > [] inet_stream_connect+0x3e/0x68 > > [] __sys_connect_file+0x50/0x88 > > [] __sys_connect+0x96/0xc8 > > [] __riscv_sys_connect+0x20/0x30 > > [] do_trap_ecall_u+0x256/0x378 > > [] handle_exception+0x14a/0x156 > > Code: 892a 0363 1205 489c 8bc1 c7e5 2d03 084a 2703 080a (2783) 0709 > > ---[ end trace 0000000000000000 ]--- > > > > The bpf_fifo_dequeue prog returns a skb which is a pointer. > > The pointer is treated as a 32bit value and sign extend to > > 64bit in epilogue. This behavior is right for most bpf prog > > types but wrong for struct ops which requires RISC-V ABI. > > > > So let's sign extend struct ops return values according to > > the function model and RISC-V ABI([0]). > > > > [0]: https://riscv.org/wp-content/uploads/2024/12/riscv-calling.pdf > > > > Fixes: 25ad10658dc1 ("riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace framework") > > Signed-off-by: Hengqi Chen > > --- > > arch/riscv/net/bpf_jit_comp64.c | 38 ++++++++++++++++++++++++++++++++- > > 1 file changed, 37 insertions(+), 1 deletion(-) > > > > diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c > > index 549c3063c7f1..c7ae4d0a8361 100644 > > --- a/arch/riscv/net/bpf_jit_comp64.c > > +++ b/arch/riscv/net/bpf_jit_comp64.c > > @@ -954,6 +954,35 @@ static int invoke_bpf_prog(struct bpf_tramp_link *l, int args_off, int retval_of > > return ret; > > } > > > > +/* > > + * Sign-extend the register if necessary > > + */ > > +static int sign_extend(int rd, int rs, u8 size, u8 flags, struct rv_jit_context *ctx) > > +{ > > + if (!(flags & BTF_FMODEL_SIGNED_ARG) && (size == 1 || size == 2)) > > + return 0; > > + > > + switch (size) { > > + case 1: > > + emit_sextb(rd, rs, ctx); > > + break; > > + case 2: > > + emit_sexth(rd, rs, ctx); > > + break; > > + case 4: > > + emit_sextw(rd, rs, ctx); > > + break; > > + case 8: > > + emit_mv(rd, rs, ctx); > > + break; > > + default: > > + pr_err("bpf-jit: invalid size %d for sign_extend\n", size); > > + return -EINVAL; > > Will this accidentally rejects struct_ops functions that return void? > No, see https://elixir.bootlin.com/linux/v6.16.4/source/kernel/bpf/bpf_struct_ops.c#L601-L602 > > + } > > + > > + return 0; > > +} > > + > > static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, > > const struct btf_func_model *m, > > struct bpf_tramp_links *tlinks, > > @@ -1175,8 +1204,15 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, > > restore_args(min_t(int, nr_arg_slots, RV_MAX_REG_ARGS), args_off, ctx); > > > > if (save_ret) { > > - emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > > emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx); > > + if (is_struct_ops) { > > + ret = sign_extend(RV_REG_A0, regmap[BPF_REG_0], > > + m->ret_size, m->ret_flags, ctx); > > + if (ret) > > + goto out; > > + } else { > > + emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > > + } > > } > > > > emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); > From pulehui at huawei.com Thu Sep 4 18:55:49 2025 From: pulehui at huawei.com (Pu Lehui) Date: Fri, 5 Sep 2025 09:55:49 +0800 Subject: [PATCH bpf-next] riscv, bpf: Remove duplicated bpf_flush_icache() In-Reply-To: <20250904105119.21861-1-hengqi.chen@gmail.com> References: <20250904105119.21861-1-hengqi.chen@gmail.com> Message-ID: <95324d30-2e75-47dd-8ef7-0eb1bc80ab90@huawei.com> On 2025/9/4 18:51, Hengqi Chen wrote: > The bpf_flush_icache() is done by bpf_arch_text_copy() already. > Remove the duplicated one in arch_prepare_bpf_trampoline(). > > Signed-off-by: Hengqi Chen > --- > arch/riscv/net/bpf_jit_comp64.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c > index c7ae4d0a8361..3fcc011c6be4 100644 > --- a/arch/riscv/net/bpf_jit_comp64.c > +++ b/arch/riscv/net/bpf_jit_comp64.c > @@ -1305,7 +1305,6 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image, > goto out; > } > > - bpf_flush_icache(ro_image, ro_image_end); > out: > kvfree(image); > return ret < 0 ? ret : size; Reviewed-by: Pu Lehui From pulehui at huawei.com Thu Sep 4 19:15:52 2025 From: pulehui at huawei.com (Pu Lehui) Date: Fri, 5 Sep 2025 10:15:52 +0800 Subject: [PATCH v2 bpf-next] riscv, bpf: Sign extend struct ops return values properly In-Reply-To: <20250904103806.18937-1-hengqi.chen@gmail.com> References: <20250904103806.18937-1-hengqi.chen@gmail.com> Message-ID: On 2025/9/4 18:38, Hengqi Chen wrote: > The ns_bpf_qdisc selftest triggers a kernel panic: > > Unable to handle kernel paging request at virtual address ffffffffa38dbf58 > Current test_progs pgtable: 4K pagesize, 57-bit VAs, pgdp=0x00000001109cc000 > [ffffffffa38dbf58] pgd=000000011fffd801, p4d=000000011fffd401, pud=000000011fffd001, pmd=0000000000000000 > Oops [#1] > Modules linked in: bpf_testmod(OE) xt_conntrack nls_iso8859_1 dm_mod drm drm_panel_orientation_quirks configfs backlight btrfs blake2b_generic xor lzo_compress zlib_deflate raid6_pq efivarfs [last unloaded: bpf_testmod(OE)] > CPU: 1 UID: 0 PID: 23584 Comm: test_progs Tainted: G W OE 6.17.0-rc1-g2465bb83e0b4 #1 NONE > Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2024.01+dfsg-1ubuntu5.1 01/01/2024 > epc : __qdisc_run+0x82/0x6f0 > ra : __qdisc_run+0x6e/0x6f0 > epc : ffffffff80bd5c7a ra : ffffffff80bd5c66 sp : ff2000000eecb550 > gp : ffffffff82472098 tp : ff60000096895940 t0 : ffffffff8001f180 > t1 : ffffffff801e1664 t2 : 0000000000000000 s0 : ff2000000eecb5d0 > s1 : ff60000093a6a600 a0 : ffffffffa38dbee8 a1 : 0000000000000001 > a2 : ff2000000eecb510 a3 : 0000000000000001 a4 : 0000000000000000 > a5 : 0000000000000010 a6 : 0000000000000000 a7 : 0000000000735049 > s2 : ffffffffa38dbee8 s3 : 0000000000000040 s4 : ff6000008bcda000 > s5 : 0000000000000008 s6 : ff60000093a6a680 s7 : ff60000093a6a6f0 > s8 : ff60000093a6a6ac s9 : ff60000093140000 s10: 0000000000000000 > s11: ff2000000eecb9d0 t3 : 0000000000000000 t4 : 0000000000ff0000 > t5 : 0000000000000000 t6 : ff60000093a6a8b6 > status: 0000000200000120 badaddr: ffffffffa38dbf58 cause: 000000000000000d > [] __qdisc_run+0x82/0x6f0 > [] __dev_queue_xmit+0x4c0/0x1128 > [] neigh_resolve_output+0xd0/0x170 > [] ip6_finish_output2+0x226/0x6c8 > [] ip6_finish_output+0x10c/0x2a0 > [] ip6_output+0x5e/0x178 > [] ip6_xmit+0x29a/0x608 > [] inet6_csk_xmit+0xe6/0x140 > [] __tcp_transmit_skb+0x45c/0xaa8 > [] tcp_connect+0x9ce/0xd10 > [] tcp_v6_connect+0x4ac/0x5e8 > [] __inet_stream_connect+0xd8/0x318 > [] inet_stream_connect+0x3e/0x68 > [] __sys_connect_file+0x50/0x88 > [] __sys_connect+0x96/0xc8 > [] __riscv_sys_connect+0x20/0x30 > [] do_trap_ecall_u+0x256/0x378 > [] handle_exception+0x14a/0x156 > Code: 892a 0363 1205 489c 8bc1 c7e5 2d03 084a 2703 080a (2783) 0709 > ---[ end trace 0000000000000000 ]--- > > The bpf_fifo_dequeue prog returns a skb which is a pointer. > The pointer is treated as a 32bit value and sign extend to > 64bit in epilogue. This behavior is right for most bpf prog > types but wrong for struct ops which requires RISC-V ABI. > > So let's sign extend struct ops return values according to > the function model and RISC-V ABI([0]). > > [0]: https://riscv.org/wp-content/uploads/2024/12/riscv-calling.pdf > > Fixes: 25ad10658dc1 ("riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace framework") > Signed-off-by: Hengqi Chen > --- > arch/riscv/net/bpf_jit_comp64.c | 38 ++++++++++++++++++++++++++++++++- > 1 file changed, 37 insertions(+), 1 deletion(-) > > diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c > index 549c3063c7f1..c7ae4d0a8361 100644 > --- a/arch/riscv/net/bpf_jit_comp64.c > +++ b/arch/riscv/net/bpf_jit_comp64.c > @@ -954,6 +954,35 @@ static int invoke_bpf_prog(struct bpf_tramp_link *l, int args_off, int retval_of > return ret; > } > > +/* > + * Sign-extend the register if necessary > + */ This helper may be used later, so let's put it higher. > +static int sign_extend(int rd, int rs, u8 size, u8 flags, struct rv_jit_context *ctx) > +{ > + if (!(flags & BTF_FMODEL_SIGNED_ARG) && (size == 1 || size == 2)) emm, this will miss unsigned 1 and 2 byte return values, we should also move them to RV_REG_A0. And also, let we use `sign` but not `flags`, as we may use this helper in other place. That will be: static int sign_extend(u8 rd, u8 rs, u8 sz, bool sign, struct rv_jit_context *ctx) { if (!sign && (sz == 1 || sz == 2)) { if (rd != rs) emit_mv(rd, rs, ctx); return 0; } ... } > + return 0; > + > + switch (size) { > + case 1: > + emit_sextb(rd, rs, ctx); > + break; > + case 2: > + emit_sexth(rd, rs, ctx); > + break; > + case 4: > + emit_sextw(rd, rs, ctx); > + break; > + case 8: let's only move when rd != rs > + emit_mv(rd, rs, ctx); > + break; > + default: > + pr_err("bpf-jit: invalid size %d for sign_extend\n", size); > + return -EINVAL; > + } > + > + return 0; > +} > + > static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, > const struct btf_func_model *m, > struct bpf_tramp_links *tlinks, > @@ -1175,8 +1204,15 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, > restore_args(min_t(int, nr_arg_slots, RV_MAX_REG_ARGS), args_off, ctx); > > if (save_ret) { > - emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx); > + if (is_struct_ops) { > + ret = sign_extend(RV_REG_A0, regmap[BPF_REG_0], > + m->ret_size, m->ret_flags, ctx); > + if (ret) > + goto out; > + } else { > + emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > + } > } > > emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); From kingxukai at zohomail.com Thu Sep 4 20:10:21 2025 From: kingxukai at zohomail.com (Xukai Wang) Date: Fri, 05 Sep 2025 11:10:21 +0800 Subject: [PATCH v8 0/3] riscv: canaan: Add support for K230 clock Message-ID: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> This patch series adds clock controller support for the Canaan Kendryte K230 SoC. The K230 SoC includes an external 24MHz OSC, 4 internal PLLs and an external pulse input, with the controller managing these sources and their derived clocks. The clock tree and hardware-specific definition can be found in the vendor's DTS [1], and this series is based on the K230 initial series [2]. Link: https://github.com/ruyisdk/linux-xuantie-kernel/blob/linux-6.6.36/arch/riscv/boot/dts/canaan/k230_clock_provider.dtsi [1] Link: https://lore.kernel.org/linux-clk/tencent_F76EB8D731C521C18D5D7C4F8229DAA58E08 at qq.com/ [2] Co-developed-by: Troy Mitchell Signed-off-by: Troy Mitchell Signed-off-by: Xukai Wang --- Changes in v8: - Rename dts node name "timer_pulse_in" to "clock-50m" - Drop redundant comment and 'minItems' of hardware in dt-binding. - Link to v7: https://lore.kernel.org/r/20250730-b4-k230-clk-v7-0-c57d3bb593d3 at zohomail.com Changes in v7: - Rename K230_PLL_STATUS_MASK to K230_PLL_LOCK_STATUS_MASK - Add clkdev for PLLs to register lookup - Add macros to generate repeat variables definition - Refine the definitions of k230 clocks - Split composite clks into rate, gate, mux, fixed_factor clk - Replace k230_clk_hw_onecell_get with of_clk_hw_onecell_get for clock provider - Drop k230_sysclk and use clk_mux, clk_gate and clk_fixed_factor as the data structures. - Replace one loop registration with individual registration for each type. - Link to v6: https://lore.kernel.org/r/20250415-b4-k230-clk-v6-0-7fd89f427250 at zohomail.com Changes in v6: - Remove some redundant comments in struct declaration. - Replace the Vendor's code source link with a new one. - Link to v5: https://lore.kernel.org/r/20250320-b4-k230-clk-v5-0-0e9d089c5488 at zohomail.com Changes in v5: - Fix incorrect base-commit and add prerequisite-patch-id. - Replace dummy apb_clk with real ones for UARTs. - Add IDs of UARTs clock and DMA clocks in the binding header. - Replace k230_clk_cfgs[] array with corresponding named variables. - Remove some redundant checks in clk_ops. - Drop the unnecessary parenthesis and type casts. - Modify return value handling in probe path to avoid redundant print. - Link to v4: https://lore.kernel.org/r/20250217-b4-k230-clk-v4-0-5a95a3458691 at zohomail.com Changes in v4: - Remove redundant onecell_get callback and add_provider function for pll_divs. - Modify the base-commit in cover letter. - Link to v3: https://lore.kernel.org/r/20250203-b4-k230-clk-v3-0-362c79124572 at zohomail.com Changes in v3: - Reorder the defination and declaration in drivers code. - Reorder the properties in dts node. - Replace global variable `k230_sysclk` with dynamic memory allocation. - Rename the macro K230_NUM_CLKS to K230_CLK_NUM. - Use dev_err_probe for error handling. - Remove unused includes. - Link to v2: https://lore.kernel.org/r/20250108-b4-k230-clk-v2-0-27b30a2ca52d at zohomail.com Changes in v2: - Add items and description. - Rename k230-clk.h to canaan,k230-clk.h - Link to v1: https://lore.kernel.org/r/20241229-b4-k230-clk-v1-0-221a917e80ed at zohomail.com --- Xukai Wang (3): dt-bindings: clock: Add bindings for Canaan K230 clock controller clk: canaan: Add clock driver for Canaan K230 riscv: dts: canaan: Add clock definition for K230 .../devicetree/bindings/clock/canaan,k230-clk.yaml | 59 + arch/riscv/boot/dts/canaan/k230-canmv.dts | 11 + arch/riscv/boot/dts/canaan/k230-evb.dts | 11 + arch/riscv/boot/dts/canaan/k230.dtsi | 26 +- drivers/clk/Kconfig | 6 + drivers/clk/Makefile | 1 + drivers/clk/clk-k230.c | 2456 ++++++++++++++++++++ include/dt-bindings/clock/canaan,k230-clk.h | 223 ++ 8 files changed, 2785 insertions(+), 8 deletions(-) --- base-commit: 0eea987088a22d73d81e968de7347cdc7e594f72 change-id: 20241206-b4-k230-clk-925f33fed6c2 prerequisite-patch-id: deda3c472f0000ffd40cddd7cf6d3b5e2d7da7dc Best regards, -- Xukai Wang From kingxukai at zohomail.com Thu Sep 4 20:10:22 2025 From: kingxukai at zohomail.com (Xukai Wang) Date: Fri, 05 Sep 2025 11:10:22 +0800 Subject: [PATCH v8 1/3] dt-bindings: clock: Add bindings for Canaan K230 clock controller In-Reply-To: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> References: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> Message-ID: <20250905-b4-k230-clk-v8-1-96caa02d5428@zohomail.com> This patch adds the Device Tree binding for the clock controller on Canaan k230. The binding defines the clocks and the required properties to configure them correctly. Reviewed-by: Krzysztof Kozlowski Signed-off-by: Xukai Wang --- .../devicetree/bindings/clock/canaan,k230-clk.yaml | 59 ++++++ include/dt-bindings/clock/canaan,k230-clk.h | 223 +++++++++++++++++++++ 2 files changed, 282 insertions(+) diff --git a/Documentation/devicetree/bindings/clock/canaan,k230-clk.yaml b/Documentation/devicetree/bindings/clock/canaan,k230-clk.yaml new file mode 100644 index 0000000000000000000000000000000000000000..34c93cb5db400c7db0a7ede2ef79d340354f150c --- /dev/null +++ b/Documentation/devicetree/bindings/clock/canaan,k230-clk.yaml @@ -0,0 +1,59 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/clock/canaan,k230-clk.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Canaan Kendryte K230 Clock + +maintainers: + - Xukai Wang + +description: + The Canaan K230 clock controller generates various clocks for SoC + peripherals. See include/dt-bindings/clock/canaan,k230-clk.h for + valid clock IDs. + +properties: + compatible: + const: canaan,k230-clk + + reg: + items: + - description: PLL control registers + - description: Sysclk control registers + + clocks: + items: + - description: Main external reference clock + - description: + External clock which used as the pulse input + for the timer to provide timing signals. + + clock-names: + items: + - const: osc24m + - const: timer-pulse-in + + '#clock-cells': + const: 1 + +required: + - compatible + - reg + - clocks + - clock-names + - '#clock-cells' + +additionalProperties: false + +examples: + - | + clock-controller at 91102000 { + compatible = "canaan,k230-clk"; + reg = <0x91102000 0x40>, + <0x91100000 0x108>; + clocks = <&osc24m>, <&timerx_pulse_in>; + clock-names = "osc24m", "timer-pulse-in"; + #clock-cells = <1>; + }; diff --git a/include/dt-bindings/clock/canaan,k230-clk.h b/include/dt-bindings/clock/canaan,k230-clk.h new file mode 100644 index 0000000000000000000000000000000000000000..9eee9440a4f14583eac845b649e5685d623132e1 --- /dev/null +++ b/include/dt-bindings/clock/canaan,k230-clk.h @@ -0,0 +1,223 @@ +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */ +/* + * Kendryte Canaan K230 Clock Drivers + * + * Author: Xukai Wang + */ + +#ifndef __DT_BINDINGS_CANAAN_K230_CLOCK_H__ +#define __DT_BINDINGS_CANAAN_K230_CLOCK_H__ + +/* Kendryte K230 SoC clock identifiers (arbitrary values) */ +#define K230_CPU0_SRC_GATE 0 +#define K230_CPU0_PLIC_GATE 1 +#define K230_CPU0_NOC_DDRCP4_GATE 2 +#define K230_CPU0_APB_GATE 3 +#define K230_CPU0_SRC_RATE 4 +#define K230_CPU0_AXI_RATE 5 +#define K230_CPU0_PLIC_RATE 6 +#define K230_CPU0_APB_RATE 7 +#define K230_HS_OSPI_SRC_MUX 8 +#define K230_HS_USB_REF_MUX 9 +#define K230_HS_HCLK_HIGH_GATE 10 +#define K230_HS_HCLK_SRC_GATE 11 +#define K230_HS_SD0_AHB_GATE 12 +#define K230_HS_SD1_AHB_GATE 13 +#define K230_HS_SSI1_AHB_GATE 14 +#define K230_HS_SSI2_AHB_GATE 15 +#define K230_HS_USB0_AHB_GATE 16 +#define K230_HS_USB1_AHB_GATE 17 +#define K230_HS_SSI0_AXI_GATE 18 +#define K230_HS_SSI1_GATE 19 +#define K230_HS_SSI2_GATE 20 +#define K230_HS_QSPI_AXI_SRC_GATE 21 +#define K230_HS_SSI1_AXI_GATE 22 +#define K230_HS_SSI2_AXI_GATE 23 +#define K230_HS_SD_CARD_SRC_GATE 24 +#define K230_HS_SD0_CARD_GATE 25 +#define K230_HS_SD1_CARD_GATE 26 +#define K230_HS_SD_AXI_SRC_GATE 27 +#define K230_HS_SD0_AXI_GATE 28 +#define K230_HS_SD1_AXI_GATE 29 +#define K230_HS_SD0_BASE_GATE 30 +#define K230_HS_SD1_BASE_GATE 31 +#define K230_HS_OSPI_SRC_GATE 32 +#define K230_HS_SD_TIMER_SRC_GATE 33 +#define K230_HS_SD0_TIMER_GATE 34 +#define K230_HS_SD1_TIMER_GATE 35 +#define K230_HS_USB0_REF_GATE 36 +#define K230_HS_USB1_REF_GATE 37 +#define K230_HS_HCLK_HIGH_SRC_RATE 38 +#define K230_HS_HCLK_SRC_RATE 39 +#define K230_HS_SSI0_AXI_RATE 40 +#define K230_HS_SSI1_RATE 41 +#define K230_HS_SSI2_RATE 42 +#define K230_HS_QSPI_AXI_SRC_RATE 43 +#define K230_HS_SD_CARD_SRC_RATE 44 +#define K230_HS_SD_AXI_SRC_RATE 45 +#define K230_HS_USB_REF_50M_RATE 46 +#define K230_HS_SD_TIMER_SRC_RATE 47 +#define K230_TIMER0_MUX 48 +#define K230_TIMER1_MUX 49 +#define K230_TIMER2_MUX 50 +#define K230_TIMER3_MUX 51 +#define K230_TIMER4_MUX 52 +#define K230_TIMER5_MUX 53 +#define K230_SHRM_SRAM_MUX 54 +#define K230_DDRC_SRC_MUX 55 +#define K230_AI_SRC_MUX 56 +#define K230_CAMERA0_MUX 57 +#define K230_CAMERA1_MUX 58 +#define K230_CAMERA2_MUX 59 +#define K230_CPU1_SRC_MUX 60 +#define K230_CPU1_SRC_GATE 61 +#define K230_CPU1_PLIC_GATE 62 +#define K230_CPU1_APB_GATE 63 +#define K230_CPU1_SRC_RATE 64 +#define K230_CPU1_AXI_RATE 65 +#define K230_CPU1_PLIC_RATE 66 +#define K230_CPU1_APB_RATE 67 +#define K230_PMU_APB_GATE 68 +#define K230_LS_APB_SRC_GATE 69 +#define K230_LS_UART0_APB_GATE 70 +#define K230_LS_UART1_APB_GATE 71 +#define K230_LS_UART2_APB_GATE 72 +#define K230_LS_UART3_APB_GATE 73 +#define K230_LS_UART4_APB_GATE 74 +#define K230_LS_I2C0_APB_GATE 75 +#define K230_LS_I2C1_APB_GATE 76 +#define K230_LS_I2C2_APB_GATE 77 +#define K230_LS_I2C3_APB_GATE 78 +#define K230_LS_I2C4_APB_GATE 79 +#define K230_LS_GPIO_APB_GATE 80 +#define K230_LS_PWM_APB_GATE 81 +#define K230_LS_JAMLINK0_APB_GATE 82 +#define K230_LS_JAMLINK1_APB_GATE 83 +#define K230_LS_JAMLINK2_APB_GATE 84 +#define K230_LS_JAMLINK3_APB_GATE 85 +#define K230_LS_AUDIO_APB_GATE 86 +#define K230_LS_ADC_APB_GATE 87 +#define K230_LS_CODEC_APB_GATE 88 +#define K230_LS_I2C0_GATE 89 +#define K230_LS_I2C1_GATE 90 +#define K230_LS_I2C2_GATE 91 +#define K230_LS_I2C3_GATE 92 +#define K230_LS_I2C4_GATE 93 +#define K230_LS_CODEC_ADC_GATE 94 +#define K230_LS_CODEC_DAC_GATE 95 +#define K230_LS_AUDIO_DEV_GATE 96 +#define K230_LS_PDM_GATE 97 +#define K230_LS_ADC_GATE 98 +#define K230_LS_UART0_GATE 99 +#define K230_LS_UART1_GATE 100 +#define K230_LS_UART2_GATE 101 +#define K230_LS_UART3_GATE 102 +#define K230_LS_UART4_GATE 103 +#define K230_LS_JAMLINK0CO_GATE 104 +#define K230_LS_JAMLINK1CO_GATE 105 +#define K230_LS_JAMLINK2CO_GATE 106 +#define K230_LS_JAMLINK3CO_GATE 107 +#define K230_LS_GPIO_DEBOUNCE_GATE 108 +#define K230_SYSCTL_WDT0_APB_GATE 109 +#define K230_SYSCTL_WDT1_APB_GATE 110 +#define K230_SYSCTL_TIMER_APB_GATE 111 +#define K230_SYSCTL_IOMUX_APB_GATE 112 +#define K230_SYSCTL_MAILBOX_APB_GATE 113 +#define K230_SYSCTL_HDI_GATE 114 +#define K230_SYSCTL_TIME_STAMP_GATE 115 +#define K230_SYSCTL_WDT0_GATE 116 +#define K230_SYSCTL_WDT1_GATE 117 +#define K230_TIMER0_GATE 118 +#define K230_TIMER1_GATE 119 +#define K230_TIMER2_GATE 120 +#define K230_TIMER3_GATE 121 +#define K230_TIMER4_GATE 122 +#define K230_TIMER5_GATE 123 +#define K230_SHRM_APB_GATE 124 +#define K230_SHRM_AXI_GATE 125 +#define K230_SHRM_AXI_SLAVE_GATE 126 +#define K230_SHRM_NONAI2D_AXI_GATE 127 +#define K230_SHRM_SRAM_GATE 128 +#define K230_SHRM_DECOMPRESS_AXI_GATE 129 +#define K230_SHRM_SDMA_AXI_GATE 130 +#define K230_SHRM_PDMA_AXI_GATE 131 +#define K230_DDRC_SRC_GATE 132 +#define K230_DDRC_BYPASS_GATE 133 +#define K230_DDRC_APB_GATE 134 +#define K230_DISPLAY_AHB_GATE 135 +#define K230_DISPLAY_AXI_GATE 136 +#define K230_DISPLAY_GPU_GATE 137 +#define K230_DISPLAY_DPIP_GATE 138 +#define K230_DISPLAY_CFG_GATE 139 +#define K230_DISPLAY_REF_GATE 140 +#define K230_USB_480M_GATE 141 +#define K230_USB_100M_GATE 142 +#define K230_DPHY_DFT_GATE 143 +#define K230_SPI2AXI_GATE 144 +#define K230_AI_SRC_GATE 145 +#define K230_AI_AXI_GATE 146 +#define K230_AI_SRC_RATE 147 +#define K230_CAMERA0_GATE 148 +#define K230_CAMERA1_GATE 149 +#define K230_CAMERA2_GATE 150 +#define K230_LS_APB_SRC_RATE 151 +#define K230_LS_I2C0_RATE 152 +#define K230_LS_I2C1_RATE 153 +#define K230_LS_I2C2_RATE 154 +#define K230_LS_I2C3_RATE 155 +#define K230_LS_I2C4_RATE 156 +#define K230_LS_CODEC_ADC_RATE 157 +#define K230_LS_CODEC_DAC_RATE 158 +#define K230_LS_AUDIO_DEV_RATE 159 +#define K230_LS_PDM_RATE 160 +#define K230_LS_ADC_RATE 161 +#define K230_LS_UART0_RATE 162 +#define K230_LS_UART1_RATE 163 +#define K230_LS_UART2_RATE 164 +#define K230_LS_UART3_RATE 165 +#define K230_LS_UART4_RATE 166 +#define K230_LS_JAMLINKCO_SRC_RATE 167 +#define K230_LS_GPIO_DEBOUNCE_RATE 168 +#define K230_SYSCTL_HDI_RATE 169 +#define K230_SYSCTL_TIME_STAMP_RATE 170 +#define K230_SYSCTL_TEMP_SENSOR_RATE 171 +#define K230_SYSCTL_WDT0_RATE 172 +#define K230_SYSCTL_WDT1_RATE 173 +#define K230_TIMER0_SRC_RATE 174 +#define K230_TIMER1_SRC_RATE 175 +#define K230_TIMER2_SRC_RATE 176 +#define K230_TIMER3_SRC_RATE 177 +#define K230_TIMER4_SRC_RATE 178 +#define K230_TIMER5_SRC_RATE 179 +#define K230_SHRM_APB_RATE 180 +#define K230_DDRC_SRC_RATE 181 +#define K230_DDRC_APB_RATE 182 +#define K230_DISPLAY_AHB_RATE 183 +#define K230_DISPLAY_CLKEXT_RATE 184 +#define K230_DISPLAY_GPU_RATE 185 +#define K230_DISPLAY_DPIP_RATE 186 +#define K230_DISPLAY_CFG_RATE 187 +#define K230_VPU_SRC_GATE 188 +#define K230_VPU_AXI_GATE 189 +#define K230_VPU_DDRCP2_GATE 190 +#define K230_VPU_CFG_GATE 191 +#define K230_VPU_SRC_RATE 192 +#define K230_VPU_AXI_SRC_RATE 193 +#define K230_VPU_CFG_RATE 194 +#define K230_SEC_APB_GATE 195 +#define K230_SEC_FIX_GATE 196 +#define K230_SEC_AXI_GATE 197 +#define K230_SEC_APB_RATE 198 +#define K230_SEC_FIX_RATE 199 +#define K230_SEC_AXI_RATE 200 +#define K230_USB_480M_RATE 201 +#define K230_USB_100M_RATE 202 +#define K230_DPHY_DFT_RATE 203 +#define K230_SPI2AXI_RATE 204 +#define K230_CAMERA0_RATE 205 +#define K230_CAMERA1_RATE 206 +#define K230_CAMERA2_RATE 207 +#define K230_SHRM_SRAM_DIV2 208 + +#endif /* __DT_BINDINGS_CANAAN_K230_CLOCK_H__ */ + -- 2.34.1 From kingxukai at zohomail.com Thu Sep 4 20:10:23 2025 From: kingxukai at zohomail.com (Xukai Wang) Date: Fri, 05 Sep 2025 11:10:23 +0800 Subject: [PATCH v8 2/3] clk: canaan: Add clock driver for Canaan K230 In-Reply-To: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> References: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> Message-ID: <20250905-b4-k230-clk-v8-2-96caa02d5428@zohomail.com> This patch provides basic support for the K230 clock, which covers all clocks in K230 SoC. The clock tree of the K230 SoC consists of a 24MHZ external crystal oscillator, PLLs and an external pulse input for timerX, and their derived clocks. Co-developed-by: Troy Mitchell Signed-off-by: Troy Mitchell Signed-off-by: Xukai Wang --- drivers/clk/Kconfig | 6 + drivers/clk/Makefile | 1 + drivers/clk/clk-k230.c | 2456 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 2463 insertions(+) diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig index 299bc678ed1b9fcd9110bb8c5937a1bd1ea60e23..b597912607a6cc8eabff459a890a1e7353ef9c1d 100644 --- a/drivers/clk/Kconfig +++ b/drivers/clk/Kconfig @@ -464,6 +464,12 @@ config COMMON_CLK_K210 help Support for the Canaan Kendryte K210 RISC-V SoC clocks. +config COMMON_CLK_K230 + bool "Clock driver for the Canaan Kendryte K230 SoC" + depends on ARCH_CANAAN || COMPILE_TEST + help + Support for the Canaan Kendryte K230 RISC-V SoC clocks. + config COMMON_CLK_SP7021 tristate "Clock driver for Sunplus SP7021 SoC" depends on SOC_SP7021 || COMPILE_TEST diff --git a/drivers/clk/Makefile b/drivers/clk/Makefile index fb8878a5d7d93da6bec487460cdf63f1f764a431..5df50b1e14c701ed38397bfb257db26e8dd278b8 100644 --- a/drivers/clk/Makefile +++ b/drivers/clk/Makefile @@ -51,6 +51,7 @@ obj-$(CONFIG_MACH_ASPEED_G6) += clk-ast2600.o obj-$(CONFIG_ARCH_HIGHBANK) += clk-highbank.o obj-$(CONFIG_CLK_HSDK) += clk-hsdk-pll.o obj-$(CONFIG_COMMON_CLK_K210) += clk-k210.o +obj-$(CONFIG_COMMON_CLK_K230) += clk-k230.o obj-$(CONFIG_LMK04832) += clk-lmk04832.o obj-$(CONFIG_COMMON_CLK_LAN966X) += clk-lan966x.o obj-$(CONFIG_COMMON_CLK_LOCHNAGAR) += clk-lochnagar.o diff --git a/drivers/clk/clk-k230.c b/drivers/clk/clk-k230.c new file mode 100644 index 0000000000000000000000000000000000000000..2ba74c008b30ae3400acbd8c08550e8315dfe205 --- /dev/null +++ b/drivers/clk/clk-k230.c @@ -0,0 +1,2456 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Kendryte Canaan K230 Clock Drivers + * + * Author: Xukai Wang + * Author: Troy Mitchell + */ + +#include +#include +#include +#include +#include +#include +#include + +#include + +/* PLL control register bits. */ +#define K230_PLL_BYPASS_ENABLE BIT(19) +#define K230_PLL_GATE_ENABLE BIT(2) +#define K230_PLL_GATE_WRITE_ENABLE BIT(18) +#define K230_PLL_OD_SHIFT 24 +#define K230_PLL_OD_MASK 0xF +#define K230_PLL_R_SHIFT 16 +#define K230_PLL_R_MASK 0x3F +#define K230_PLL_F_SHIFT 0 +#define K230_PLL_F_MASK 0x1FFF +#define K230_PLL_DIV_REG_OFFSET 0x00 +#define K230_PLL_BYPASS_REG_OFFSET 0x04 +#define K230_PLL_GATE_REG_OFFSET 0x08 +#define K230_PLL_LOCK_REG_OFFSET 0x0C + +/* PLL lock register */ +#define K230_PLL_LOCK_STATUS_MASK BIT(0) +#define K230_PLL_LOCK_TIME_DELAY 400 +#define K230_PLL_LOCK_TIMEOUT 0 + +/* K230 CLK registers offset */ +#define K230_CLK_AUDIO_CLKDIV_OFFSET 0x34 +#define K230_CLK_PDM_CLKDIV_OFFSET 0x40 +#define K230_CLK_CODEC_ADC_MCLKDIV_OFFSET 0x38 +#define K230_CLK_CODEC_DAC_MCLKDIV_OFFSET 0x3c + +#define K230_PLLX_DIV_ADDR(base, idx) \ + (K230_PLL_DIV_REG_OFFSET + (base) + (idx) * 0x10) + +#define K230_PLLX_BYPASS_ADDR(base, idx) \ + (K230_PLL_BYPASS_REG_OFFSET + (base) + (idx) * 0x10) + +#define K230_PLLX_GATE_ADDR(base, idx) \ + (K230_PLL_GATE_REG_OFFSET + (base) + (idx) * 0x10) + +#define K230_PLLX_LOCK_ADDR(base, idx) \ + (K230_PLL_LOCK_REG_OFFSET + (base) + (idx) * 0x10) + +#define K230_CLK_RATE_FORMAT_PNAME(_var, _id, \ + _mul_min, _mul_max, _mul_shift, _mul_mask, \ + _div_min, _div_max, _div_shift, _div_mask, \ + _reg, _bit, _method, _reg2, \ + _read_only, _flags, \ + _pname) \ + static struct k230_clk_rate _var = { \ + .div_reg_off = _reg, \ + .mul_reg_off = _reg2, \ + .id = _id, \ + .clk = { \ + .write_enable_bit = _bit, \ + .mul_min = _mul_min, \ + .mul_max = _mul_max, \ + .mul_shift = _mul_shift, \ + .mul_mask = _mul_mask, \ + .div_min = _div_min, \ + .div_max = _div_max, \ + .div_shift = _div_shift, \ + .div_mask = _div_mask, \ + .read_only = _read_only, \ + .hw.init = CLK_HW_INIT_FW_NAME(#_var, \ + _pname, &k230_clk_ops_##_method, \ + _flags), \ + }, \ + } + +#define K230_CLK_RATE_FORMAT(_var, _id, \ + _mul_min, _mul_max, _mul_shift, _mul_mask, \ + _div_min, _div_max, _div_shift, _div_mask, \ + _reg, _bit, _method, _reg2, \ + _read_only, _flags, \ + _phw) \ + static struct k230_clk_rate _var = { \ + .div_reg_off = _reg, \ + .mul_reg_off = _reg2, \ + .id = _id, \ + .clk = { \ + .write_enable_bit = _bit, \ + .mul_min = _mul_min, \ + .mul_max = _mul_max, \ + .mul_shift = _mul_shift, \ + .mul_mask = _mul_mask, \ + .div_min = _div_min, \ + .div_max = _div_max, \ + .div_shift = _div_shift, \ + .div_mask = _div_mask, \ + .read_only = _read_only, \ + .hw.init = CLK_HW_INIT_HW(#_var, \ + _phw, &k230_clk_ops_##_method, \ + _flags), \ + }, \ + } + +#define K230_CLK_GATE_FORMAT_PNAME(_var, _id, \ + _reg, _bit, _flags, _gate_flags, \ + _pname) \ + static struct k230_clk_gate _var = { \ + .reg_off = _reg, \ + .id = _id, \ + .clk = { \ + .bit_idx = _bit, \ + .flags = _gate_flags, \ + .hw.init = CLK_HW_INIT_FW_NAME(#_var, \ + _pname, &clk_gate_ops, _flags), \ + }, \ + } + +#define K230_CLK_GATE_FORMAT(_var, _id, \ + _reg, _bit, _flags, _gate_flags, \ + _phw) \ + static struct k230_clk_gate _var = { \ + .reg_off = _reg, \ + .id = _id, \ + .clk = { \ + .bit_idx = _bit, \ + .flags = _gate_flags, \ + .hw.init = CLK_HW_INIT_HW(#_var, \ + _phw, &clk_gate_ops, _flags), \ + }, \ + } + +#define K230_CLK_MUX_FORMAT(_var, _id, \ + _reg, _shift, _mask, _flags, _mux_flags, _pdata) \ + static struct k230_clk_mux _var = { \ + .reg_off = _reg, \ + .id = _id, \ + .clk = { \ + .flags = _mux_flags, \ + .shift = _shift, \ + .mask = _mask, \ + .hw.init = CLK_HW_INIT_PARENTS_DATA(#_var, \ + _pdata, &clk_mux_ops, _flags), \ + }, \ + } + +#define K230_CLK_FIXED_FACTOR_FORMAT(_var, \ + _mul, _div, _flags, \ + _phw) \ + static struct clk_fixed_factor _var = { \ + .mult = _mul, \ + .div = _div, \ + .hw.init = CLK_HW_INIT_HW(#_var, \ + _phw, &clk_fixed_factor_ops, _flags), \ + } + +#define K230_CLK_PLL_FORMAT(_var, _id, _flags, _pname) \ + static struct k230_pll _var = { \ + .hw.init = CLK_HW_INIT_FW_NAME(#_var, \ + _pname, &k230_pll_ops, _flags), \ + .id = _id, \ + } + +struct k230_pll { + struct clk_hw hw; + void __iomem *reg; + /* ensures mutual exclusion for concurrent register access. */ + spinlock_t *lock; + int id; +}; + +#define hw_to_k230_pll(_hw) container_of(_hw, struct k230_pll, hw) + +struct k230_clk_rate_self { + struct clk_hw hw; + void __iomem *reg; + bool read_only; + u32 write_enable_bit; + u32 mul_min; + u32 mul_max; + u32 mul_shift; + u32 mul_mask; + u32 div_min; + u32 div_max; + u32 div_shift; + u32 div_mask; + /* ensures mutual exclusion for concurrent register access. */ + spinlock_t *lock; +}; + +#define hw_to_k230_clk_rate_self(_hw) container_of(_hw, \ + struct k230_clk_rate_self, hw) + +struct k230_clk_rate { + u32 mul_reg_off; + u32 div_reg_off; + struct k230_clk_rate_self clk; + int id; +}; + +static inline struct k230_clk_rate *hw_to_k230_clk_rate(struct clk_hw *hw) +{ + return container_of(hw_to_k230_clk_rate_self(hw), struct k230_clk_rate, + clk); +} + +struct k230_clk_gate { + u32 reg_off; + struct clk_gate clk; + int id; +}; + +struct k230_clk_mux { + u32 reg_off; + struct clk_mux clk; + int id; +}; + +static int k230_pll_prepare(struct clk_hw *hw); +static int k230_pll_enable(struct clk_hw *hw); +static void k230_pll_disable(struct clk_hw *hw); +static int k230_pll_is_enabled(struct clk_hw *hw); +static unsigned long k230_pll_get_rate(struct clk_hw *hw, unsigned long parent_rate); + +static const struct clk_ops k230_pll_ops = { + .prepare = k230_pll_prepare, + .enable = k230_pll_enable, + .disable = k230_pll_disable, + .is_enabled = k230_pll_is_enabled, + .recalc_rate = k230_pll_get_rate, +}; + +K230_CLK_PLL_FORMAT(pll0, 0, CLK_IS_CRITICAL, 0); +K230_CLK_PLL_FORMAT(pll1, 1, CLK_IS_CRITICAL, 0); +K230_CLK_PLL_FORMAT(pll2, 2, CLK_IS_CRITICAL, 0); +K230_CLK_PLL_FORMAT(pll3, 3, CLK_IS_CRITICAL, 0); + +struct k230_pll *k230_plls[] = { + &pll0, + &pll1, + &pll2, + &pll3, +}; + +K230_CLK_FIXED_FACTOR_FORMAT(pll0_div2, 1, 2, 0, &pll0.hw); +K230_CLK_FIXED_FACTOR_FORMAT(pll0_div3, 1, 3, 0, &pll0.hw); +K230_CLK_FIXED_FACTOR_FORMAT(pll0_div4, 1, 4, 0, &pll0.hw); +K230_CLK_FIXED_FACTOR_FORMAT(pll0_div16, 1, 16, 0, &pll0.hw); +K230_CLK_FIXED_FACTOR_FORMAT(pll1_div2, 1, 2, 0, &pll1.hw); +K230_CLK_FIXED_FACTOR_FORMAT(pll1_div3, 1, 3, 0, &pll1.hw); +K230_CLK_FIXED_FACTOR_FORMAT(pll1_div4, 1, 4, 0, &pll1.hw); +K230_CLK_FIXED_FACTOR_FORMAT(pll2_div2, 1, 2, 0, &pll2.hw); +K230_CLK_FIXED_FACTOR_FORMAT(pll2_div3, 1, 3, 0, &pll2.hw); +K230_CLK_FIXED_FACTOR_FORMAT(pll2_div4, 1, 4, 0, &pll2.hw); +K230_CLK_FIXED_FACTOR_FORMAT(pll3_div2, 1, 2, 0, &pll3.hw); +K230_CLK_FIXED_FACTOR_FORMAT(pll3_div3, 1, 3, 0, &pll3.hw); +K230_CLK_FIXED_FACTOR_FORMAT(pll3_div4, 1, 4, 0, &pll3.hw); + +struct clk_fixed_factor *k230_pll_divs[] = { + &pll0_div2, + &pll0_div3, + &pll0_div4, + &pll0_div16, + &pll1_div2, + &pll1_div3, + &pll1_div4, + &pll2_div2, + &pll2_div3, + &pll2_div4, + &pll3_div2, + &pll3_div3, + &pll3_div4, +}; + +static int k230_clk_set_rate_mul(struct clk_hw *hw, unsigned long rate, + unsigned long parent_rate); +static long k230_clk_round_rate_mul(struct clk_hw *hw, unsigned long rate, + unsigned long *parent_rate); +static unsigned long k230_clk_get_rate_mul(struct clk_hw *hw, + unsigned long parent_rate); +static int k230_clk_set_rate_div(struct clk_hw *hw, unsigned long rate, + unsigned long parent_rate); +static long k230_clk_round_rate_div(struct clk_hw *hw, unsigned long rate, + unsigned long *parent_rate); +static unsigned long k230_clk_get_rate_div(struct clk_hw *hw, + unsigned long parent_rate); +static int k230_clk_set_rate_mul_div(struct clk_hw *hw, unsigned long rate, + unsigned long parent_rate); +static long k230_clk_round_rate_mul_div(struct clk_hw *hw, unsigned long rate, + unsigned long *parent_rate); +static unsigned long k230_clk_get_rate_mul_div(struct clk_hw *hw, + unsigned long parent_rate); + +/* clk_ops for clocks whose rate is determined by a configurable multiplier */ +static const struct clk_ops k230_clk_ops_mul = { + .set_rate = k230_clk_set_rate_mul, + .round_rate = k230_clk_round_rate_mul, + .recalc_rate = k230_clk_get_rate_mul, +}; + +/* clk_ops for clocks whose rate is determined by a configurable divider */ +static const struct clk_ops k230_clk_ops_div = { + .set_rate = k230_clk_set_rate_div, + .round_rate = k230_clk_round_rate_div, + .recalc_rate = k230_clk_get_rate_div, +}; + +/* clk_ops for clocks whose rate is determined by both a multiplier and a divider */ +static const struct clk_ops k230_clk_ops_mul_div = { + .set_rate = k230_clk_set_rate_mul_div, + .round_rate = k230_clk_round_rate_mul_div, + .recalc_rate = k230_clk_get_rate_mul_div, +}; + +K230_CLK_GATE_FORMAT(cpu0_src_gate, + K230_CPU0_SRC_GATE, + 0, 0, 0, 0, + &pll0_div2.hw); + +K230_CLK_RATE_FORMAT(cpu0_src_rate, + K230_CPU0_SRC_RATE, + 1, 16, 1, 0xF, + 16, 16, 0, 0x0, + 0x0, 31, mul, 0x0, + false, 0, + &cpu0_src_gate.clk.hw); + +K230_CLK_RATE_FORMAT(cpu0_axi_rate, + K230_CPU0_AXI_RATE, + 1, 1, 0, 0, + 1, 8, 6, 0x7, + 0x0, 31, div, 0x0, + 0, 0, + &cpu0_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(cpu0_plic_gate, + K230_CPU0_PLIC_GATE, + 0x0, 9, 0, 0, + &cpu0_src_rate.clk.hw); + +K230_CLK_RATE_FORMAT(cpu0_plic_rate, + K230_CPU0_PLIC_RATE, + 1, 1, 0, 0, + 1, 8, 10, 0x7, + 0x0, 31, div, 0x0, + false, 0, + &cpu0_plic_gate.clk.hw); + +K230_CLK_GATE_FORMAT(cpu0_noc_ddrcp4_gate, + K230_CPU0_NOC_DDRCP4_GATE, + 0x60, 7, 0, 0, + &cpu0_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(cpu0_apb_gate, + K230_CPU0_APB_GATE, + 0x0, 13, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(cpu0_apb_rate, + K230_CPU0_APB_RATE, + 1, 1, 0, 0, + 1, 8, 15, 0x7, + 0x0, 31, div, 0x0, + false, 0, + &cpu0_apb_gate.clk.hw); + +static const struct clk_parent_data k230_cpu1_src_mux_pdata[] = { + { .hw = &pll0_div2.hw, }, + { .hw = &pll3.hw, }, + { .hw = &pll0.hw, }, +}; + +K230_CLK_MUX_FORMAT(cpu1_src_mux, + K230_CPU1_SRC_MUX, + 0x4, 1, 0x3, + 0, 0, + k230_cpu1_src_mux_pdata); + +K230_CLK_GATE_FORMAT(cpu1_src_gate, + K230_CPU1_SRC_GATE, + 0x4, 0, CLK_IGNORE_UNUSED, 0, + &cpu1_src_mux.clk.hw); + +K230_CLK_RATE_FORMAT(cpu1_src_rate, + K230_CPU1_SRC_GATE, + 1, 1, 0, 0, + 1, 8, 3, 0x7, + 0x4, 31, div, 0x0, + false, 0, + &cpu1_src_gate.clk.hw); + +K230_CLK_RATE_FORMAT(cpu1_axi_rate, + K230_CPU1_AXI_RATE, + 1, 1, 0, 0, + 1, 8, 12, 0x7, + 0x4, 31, div, 0x0, + false, 0, + &cpu1_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(cpu1_plic_gate, + K230_CPU1_PLIC_GATE, + 0x4, 15, CLK_IGNORE_UNUSED, 0, + &cpu1_src_rate.clk.hw); + +K230_CLK_RATE_FORMAT(cpu1_plic_rate, + K230_CPU1_PLIC_RATE, + 1, 1, 0, 0, + 1, 8, 16, 0x7, + 0x4, 31, div, 0x0, + false, 0, + &cpu1_plic_gate.clk.hw); + +K230_CLK_GATE_FORMAT(cpu1_apb_gate, + K230_CPU1_APB_GATE, + 0x4, 19, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(cpu1_apb_rate, + K230_CPU1_APB_RATE, + 1, 1, 0, 0, + 1, 8, 15, 0x7, + 0x0, 31, div, 0x0, + false, 0, + &cpu1_apb_gate.clk.hw); + +K230_CLK_GATE_FORMAT_PNAME(pmu_apb_gate, + K230_PMU_APB_GATE, + 0x10, 0, 0, 0, + "osc24m"); + +K230_CLK_RATE_FORMAT(hs_hclk_high_src_rate, + K230_HS_HCLK_HIGH_SRC_RATE, + 1, 1, 0, 0, + 1, 8, 0, 0x7, + 0x1C, 31, div, 0x0, + false, 0, + &pll0_div4.hw); + +K230_CLK_GATE_FORMAT(hs_hclk_high_gate, + K230_HS_HCLK_HIGH_GATE, + 0x18, 1, 0, 0, + &hs_hclk_high_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_hclk_src_gate, + K230_HS_HCLK_SRC_GATE, + 0x18, 1, 0, 0, + &hs_hclk_high_src_rate.clk.hw); + +K230_CLK_RATE_FORMAT(hs_hclk_src_rate, + K230_HS_HCLK_SRC_RATE, + 1, 1, 0, 0, + 1, 8, 3, 0x7, + 0x1C, 31, div, 0x0, + false, 0, + &hs_hclk_src_gate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_sd0_ahb_gate, + K230_HS_SD0_AHB_GATE, + 0x18, 2, 0, 0, + &hs_hclk_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_sd1_ahb_gate, + K230_HS_SD1_AHB_GATE, + 0x18, 3, 0, 0, + &hs_hclk_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_ssi1_ahb_gate, + K230_HS_SSI1_AHB_GATE, + 0x18, 7, 0, 0, + &hs_hclk_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_ssi2_ahb_gate, + K230_HS_SSI2_AHB_GATE, + 0x18, 8, 0, 0, + &hs_hclk_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_usb0_ahb_gate, + K230_HS_USB0_AHB_GATE, + 0x18, 4, 0, 0, + &hs_hclk_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_usb1_ahb_gate, + K230_HS_USB1_AHB_GATE, + 0x18, 5, 0, 0, + &hs_hclk_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_ssi0_axi_gate, + K230_HS_SSI0_AXI_GATE, + 0x18, 27, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(hs_ssi0_axi_rate, + K230_HS_SSI0_AXI_RATE, + 1, 1, 0, 0, + 1, 8, 9, 0x7, + 0x20, 31, div, 0x0, + false, 0, + &hs_ssi0_axi_gate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_ssi1_gate, + K230_HS_SSI1_GATE, + 0x18, 25, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(hs_ssi1_rate, + K230_HS_SSI1_RATE, + 1, 1, 0, 0, + 1, 8, 3, 0x7, + 0x20, 31, div, 0x0, + false, 0, + &hs_ssi1_gate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_ssi2_gate, + K230_HS_SSI2_GATE, + 0x18, 26, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(hs_ssi2_rate, + K230_HS_SSI2_RATE, + 1, 1, 0, 0, + 1, 8, 6, 0x7, + 0x20, 31, div, 0x0, + false, 0, + &hs_ssi2_gate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_qspi_axi_src_gate, + K230_HS_QSPI_AXI_SRC_GATE, + 0x18, 28, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(hs_qspi_axi_src_rate, + K230_HS_QSPI_AXI_SRC_RATE, + 1, 1, 0, 0, + 1, 8, 12, 0x7, + 0x20, 31, div, 0x0, + false, 0, + &hs_qspi_axi_src_gate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_ssi1_axi_gate, + K230_HS_SSI1_AXI_GATE, + 0x18, 29, 0, 0, + &hs_qspi_axi_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_ssi2_axi_gate, + K230_HS_SSI2_AXI_GATE, + 0x18, 30, 0, 0, + &hs_qspi_axi_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_sd_card_src_gate, + K230_HS_SD_CARD_SRC_GATE, + 0x18, 11, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(hs_sd_card_src_rate, + K230_HS_SD_CARD_SRC_RATE, + 1, 1, 0, 0, + 2, 8, 12, 0x7, + 0x1C, 31, div, 0x0, + false, 0, + &pll0_div4.hw); + +K230_CLK_GATE_FORMAT(hs_sd0_card_gate, + K230_HS_SD0_CARD_GATE, + 0x18, 15, 0, 0, + &hs_sd_card_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_sd1_card_gate, + K230_HS_SD1_CARD_GATE, + 0x18, 19, 0, 0, + &hs_sd_card_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_sd_axi_src_gate, + K230_HS_SD_AXI_SRC_GATE, + 0x18, 9, 0, 0, + &pll2_div4.hw); + +K230_CLK_RATE_FORMAT(hs_sd_axi_src_rate, + K230_HS_SD_AXI_SRC_RATE, + 1, 1, 0, 0, + 1, 8, 6, 0x7, + 0x1C, 31, div, 0x0, + false, 0, + &hs_sd_axi_src_gate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_sd0_axi_gate, + K230_HS_SD0_AXI_GATE, + 0x18, 13, 0, 0, + &hs_sd_axi_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_sd1_axi_gate, + K230_HS_SD1_AXI_GATE, + 0x18, 17, 0, 0, + &hs_sd_axi_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_sd0_base_gate, + K230_HS_SD0_BASE_GATE, + 0x18, 14, 0, 0, + &hs_sd_axi_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_sd1_base_gate, + K230_HS_SD1_BASE_GATE, + 0x18, 18, 0, 0, + &hs_sd_axi_src_rate.clk.hw); + +static const struct clk_parent_data k230_hs_ospi_src_mux_pdata[] = { + { .hw = &pll0_div2.hw, }, + { .hw = &pll2_div4.hw, }, +}; + +K230_CLK_MUX_FORMAT(hs_ospi_src_mux, + K230_HS_OSPI_SRC_MUX, + 0x20, 18, 0x1, + 0, 0, + k230_hs_ospi_src_mux_pdata); + +K230_CLK_GATE_FORMAT(hs_ospi_src_gate, + K230_HS_OSPI_SRC_GATE, + 0x18, 24, CLK_IGNORE_UNUSED, 0, + &hs_ospi_src_mux.clk.hw); + +K230_CLK_RATE_FORMAT(hs_usb_ref_50m_rate, + K230_HS_USB_REF_50M_RATE, + 1, 1, 0, 0, + 1, 8, 15, 0x7, + 0x20, 31, div, 0x0, + false, 0, + &pll0_div16.hw); + +K230_CLK_GATE_FORMAT_PNAME(hs_sd_timer_src_gate, + K230_HS_SD_TIMER_SRC_GATE, + 0x18, 12, 0, 0, + "osc24m"); + +K230_CLK_RATE_FORMAT(hs_sd_timer_src_rate, + K230_HS_SD_TIMER_SRC_RATE, + 1, 1, 0, 0, + 24, 32, 15, 0x1F, + 0x1C, 31, div, 0x0, + false, 0, + &hs_sd_timer_src_gate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_sd0_timer_gate, + K230_HS_SD0_TIMER_GATE, + 0x18, 16, 0, 0, + &hs_sd_timer_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(hs_sd1_timer_gate, + K230_HS_SD1_TIMER_GATE, + 0x18, 20, 0, 0, + &hs_sd_timer_src_rate.clk.hw); + +static const struct clk_parent_data k230_hs_usb_ref_mux_pdata[] = { + { .fw_name = "osc24m", }, + { .hw = &hs_usb_ref_50m_rate.clk.hw, }, +}; + +K230_CLK_MUX_FORMAT(hs_usb_ref_mux, + K230_HS_USB_REF_MUX, + 0x18, 23, 0x1, + 0, 0, + k230_hs_usb_ref_mux_pdata); + +K230_CLK_GATE_FORMAT(hs_usb0_ref_gate, + K230_HS_USB0_REF_GATE, + 0x18, 21, CLK_IGNORE_UNUSED, 0, + &hs_usb_ref_mux.clk.hw); + +K230_CLK_GATE_FORMAT(hs_usb1_ref_gate, + K230_HS_USB1_REF_GATE, + 0x18, 22, CLK_IGNORE_UNUSED, 0, + &hs_usb_ref_mux.clk.hw); + +K230_CLK_GATE_FORMAT(ls_apb_src_gate, + K230_LS_APB_SRC_GATE, + 0x24, 0, CLK_IS_CRITICAL, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(ls_apb_src_rate, + K230_LS_APB_SRC_RATE, + 1, 1, 0, 0, + 1, 8, 0, 0x7, + 0x30, 31, div, 0x0, + false, 0, + &ls_apb_src_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_uart0_apb_gate, + K230_LS_UART0_APB_GATE, + 0x24, 1, CLK_IS_CRITICAL, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_uart1_apb_gate, + K230_LS_UART1_APB_GATE, + 0x24, 2, CLK_IS_CRITICAL, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_uart2_apb_gate, + K230_LS_UART2_APB_GATE, + 0x24, 3, CLK_IS_CRITICAL, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_uart3_apb_gate, + K230_LS_UART3_APB_GATE, + 0x24, 4, CLK_IS_CRITICAL, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_uart4_apb_gate, + K230_LS_UART4_APB_GATE, + 0x24, 5, CLK_IS_CRITICAL, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_i2c0_apb_gate, + K230_LS_I2C0_APB_GATE, + 0x24, 6, 0, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_i2c1_apb_gate, + K230_LS_I2C1_APB_GATE, + 0x24, 7, 0, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_i2c2_apb_gate, + K230_LS_I2C2_APB_GATE, + 0x24, 8, 0, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_i2c3_apb_gate, + K230_LS_I2C3_APB_GATE, + 0x24, 9, 0, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_i2c4_apb_gate, + K230_LS_I2C4_APB_GATE, + 0x24, 10, 0, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_gpio_apb_gate, + K230_LS_GPIO_APB_GATE, + 0x24, 11, 0, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_pwm_apb_gate, + K230_LS_PWM_APB_GATE, + 0x24, 12, 0, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_jamlink0_apb_gate, + K230_LS_JAMLINK0_APB_GATE, + 0x28, 4, 0, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_jamlink1_apb_gate, + K230_LS_JAMLINK1_APB_GATE, + 0x28, 5, 0, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_jamlink2_apb_gate, + K230_LS_JAMLINK2_APB_GATE, + 0x28, 6, 0, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_jamlink3_apb_gate, + K230_LS_JAMLINK3_APB_GATE, + 0x28, 7, 0, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_audio_apb_gate, + K230_LS_AUDIO_APB_GATE, + 0x24, 13, 0, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_adc_apb_gate, + K230_LS_ADC_APB_GATE, + 0x24, 15, 0, 0, + &ls_apb_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_codec_apb_gate, + K230_LS_CODEC_APB_GATE, + 0x24, 14, 0, 0, + &pll0_div4.hw); + +K230_CLK_GATE_FORMAT(ls_i2c0_gate, + K230_LS_I2C0_GATE, + 0x24, 21, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(ls_i2c0_rate, + K230_LS_I2C0_RATE, + 1, 1, 0, 0, + 1, 8, 15, 0x7, + 0x2C, 31, div, 0x0, + false, 0, + &ls_i2c0_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_i2c1_gate, + K230_LS_I2C1_GATE, + 0x24, 22, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(ls_i2c1_rate, + K230_LS_I2C1_RATE, + 1, 1, 0, 0, + 1, 8, 18, 0x7, + 0x2C, 31, div, 0x0, + false, 0, + &ls_i2c1_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_i2c2_gate, + K230_LS_I2C2_GATE, + 0x24, 23, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(ls_i2c2_rate, + K230_LS_I2C2_RATE, + 1, 1, 0, 0, + 1, 8, 21, 0x7, + 0x2C, 31, div, 0x0, + false, 0, + &ls_i2c2_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_i2c3_gate, + K230_LS_I2C3_GATE, + 0x24, 24, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(ls_i2c3_rate, + K230_LS_I2C3_RATE, + 1, 1, 0, 0, + 1, 8, 24, 0x7, + 0x2C, 31, div, 0x0, + false, 0, + &ls_i2c3_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_i2c4_gate, + K230_LS_I2C4_GATE, + 0x24, 25, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(ls_i2c4_rate, + K230_LS_I2C4_RATE, + 1, 1, 0, 0, + 1, 8, 27, 0x7, + 0x2C, 31, div, 0x0, + false, 0, + &ls_i2c4_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_codec_adc_gate, + K230_LS_CODEC_ADC_GATE, + 0x24, 29, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(ls_codec_adc_rate, + K230_LS_CODEC_ADC_RATE, + 0x10, 0x1B9, 14, 0x1FFF, + 0xC35, 0x3D09, 0, 0x3FFF, + 0x38, 31, mul_div, 0x38, + false, 0, + &ls_codec_adc_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_codec_dac_gate, + K230_LS_CODEC_DAC_GATE, + 0x24, 30, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(ls_codec_dac_rate, + K230_LS_CODEC_DAC_RATE, + 0x10, 0x1B9, 14, 0x1FFF, + 0xC35, 0x3D09, 0, 0x3FFF, + 0x3C, 31, mul_div, 0x3C, + false, 0, + &ls_codec_dac_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_audio_dev_gate, + K230_LS_AUDIO_DEV_GATE, + 0x24, 28, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(ls_audio_dev_rate, + K230_LS_AUDIO_DEV_RATE, + 0x4, 0x1B9, 16, 0x7FFF, + 0xC35, 0xF424, 0, 0xFFFF, + 0x34, 31, mul_div, 0x34, + false, 0, + &ls_audio_dev_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_pdm_gate, + K230_LS_PDM_GATE, + 0x24, 31, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(ls_pdm_rate, + K230_LS_PDM_RATE, + 0x2, 0x1B9, 0, 0xFFFF, + 0xC35, 0x1E848, 0, 0x1FFFF, + 0x40, 0, mul_div, 0x44, + false, 0, + &ls_pdm_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_adc_gate, + K230_LS_ADC_GATE, + 0x24, 26, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(ls_adc_rate, + K230_LS_ADC_RATE, + 1, 1, 0, 0, + 1, 1024, 3, 0x3FF, + 0x30, 31, div, 0x0, + false, 0, + &ls_adc_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_uart0_gate, + K230_LS_UART0_GATE, + 0x24, 16, CLK_IS_CRITICAL, 0, + &pll0_div16.hw); + +K230_CLK_RATE_FORMAT(ls_uart0_rate, + K230_LS_UART0_RATE, + 1, 1, 0, 0, + 1, 8, 0, 0x7, + 0x2C, 31, div, 0x0, + false, 0, + &ls_uart0_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_uart1_gate, + K230_LS_UART1_GATE, + 0x24, 17, CLK_IS_CRITICAL, 0, + &pll0_div16.hw); + +K230_CLK_RATE_FORMAT(ls_uart1_rate, + K230_LS_UART1_RATE, + 1, 1, 0, 0, + 1, 8, 3, 0x7, + 0x2C, 31, div, 0x0, + false, 0, + &ls_uart1_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_uart2_gate, + K230_LS_UART2_GATE, + 0x24, 18, CLK_IS_CRITICAL, 0, + &pll0_div16.hw); + +K230_CLK_RATE_FORMAT(ls_uart2_rate, + K230_LS_UART2_RATE, + 1, 1, 0, 0, + 1, 8, 6, 0x7, + 0x2C, 31, div, 0x0, + false, 0, + &ls_uart2_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_uart3_gate, + K230_LS_UART3_GATE, + 0x24, 19, CLK_IS_CRITICAL, 0, + &pll0_div16.hw); + +K230_CLK_RATE_FORMAT(ls_uart3_rate, + K230_LS_UART3_RATE, + 1, 1, 0, 0, + 1, 8, 9, 0x7, + 0x2C, 31, div, 0x0, + false, 0, + &ls_uart3_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_uart4_gate, + K230_LS_UART4_GATE, + 0x24, 20, CLK_IS_CRITICAL, 0, + &pll0_div16.hw); + +K230_CLK_RATE_FORMAT(ls_uart4_rate, + K230_LS_UART4_RATE, + 1, 1, 0, 0, + 1, 8, 12, 0x7, + 0x2C, 31, div, 0x0, + false, 0, + &ls_uart4_gate.clk.hw); + +K230_CLK_RATE_FORMAT(ls_jamlinkco_src_rate, + K230_LS_JAMLINKCO_SRC_RATE, + 1, 1, 0, 0, + 2, 512, 23, 0xFF, + 0x30, 31, div, 0x0, + false, 0, + &pll0_div16.hw); + +K230_CLK_GATE_FORMAT(ls_jamlink0co_gate, + K230_LS_JAMLINK0CO_GATE, + 0x28, 0, 0, 0, + &ls_jamlinkco_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_jamlink1co_gate, + K230_LS_JAMLINK1CO_GATE, + 0x28, 1, 0, 0, + &ls_jamlinkco_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_jamlink2co_gate, + K230_LS_JAMLINK2CO_GATE, + 0x28, 2, 0, 0, + &ls_jamlinkco_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(ls_jamlink3co_gate, + K230_LS_JAMLINK3CO_GATE, + 0x28, 3, 0, 0, + &ls_jamlinkco_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT_PNAME(ls_gpio_debounce_gate, + K230_LS_GPIO_DEBOUNCE_GATE, + 0x24, 27, 0, 0, + "osc24m"); + +K230_CLK_RATE_FORMAT(ls_gpio_debounce_rate, + K230_LS_GPIO_DEBOUNCE_RATE, + 1, 1, 0, 0, + 1, 1024, 13, 0x3FF, + 0x30, 31, div, 0x0, + false, 0, + &ls_gpio_debounce_gate.clk.hw); + +K230_CLK_GATE_FORMAT(sysctl_wdt0_apb_gate, + K230_SYSCTL_WDT0_APB_GATE, + 0x50, 1, 0, 0, + &pll0_div16.hw); + +K230_CLK_GATE_FORMAT(sysctl_wdt1_apb_gate, + K230_SYSCTL_WDT1_APB_GATE, + 0x50, 2, 0, 0, + &pll0_div16.hw); + +K230_CLK_GATE_FORMAT(sysctl_timer_apb_gate, + K230_SYSCTL_TIMER_APB_GATE, + 0x50, 3, 0, 0, + &pll0_div16.hw); + +K230_CLK_GATE_FORMAT(sysctl_iomux_apb_gate, + K230_SYSCTL_IOMUX_APB_GATE, + 0x50, 20, 0, 0, + &pll0_div16.hw); + +K230_CLK_GATE_FORMAT(sysctl_mailbox_apb_gate, + K230_SYSCTL_MAILBOX_APB_GATE, + 0x50, 4, 0, 0, + &pll0_div16.hw); + +K230_CLK_GATE_FORMAT(sysctl_hdi_gate, + K230_SYSCTL_HDI_GATE, + 0x50, 21, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(sysctl_hdi_rate, + K230_SYSCTL_HDI_RATE, + 1, 1, 0, 0, + 1, 8, 28, 0x7, + 0x58, 31, div, 0x0, + false, 0, + &sysctl_hdi_gate.clk.hw); + +K230_CLK_GATE_FORMAT(sysctl_time_stamp_gate, + K230_SYSCTL_TIME_STAMP_GATE, + 0x50, 19, CLK_IS_CRITICAL, 0, + &pll1_div4.hw); + +K230_CLK_RATE_FORMAT(sysctl_time_stamp_rate, + K230_SYSCTL_TIME_STAMP_RATE, + 1, 1, 0, 0, + 1, 32, 15, 0x1F, + 0x58, 31, div, 0x0, + false, 0, + &sysctl_time_stamp_gate.clk.hw); + +K230_CLK_RATE_FORMAT_PNAME(sysctl_temp_sensor_rate, + K230_SYSCTL_TEMP_SENSOR_RATE, + 1, 1, 0, 0, + 1, 256, 20, 0xFF, + 0x58, 31, div, 0x0, + false, 0, + "osc24m"); + +K230_CLK_GATE_FORMAT_PNAME(sysctl_wdt0_gate, + K230_SYSCTL_WDT0_GATE, + 0x50, 5, 0, 0, + "osc24m"); + +K230_CLK_RATE_FORMAT(sysctl_wdt0_rate, + K230_SYSCTL_WDT0_RATE, + 1, 1, 0, 0, + 1, 64, 3, 0x3F, + 0x58, 31, div, 0x0, + false, 0, + &sysctl_wdt0_gate.clk.hw); + +K230_CLK_GATE_FORMAT_PNAME(sysctl_wdt1_gate, + K230_SYSCTL_WDT1_GATE, + 0x50, 6, 0, 0, + "osc24m"); + +K230_CLK_RATE_FORMAT(sysctl_wdt1_rate, + K230_SYSCTL_WDT1_RATE, + 1, 1, 0, 0, + 1, 64, 9, 0x3F, + 0x58, 31, div, 0x0, + false, 0, + &sysctl_wdt1_gate.clk.hw); + +K230_CLK_RATE_FORMAT(timer0_src_rate, + K230_TIMER0_SRC_RATE, + 1, 1, 0, 0, + 1, 8, 0, 0x7, + 0x54, 31, div, 0x0, + false, 0, + &pll0_div16.hw); + +K230_CLK_RATE_FORMAT(timer1_src_rate, + K230_TIMER1_SRC_RATE, + 1, 1, 0, 0, + 1, 8, 3, 0x7, + 0x54, 31, div, 0x0, + false, 0, + &pll0_div16.hw); + +K230_CLK_RATE_FORMAT(timer2_src_rate, + K230_TIMER2_SRC_RATE, + 1, 1, 0, 0, + 1, 8, 6, 0x7, + 0x54, 31, div, 0x0, + false, 0, + &pll0_div16.hw); + +K230_CLK_RATE_FORMAT(timer3_src_rate, + K230_TIMER3_SRC_RATE, + 1, 1, 0, 0, + 1, 8, 9, 0x7, + 0x54, 31, div, 0x0, + false, 0, + &pll0_div16.hw); + +K230_CLK_RATE_FORMAT(timer4_src_rate, + K230_TIMER4_SRC_RATE, + 1, 1, 0, 0, + 1, 8, 12, 0x7, + 0x54, 31, div, 0x0, + false, 0, + &pll0_div16.hw); + +K230_CLK_RATE_FORMAT(timer5_src_rate, + K230_TIMER5_SRC_RATE, + 1, 1, 0, 0, + 1, 8, 15, 0x7, + 0x54, 31, div, 0x0, + false, 0, + &pll0_div16.hw); + +static const struct clk_parent_data k230_timer0_mux_pdata[] = { + { .fw_name = "timer-pulse-in", }, + { .hw = &timer0_src_rate.clk.hw, }, +}; + +K230_CLK_MUX_FORMAT(timer0_mux, + K230_TIMER0_MUX, + 0x50, 7, 0x1, + 0, 0, + k230_timer0_mux_pdata); + +K230_CLK_GATE_FORMAT(timer0_gate, + K230_TIMER0_GATE, + 0x50, 13, CLK_IGNORE_UNUSED, 0, + &timer0_mux.clk.hw); + +static const struct clk_parent_data k230_timer1_mux_pdata[] = { + { .fw_name = "timer-pulse-in", }, + { .hw = &timer1_src_rate.clk.hw, }, +}; + +K230_CLK_MUX_FORMAT(timer1_mux, + K230_TIMER1_MUX, + 0x50, 8, 0x1, + 0, 0, + k230_timer1_mux_pdata); + +K230_CLK_GATE_FORMAT(timer1_gate, + K230_TIMER1_GATE, + 0x50, 14, CLK_IGNORE_UNUSED, 0, + &timer1_mux.clk.hw); + +static const struct clk_parent_data k230_timer2_mux_pdata[] = { + { .fw_name = "timer-pulse-in", }, + { .hw = &timer2_src_rate.clk.hw, }, +}; + +K230_CLK_MUX_FORMAT(timer2_mux, + K230_TIMER2_MUX, + 0x50, 9, 0x1, + 0, 0, + k230_timer2_mux_pdata); + +K230_CLK_GATE_FORMAT(timer2_gate, + K230_TIMER2_GATE, + 0x50, 15, CLK_IGNORE_UNUSED, 0, + &timer2_mux.clk.hw); + +static const struct clk_parent_data k230_timer3_mux_pdata[] = { + { .fw_name = "timer-pulse-in", }, + { .hw = &timer3_src_rate.clk.hw, }, +}; + +K230_CLK_MUX_FORMAT(timer3_mux, + K230_TIMER3_MUX, + 0x50, 10, 0x1, + 0, 0, + k230_timer3_mux_pdata); + +K230_CLK_GATE_FORMAT(timer3_gate, + K230_TIMER3_GATE, + 0x50, 16, CLK_IGNORE_UNUSED, 0, + &timer3_mux.clk.hw); + +static const struct clk_parent_data k230_timer4_mux_pdata[] = { + { .fw_name = "timer-pulse-in", }, + { .hw = &timer4_src_rate.clk.hw, }, +}; + +K230_CLK_MUX_FORMAT(timer4_mux, + K230_TIMER4_MUX, + 0x50, 11, 0x1, + 0, 0, + k230_timer4_mux_pdata); + +K230_CLK_GATE_FORMAT(timer4_gate, + K230_TIMER4_GATE, + 0x50, 17, CLK_IGNORE_UNUSED, 0, + &timer4_mux.clk.hw); + +static const struct clk_parent_data k230_timer5_mux_pdata[] = { + { .fw_name = "timer-pulse-in", }, + { .hw = &timer5_src_rate.clk.hw, }, +}; + +K230_CLK_MUX_FORMAT(timer5_mux, + K230_TIMER5_MUX, + 0x50, 12, 0x1, + 0, 0, + k230_timer5_mux_pdata); + +K230_CLK_GATE_FORMAT(timer5_gate, + K230_TIMER5_GATE, + 0x50, 18, CLK_IGNORE_UNUSED, 0, + &timer5_mux.clk.hw); + +K230_CLK_GATE_FORMAT(shrm_apb_gate, + K230_SHRM_APB_GATE, + 0x5C, 0, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(shrm_apb_rate, + K230_SHRM_APB_RATE, + 1, 1, 0, 0, + 1, 8, 18, 0x7, + 0x5C, 31, div, 0x0, + false, 0, + &shrm_apb_gate.clk.hw); + +static const struct clk_parent_data k230_shrm_sram_mux_pdata[] = { + { .hw = &pll3_div2.hw, }, + { .hw = &pll0_div2.hw, }, +}; + +K230_CLK_MUX_FORMAT(shrm_sram_mux, + K230_SHRM_SRAM_MUX, + 0x50, 14, 0x1, + 0, 0, + k230_shrm_sram_mux_pdata); + +K230_CLK_GATE_FORMAT(shrm_sram_gate, + K230_SHRM_SRAM_GATE, + 0x5c, 10, CLK_IGNORE_UNUSED, 0, + &shrm_sram_mux.clk.hw); + +K230_CLK_FIXED_FACTOR_FORMAT(shrm_sram_div2, + 1, 2, 0, + &shrm_sram_gate.clk.hw); + +K230_CLK_GATE_FORMAT(shrm_axi_slave_gate, + K230_SHRM_AXI_SLAVE_GATE, + 0x5C, 11, CLK_IGNORE_UNUSED, 0, + &shrm_sram_div2.hw); + +K230_CLK_GATE_FORMAT(shrm_axi_gate, + K230_SHRM_AXI_GATE, + 0x5C, 12, 0, 0, + &pll0_div4.hw); + +K230_CLK_GATE_FORMAT(shrm_nonai2d_axi_gate, + K230_SHRM_NONAI2D_AXI_GATE, + 0x5C, 9, 0, 0, + &shrm_axi_gate.clk.hw); + +K230_CLK_GATE_FORMAT(shrm_decompress_axi_gate, + K230_SHRM_DECOMPRESS_AXI_GATE, + 0x5C, 7, CLK_IGNORE_UNUSED, 0, + &shrm_sram_gate.clk.hw); + +K230_CLK_GATE_FORMAT(shrm_sdma_axi_gate, + K230_SHRM_SDMA_AXI_GATE, + 0x5C, 5, 0, 0, + &shrm_axi_gate.clk.hw); + +K230_CLK_GATE_FORMAT(shrm_pdma_axi_gate, + K230_SHRM_PDMA_AXI_GATE, + 0x5C, 3, 0, 0, + &shrm_axi_gate.clk.hw); + +static const struct clk_parent_data k230_ddrc_src_mux_pdata[] = { + { .hw = &pll0_div2.hw, }, + { .hw = &pll0_div3.hw, }, + { .hw = &pll2_div4.hw, }, +}; + +K230_CLK_MUX_FORMAT(ddrc_src_mux, + K230_DDRC_SRC_MUX, + 0x60, 0, 0x3, + 0, 0, + k230_ddrc_src_mux_pdata); + +K230_CLK_GATE_FORMAT(ddrc_src_gate, + K230_DDRC_SRC_GATE, + 0x60, 2, CLK_IGNORE_UNUSED, 0, + &ddrc_src_mux.clk.hw); + +K230_CLK_RATE_FORMAT(ddrc_src_rate, + K230_DDRC_SRC_RATE, + 1, 1, 0, 0, + 1, 16, 10, 0xF, + 0x60, 31, div, 0x0, + false, 0, + &ddrc_src_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ddrc_bypass_gate, + K230_DDRC_BYPASS_GATE, + 0x60, 8, 0, 0, + &pll2_div4.hw); + +K230_CLK_GATE_FORMAT(ddrc_apb_gate, + K230_DDRC_APB_GATE, + 0x60, 9, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(ddrc_apb_rate, + K230_DDRC_APB_RATE, + 1, 1, 0, 0, + 1, 16, 14, 0xF, + 0x60, 31, div, 0x0, + false, 0, + &ddrc_apb_gate.clk.hw); + +K230_CLK_GATE_FORMAT(display_ahb_gate, + K230_DISPLAY_AHB_GATE, + 0x74, 0, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(display_ahb_rate, + K230_DISPLAY_AHB_RATE, + 1, 1, 0, 0, + 1, 8, 0, 0x7, + 0x78, 31, div, 0x0, + false, 0, + &display_ahb_gate.clk.hw); + +K230_CLK_GATE_FORMAT(display_axi_gate, + K230_DISPLAY_AXI_GATE, + 0x74, 1, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(display_clkext_rate, + K230_DISPLAY_CLKEXT_RATE, + 1, 1, 0, 0, + 1, 16, 16, 0xF, + 0x78, 31, div, 0x0, + false, 0, + &display_axi_gate.clk.hw); + +K230_CLK_GATE_FORMAT(display_gpu_gate, + K230_DISPLAY_GPU_GATE, + 0x74, 6, 0, 0, + &pll0_div3.hw); + +K230_CLK_RATE_FORMAT(display_gpu_rate, + K230_DISPLAY_GPU_RATE, + 1, 1, 0, 0, + 1, 16, 20, 0xF, + 0x78, 31, div, 0x0, + false, 0, + &display_gpu_gate.clk.hw); + +K230_CLK_GATE_FORMAT(display_dpip_gate, + K230_DISPLAY_DPIP_GATE, + 0x74, 2, 0, 0, + &pll1_div4.hw); + +K230_CLK_RATE_FORMAT(display_dpip_rate, + K230_DISPLAY_DPIP_RATE, + 1, 1, 0, 0, + 1, 256, 3, 0xFF, + 0x78, 31, div, 0x0, + false, 0, + &display_dpip_gate.clk.hw); + +K230_CLK_GATE_FORMAT(display_cfg_gate, + K230_DISPLAY_CFG_GATE, + 0x74, 4, 0, 0, + &pll1_div4.hw); + +K230_CLK_RATE_FORMAT(display_cfg_rate, + K230_DISPLAY_CFG_RATE, + 1, 1, 0, 0, + 1, 32, 11, 0x1F, + 0x78, 31, div, 0x0, + false, 0, + &display_cfg_gate.clk.hw); + +K230_CLK_GATE_FORMAT_PNAME(display_ref_gate, + K230_DISPLAY_REF_GATE, + 0x74, 3, 0, 0, + "osc24m"); + +K230_CLK_GATE_FORMAT(vpu_src_gate, + K230_VPU_SRC_GATE, + 0xC, 0, 0, 0, + &pll0_div2.hw); + +K230_CLK_RATE_FORMAT(vpu_src_rate, + K230_VPU_SRC_RATE, + 1, 16, 1, 0xF, + 16, 16, 0, 0, + 0x0, 31, mul, 0xC, + false, 0, + &vpu_src_gate.clk.hw); + +K230_CLK_RATE_FORMAT(vpu_axi_src_rate, + K230_VPU_AXI_SRC_RATE, + 1, 1, 0, 0, + 1, 16, 6, 0xF, + 0xC, 31, div, 0x0, + false, 0, + &vpu_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(vpu_axi_gate, + K230_VPU_AXI_GATE, + 0xC, 5, 0, 0, + &vpu_axi_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(vpu_ddrcp2_gate, + K230_VPU_DDRCP2_GATE, + 0x60, 5, 0, 0, + &vpu_axi_src_rate.clk.hw); + +K230_CLK_GATE_FORMAT(vpu_cfg_gate, + K230_VPU_CFG_GATE, + 0xC, 10, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(vpu_cfg_rate, + K230_VPU_CFG_RATE, + 1, 1, 0, 0, + 1, 16, 11, 0xF, + 0xC, 31, div, 0x0, + false, 0, + &vpu_cfg_gate.clk.hw); + +K230_CLK_GATE_FORMAT(sec_apb_gate, + K230_SEC_APB_GATE, + 0x80, 0, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(sec_apb_rate, + K230_SEC_APB_RATE, + 1, 1, 0, 0, + 1, 8, 1, 0x7, + 0x80, 31, div, 0x0, + false, 0, + &sec_apb_gate.clk.hw); + +K230_CLK_GATE_FORMAT(sec_fix_gate, + K230_SEC_FIX_GATE, + 0x80, 5, 0, 0, + &pll1_div4.hw); + +K230_CLK_RATE_FORMAT(sec_fix_rate, + K230_SEC_FIX_RATE, + 1, 1, 0, 0, + 1, 32, 6, 0x1F, + 0x80, 31, div, 0x0, + false, 0, + &sec_fix_gate.clk.hw); + +K230_CLK_GATE_FORMAT(sec_axi_gate, + K230_SEC_AXI_GATE, + 0x80, 4, 0, 0, + &pll1_div4.hw); + +K230_CLK_RATE_FORMAT(sec_axi_rate, + K230_SEC_AXI_RATE, + 1, 1, 0, 0, + 1, 8, 11, 0x3, + 0x80, 31, div, 0, + false, 0, + &sec_axi_gate.clk.hw); + +K230_CLK_GATE_FORMAT(usb_480m_gate, + K230_USB_480M_GATE, + 0x100, 0, 0, 0, + &pll1.hw); + +K230_CLK_RATE_FORMAT(usb_480m_rate, + K230_USB_480M_RATE, + 1, 1, 0, 0, + 1, 8, 1, 0x7, + 0x100, 31, div, 0, + false, 0, + &usb_480m_gate.clk.hw); + +K230_CLK_GATE_FORMAT(usb_100m_gate, + K230_USB_100M_GATE, + 0x100, 0, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(usb_100m_rate, + K230_USB_100M_RATE, + 1, 1, 0, 0, + 1, 8, 4, 0x7, + 0x100, 31, div, 0, + false, 0, + &usb_100m_gate.clk.hw); + +K230_CLK_GATE_FORMAT(dphy_dft_gate, + K230_DPHY_DFT_GATE, + 0x100, 0, 0, 0, + &pll0.hw); + +K230_CLK_RATE_FORMAT(dphy_dft_rate, + K230_DPHY_DFT_RATE, + 1, 1, 0, 0, + 1, 16, 1, 0xF, + 0x104, 31, div, 0, + false, 0, + &dphy_dft_gate.clk.hw); + +K230_CLK_GATE_FORMAT(spi2axi_gate, + K230_SPI2AXI_GATE, + 0x108, 0, 0, 0, + &pll0_div4.hw); + +K230_CLK_RATE_FORMAT(spi2axi_rate, + K230_SPI2AXI_RATE, + 1, 1, 0, 0, + 1, 8, 1, 0x7, + 0x108, 31, div, 0x0, + false, 0, + &spi2axi_gate.clk.hw); + +static const struct clk_parent_data k230_ai_src_mux_pdata[] = { + { .hw = &pll0_div2.hw, }, + { .hw = &pll3_div2.hw, }, +}; + +K230_CLK_MUX_FORMAT(ai_src_mux, + K230_AI_SRC_MUX, + 0x8, 2, 0x1, + 0, 0, + k230_ai_src_mux_pdata); + +K230_CLK_GATE_FORMAT(ai_src_gate, + K230_AI_SRC_GATE, + 0x8, 0, CLK_IGNORE_UNUSED, 0, + &ai_src_mux.clk.hw); + +K230_CLK_RATE_FORMAT(ai_src_rate, + K230_AI_SRC_RATE, + 1, 1, 0, 0, + 1, 8, 3, 0x7, + 0x8, 31, div, 0x0, + false, 0, + &ai_src_gate.clk.hw); + +K230_CLK_GATE_FORMAT(ai_axi_gate, + K230_AI_AXI_GATE, + 0x8, 10, 0, 0, + &ai_src_rate.clk.hw); + +static const struct clk_parent_data k230_camera0_mux_pdata[] = { + { .hw = &pll1_div3.hw, }, + { .hw = &pll1_div4.hw, }, + { .hw = &pll0_div4.hw, }, +}; + +K230_CLK_MUX_FORMAT(camera0_mux, + K230_CAMERA0_MUX, + 0x6C, 3, 0x3, + 0, 0, + k230_camera0_mux_pdata); + +K230_CLK_GATE_FORMAT(camera0_gate, + K230_CAMERA0_GATE, + 0x6C, 0, CLK_IGNORE_UNUSED, 0, + &camera0_mux.clk.hw); + +K230_CLK_RATE_FORMAT(camera0_rate, + K230_CAMERA0_RATE, + 1, 1, 0, 0, + 1, 32, 5, 0x1f, + 0x6C, 31, div, 0x0, + false, 0, + &camera0_gate.clk.hw); + +static const struct clk_parent_data k230_camera1_mux_pdata[] = { + { .hw = &pll1_div3.hw, }, + { .hw = &pll1_div4.hw, }, + { .hw = &pll0_div4.hw, }, +}; + +K230_CLK_MUX_FORMAT(camera1_mux, + K230_CAMERA1_MUX, + 0x6C, 10, 0x3, + 0, 0, + k230_camera1_mux_pdata); + +K230_CLK_GATE_FORMAT(camera1_gate, + K230_CAMERA1_GATE, + 0x6C, 1, CLK_IGNORE_UNUSED, 0, + &camera1_mux.clk.hw); + +K230_CLK_RATE_FORMAT(camera1_rate, + K230_CAMERA1_RATE, + 1, 1, 0, 0, + 1, 32, 12, 0x1f, + 0x6C, 31, div, 0x0, + false, 0, + &camera1_gate.clk.hw); + +static const struct clk_parent_data k230_camera2_mux_pdata[] = { + { .hw = &pll1_div3.hw, }, + { .hw = &pll1_div4.hw, }, + { .hw = &pll0_div4.hw, }, +}; + +K230_CLK_MUX_FORMAT(camera2_mux, + K230_CAMERA2_MUX, + 0x6C, 17, 0x3, + 0, 0, + k230_camera2_mux_pdata); + +K230_CLK_GATE_FORMAT(camera2_gate, + K230_CAMERA2_GATE, + 0x6C, 2, CLK_IGNORE_UNUSED, 0, + &camera2_mux.clk.hw); + +K230_CLK_RATE_FORMAT(camera2_rate, + K230_CAMERA2_RATE, + 1, 1, 0, 0, + 1, 32, 19, 0x1f, + 0x6C, 31, div, 0x0, + false, 0, + &camera2_gate.clk.hw); + +static struct k230_clk_mux *k230_clk_muxs[] = { + &hs_ospi_src_mux, + &hs_usb_ref_mux, + &cpu1_src_mux, + &timer0_mux, + &timer1_mux, + &timer2_mux, + &timer3_mux, + &timer4_mux, + &timer5_mux, + &shrm_sram_mux, + &ddrc_src_mux, + &ai_src_mux, + &camera0_mux, + &camera1_mux, + &camera2_mux, +}; + +#define K230_CLK_MUX_NUM ARRAY_SIZE(k230_clk_muxs) + +static struct k230_clk_gate *k230_clk_gates[] = { + &cpu0_src_gate, + &cpu0_plic_gate, + &cpu0_noc_ddrcp4_gate, + &cpu0_apb_gate, + &cpu1_src_gate, + &cpu1_plic_gate, + &cpu1_apb_gate, + &pmu_apb_gate, + &hs_hclk_high_gate, + &hs_hclk_src_gate, + &hs_sd0_ahb_gate, + &hs_sd1_ahb_gate, + &hs_ssi1_ahb_gate, + &hs_ssi2_ahb_gate, + &hs_usb0_ahb_gate, + &hs_usb1_ahb_gate, + &hs_ssi0_axi_gate, + &hs_ssi1_gate, + &hs_ssi2_gate, + &hs_qspi_axi_src_gate, + &hs_ssi1_axi_gate, + &hs_ssi2_axi_gate, + &hs_sd_card_src_gate, + &hs_sd0_card_gate, + &hs_sd1_card_gate, + &hs_sd_axi_src_gate, + &hs_sd0_axi_gate, + &hs_sd1_axi_gate, + &hs_sd0_base_gate, + &hs_sd1_base_gate, + &hs_ospi_src_gate, + &hs_sd_timer_src_gate, + &hs_sd0_timer_gate, + &hs_sd1_timer_gate, + &hs_usb0_ref_gate, + &hs_usb1_ref_gate, + &ls_apb_src_gate, + &ls_uart0_apb_gate, + &ls_uart1_apb_gate, + &ls_uart2_apb_gate, + &ls_uart3_apb_gate, + &ls_uart4_apb_gate, + &ls_i2c0_apb_gate, + &ls_i2c1_apb_gate, + &ls_i2c2_apb_gate, + &ls_i2c3_apb_gate, + &ls_i2c4_apb_gate, + &ls_gpio_apb_gate, + &ls_pwm_apb_gate, + &ls_jamlink0_apb_gate, + &ls_jamlink1_apb_gate, + &ls_jamlink2_apb_gate, + &ls_jamlink3_apb_gate, + &ls_audio_apb_gate, + &ls_adc_apb_gate, + &ls_codec_apb_gate, + &ls_i2c0_gate, + &ls_i2c1_gate, + &ls_i2c2_gate, + &ls_i2c3_gate, + &ls_i2c4_gate, + &ls_codec_adc_gate, + &ls_codec_dac_gate, + &ls_audio_dev_gate, + &ls_pdm_gate, + &ls_adc_gate, + &ls_uart0_gate, + &ls_uart1_gate, + &ls_uart2_gate, + &ls_uart3_gate, + &ls_uart4_gate, + &ls_jamlink0co_gate, + &ls_jamlink1co_gate, + &ls_jamlink2co_gate, + &ls_jamlink3co_gate, + &ls_gpio_debounce_gate, + &sysctl_wdt0_apb_gate, + &sysctl_wdt1_apb_gate, + &sysctl_timer_apb_gate, + &sysctl_iomux_apb_gate, + &sysctl_mailbox_apb_gate, + &sysctl_hdi_gate, + &sysctl_time_stamp_gate, + &sysctl_wdt0_gate, + &sysctl_wdt1_gate, + &timer0_gate, + &timer1_gate, + &timer2_gate, + &timer3_gate, + &timer4_gate, + &timer5_gate, + &shrm_apb_gate, + &shrm_sram_gate, + &shrm_axi_gate, + &shrm_axi_slave_gate, + &shrm_nonai2d_axi_gate, + &shrm_decompress_axi_gate, + &shrm_sdma_axi_gate, + &shrm_pdma_axi_gate, + &ddrc_src_gate, + &ddrc_bypass_gate, + &ddrc_apb_gate, + &display_ahb_gate, + &display_axi_gate, + &display_gpu_gate, + &display_dpip_gate, + &display_cfg_gate, + &display_ref_gate, + &vpu_src_gate, + &vpu_axi_gate, + &vpu_ddrcp2_gate, + &vpu_cfg_gate, + &sec_apb_gate, + &sec_fix_gate, + &sec_axi_gate, + &usb_480m_gate, + &usb_100m_gate, + &dphy_dft_gate, + &spi2axi_gate, + &ai_src_gate, + &ai_axi_gate, + &camera0_gate, + &camera1_gate, + &camera2_gate, +}; + +#define K230_CLK_GATE_NUM ARRAY_SIZE(k230_clk_gates) + +static struct k230_clk_rate *k230_clk_rates[] = { + &cpu0_src_rate, + &cpu0_axi_rate, + &cpu0_plic_rate, + &cpu0_apb_rate, + &cpu1_src_rate, + &cpu1_axi_rate, + &cpu1_plic_rate, + &cpu1_apb_rate, + &hs_hclk_high_src_rate, + &hs_hclk_src_rate, + &hs_ssi0_axi_rate, + &hs_ssi1_rate, + &hs_ssi2_rate, + &hs_qspi_axi_src_rate, + &hs_sd_card_src_rate, + &hs_sd_axi_src_rate, + &hs_usb_ref_50m_rate, + &hs_sd_timer_src_rate, + &ls_apb_src_rate, + &ls_gpio_debounce_rate, + &ls_i2c0_rate, + &ls_i2c1_rate, + &ls_i2c2_rate, + &ls_i2c3_rate, + &ls_i2c4_rate, + &ls_codec_adc_rate, + &ls_codec_dac_rate, + &ls_audio_dev_rate, + &ls_pdm_rate, + &ls_adc_rate, + &ls_uart0_rate, + &ls_uart1_rate, + &ls_uart2_rate, + &ls_uart3_rate, + &ls_uart4_rate, + &ls_jamlinkco_src_rate, + &sysctl_hdi_rate, + &sysctl_time_stamp_rate, + &sysctl_temp_sensor_rate, + &sysctl_wdt0_rate, + &sysctl_wdt1_rate, + &timer0_src_rate, + &timer1_src_rate, + &timer2_src_rate, + &timer3_src_rate, + &timer4_src_rate, + &timer5_src_rate, + &shrm_apb_rate, + &ddrc_src_rate, + &ddrc_apb_rate, + &display_ahb_rate, + &display_clkext_rate, + &display_gpu_rate, + &display_dpip_rate, + &display_cfg_rate, + &vpu_src_rate, + &vpu_axi_src_rate, + &vpu_cfg_rate, + &sec_apb_rate, + &sec_fix_rate, + &sec_axi_rate, + &usb_480m_rate, + &usb_100m_rate, + &dphy_dft_rate, + &spi2axi_rate, + &ai_src_rate, + &camera0_rate, + &camera1_rate, + &camera2_rate, +}; + +#define K230_CLK_RATE_NUM ARRAY_SIZE(k230_clk_rates) + +#define K230_CLK_NUM (K230_CLK_MUX_NUM + K230_CLK_GATE_NUM + K230_CLK_RATE_NUM + 1) + +static int k230_pll_prepare(struct clk_hw *hw) +{ + struct k230_pll *pll = hw_to_k230_pll(hw); + u32 reg; + + /* wait for PLL lock until it reaches lock status */ + return readl_poll_timeout(K230_PLLX_LOCK_ADDR(pll->reg, pll->id), reg, + reg & K230_PLL_LOCK_STATUS_MASK, + K230_PLL_LOCK_TIME_DELAY, K230_PLL_LOCK_TIMEOUT); +} + +static inline bool k230_pll_hw_is_enabled(struct k230_pll *pll) +{ + return readl(K230_PLLX_GATE_ADDR(pll->reg, pll->id)) & K230_PLL_GATE_ENABLE; +} + +static void k230_pll_enable_hw(struct k230_pll *pll) +{ + u32 reg; + + if (k230_pll_hw_is_enabled(pll)) + return; + + reg = readl(K230_PLLX_GATE_ADDR(pll->reg, pll->id)); + reg |= K230_PLL_GATE_ENABLE | K230_PLL_GATE_WRITE_ENABLE; + writel(reg, K230_PLLX_GATE_ADDR(pll->reg, pll->id)); +} + +static int k230_pll_enable(struct clk_hw *hw) +{ + struct k230_pll *pll = hw_to_k230_pll(hw); + + guard(spinlock)(pll->lock); + + k230_pll_enable_hw(pll); + + return 0; +} + +static void k230_pll_disable(struct clk_hw *hw) +{ + struct k230_pll *pll = hw_to_k230_pll(hw); + u32 reg; + + guard(spinlock)(pll->lock); + + reg = readl(K230_PLLX_GATE_ADDR(pll->reg, pll->id)); + reg &= ~(K230_PLL_GATE_ENABLE); + reg |= (K230_PLL_GATE_WRITE_ENABLE); + writel(reg, K230_PLLX_GATE_ADDR(pll->reg, pll->id)); +} + +static int k230_pll_is_enabled(struct clk_hw *hw) +{ + return k230_pll_hw_is_enabled(hw_to_k230_pll(hw)); +} + +static unsigned long k230_pll_get_rate(struct clk_hw *hw, unsigned long parent_rate) +{ + struct k230_pll *pll = hw_to_k230_pll(hw); + u32 reg; + u32 r, f, od; + + guard(spinlock)(pll->lock); + + reg = readl(K230_PLLX_BYPASS_ADDR(pll->reg, pll->id)); + if (reg & K230_PLL_BYPASS_ENABLE) + return parent_rate; + + reg = readl(K230_PLLX_LOCK_ADDR(pll->reg, pll->id)); + if (!(reg & (K230_PLL_LOCK_STATUS_MASK))) + return 0; + + reg = readl(K230_PLLX_DIV_ADDR(pll->reg, pll->id)); + r = ((reg >> K230_PLL_R_SHIFT) & K230_PLL_R_MASK) + 1; + f = ((reg >> K230_PLL_F_SHIFT) & K230_PLL_F_MASK) + 1; + od = ((reg >> K230_PLL_OD_SHIFT) & K230_PLL_OD_MASK) + 1; + + return mul_u64_u32_div(parent_rate, f, r * od); +} + +static int k230_register_plls(struct platform_device *pdev, spinlock_t *lock, + void __iomem *reg) +{ + int i, ret; + struct k230_pll *pll; + + for (i = 0; i < ARRAY_SIZE(k230_plls); i++) { + const char *name; + + pll = k230_plls[i]; + + name = pll->hw.init->name; + pll->lock = lock; + pll->reg = reg; + + ret = devm_clk_hw_register(&pdev->dev, &pll->hw); + if (ret) + return ret; + + ret = devm_clk_hw_register_clkdev(&pdev->dev, &pll->hw, name, NULL); + if (ret) + return ret; + } + + return 0; +} + +static int k230_register_pll_divs(struct platform_device *pdev) +{ + struct clk_fixed_factor *pll_div; + int ret; + + for (int i = 0; i < ARRAY_SIZE(k230_pll_divs); i++) { + const char *name; + + pll_div = k230_pll_divs[i]; + + name = pll_div->hw.init->name; + + ret = devm_clk_hw_register(&pdev->dev, &pll_div->hw); + if (ret) + return ret; + + ret = devm_clk_hw_register_clkdev(&pdev->dev, &pll_div->hw, + name, NULL); + if (ret) + return ret; + } + + return 0; +} + +static unsigned long k230_clk_get_rate_mul(struct clk_hw *hw, + unsigned long parent_rate) +{ + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); + struct k230_clk_rate_self *rate_self = &clk->clk; + u32 mul, div; + + guard(spinlock)(rate_self->lock); + + div = rate_self->div_max; + mul = (readl(rate_self->reg + clk->mul_reg_off) >> rate_self->mul_shift) + & rate_self->mul_mask; + + return mul_u64_u32_div(parent_rate, mul + 1, div); +} + +static unsigned long k230_clk_get_rate_div(struct clk_hw *hw, + unsigned long parent_rate) +{ + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); + struct k230_clk_rate_self *rate_self = &clk->clk; + u32 mul, div; + + guard(spinlock)(rate_self->lock); + + mul = rate_self->mul_max; + div = (readl(rate_self->reg + clk->div_reg_off) >> rate_self->div_shift) + & rate_self->div_mask; + + return mul_u64_u32_div(parent_rate, mul, div + 1); +} + +static unsigned long k230_clk_get_rate_mul_div(struct clk_hw *hw, + unsigned long parent_rate) +{ + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); + struct k230_clk_rate_self *rate_self = &clk->clk; + u32 mul, div; + + guard(spinlock)(rate_self->lock); + + div = (readl(rate_self->reg + clk->div_reg_off) >> rate_self->div_shift) + & rate_self->div_mask; + mul = (readl(rate_self->reg + clk->mul_reg_off) >> rate_self->mul_shift) + & rate_self->mul_mask; + + return mul_u64_u32_div(parent_rate, mul, div); +} + +static int k230_clk_find_approximate_mul(u32 mul_min, u32 mul_max, + u32 div_min, u32 div_max, + unsigned long rate, unsigned long parent_rate, + u32 *div, u32 *mul) +{ + long abs_min; + long abs_current; + long perfect_divide; + + if (!rate || !parent_rate || !mul_min) + return -EINVAL; + + perfect_divide = (long)((parent_rate * 1000) / rate); + abs_min = abs(perfect_divide - + (long)(((long)div_max * 1000) / (long)mul_min)); + *mul = mul_min; + + for (u32 i = mul_min + 1; i <= mul_max; i++) { + abs_current = abs(perfect_divide - + (long)(((long)div_max * 1000) / (long)i)); + + if (abs_min > abs_current) { + abs_min = abs_current; + *mul = i; + } + } + + *div = div_max; + + return 0; +} + +static int k230_clk_find_approximate_div(u32 mul_min, u32 mul_max, + u32 div_min, u32 div_max, + unsigned long rate, unsigned long parent_rate, + u32 *div, u32 *mul) +{ + long abs_min; + long abs_current; + long perfect_divide; + + if (!rate || !parent_rate || !mul_max) + return -EINVAL; + + perfect_divide = (long)((parent_rate * 1000) / rate); + abs_min = abs(perfect_divide - + (long)(((long)div_min * 1000) / (long)mul_max)); + *div = div_min; + + for (u32 i = div_min + 1; i <= div_max; i++) { + abs_current = abs(perfect_divide - + (long)(((long)i * 1000) / (long)mul_max)); + + if (abs_min > abs_current) { + abs_min = abs_current; + *div = i; + } + } + + *mul = mul_max; + + return 0; +} + +static int k230_clk_find_approximate_mul_div(u32 mul_min, u32 mul_max, + u32 div_min, u32 div_max, + unsigned long rate, + unsigned long parent_rate, + u32 *div, u32 *mul) +{ + long abs_min; + long abs_current; + long perfect_divide; + + if (!rate || !parent_rate || !mul_min) + return -EINVAL; + + perfect_divide = (long)((parent_rate * 1000) / rate); + abs_min = abs(perfect_divide - + (long)(((long)div_max * 1000) / (long)mul_min)); + + *div = div_max; + *mul = mul_min; + + for (u32 i = div_max - 1; i >= div_min; i--) { + for (u32 j = mul_min + 1; j <= mul_max; j++) { + abs_current = abs(perfect_divide - + (long)(((long)i * 1000) / (long)j)); + + if (abs_min > abs_current) { + abs_min = abs_current; + *div = i; + *mul = j; + } + } + } + + return 0; +} + +static long k230_clk_round_rate_mul(struct clk_hw *hw, unsigned long rate, + unsigned long *parent_rate) +{ + struct k230_clk_rate_self *rate_self = hw_to_k230_clk_rate_self(hw); + u32 div, mul; + + if (k230_clk_find_approximate_mul(rate_self->mul_min, rate_self->mul_max, + rate_self->div_min, rate_self->div_max, + rate, *parent_rate, &div, &mul)) + return 0; + + return mul_u64_u32_div(*parent_rate, mul, div); +} + +static long k230_clk_round_rate_div(struct clk_hw *hw, unsigned long rate, + unsigned long *parent_rate) +{ + struct k230_clk_rate_self *rate_self = hw_to_k230_clk_rate_self(hw); + u32 div, mul; + + if (k230_clk_find_approximate_div(rate_self->mul_min, rate_self->mul_max, + rate_self->div_min, rate_self->div_max, + rate, *parent_rate, &div, &mul)) + return 0; + + return mul_u64_u32_div(*parent_rate, mul, div); +} + +static long k230_clk_round_rate_mul_div(struct clk_hw *hw, unsigned long rate, + unsigned long *parent_rate) +{ + struct k230_clk_rate_self *rate_self = hw_to_k230_clk_rate_self(hw); + u32 div, mul; + + if (k230_clk_find_approximate_mul_div(rate_self->mul_min, rate_self->mul_max, + rate_self->div_min, rate_self->div_max, + rate, *parent_rate, &div, &mul)) + return 0; + + return mul_u64_u32_div(*parent_rate, mul, div); +} + +static int k230_clk_set_rate_mul(struct clk_hw *hw, unsigned long rate, + unsigned long parent_rate) +{ + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); + struct k230_clk_rate_self *rate_self = &clk->clk; + u32 div, mul, mul_reg; + + if (rate > parent_rate) + return -EINVAL; + + if (rate_self->read_only) + return 0; + + if (k230_clk_find_approximate_mul(rate_self->mul_min, rate_self->mul_max, + rate_self->div_min, rate_self->div_max, + rate, parent_rate, &div, &mul)) + return -EINVAL; + + guard(spinlock)(rate_self->lock); + + mul_reg = readl(rate_self->reg + clk->mul_reg_off); + mul_reg |= ((mul - 1) & rate_self->mul_mask) << (rate_self->mul_shift); + mul_reg |= BIT(rate_self->write_enable_bit); + writel(mul_reg, rate_self->reg + clk->mul_reg_off); + + return 0; +} + +static int k230_clk_set_rate_div(struct clk_hw *hw, unsigned long rate, + unsigned long parent_rate) +{ + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); + struct k230_clk_rate_self *rate_self = &clk->clk; + u32 div, mul, div_reg; + + if (rate > parent_rate) + return -EINVAL; + + if (rate_self->read_only) + return 0; + + if (k230_clk_find_approximate_div(rate_self->mul_min, rate_self->mul_max, + rate_self->div_min, rate_self->div_max, + rate, parent_rate, &div, &mul)) + return -EINVAL; + + guard(spinlock)(rate_self->lock); + + div_reg = readl(rate_self->reg + clk->div_reg_off); + div_reg |= ((div - 1) & rate_self->div_mask) << (rate_self->div_shift); + div_reg |= BIT(rate_self->write_enable_bit); + writel(div_reg, rate_self->reg + clk->div_reg_off); + + return 0; +} + +static int k230_clk_set_rate_mul_div(struct clk_hw *hw, unsigned long rate, + unsigned long parent_rate) +{ + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); + struct k230_clk_rate_self *rate_self = &clk->clk; + u32 div, mul, div_reg, mul_reg; + + if (rate > parent_rate) + return -EINVAL; + + if (rate_self->read_only) + return 0; + + if (k230_clk_find_approximate_mul_div(rate_self->mul_min, rate_self->mul_max, + rate_self->div_min, rate_self->div_max, + rate, parent_rate, &div, &mul)) + return -EINVAL; + + guard(spinlock)(rate_self->lock); + + div_reg = readl(rate_self->reg + clk->div_reg_off); + div_reg |= ((div - 1) & rate_self->div_mask) << (rate_self->div_shift); + div_reg |= BIT(rate_self->write_enable_bit); + writel(div_reg, rate_self->reg + clk->div_reg_off); + + mul_reg = readl(rate_self->reg + clk->mul_reg_off); + mul_reg |= ((mul - 1) & rate_self->mul_mask) << (rate_self->mul_shift); + mul_reg |= BIT(rate_self->write_enable_bit); + writel(mul_reg, rate_self->reg + clk->mul_reg_off); + + return 0; +} + +static int k230_register_clk(int id, struct clk_hw *hw, struct device *dev, + struct clk_hw_onecell_data *hw_data) +{ + int ret; + + ret = devm_clk_hw_register(dev, hw); + if (ret) + return ret; + + hw_data->hws[id] = hw; + + return 0; +} + +static int k230_register_clks(struct platform_device *pdev, + struct clk_hw_onecell_data *hw_data, + spinlock_t *lock, void __iomem *reg) +{ + int i, ret; + struct device *dev = &pdev->dev; + struct clk_fixed_factor *fixed_factor = &shrm_sram_div2; + struct k230_clk_mux *mux; + struct k230_clk_gate *gate; + struct k230_clk_rate *rate; + + for (i = 0; i < K230_CLK_MUX_NUM; i++) { + mux = k230_clk_muxs[i]; + mux->clk.lock = lock; + mux->clk.reg = reg + mux->reg_off; + + ret = k230_register_clk(mux->id, &mux->clk.hw, dev, hw_data); + if (ret) + return ret; + } + + for (i = 0; i < K230_CLK_GATE_NUM; i++) { + gate = k230_clk_gates[i]; + gate->clk.lock = lock; + gate->clk.reg = reg + gate->reg_off; + + ret = k230_register_clk(gate->id, &gate->clk.hw, dev, hw_data); + if (ret) + return ret; + } + + for (i = 0; i < K230_CLK_RATE_NUM; i++) { + rate = k230_clk_rates[i]; + rate->clk.lock = lock; + rate->clk.reg = reg; + + ret = k230_register_clk(rate->id, &rate->clk.hw, dev, hw_data); + if (ret) + return ret; + } + + ret = k230_register_clk(K230_SHRM_SRAM_DIV2, &fixed_factor->hw, dev, hw_data); + if (ret) + return ret; + + return devm_of_clk_add_hw_provider(&pdev->dev, of_clk_hw_onecell_get, hw_data); +} + +static int k230_clk_init_plls(struct platform_device *pdev) +{ + int ret; + void __iomem *reg; + /* used for all the plls */ + spinlock_t *lock; + + lock = devm_kzalloc(&pdev->dev, sizeof(*lock), GFP_KERNEL); + if (!lock) + return -ENOMEM; + + spin_lock_init(lock); + + reg = devm_platform_ioremap_resource(pdev, 0); + if (IS_ERR(reg)) + return PTR_ERR(reg); + + ret = k230_register_plls(pdev, lock, reg); + if (ret) + return ret; + + ret = k230_register_pll_divs(pdev); + if (ret) + return ret; + + return 0; +} + +static int k230_clk_init_clks(struct platform_device *pdev, + struct clk_hw_onecell_data *hw_data) +{ + int ret; + void __iomem *reg; + /* used for all the clocks */ + spinlock_t *lock; + + lock = devm_kzalloc(&pdev->dev, sizeof(*lock), GFP_KERNEL); + if (!lock) + return -ENOMEM; + + spin_lock_init(lock); + + hw_data->num = K230_CLK_NUM; + + reg = devm_platform_ioremap_resource(pdev, 1); + if (IS_ERR(reg)) + return PTR_ERR(reg); + + ret = k230_register_clks(pdev, hw_data, lock, reg); + if (ret) + return ret; + + return 0; +} + +static int k230_clk_probe(struct platform_device *pdev) +{ + int ret; + struct clk_hw_onecell_data *hw_data; + + hw_data = devm_kzalloc(&pdev->dev, struct_size(hw_data, hws, K230_CLK_NUM), + GFP_KERNEL); + if (!hw_data) + return -ENOMEM; + + ret = k230_clk_init_plls(pdev); + if (ret) + return dev_err_probe(&pdev->dev, ret, "init plls failed\n"); + + ret = k230_clk_init_clks(pdev, hw_data); + if (ret) + return dev_err_probe(&pdev->dev, ret, "init clks failed\n"); + + return 0; +} + +static const struct of_device_id k230_clk_ids[] = { + { .compatible = "canaan,k230-clk" }, + { /* Sentinel */ } +}; +MODULE_DEVICE_TABLE(of, k230_clk_ids); + +static struct platform_driver k230_clk_driver = { + .driver = { + .name = "k230_clock_controller", + .of_match_table = k230_clk_ids, + }, + .probe = k230_clk_probe, +}; +builtin_platform_driver(k230_clk_driver); -- 2.34.1 From kingxukai at zohomail.com Thu Sep 4 20:10:24 2025 From: kingxukai at zohomail.com (Xukai Wang) Date: Fri, 05 Sep 2025 11:10:24 +0800 Subject: [PATCH v8 3/3] riscv: dts: canaan: Add clock definition for K230 In-Reply-To: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> References: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> Message-ID: <20250905-b4-k230-clk-v8-3-96caa02d5428@zohomail.com> This patch describes the clock controller integrated in K230 SoC and replace dummy clocks with the real ones for UARTs. For k230-canmv and k230-evb, they provide an additional external pulse input through a pin to serve as clock source. Co-developed-by: Troy Mitchell Signed-off-by: Troy Mitchell Signed-off-by: Xukai Wang --- arch/riscv/boot/dts/canaan/k230-canmv.dts | 11 +++++++++++ arch/riscv/boot/dts/canaan/k230-evb.dts | 11 +++++++++++ arch/riscv/boot/dts/canaan/k230.dtsi | 26 ++++++++++++++++++-------- 3 files changed, 40 insertions(+), 8 deletions(-) diff --git a/arch/riscv/boot/dts/canaan/k230-canmv.dts b/arch/riscv/boot/dts/canaan/k230-canmv.dts index 9565915cead6ad2381ea8249b616e79575feb896..cf33a5df9ff520a0dbb408864e615f61a115b673 100644 --- a/arch/riscv/boot/dts/canaan/k230-canmv.dts +++ b/arch/riscv/boot/dts/canaan/k230-canmv.dts @@ -17,8 +17,19 @@ ddr: memory at 0 { device_type = "memory"; reg = <0x0 0x0 0x0 0x20000000>; }; + + timerx_pulse_in: clock-50m { + compatible = "fixed-clock"; + #clock-cells = <0>; + clock-frequency = <50000000>; + }; }; &uart0 { status = "okay"; }; + +&sysclk { + clocks = <&osc24m>, <&timerx_pulse_in>; + clock-names = "osc24m", "timer-pulse-in"; +}; diff --git a/arch/riscv/boot/dts/canaan/k230-evb.dts b/arch/riscv/boot/dts/canaan/k230-evb.dts index f898b8e62368c3740d6795fd1e3cb0b261a460ac..24dba44955690e01e53f11d6720e60a81a9f435d 100644 --- a/arch/riscv/boot/dts/canaan/k230-evb.dts +++ b/arch/riscv/boot/dts/canaan/k230-evb.dts @@ -17,8 +17,19 @@ ddr: memory at 0 { device_type = "memory"; reg = <0x0 0x0 0x0 0x20000000>; }; + + timerx_pulse_in: clock-50m { + compatible = "fixed-clock"; + #clock-cells = <0>; + clock-frequency = <50000000>; + }; }; &uart0 { status = "okay"; }; + +&sysclk { + clocks = <&osc24m>, <&timerx_pulse_in>; + clock-names = "osc24m", "timer-pulse-in"; +}; diff --git a/arch/riscv/boot/dts/canaan/k230.dtsi b/arch/riscv/boot/dts/canaan/k230.dtsi index 95c1a3d8fb1192e30113d96d3e96329545bc6ae7..7868cd4c6c9e9d82c9271f8585a71b67738d1ca7 100644 --- a/arch/riscv/boot/dts/canaan/k230.dtsi +++ b/arch/riscv/boot/dts/canaan/k230.dtsi @@ -3,6 +3,7 @@ * Copyright (C) 2024 Yangyu Chen */ +#include #include /dts-v1/; @@ -58,11 +59,11 @@ l2_cache: l2-cache { }; }; - apb_clk: apb-clk-clock { + osc24m: clock-24m { compatible = "fixed-clock"; - clock-frequency = <50000000>; - clock-output-names = "apb_clk"; #clock-cells = <0>; + clock-frequency = <24000000>; + clock-output-names = "osc24m"; }; soc { @@ -89,10 +90,19 @@ clint: timer at f04000000 { interrupts-extended = <&cpu0_intc 3>, <&cpu0_intc 7>; }; + sysclk: clock-controller at 91102000 { + compatible = "canaan,k230-clk"; + reg = <0x0 0x91102000 0x0 0x40>, + <0x0 0x91100000 0x0 0x108>; + clocks = <&osc24m>; + clock-names = "osc24m"; + #clock-cells = <1>; + }; + uart0: serial at 91400000 { compatible = "snps,dw-apb-uart"; reg = <0x0 0x91400000 0x0 0x1000>; - clocks = <&apb_clk>; + clocks = <&sysclk K230_LS_UART0_RATE>; interrupts = <16 IRQ_TYPE_LEVEL_HIGH>; reg-io-width = <4>; reg-shift = <2>; @@ -102,7 +112,7 @@ uart0: serial at 91400000 { uart1: serial at 91401000 { compatible = "snps,dw-apb-uart"; reg = <0x0 0x91401000 0x0 0x1000>; - clocks = <&apb_clk>; + clocks = <&sysclk K230_LS_UART1_RATE>; interrupts = <17 IRQ_TYPE_LEVEL_HIGH>; reg-io-width = <4>; reg-shift = <2>; @@ -112,7 +122,7 @@ uart1: serial at 91401000 { uart2: serial at 91402000 { compatible = "snps,dw-apb-uart"; reg = <0x0 0x91402000 0x0 0x1000>; - clocks = <&apb_clk>; + clocks = <&sysclk K230_LS_UART2_RATE>; interrupts = <18 IRQ_TYPE_LEVEL_HIGH>; reg-io-width = <4>; reg-shift = <2>; @@ -122,7 +132,7 @@ uart2: serial at 91402000 { uart3: serial at 91403000 { compatible = "snps,dw-apb-uart"; reg = <0x0 0x91403000 0x0 0x1000>; - clocks = <&apb_clk>; + clocks = <&sysclk K230_LS_UART3_RATE>; interrupts = <19 IRQ_TYPE_LEVEL_HIGH>; reg-io-width = <4>; reg-shift = <2>; @@ -132,7 +142,7 @@ uart3: serial at 91403000 { uart4: serial at 91404000 { compatible = "snps,dw-apb-uart"; reg = <0x0 0x91404000 0x0 0x1000>; - clocks = <&apb_clk>; + clocks = <&sysclk K230_LS_UART4_RATE>; interrupts = <20 IRQ_TYPE_LEVEL_HIGH>; reg-io-width = <4>; reg-shift = <2>; -- 2.34.1 From zong.li at sifive.com Thu Sep 4 20:27:10 2025 From: zong.li at sifive.com (Zong Li) Date: Fri, 5 Sep 2025 11:27:10 +0800 Subject: [RFC PATCH v2 00/10] RISC-V IOMMU HPM and nested IOMMU support In-Reply-To: References: <20240614142156.29420-3-zong.li@sifive.com> <20250901133629.87310-1-ni_liqiang@126.com> Message-ID: On Tue, Sep 2, 2025 at 12:01?PM Zong Li wrote: > > On Mon, Sep 1, 2025 at 9:37?PM niliqiang wrote: > > > > Hi Zong > > > > Fri, 14 Jun 2024 22:21:48 +0800, Zong Li wrote: > > > > > This patch initialize the pmu stuff and uninitialize it when driver > > > removing. The interrupt handling is also provided, this handler need to > > > be primary handler instead of thread function, because pt_regs is empty > > > when threading the IRQ, but pt_regs is necessary by perf_event_overflow. > > > > > > Signed-off-by: Zong Li > > > --- > > > drivers/iommu/riscv/iommu.c | 65 +++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 65 insertions(+) > > > > > > diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c > > > index 8b6a64c1ad8d..1716b2251f38 100644 > > > --- a/drivers/iommu/riscv/iommu.c > > > +++ b/drivers/iommu/riscv/iommu.c > > > @@ -540,6 +540,62 @@ static irqreturn_t riscv_iommu_fltq_process(int irq, void *data) > > > return IRQ_HANDLED; > > > } > > > > > > +/* > > > + * IOMMU Hardware performance monitor > > > + */ > > > + > > > +/* HPM interrupt primary handler */ > > > +static irqreturn_t riscv_iommu_hpm_irq_handler(int irq, void *dev_id) > > > +{ > > > + struct riscv_iommu_device *iommu = (struct riscv_iommu_device *)dev_id; > > > + > > > + /* Process pmu irq */ > > > + riscv_iommu_pmu_handle_irq(&iommu->pmu); > > > + > > > + /* Clear performance monitoring interrupt pending */ > > > + riscv_iommu_writel(iommu, RISCV_IOMMU_REG_IPSR, RISCV_IOMMU_IPSR_PMIP); > > > + > > > + return IRQ_HANDLED; > > > +} > > > + > > > +/* HPM initialization */ > > > +static int riscv_iommu_hpm_enable(struct riscv_iommu_device *iommu) > > > +{ > > > + int rc; > > > + > > > + if (!(iommu->caps & RISCV_IOMMU_CAPABILITIES_HPM)) > > > + return 0; > > > + > > > + /* > > > + * pt_regs is empty when threading the IRQ, but pt_regs is necessary > > > + * by perf_event_overflow. Use primary handler instead of thread > > > + * function for PM IRQ. > > > + * > > > + * Set the IRQF_ONESHOT flag because this IRQ might be shared with > > > + * other threaded IRQs by other queues. > > > + */ > > > + rc = devm_request_irq(iommu->dev, > > > + iommu->irqs[riscv_iommu_queue_vec(iommu, RISCV_IOMMU_IPSR_PMIP)], > > > + riscv_iommu_hpm_irq_handler, IRQF_ONESHOT | IRQF_SHARED, NULL, iommu); > > > + if (rc) > > > + return rc; > > > + > > > + return riscv_iommu_pmu_init(&iommu->pmu, iommu->reg, dev_name(iommu->dev)); > > > +} > > > + > > > > What are the benefits of initializing the iommu-pmu driver in the iommu driver? > > > > It might be better for the RISC-V IOMMU PMU driver to be loaded as a separate module, as this would allow greater flexibility since different vendors may need to add custom events. > > > > Also, I'm not quite clear on how custom events should be added if the RISC-V iommu-pmu is placed within the iommu driver. > > Hi Liqiang, > My original idea is that, since the IOMMU HPM is not always present, > it depends on the capability.HPM bit, if we separate HPM into an > individual module, I assume that the PMU driver may not have access to > the IOMMU's complete MMIO region. I?m not sure how we would check the > capability register in the PMU driver and avoid the following > situation: capability.HPM is zero, but the IOMMU-PMU driver is still > loaded because the PMU node is present in the DTS. It will be helpful > if you have any suggestions on this. > > Regarding custom events, since we don?t have the driver data, my > current rough idea is to add a vendor event map table to list the > vendor events and use Kconfig to define them respectively. This is > just an initial thought and may not be the good solution, so feel free > to share any recommendations. Of course, if we eventually decide to > move it to drivers/perf as an individual module, then we could use the > driver data for custom events, similar to what ARM does. Maybe let's try auxiliary driver framework to resolve this topic in the next version. > > Thanks > > > > > > > Best regards, > > Liqiang > > From kingxukai at zohomail.com Thu Sep 4 21:12:21 2025 From: kingxukai at zohomail.com (Xukai Wang) Date: Fri, 5 Sep 2025 12:12:21 +0800 Subject: [PATCH v8 2/3] clk: canaan: Add clock driver for Canaan K230 In-Reply-To: <20250905-b4-k230-clk-v8-2-96caa02d5428@zohomail.com> References: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> <20250905-b4-k230-clk-v8-2-96caa02d5428@zohomail.com> Message-ID: Hi Stephen Boyd, Is the driver in this series satisfactory to you? If you have any concerns or suggestions, I would appreciate your feedback.Otherwise, I would like to know if it is ready for merging. Thank you for your time and consideration. On 2025/9/5 11:10, Xukai Wang wrote: > This patch provides basic support for the K230 clock, which covers > all clocks in K230 SoC. > > The clock tree of the K230 SoC consists of a 24MHZ external crystal > oscillator, PLLs and an external pulse input for timerX, and their > derived clocks. > > Co-developed-by: Troy Mitchell > Signed-off-by: Troy Mitchell > Signed-off-by: Xukai Wang > --- > drivers/clk/Kconfig | 6 + > drivers/clk/Makefile | 1 + > drivers/clk/clk-k230.c | 2456 ++++++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 2463 insertions(+) From seanjc at google.com Thu Sep 4 22:39:37 2025 From: seanjc at google.com (Sean Christopherson) Date: Thu, 4 Sep 2025 22:39:37 -0700 Subject: [PATCH v2 0/7] Drivers: hv: Fix NEED_RESCHED_LAZY and use common APIs In-Reply-To: References: <20250828000156.23389-1-seanjc@google.com> Message-ID: On Thu, Sep 04, 2025, Wei Liu wrote: > On Wed, Aug 27, 2025 at 05:01:49PM -0700, Sean Christopherson wrote: > > Fix a bug where MSHV root partitions (and upper-level VTL code) don't honor > > NEED_RESCHED_LAZY, and then deduplicate the TIF related MSHV code by turning > > the "kvm" entry APIs into more generic "virt" APIs. > > > > This version is based on > > > > git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git hyperv-next > > > > in order to pickup the VTL changes that are queued for 6.18. I also > > squashed the NEED_RESCHED_LAZY fixes for root and VTL modes into a single > > patch, as it should be easy/straightforward to drop the VTL change as needed > > if we want this in 6.17 or earlier. > > > > That effectively means the full series is dependent on the VTL changes being > > fully merged for 6.18. But I think that's ok as it's really only the MSHV > > changes that have any urgency whatsoever, and I assume that Microsoft is > > the only user that truly cares about the MSHV root fix. I.e. if the whole > > thing gets delayed, I think it's only the Hyper-V folks that are impacted. > > > > I have no preference what tree this goes through, or when, and can respin > > and/or split as needed. > > > > As with v1, the Hyper-V stuff and non-x86 architectures are compile-tested > > only. > > > > v2: > > - Rebase on hyperv-next. > > - Fix and converge the VTL code as well. [Peter, Nuno] > > > > v1: https://lore.kernel.org/all/20250825200622.3759571-1-seanjc at google.com > > > > I dropped the mshv_vtl changes in this series and applied the rest > (including the KVM changes) to hyperv-next. mshv_do_pre_guest_mode_work() ended up getting left behind since its removal was in the last mshv_vtl patch. $ git grep mshv_do_pre_guest_mode_work drivers/hv/mshv.h:int mshv_do_pre_guest_mode_work(ulong th_flags); drivers/hv/mshv_common.c:int mshv_do_pre_guest_mode_work(ulong th_flags) drivers/hv/mshv_common.c:EXPORT_SYMBOL_GPL(mshv_do_pre_guest_mode_work); Want to squash this into 3786d7d6b3c0 ("mshv: Use common "entry virt" APIs to do work in root before running guest")? --- drivers/hv/mshv.h | 2 -- drivers/hv/mshv_common.c | 22 ---------------------- 2 files changed, 24 deletions(-) diff --git a/drivers/hv/mshv.h b/drivers/hv/mshv.h index 0340a67acd0a..d4813df92b9c 100644 --- a/drivers/hv/mshv.h +++ b/drivers/hv/mshv.h @@ -25,6 +25,4 @@ int hv_call_set_vp_registers(u32 vp_index, u64 partition_id, u16 count, int hv_call_get_partition_property(u64 partition_id, u64 property_code, u64 *property_value); -int mshv_do_pre_guest_mode_work(ulong th_flags); - #endif /* _MSHV_H */ diff --git a/drivers/hv/mshv_common.c b/drivers/hv/mshv_common.c index eb3df3e296bb..aa2be51979fd 100644 --- a/drivers/hv/mshv_common.c +++ b/drivers/hv/mshv_common.c @@ -138,25 +138,3 @@ int hv_call_get_partition_property(u64 partition_id, return 0; } EXPORT_SYMBOL_GPL(hv_call_get_partition_property); - -/* - * Handle any pre-processing before going into the guest mode on this cpu, most - * notably call schedule(). Must be invoked with both preemption and - * interrupts enabled. - * - * Returns: 0 on success, -errno on error. - */ -int mshv_do_pre_guest_mode_work(ulong th_flags) -{ - if (th_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL)) - return -EINTR; - - if (th_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) - schedule(); - - if (th_flags & _TIF_NOTIFY_RESUME) - resume_user_mode_work(NULL); - - return 0; -} -EXPORT_SYMBOL_GPL(mshv_do_pre_guest_mode_work); base-commit: 3786d7d6b3c0a412ebe4439ba4a7d4b0e27d9a12 -- From cleger at rivosinc.com Thu Sep 4 23:29:14 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 5 Sep 2025 08:29:14 +0200 Subject: [PATCH 2/2] riscv: Fix sparse warning about different address spaces In-Reply-To: <20250903-dev-alex-sparse_warnings_v1-v1-2-7e6350beb700@rivosinc.com> References: <20250903-dev-alex-sparse_warnings_v1-v1-0-7e6350beb700@rivosinc.com> <20250903-dev-alex-sparse_warnings_v1-v1-2-7e6350beb700@rivosinc.com> Message-ID: <97fb026b-54d7-42ec-a57c-51c8c8c44a76@rivosinc.com> On 03/09/2025 20:53, Alexandre Ghiti wrote: > We did not propagate the __user attribute of the pointers in > __get_kernel_nofault() and __put_kernel_nofault(), which results in > sparse complaining: > >>> mm/maccess.c:41:17: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected void const [noderef] __user *from @@ got unsigned long long [usertype] * @@ > mm/maccess.c:41:17: sparse: expected void const [noderef] __user *from > mm/maccess.c:41:17: sparse: got unsigned long long [usertype] * > > So fix this by correctly casting those pointers. > > Reported-by: kernel test robot > Closes: https://lore.kernel.org/oe-kbuild-all/202508161713.RWu30Lv1-lkp at intel.com/ > Suggested-by: Al Viro > Fixes: f6bff7827a48 ("riscv: uaccess: use 'asm_goto_output' for get_user()") > Cc: stable at vger.kernel.org > Signed-off-by: Alexandre Ghiti > --- > arch/riscv/include/asm/uaccess.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h > index 551e7490737effb2c238e6a4db50293ece7c9df9..f5f4f7f85543f2a635b18e4bd1c6202b20e3b239 100644 > --- a/arch/riscv/include/asm/uaccess.h > +++ b/arch/riscv/include/asm/uaccess.h > @@ -438,10 +438,10 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n) > } > > #define __get_kernel_nofault(dst, src, type, err_label) \ > - __get_user_nocheck(*((type *)(dst)), (type *)(src), err_label) > + __get_user_nocheck(*((type *)(dst)), (__force __user type *)(src), err_label) > > #define __put_kernel_nofault(dst, src, type, err_label) \ > - __put_user_nocheck(*((type *)(src)), (type *)(dst), err_label) > + __put_user_nocheck(*((type *)(src)), (__force __user type *)(dst), err_label) > > static __must_check __always_inline bool user_access_begin(const void __user *ptr, size_t len) > { > Hi Alex, LGTM, Reviewed-by: Cl?ment L?ger Thanks, Cl?ment From david at redhat.com Thu Sep 4 23:41:23 2025 From: david at redhat.com (David Hildenbrand) Date: Fri, 5 Sep 2025 08:41:23 +0200 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <20250901150359.867252-20-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> Message-ID: <5090355d-546a-4d06-99e1-064354d156b5@redhat.com> On 01.09.25 17:03, David Hildenbrand wrote: > We can just cleanup the code by calculating the #refs earlier, > so we can just inline what remains of record_subpages(). > > Calculate the number of references/pages ahead of times, and record them > only once all our tests passed. > > Signed-off-by: David Hildenbrand > --- > mm/gup.c | 25 ++++++++----------------- > 1 file changed, 8 insertions(+), 17 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index c10cd969c1a3b..f0f4d1a68e094 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -484,19 +484,6 @@ static inline void mm_set_has_pinned_flag(struct mm_struct *mm) > #ifdef CONFIG_MMU > > #ifdef CONFIG_HAVE_GUP_FAST > -static int record_subpages(struct page *page, unsigned long sz, > - unsigned long addr, unsigned long end, > - struct page **pages) > -{ > - int nr; > - > - page += (addr & (sz - 1)) >> PAGE_SHIFT; > - for (nr = 0; addr != end; nr++, addr += PAGE_SIZE) > - pages[nr] = page++; > - > - return nr; > -} > - > /** > * try_grab_folio_fast() - Attempt to get or pin a folio in fast path. > * @page: pointer to page to be grabbed > @@ -2967,8 +2954,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > if (pmd_special(orig)) > return 0; > > - page = pmd_page(orig); > - refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr); > + refs = (end - addr) >> PAGE_SHIFT; > + page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); > > folio = try_grab_folio_fast(page, refs, flags); > if (!folio) > @@ -2989,6 +2976,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > } > > *nr += refs; > + for (; refs; refs--) > + *(pages++) = page++; > folio_set_referenced(folio); > return 1; > } > @@ -3007,8 +2996,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > if (pud_special(orig)) > return 0; > > - page = pud_page(orig); > - refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr); > + refs = (end - addr) >> PAGE_SHIFT; > + page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); > > folio = try_grab_folio_fast(page, refs, flags); > if (!folio) > @@ -3030,6 +3019,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > } > > *nr += refs; > + for (; refs; refs--) > + *(pages++) = page++; > folio_set_referenced(folio); > return 1; > } Okay, this code is nasty. We should rework this code to just return the nr and receive a the proper pages pointer, getting rid of the "*nr" parameter. For the time being, the following should do the trick: commit bfd07c995814354f6b66c5b6a72e96a7aa9fb73b (HEAD -> nth_page) Author: David Hildenbrand Date: Fri Sep 5 08:38:43 2025 +0200 fixup: mm/gup: remove record_subpages() pages is not adjusted by the caller, but idnexed by existing *nr. Signed-off-by: David Hildenbrand diff --git a/mm/gup.c b/mm/gup.c index 010fe56f6e132..22420f2069ee1 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2981,6 +2981,7 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, return 0; } + pages += *nr; *nr += refs; for (; refs; refs--) *(pages++) = page++; @@ -3024,6 +3025,7 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, return 0; } + pages += *nr; *nr += refs; for (; refs; refs--) *(pages++) = page++; -- Cheers David / dhildenb From anup at brainfault.org Thu Sep 4 23:51:20 2025 From: anup at brainfault.org (Anup Patel) Date: Fri, 5 Sep 2025 12:21:20 +0530 Subject: [PATCH V4 RESEND 3/3] RISC-V: KVM: Prevent HGATP_MODE_BARE passed In-Reply-To: <20250821142542.2472079-4-guoren@kernel.org> References: <20250821142542.2472079-1-guoren@kernel.org> <20250821142542.2472079-4-guoren@kernel.org> Message-ID: On Thu, Aug 21, 2025 at 7:56?PM wrote: > > From: "Guo Ren (Alibaba DAMO Academy)" > > urrent kvm_riscv_gstage_mode_detect() assumes H-extension must s/urrent/Current/ > have HGATP_MODE_SV39X4/SV32X4 at least, but the spec allows > H-extension with HGATP_MODE_BARE alone. The KVM depends on > !HGATP_MODE_BARE at least, so enhance the gstage-mode-detect > to block HGATP_MODE_BARE. > > Move gstage-mode-check closer to gstage-mode-detect to prevent > unnecessary init. > > Reviewed-by: Troy Mitchell > Reviewed-by: Nutty Liu > Signed-off-by: Guo Ren (Alibaba DAMO Academy) > --- > arch/riscv/kvm/gstage.c | 27 ++++++++++++++++++++++++--- > arch/riscv/kvm/main.c | 35 +++++++++++++++++------------------ > 2 files changed, 41 insertions(+), 21 deletions(-) > > diff --git a/arch/riscv/kvm/gstage.c b/arch/riscv/kvm/gstage.c > index 24c270d6d0e2..b67d60d722c2 100644 > --- a/arch/riscv/kvm/gstage.c > +++ b/arch/riscv/kvm/gstage.c > @@ -321,7 +321,7 @@ void __init kvm_riscv_gstage_mode_detect(void) > if ((csr_read(CSR_HGATP) >> HGATP_MODE_SHIFT) == HGATP_MODE_SV57X4) { > kvm_riscv_gstage_mode = HGATP_MODE_SV57X4; > kvm_riscv_gstage_pgd_levels = 5; > - goto skip_sv48x4_test; > + goto done; > } > > /* Try Sv48x4 G-stage mode */ > @@ -329,10 +329,31 @@ void __init kvm_riscv_gstage_mode_detect(void) > if ((csr_read(CSR_HGATP) >> HGATP_MODE_SHIFT) == HGATP_MODE_SV48X4) { > kvm_riscv_gstage_mode = HGATP_MODE_SV48X4; > kvm_riscv_gstage_pgd_levels = 4; > + goto done; > } > -skip_sv48x4_test: > > + /* Try Sv39x4 G-stage mode */ > + csr_write(CSR_HGATP, HGATP_MODE_SV39X4 << HGATP_MODE_SHIFT); > + if ((csr_read(CSR_HGATP) >> HGATP_MODE_SHIFT) == HGATP_MODE_SV39X4) { > + kvm_riscv_gstage_mode = HGATP_MODE_SV39X4; > + kvm_riscv_gstage_pgd_levels = 3; > + goto done; > + } > +#else /* CONFIG_32BIT */ > + /* Try Sv32x4 G-stage mode */ > + csr_write(CSR_HGATP, HGATP_MODE_SV32X4 << HGATP_MODE_SHIFT); > + if ((csr_read(CSR_HGATP) >> HGATP_MODE_SHIFT) == HGATP_MODE_SV32X4) { > + kvm_riscv_gstage_mode = HGATP_MODE_SV32X4; > + kvm_riscv_gstage_pgd_levels = 2; > + goto done; > + } > +#endif > + > + /* KVM depends on !HGATP_MODE_OFF */ > + kvm_riscv_gstage_mode = HGATP_MODE_OFF; > + kvm_riscv_gstage_pgd_levels = 0; > + > +done: > csr_write(CSR_HGATP, 0); > kvm_riscv_local_hfence_gvma_all(); > -#endif > } > diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c > index 67c876de74ef..8ee7aaa74ddc 100644 > --- a/arch/riscv/kvm/main.c > +++ b/arch/riscv/kvm/main.c > @@ -93,6 +93,23 @@ static int __init riscv_kvm_init(void) > return rc; > > kvm_riscv_gstage_mode_detect(); > + switch (kvm_riscv_gstage_mode) { > + case HGATP_MODE_SV32X4: > + str = "Sv32x4"; > + break; > + case HGATP_MODE_SV39X4: > + str = "Sv39x4"; > + break; > + case HGATP_MODE_SV48X4: > + str = "Sv48x4"; > + break; > + case HGATP_MODE_SV57X4: > + str = "Sv57x4"; > + break; > + default: Need kvm_riscv_nacl_exit() here. > + return -ENODEV; > + } > + kvm_info("using %s G-stage page table format\n", str); Moving the kvm_info() over here now prints G-stage mode before announcing availablity of h-extension which looks odd. It's better to keep kvm_info() in the same location and only move the switch-case. > > kvm_riscv_gstage_vmid_detect(); > > @@ -135,24 +152,6 @@ static int __init riscv_kvm_init(void) > (rc) ? slist : "no features"); > } > > - switch (kvm_riscv_gstage_mode) { > - case HGATP_MODE_SV32X4: > - str = "Sv32x4"; > - break; > - case HGATP_MODE_SV39X4: > - str = "Sv39x4"; > - break; > - case HGATP_MODE_SV48X4: > - str = "Sv48x4"; > - break; > - case HGATP_MODE_SV57X4: > - str = "Sv57x4"; > - break; > - default: > - return -ENODEV; > - } > - kvm_info("using %s G-stage page table format\n", str); > - > kvm_info("VMID %ld bits available\n", kvm_riscv_gstage_vmid_bits()); > > if (kvm_riscv_aia_available()) > -- > 2.40.1 > Otherwise, this looks good to me. I will take care of minor comments mentioned above at the time of merging this series. Regards, Anup From anup at brainfault.org Thu Sep 4 23:51:46 2025 From: anup at brainfault.org (Anup Patel) Date: Fri, 5 Sep 2025 12:21:46 +0530 Subject: [PATCH V4 RESEND 0/3] Fixup & optimize hgatp mode & vmid detect functions In-Reply-To: <20250821142542.2472079-1-guoren@kernel.org> References: <20250821142542.2472079-1-guoren@kernel.org> Message-ID: On Thu, Aug 21, 2025 at 7:56?PM wrote: > > From: "Guo Ren (Alibaba DAMO Academy)" > > Here are serval fixup & optmizitions for hgatp detect according > to the RISC-V Privileged Architecture Spec. > > --- > Changes in v4: > - Involve ("RISC-V: KVM: Prevent HGATP_MODE_BARE passed"), which > explain why gstage_mode_detect needs reset HGATP to zero. > - RESEND for wrong mailing thread. > > Changes in v3: > - Add "Fixes" tag. > - Involve("RISC-V: KVM: Remove unnecessary HGATP csr_read"), which > depends on patch 1. > > Changes in v2: > - Fixed build error since kvm_riscv_gstage_mode() has been modified. > --- > > Fangyu Yu (1): > RISC-V: KVM: Write hgatp register with valid mode bits > > Guo Ren (Alibaba DAMO Academy) (2): > RISC-V: KVM: Remove unnecessary HGATP csr_read > RISC-V: KVM: Prevent HGATP_MODE_BARE passed > > arch/riscv/kvm/gstage.c | 27 ++++++++++++++++++++++++--- > arch/riscv/kvm/main.c | 35 +++++++++++++++++------------------ > arch/riscv/kvm/vmid.c | 8 +++----- > 3 files changed, 44 insertions(+), 26 deletions(-) > > -- > 2.40.1 > Queued this series for Linux-6.18 Regards, Anup From seanjc at google.com Fri Sep 5 01:23:17 2025 From: seanjc at google.com (Sean Christopherson) Date: Fri, 5 Sep 2025 01:23:17 -0700 Subject: [PATCH v5 6/9] KVM: Add a helper function to check if a gpa is in writable memselot In-Reply-To: References: <20250829-pmu_event_info-v5-0-9dca26139a33@rivosinc.com> <20250829-pmu_event_info-v5-6-9dca26139a33@rivosinc.com> Message-ID: On Wed, Sep 03, 2025, Atish Kumar Patra wrote: > On Fri, Aug 29, 2025 at 1:47?PM Sean Christopherson wrote: > > > > On Fri, Aug 29, 2025, Atish Patra wrote: > > > +static inline bool kvm_is_gpa_in_writable_memslot(struct kvm *kvm, gpa_t gpa) > > > +{ > > > + bool writable; > > > + unsigned long hva = gfn_to_hva_prot(kvm, gpa_to_gfn(gpa), &writable); > > > + > > > + return !kvm_is_error_hva(hva) && writable; > > > > I don't hate this API, but I don't love it either. Because knowing that the > > _memslot_ is writable doesn't mean all that much. E.g. in this usage: > > > > hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable); > > if (kvm_is_error_hva(hva) || !writable) > > return SBI_ERR_INVALID_ADDRESS; > > > > ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta)); > > if (ret) > > return SBI_ERR_FAILURE; > > > > the error code returned to the guest will be different if the memslot is read-only > > versus if the VMA is read-only (or not even mapped!). Unless every read-only > > memslot is explicitly communicated as such to the guest, I don't see how the guest > > can *know* that a memslot is read-only, so returning INVALID_ADDRESS in that case > > but not when the underlying VMA isn't writable seems odd. > > > > It's also entirely possible the memslot could be replaced with a read-only memslot > > after the check, or vice versa, i.e. become writable after being rejected. Is it > > *really* a problem to return FAILURE if the guest attempts to setup steal-time in > > a read-only memslot? I.e. why not do this and call it good? > > > > Reposting the response as gmail converted my previous response as > html. Sorry for the spam. > > From a functionality pov, that should be fine. However, we have > explicit error conditions for read only memory defined in the SBI STA > specification[1]. > Technically, we will violate the spec if we return FAILURE instead of > INVALID_ADDRESS for read only memslot. But KVM is already violating the spec, as kvm_vcpu_write_guest() redoes the memslot lookup and so could encounter a read-only memslot (if it races with a memslot update), and because the underlying memory could be read-only even if the memslot is writable. Why not simply return SBI_ERR_INVALID_ADDRESS on kvm_vcpu_write_guest() failure? The only downside of that is KVM will also return SBI_ERR_INVALID_ADDRESS if the userspace mapping is completely missing, but AFAICT that doesn't seem to be an outright spec violation. From guoren at kernel.org Fri Sep 5 02:24:32 2025 From: guoren at kernel.org (Guo Ren) Date: Fri, 5 Sep 2025 17:24:32 +0800 Subject: [PATCH V4 RESEND 3/3] RISC-V: KVM: Prevent HGATP_MODE_BARE passed In-Reply-To: References: <20250821142542.2472079-1-guoren@kernel.org> <20250821142542.2472079-4-guoren@kernel.org> Message-ID: On Fri, Sep 5, 2025 at 2:51?PM Anup Patel wrote: > > On Thu, Aug 21, 2025 at 7:56?PM wrote: > > > > From: "Guo Ren (Alibaba DAMO Academy)" > > > > urrent kvm_riscv_gstage_mode_detect() assumes H-extension must > > s/urrent/Current/ Oh, my fault about copy & paste. > > > have HGATP_MODE_SV39X4/SV32X4 at least, but the spec allows > > H-extension with HGATP_MODE_BARE alone. The KVM depends on > > !HGATP_MODE_BARE at least, so enhance the gstage-mode-detect > > to block HGATP_MODE_BARE. > > > > Move gstage-mode-check closer to gstage-mode-detect to prevent > > unnecessary init. > > > > Reviewed-by: Troy Mitchell > > Reviewed-by: Nutty Liu > > Signed-off-by: Guo Ren (Alibaba DAMO Academy) > > --- > > arch/riscv/kvm/gstage.c | 27 ++++++++++++++++++++++++--- > > arch/riscv/kvm/main.c | 35 +++++++++++++++++------------------ > > 2 files changed, 41 insertions(+), 21 deletions(-) > > > > diff --git a/arch/riscv/kvm/gstage.c b/arch/riscv/kvm/gstage.c > > index 24c270d6d0e2..b67d60d722c2 100644 > > --- a/arch/riscv/kvm/gstage.c > > +++ b/arch/riscv/kvm/gstage.c > > @@ -321,7 +321,7 @@ void __init kvm_riscv_gstage_mode_detect(void) > > if ((csr_read(CSR_HGATP) >> HGATP_MODE_SHIFT) == HGATP_MODE_SV57X4) { > > kvm_riscv_gstage_mode = HGATP_MODE_SV57X4; > > kvm_riscv_gstage_pgd_levels = 5; > > - goto skip_sv48x4_test; > > + goto done; > > } > > > > /* Try Sv48x4 G-stage mode */ > > @@ -329,10 +329,31 @@ void __init kvm_riscv_gstage_mode_detect(void) > > if ((csr_read(CSR_HGATP) >> HGATP_MODE_SHIFT) == HGATP_MODE_SV48X4) { > > kvm_riscv_gstage_mode = HGATP_MODE_SV48X4; > > kvm_riscv_gstage_pgd_levels = 4; > > + goto done; > > } > > -skip_sv48x4_test: > > > > + /* Try Sv39x4 G-stage mode */ > > + csr_write(CSR_HGATP, HGATP_MODE_SV39X4 << HGATP_MODE_SHIFT); > > + if ((csr_read(CSR_HGATP) >> HGATP_MODE_SHIFT) == HGATP_MODE_SV39X4) { > > + kvm_riscv_gstage_mode = HGATP_MODE_SV39X4; > > + kvm_riscv_gstage_pgd_levels = 3; > > + goto done; > > + } > > +#else /* CONFIG_32BIT */ > > + /* Try Sv32x4 G-stage mode */ > > + csr_write(CSR_HGATP, HGATP_MODE_SV32X4 << HGATP_MODE_SHIFT); > > + if ((csr_read(CSR_HGATP) >> HGATP_MODE_SHIFT) == HGATP_MODE_SV32X4) { > > + kvm_riscv_gstage_mode = HGATP_MODE_SV32X4; > > + kvm_riscv_gstage_pgd_levels = 2; > > + goto done; > > + } > > +#endif > > + > > + /* KVM depends on !HGATP_MODE_OFF */ > > + kvm_riscv_gstage_mode = HGATP_MODE_OFF; > > + kvm_riscv_gstage_pgd_levels = 0; > > + > > +done: > > csr_write(CSR_HGATP, 0); > > kvm_riscv_local_hfence_gvma_all(); > > -#endif > > } > > diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c > > index 67c876de74ef..8ee7aaa74ddc 100644 > > --- a/arch/riscv/kvm/main.c > > +++ b/arch/riscv/kvm/main.c > > @@ -93,6 +93,23 @@ static int __init riscv_kvm_init(void) > > return rc; > > > > kvm_riscv_gstage_mode_detect(); > > + switch (kvm_riscv_gstage_mode) { > > + case HGATP_MODE_SV32X4: > > + str = "Sv32x4"; > > + break; > > + case HGATP_MODE_SV39X4: > > + str = "Sv39x4"; > > + break; > > + case HGATP_MODE_SV48X4: > > + str = "Sv48x4"; > > + break; > > + case HGATP_MODE_SV57X4: > > + str = "Sv57x4"; > > + break; > > + default: > > Need kvm_riscv_nacl_exit() here. Yes, it's another legacy problem, which lacks: kvm_riscv_aia_exit(); kvm_riscv_nacl_exit(); After we move it up, it still needs: kvm_riscv_nacl_exit(); I'm okay with it being fixed in this patch. > > > + return -ENODEV; > > + } > > + kvm_info("using %s G-stage page table format\n", str); > > Moving the kvm_info() over here now prints G-stage mode > before announcing availablity of h-extension which looks odd. > It's better to keep kvm_info() in the same location and only > move the switch-case. okay. > > > > > kvm_riscv_gstage_vmid_detect(); > > > > @@ -135,24 +152,6 @@ static int __init riscv_kvm_init(void) > > (rc) ? slist : "no features"); > > } > > > > - switch (kvm_riscv_gstage_mode) { > > - case HGATP_MODE_SV32X4: > > - str = "Sv32x4"; > > - break; > > - case HGATP_MODE_SV39X4: > > - str = "Sv39x4"; > > - break; > > - case HGATP_MODE_SV48X4: > > - str = "Sv48x4"; > > - break; > > - case HGATP_MODE_SV57X4: > > - str = "Sv57x4"; > > - break; > > - default: > > - return -ENODEV; > > - } > > - kvm_info("using %s G-stage page table format\n", str); > > - > > kvm_info("VMID %ld bits available\n", kvm_riscv_gstage_vmid_bits()); > > > > if (kvm_riscv_aia_available()) > > -- > > 2.40.1 > > > > Otherwise, this looks good to me. > > I will take care of minor comments mentioned above at the > time of merging this series. Thx for taking care. Nice! -- Best Regards Guo Ren From zhangchunyan at iscas.ac.cn Fri Sep 5 03:36:51 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Fri, 5 Sep 2025 18:36:51 +0800 Subject: [PATCH v9 5/5] riscv: mm: Add uffd write-protect support In-Reply-To: <20250905103651.489197-1-zhangchunyan@iscas.ac.cn> References: <20250905103651.489197-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250905103651.489197-6-zhangchunyan@iscas.ac.cn> The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59 for software, this patch uses bit 60 for uffd-wp tracking Additionally for tracking the uffd-wp state as a PTE swap bit, we borrow bit 4 which is not involved into swap entry computation. Signed-off-by: Chunyan Zhang --- arch/riscv/Kconfig | 1 + arch/riscv/include/asm/pgtable-bits.h | 18 +++++++ arch/riscv/include/asm/pgtable.h | 67 +++++++++++++++++++++++++++ 3 files changed, 86 insertions(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 53b73e4bdf3f..f928768bb14a 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -147,6 +147,7 @@ config RISCV select HAVE_ARCH_TRANSPARENT_HUGEPAGE if 64BIT && MMU select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if 64BIT && MMU select HAVE_ARCH_USERFAULTFD_MINOR if 64BIT && USERFAULTFD + select HAVE_ARCH_USERFAULTFD_WP if 64BIT && MMU && USERFAULTFD && RISCV_ISA_SVRSW60T59B select HAVE_ARCH_VMAP_STACK if MMU && 64BIT select HAVE_ASM_MODVERSIONS select HAVE_CONTEXT_TRACKING_USER diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h index 8ffe81bf66d2..894b2a24fc49 100644 --- a/arch/riscv/include/asm/pgtable-bits.h +++ b/arch/riscv/include/asm/pgtable-bits.h @@ -38,6 +38,24 @@ #define _PAGE_SWP_SOFT_DIRTY 0 #endif /* CONFIG_MEM_SOFT_DIRTY */ +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP + +/* ext_svrsw60t59b: Bit(60) for uffd-wp tracking */ +#define _PAGE_UFFD_WP \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + (1UL << 60) : 0) +/* + * Bit 4 is not involved into swap entry computation, so we + * can borrow it for swap page uffd-wp tracking. + */ +#define _PAGE_SWP_UFFD_WP \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + _PAGE_USER : 0) +#else +#define _PAGE_UFFD_WP 0 +#define _PAGE_SWP_UFFD_WP 0 +#endif + #define _PAGE_TABLE _PAGE_PRESENT /* diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index b2d00d129d81..94cc97d3dbff 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -416,6 +416,40 @@ static inline pte_t pte_wrprotect(pte_t pte) return __pte(pte_val(pte) & ~(_PAGE_WRITE)); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +#define pte_uffd_wp_available() riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B) + +static inline bool pte_uffd_wp(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_UFFD_WP); +} + +static inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte_wrprotect(__pte(pte_val(pte) | _PAGE_UFFD_WP)); +} + +static inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_UFFD_WP)); +} + +static inline bool pte_swp_uffd_wp(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_SWP_UFFD_WP); +} + +static inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_UFFD_WP); +} + +static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_SWP_UFFD_WP)); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + /* static inline pte_t pte_mkread(pte_t pte) */ static inline pte_t pte_mkwrite_novma(pte_t pte) @@ -836,6 +870,38 @@ static inline pud_t pud_mkspecial(pud_t pud) } #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline bool pmd_uffd_wp(pmd_t pmd) +{ + return pte_uffd_wp(pmd_pte(pmd)); +} + +static inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_mkuffd_wp(pmd_pte(pmd))); +} + +static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_clear_uffd_wp(pmd_pte(pmd))); +} + +static inline bool pmd_swp_uffd_wp(pmd_t pmd) +{ + return pte_swp_uffd_wp(pmd_pte(pmd)); +} + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_swp_mkuffd_wp(pmd_pte(pmd))); +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_swp_clear_uffd_wp(pmd_pte(pmd))); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY static inline bool pmd_soft_dirty(pmd_t pmd) { @@ -1053,6 +1119,7 @@ static inline pud_t pud_modify(pud_t pud, pgprot_t newprot) * bit 0: _PAGE_PRESENT (zero) * bit 1 to 2: (zero) * bit 3: _PAGE_SWP_SOFT_DIRTY + * bit 4: _PAGE_SWP_UFFD_WP * bit 5: _PAGE_PROT_NONE (zero) * bit 6: exclusive marker * bits 7 to 11: swap type -- 2.34.1 From zhangchunyan at iscas.ac.cn Fri Sep 5 03:36:49 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Fri, 5 Sep 2025 18:36:49 +0800 Subject: [PATCH v9 3/5] riscv: Add RISC-V Svrsw60t59b extension support In-Reply-To: <20250905103651.489197-1-zhangchunyan@iscas.ac.cn> References: <20250905103651.489197-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250905103651.489197-4-zhangchunyan@iscas.ac.cn> The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59 for software to use. Reviewed-by: Alexandre Ghiti Signed-off-by: Chunyan Zhang --- arch/riscv/Kconfig | 14 ++++++++++++++ arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/kernel/cpufeature.c | 1 + 3 files changed, 16 insertions(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index a4b233a0659e..d99df67cc7a4 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -862,6 +862,20 @@ config RISCV_ISA_ZICBOP If you don't know what to do here, say Y. +config RISCV_ISA_SVRSW60T59B + bool "Svrsw60t59b extension support for using PTE bits 60 and 59" + depends on MMU && 64BIT + depends on RISCV_ALTERNATIVE + default y + help + Adds support to dynamically detect the presence of the Svrsw60t59b + extension and enable its usage. + + The Svrsw60t59b extension allows to free the PTE reserved bits 60 + and 59 for software to use. + + If you don't know what to do here, say Y. + config TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI def_bool y # https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=aed44286efa8ae8717a77d94b51ac3614e2ca6dc diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index affd63e11b0a..f98fcb5c17d5 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -106,6 +106,7 @@ #define RISCV_ISA_EXT_ZAAMO 97 #define RISCV_ISA_EXT_ZALRSC 98 #define RISCV_ISA_EXT_ZICBOP 99 +#define RISCV_ISA_EXT_SVRSW60T59B 100 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 743d53415572..de29562096ff 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -540,6 +540,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_DATA(svnapot, RISCV_ISA_EXT_SVNAPOT), __RISCV_ISA_EXT_DATA(svpbmt, RISCV_ISA_EXT_SVPBMT), __RISCV_ISA_EXT_DATA(svvptc, RISCV_ISA_EXT_SVVPTC), + __RISCV_ISA_EXT_DATA(svrsw60t59b, RISCV_ISA_EXT_SVRSW60T59B), }; const size_t riscv_isa_ext_count = ARRAY_SIZE(riscv_isa_ext); -- 2.34.1 From zhangchunyan at iscas.ac.cn Fri Sep 5 03:36:46 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Fri, 5 Sep 2025 18:36:46 +0800 Subject: [PATCH v9 0/5] riscv: mm: Add soft-dirty and uffd-wp support Message-ID: <20250905103651.489197-1-zhangchunyan@iscas.ac.cn> This patchset adds support for Svrsw60t59b [1] extension which is ratified now, also soft dirty and userfaultfd write protect tracking for RISC-V. This patchset has been tested with kselftest mm suite in which soft-dirty, madv_populate, test_unmerge_uffd_wp, and uffd-unit-tests run and pass, and no regressions are observed in any of the other tests. This patchset applies on top of v6.17-rc4. V9: - Add pte_soft_dirty/uffd_wp_available() API to allow dynamically checking if the PTE bit is available for the platform on which the kernel is running. V8: (https://lore.kernel.org/all/20250619065232.1786470-1-zhangchunyan at iscas.ac.cn/) - Rebase on v6.16-rc1; - Add dependencies to MMU && 64BIT for RISCV_ISA_SVRSW60T59B; - Use 'Svrsw60t59b' instead of 'SVRSW60T59B' in Kconfig help paragraph; - Add Alex's Reviewed-by tag in patch 1. V7: (https://lore.kernel.org/all/20250409095320.224100-1-zhangchunyan at iscas.ac.cn/) - Add Svrsw60t59b [1] extension support; - Have soft-dirty and uffd-wp depending on the Svrsw60t59b extension to avoid crashes for the hardware which don't have this extension. V6: - Changes to use bits 59-60 which are supported by extension Svrsw60t59b for soft dirty and userfaultfd write protect tracking. V5: - Fixed typos and corrected some words in Kconfig and commit message; - Removed pte_wrprotect() from pte_swp_mkuffd_wp(), this is a copy-paste error; - Added Alex's Reviewed-by tag in patch 2. V4: - Added bit(4) descriptions into "Format of swap PTE". V3: - Fixed the issue reported by kernel test irobot . V1 -> V2: - Add uffd-wp supported; - Make soft-dirty uffd-wp and devmap mutually exclusive which all use the same PTE bit; - Add test results of CRIU in the cover-letter. [1] https://github.com/riscv-non-isa/riscv-iommu/pull/543 Chunyan Zhang (5): mm: softdirty: Add pte_soft_dirty_available() mm: uffd_wp: Add pte_uffd_wp_available() riscv: Add RISC-V Svrsw60t59b extension support riscv: mm: Add soft-dirty page tracking support riscv: mm: Add uffd write-protect support arch/riscv/Kconfig | 16 +++ arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/include/asm/pgtable-bits.h | 37 +++++++ arch/riscv/include/asm/pgtable.h | 140 +++++++++++++++++++++++++- arch/riscv/kernel/cpufeature.c | 1 + fs/proc/task_mmu.c | 12 ++- fs/userfaultfd.c | 25 +++-- include/asm-generic/pgtable_uffd.h | 12 +++ include/linux/mm_inline.h | 6 +- include/linux/pgtable.h | 10 ++ include/linux/userfaultfd_k.h | 44 +++++--- mm/debug_vm_pgtable.c | 9 +- mm/huge_memory.c | 13 +-- mm/internal.h | 2 +- mm/memory.c | 6 +- mm/mremap.c | 13 +-- mm/userfaultfd.c | 12 +-- 17 files changed, 302 insertions(+), 57 deletions(-) -- 2.34.1 From zhangchunyan at iscas.ac.cn Fri Sep 5 03:36:50 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Fri, 5 Sep 2025 18:36:50 +0800 Subject: [PATCH v9 4/5] riscv: mm: Add soft-dirty page tracking support In-Reply-To: <20250905103651.489197-1-zhangchunyan@iscas.ac.cn> References: <20250905103651.489197-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250905103651.489197-5-zhangchunyan@iscas.ac.cn> The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59 for software, this patch uses bit 59 for soft-dirty. To add swap PTE soft-dirty tracking, we borrow bit 3 which is available for swap PTEs on RISC-V systems. Signed-off-by: Chunyan Zhang --- arch/riscv/Kconfig | 1 + arch/riscv/include/asm/pgtable-bits.h | 19 +++++++ arch/riscv/include/asm/pgtable.h | 73 ++++++++++++++++++++++++++- 3 files changed, 91 insertions(+), 2 deletions(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index d99df67cc7a4..53b73e4bdf3f 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -141,6 +141,7 @@ config RISCV select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET select HAVE_ARCH_SECCOMP_FILTER + select HAVE_ARCH_SOFT_DIRTY if 64BIT && MMU && RISCV_ISA_SVRSW60T59B select HAVE_ARCH_THREAD_STRUCT_WHITELIST select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRANSPARENT_HUGEPAGE if 64BIT && MMU diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h index 179bd4afece4..8ffe81bf66d2 100644 --- a/arch/riscv/include/asm/pgtable-bits.h +++ b/arch/riscv/include/asm/pgtable-bits.h @@ -19,6 +19,25 @@ #define _PAGE_SOFT (3 << 8) /* Reserved for software */ #define _PAGE_SPECIAL (1 << 8) /* RSW: 0x1 */ + +#ifdef CONFIG_MEM_SOFT_DIRTY + +/* ext_svrsw60t59b: bit 59 for software dirty tracking */ +#define _PAGE_SOFT_DIRTY \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + (1UL << 59) : 0) +/* + * Bit 3 is always zero for swap entry computation, so we + * can borrow it for swap page soft-dirty tracking. + */ +#define _PAGE_SWP_SOFT_DIRTY \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + _PAGE_EXEC : 0) +#else +#define _PAGE_SOFT_DIRTY 0 +#define _PAGE_SWP_SOFT_DIRTY 0 +#endif /* CONFIG_MEM_SOFT_DIRTY */ + #define _PAGE_TABLE _PAGE_PRESENT /* diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 91697fbf1f90..b2d00d129d81 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -427,7 +427,7 @@ static inline pte_t pte_mkwrite_novma(pte_t pte) static inline pte_t pte_mkdirty(pte_t pte) { - return __pte(pte_val(pte) | _PAGE_DIRTY); + return __pte(pte_val(pte) | _PAGE_DIRTY | _PAGE_SOFT_DIRTY); } static inline pte_t pte_mkclean(pte_t pte) @@ -455,6 +455,40 @@ static inline pte_t pte_mkhuge(pte_t pte) return pte; } +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY +#define pte_soft_dirty_available() riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B) + +static inline bool pte_soft_dirty(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_SOFT_DIRTY); +} + +static inline pte_t pte_mksoft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SOFT_DIRTY); +} + +static inline pte_t pte_clear_soft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_SOFT_DIRTY)); +} + +static inline bool pte_swp_soft_dirty(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_SWP_SOFT_DIRTY); +} + +static inline pte_t pte_swp_mksoft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_SOFT_DIRTY); +} + +static inline pte_t pte_swp_clear_soft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_SWP_SOFT_DIRTY)); +} +#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */ + #ifdef CONFIG_RISCV_ISA_SVNAPOT #define pte_leaf_size(pte) (pte_napot(pte) ? \ napot_cont_size(napot_cont_order(pte)) :\ @@ -802,6 +836,40 @@ static inline pud_t pud_mkspecial(pud_t pud) } #endif +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY +static inline bool pmd_soft_dirty(pmd_t pmd) +{ + return pte_soft_dirty(pmd_pte(pmd)); +} + +static inline pmd_t pmd_mksoft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_mksoft_dirty(pmd_pte(pmd))); +} + +static inline pmd_t pmd_clear_soft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_clear_soft_dirty(pmd_pte(pmd))); +} + +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION +static inline bool pmd_swp_soft_dirty(pmd_t pmd) +{ + return pte_swp_soft_dirty(pmd_pte(pmd)); +} + +static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_swp_mksoft_dirty(pmd_pte(pmd))); +} + +static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_swp_clear_soft_dirty(pmd_pte(pmd))); +} +#endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */ +#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */ + static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd) { @@ -983,7 +1051,8 @@ static inline pud_t pud_modify(pud_t pud, pgprot_t newprot) * * Format of swap PTE: * bit 0: _PAGE_PRESENT (zero) - * bit 1 to 3: _PAGE_LEAF (zero) + * bit 1 to 2: (zero) + * bit 3: _PAGE_SWP_SOFT_DIRTY * bit 5: _PAGE_PROT_NONE (zero) * bit 6: exclusive marker * bits 7 to 11: swap type -- 2.34.1 From zhangchunyan at iscas.ac.cn Fri Sep 5 03:36:47 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Fri, 5 Sep 2025 18:36:47 +0800 Subject: [PATCH v9 1/5] mm: softdirty: Add pte_soft_dirty_available() In-Reply-To: <20250905103651.489197-1-zhangchunyan@iscas.ac.cn> References: <20250905103651.489197-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250905103651.489197-2-zhangchunyan@iscas.ac.cn> Some platforms can customize the PTE soft dirty bit and make it unavailable even if the architecture allows providing the PTE resource. Signed-off-by: Chunyan Zhang --- fs/proc/task_mmu.c | 12 +++++++++++- include/linux/pgtable.h | 9 +++++++++ mm/debug_vm_pgtable.c | 9 +++++---- mm/huge_memory.c | 13 +++++++------ mm/internal.h | 2 +- mm/mremap.c | 13 +++++++------ mm/userfaultfd.c | 12 ++++++------ 7 files changed, 46 insertions(+), 24 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 29cca0e6d0ff..32ba2fb92975 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1058,7 +1058,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) * -Werror=unterminated-string-initialization warning * with GCC 15 */ - static const char mnemonics[BITS_PER_LONG][3] = { + static char mnemonics[BITS_PER_LONG][3] = { /* * In case if we meet a flag we don't know about. */ @@ -1129,6 +1129,11 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) [ilog2(VM_SEALED)] = "sl", #endif }; +#ifdef CONFIG_MEM_SOFT_DIRTY + if (!pte_soft_dirty_available()) + mnemonics[ilog2(VM_SOFTDIRTY)][0] = 0; +#endif + size_t i; seq_puts(m, "VmFlags: "); @@ -1531,6 +1536,8 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, static inline void clear_soft_dirty(struct vm_area_struct *vma, unsigned long addr, pte_t *pte) { + if (!pte_soft_dirty_available()) + return; /* * The soft-dirty tracker uses #PF-s to catch writes * to pages, so write-protect the pte as well. See the @@ -1566,6 +1573,9 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, { pmd_t old, pmd = *pmdp; + if (!pte_soft_dirty_available()) + return; + if (pmd_present(pmd)) { /* See comment in change_huge_pmd() */ old = pmdp_invalidate(vma, addr, pmdp); diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 4c035637eeb7..2a489647ac96 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1538,6 +1538,15 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) #endif #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY + +/* + * Some platforms can customize the PTE soft dirty bit and make it unavailable + * even if the architecture allows providing the PTE resource. + */ +#ifndef pte_soft_dirty_available +#define pte_soft_dirty_available() (true) +#endif + #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) { diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index 830107b6dd08..98ed7e22ccec 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -690,7 +690,7 @@ static void __init pte_soft_dirty_tests(struct pgtable_debug_args *args) { pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) + if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || !pte_soft_dirty_available()) return; pr_debug("Validating PTE soft dirty\n"); @@ -702,7 +702,7 @@ static void __init pte_swap_soft_dirty_tests(struct pgtable_debug_args *args) { pte_t pte; - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) + if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || !pte_soft_dirty_available()) return; pr_debug("Validating PTE swap soft dirty\n"); @@ -718,7 +718,7 @@ static void __init pmd_soft_dirty_tests(struct pgtable_debug_args *args) { pmd_t pmd; - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) + if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || !pte_soft_dirty_available()) return; if (!has_transparent_hugepage()) @@ -735,7 +735,8 @@ static void __init pmd_swap_soft_dirty_tests(struct pgtable_debug_args *args) pmd_t pmd; if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || - !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) + !pte_soft_dirty_available() || + !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) return; if (!has_transparent_hugepage()) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9c38a95e9f09..2cf001b2e950 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2271,12 +2271,13 @@ static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl, static pmd_t move_soft_dirty_pmd(pmd_t pmd) { -#ifdef CONFIG_MEM_SOFT_DIRTY - if (unlikely(is_pmd_migration_entry(pmd))) - pmd = pmd_swp_mksoft_dirty(pmd); - else if (pmd_present(pmd)) - pmd = pmd_mksoft_dirty(pmd); -#endif + if (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && pte_soft_dirty_available()) { + if (unlikely(is_pmd_migration_entry(pmd))) + pmd = pmd_swp_mksoft_dirty(pmd); + else if (pmd_present(pmd)) + pmd = pmd_mksoft_dirty(pmd); + } + return pmd; } diff --git a/mm/internal.h b/mm/internal.h index 45b725c3dc03..8a5b20fac892 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1538,7 +1538,7 @@ static inline bool vma_soft_dirty_enabled(struct vm_area_struct *vma) * VM_SOFTDIRTY is defined as 0x0, then !(vm_flags & VM_SOFTDIRTY) * will be constantly true. */ - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) + if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || !pte_soft_dirty_available()) return false; /* diff --git a/mm/mremap.c b/mm/mremap.c index e618a706aff5..7c01320aea33 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -162,12 +162,13 @@ static pte_t move_soft_dirty_pte(pte_t pte) * Set soft dirty bit so we can notice * in userspace the ptes were moved. */ -#ifdef CONFIG_MEM_SOFT_DIRTY - if (pte_present(pte)) - pte = pte_mksoft_dirty(pte); - else if (is_swap_pte(pte)) - pte = pte_swp_mksoft_dirty(pte); -#endif + if (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && pte_soft_dirty_available()) { + if (pte_present(pte)) + pte = pte_mksoft_dirty(pte); + else if (is_swap_pte(pte)) + pte = pte_swp_mksoft_dirty(pte); + } + return pte; } diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 45e6290e2e8b..0e07a983c513 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -1065,9 +1065,9 @@ static int move_present_pte(struct mm_struct *mm, orig_dst_pte = folio_mk_pte(src_folio, dst_vma->vm_page_prot); /* Set soft dirty bit so userspace can notice the pte was moved */ -#ifdef CONFIG_MEM_SOFT_DIRTY - orig_dst_pte = pte_mksoft_dirty(orig_dst_pte); -#endif + if (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && pte_soft_dirty_available()) + orig_dst_pte = pte_mksoft_dirty(orig_dst_pte); + if (pte_dirty(orig_src_pte)) orig_dst_pte = pte_mkdirty(orig_dst_pte); orig_dst_pte = pte_mkwrite(orig_dst_pte, dst_vma); @@ -1134,9 +1134,9 @@ static int move_swap_pte(struct mm_struct *mm, struct vm_area_struct *dst_vma, } orig_src_pte = ptep_get_and_clear(mm, src_addr, src_pte); -#ifdef CONFIG_MEM_SOFT_DIRTY - orig_src_pte = pte_swp_mksoft_dirty(orig_src_pte); -#endif + if (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && pte_soft_dirty_available()) + orig_src_pte = pte_swp_mksoft_dirty(orig_src_pte); + set_pte_at(mm, dst_addr, dst_pte, orig_src_pte); double_pt_unlock(dst_ptl, src_ptl); -- 2.34.1 From zhangchunyan at iscas.ac.cn Fri Sep 5 03:36:48 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Fri, 5 Sep 2025 18:36:48 +0800 Subject: [PATCH v9 2/5] mm: uffd_wp: Add pte_uffd_wp_available() In-Reply-To: <20250905103651.489197-1-zhangchunyan@iscas.ac.cn> References: <20250905103651.489197-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250905103651.489197-3-zhangchunyan@iscas.ac.cn> Some platforms can customize the PTE uffd_wp bit and make it unavailable even if the architecture allows providing the PTE resource. This patch adds a macro API which allows architectures to define their specific one for checking if the PTE uffd_wp bit is available. Signed-off-by: Chunyan Zhang --- fs/userfaultfd.c | 25 +++++++++-------- include/asm-generic/pgtable_uffd.h | 12 ++++++++ include/linux/mm_inline.h | 6 ++-- include/linux/pgtable.h | 1 + include/linux/userfaultfd_k.h | 44 +++++++++++++++++++----------- mm/memory.c | 6 ++-- 6 files changed, 63 insertions(+), 31 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 54c6cc7fe9c6..68e5006e5158 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1270,9 +1270,10 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MISSING) vm_flags |= VM_UFFD_MISSING; if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) { -#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP - goto out; -#endif + if (!IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP) || + !pte_uffd_wp_available()) + goto out; + vm_flags |= VM_UFFD_WP; } if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MINOR) { @@ -1980,14 +1981,16 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx, uffdio_api.features &= ~(UFFD_FEATURE_MINOR_HUGETLBFS | UFFD_FEATURE_MINOR_SHMEM); #endif -#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP - uffdio_api.features &= ~UFFD_FEATURE_PAGEFAULT_FLAG_WP; -#endif -#ifndef CONFIG_PTE_MARKER_UFFD_WP - uffdio_api.features &= ~UFFD_FEATURE_WP_HUGETLBFS_SHMEM; - uffdio_api.features &= ~UFFD_FEATURE_WP_UNPOPULATED; - uffdio_api.features &= ~UFFD_FEATURE_WP_ASYNC; -#endif + if (!IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP) || + !pte_uffd_wp_available()) + uffdio_api.features &= ~UFFD_FEATURE_PAGEFAULT_FLAG_WP; + + if (!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || + !pte_uffd_wp_available()) { + uffdio_api.features &= ~UFFD_FEATURE_WP_HUGETLBFS_SHMEM; + uffdio_api.features &= ~UFFD_FEATURE_WP_UNPOPULATED; + uffdio_api.features &= ~UFFD_FEATURE_WP_ASYNC; + } ret = -EINVAL; if (features & ~uffdio_api.features) diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h index 828966d4c281..b86a5ff447da 100644 --- a/include/asm-generic/pgtable_uffd.h +++ b/include/asm-generic/pgtable_uffd.h @@ -61,6 +61,18 @@ static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) { return pmd; } +#define pte_uffd_wp_available() (false) +#else +/* + * Some platforms can customize the PTE uffd_wp bit and make it unavailable + * even if the architecture allows providing the PTE resource. + * This allows architectures to define their own API for checking if + * the PTE uffd_wp bit is available. + */ +#ifndef pte_uffd_wp_available +#define pte_uffd_wp_available() (true) +#endif + #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ #endif /* _ASM_GENERIC_PGTABLE_UFFD_H */ diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 89b518ff097e..a81055bb3f87 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -570,7 +570,9 @@ static inline bool pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, pte_t *pte, pte_t pteval) { -#ifdef CONFIG_PTE_MARKER_UFFD_WP + if (!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || !pte_uffd_wp_available()) + return false; + bool arm_uffd_pte = false; /* The current status of the pte should be "cleared" before calling */ @@ -601,7 +603,7 @@ pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, make_pte_marker(PTE_MARKER_UFFD_WP)); return true; } -#endif + return false; } diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 2a489647ac96..51f5b610c5ec 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1564,6 +1564,7 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) } #endif #else /* !CONFIG_HAVE_ARCH_SOFT_DIRTY */ +#define pte_soft_dirty_available() (false) static inline int pte_soft_dirty(pte_t pte) { return 0; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index c0e716aec26a..ec4a815286c8 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -228,15 +228,15 @@ static inline bool vma_can_userfault(struct vm_area_struct *vma, if (wp_async && (vm_flags == VM_UFFD_WP)) return true; -#ifndef CONFIG_PTE_MARKER_UFFD_WP /* * If user requested uffd-wp but not enabled pte markers for * uffd-wp, then shmem & hugetlbfs are not supported but only * anonymous. */ - if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma)) + if ((!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || + !pte_uffd_wp_available()) && + (vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma)) return false; -#endif /* By default, allow any of anon|shmem|hugetlb */ return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || @@ -437,8 +437,11 @@ static inline bool userfaultfd_wp_use_markers(struct vm_area_struct *vma) static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) { #ifdef CONFIG_PTE_MARKER_UFFD_WP - return is_pte_marker_entry(entry) && - (pte_marker_get(entry) & PTE_MARKER_UFFD_WP); + if (pte_uffd_wp_available()) + return is_pte_marker_entry(entry) && + (pte_marker_get(entry) & PTE_MARKER_UFFD_WP); + else + return false; #else return false; #endif @@ -447,14 +450,19 @@ static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) static inline bool pte_marker_uffd_wp(pte_t pte) { #ifdef CONFIG_PTE_MARKER_UFFD_WP - swp_entry_t entry; + if (pte_uffd_wp_available()) { + swp_entry_t entry; - if (!is_swap_pte(pte)) - return false; + if (!is_swap_pte(pte)) + return false; - entry = pte_to_swp_entry(pte); + entry = pte_to_swp_entry(pte); + + return pte_marker_entry_uffd_wp(entry); + } else { + return false; + } - return pte_marker_entry_uffd_wp(entry); #else return false; #endif @@ -467,14 +475,18 @@ static inline bool pte_marker_uffd_wp(pte_t pte) static inline bool pte_swp_uffd_wp_any(pte_t pte) { #ifdef CONFIG_PTE_MARKER_UFFD_WP - if (!is_swap_pte(pte)) - return false; + if (pte_uffd_wp_available()) { + if (!is_swap_pte(pte)) + return false; - if (pte_swp_uffd_wp(pte)) - return true; + if (pte_swp_uffd_wp(pte)) + return true; - if (pte_marker_uffd_wp(pte)) - return true; + if (pte_marker_uffd_wp(pte)) + return true; + } else { + return false; + } #endif return false; } diff --git a/mm/memory.c b/mm/memory.c index 0ba4f6b71847..1c61b2d7bd4d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1465,7 +1465,9 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, { bool was_installed = false; -#ifdef CONFIG_PTE_MARKER_UFFD_WP + if (!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || !pte_uffd_wp_available()) + return was_installed; + /* Zap on anonymous always means dropping everything */ if (vma_is_anonymous(vma)) return false; @@ -1482,7 +1484,7 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, pte++; addr += PAGE_SIZE; } -#endif + return was_installed; } -- 2.34.1 From wangruikang at iscas.ac.cn Fri Sep 5 04:09:34 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Fri, 05 Sep 2025 19:09:34 +0800 Subject: [PATCH net-next v9 5/5] riscv: dts: spacemit: Add Ethernet support for Jupiter In-Reply-To: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> Message-ID: <20250905-net-k1-emac-v9-5-f1649b98a19c@iscas.ac.cn> Milk-V Jupiter uses an RGMII PHY for each port and uses GPIO for PHY reset. Signed-off-by: Vivian Wang Reviewed-by: Yixun Lan --- arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts | 46 +++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts b/arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts index 4483192141049caa201c093fb206b6134a064f42..c5933555c06b66f40e61fe2b9c159ba0770c2fa1 100644 --- a/arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts +++ b/arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts @@ -20,6 +20,52 @@ chosen { }; }; +ð0 { + phy-handle = <&rgmii0>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac0_cfg>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <0>; + status = "okay"; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + reset-gpios = <&gpio K1_GPIO(110) GPIO_ACTIVE_LOW>; + reset-delay-us = <10000>; + reset-post-delay-us = <100000>; + + rgmii0: phy at 1 { + reg = <0x1>; + }; + }; +}; + +ð1 { + phy-handle = <&rgmii1>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac1_cfg>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <250>; + status = "okay"; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + reset-gpios = <&gpio K1_GPIO(115) GPIO_ACTIVE_LOW>; + reset-delay-us = <10000>; + reset-post-delay-us = <100000>; + + rgmii1: phy at 1 { + reg = <0x1>; + }; + }; +}; + &uart0 { pinctrl-names = "default"; pinctrl-0 = <&uart0_2_cfg>; -- 2.50.1 From wangruikang at iscas.ac.cn Fri Sep 5 04:09:32 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Fri, 05 Sep 2025 19:09:32 +0800 Subject: [PATCH net-next v9 3/5] riscv: dts: spacemit: Add Ethernet support for K1 In-Reply-To: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> Message-ID: <20250905-net-k1-emac-v9-3-f1649b98a19c@iscas.ac.cn> Add nodes for each of the two Ethernet MACs on K1 with generic properties. Also add "gmac" pins to pinctrl config. Signed-off-by: Vivian Wang Reviewed-by: Yixun Lan --- arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi | 48 ++++++++++++++++++++++++++++ arch/riscv/boot/dts/spacemit/k1.dtsi | 22 +++++++++++++ 2 files changed, 70 insertions(+) diff --git a/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi b/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi index 3810557374228100be7adab58cd785c72e6d4aed..aff19c86d5ff381881016eaa87fc4809da65b50e 100644 --- a/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi +++ b/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi @@ -11,6 +11,54 @@ #define K1_GPIO(x) (x / 32) (x % 32) &pinctrl { + gmac0_cfg: gmac0-cfg { + gmac0-pins { + pinmux = , /* gmac0_rxdv */ + , /* gmac0_rx_d0 */ + , /* gmac0_rx_d1 */ + , /* gmac0_rx_clk */ + , /* gmac0_rx_d2 */ + , /* gmac0_rx_d3 */ + , /* gmac0_tx_d0 */ + , /* gmac0_tx_d1 */ + , /* gmac0_tx */ + , /* gmac0_tx_d2 */ + , /* gmac0_tx_d3 */ + , /* gmac0_tx_en */ + , /* gmac0_mdc */ + , /* gmac0_mdio */ + , /* gmac0_int_n */ + ; /* gmac0_clk_ref */ + + bias-pull-up = <0>; + drive-strength = <21>; + }; + }; + + gmac1_cfg: gmac1-cfg { + gmac1-pins { + pinmux = , /* gmac1_rxdv */ + , /* gmac1_rx_d0 */ + , /* gmac1_rx_d1 */ + , /* gmac1_rx_clk */ + , /* gmac1_rx_d2 */ + , /* gmac1_rx_d3 */ + , /* gmac1_tx_d0 */ + , /* gmac1_tx_d1 */ + , /* gmac1_tx */ + , /* gmac1_tx_d2 */ + , /* gmac1_tx_d3 */ + , /* gmac1_tx_en */ + , /* gmac1_mdc */ + , /* gmac1_mdio */ + , /* gmac1_int_n */ + ; /* gmac1_clk_ref */ + + bias-pull-up = <0>; + drive-strength = <21>; + }; + }; + uart0_2_cfg: uart0-2-cfg { uart0-2-pins { pinmux = , diff --git a/arch/riscv/boot/dts/spacemit/k1.dtsi b/arch/riscv/boot/dts/spacemit/k1.dtsi index abde8bb07c95c5a745736a2dd6f0c0e0d7c696e4..7b2ac3637d6d9fa1929418cc68aa25c57850ac7f 100644 --- a/arch/riscv/boot/dts/spacemit/k1.dtsi +++ b/arch/riscv/boot/dts/spacemit/k1.dtsi @@ -805,6 +805,28 @@ network-bus { #size-cells = <2>; dma-ranges = <0x0 0x00000000 0x0 0x00000000 0x0 0x80000000>, <0x0 0x80000000 0x1 0x00000000 0x0 0x80000000>; + + eth0: ethernet at cac80000 { + compatible = "spacemit,k1-emac"; + reg = <0x0 0xcac80000 0x0 0x420>; + clocks = <&syscon_apmu CLK_EMAC0_BUS>; + interrupts = <131>; + mac-address = [ 00 00 00 00 00 00 ]; + resets = <&syscon_apmu RESET_EMAC0>; + spacemit,apmu = <&syscon_apmu 0x3e4>; + status = "disabled"; + }; + + eth1: ethernet at cac81000 { + compatible = "spacemit,k1-emac"; + reg = <0x0 0xcac81000 0x0 0x420>; + clocks = <&syscon_apmu CLK_EMAC1_BUS>; + interrupts = <133>; + mac-address = [ 00 00 00 00 00 00 ]; + resets = <&syscon_apmu RESET_EMAC1>; + spacemit,apmu = <&syscon_apmu 0x3ec>; + status = "disabled"; + }; }; pcie-bus { -- 2.50.1 From wangruikang at iscas.ac.cn Fri Sep 5 04:09:30 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Fri, 05 Sep 2025 19:09:30 +0800 Subject: [PATCH net-next v9 1/5] dt-bindings: net: Add support for SpacemiT K1 In-Reply-To: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> Message-ID: <20250905-net-k1-emac-v9-1-f1649b98a19c@iscas.ac.cn> The Ethernet MACs on SpacemiT K1 appears to be a custom design. SpacemiT refers to them as "EMAC", so let's just call them "spacemit,k1-emac". Signed-off-by: Vivian Wang Reviewed-by: Conor Dooley --- .../devicetree/bindings/net/spacemit,k1-emac.yaml | 81 ++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/Documentation/devicetree/bindings/net/spacemit,k1-emac.yaml b/Documentation/devicetree/bindings/net/spacemit,k1-emac.yaml new file mode 100644 index 0000000000000000000000000000000000000000..500a3e1daa230ea3a1fad30d8ea56a7822fccb3d --- /dev/null +++ b/Documentation/devicetree/bindings/net/spacemit,k1-emac.yaml @@ -0,0 +1,81 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/spacemit,k1-emac.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: SpacemiT K1 Ethernet MAC + +allOf: + - $ref: ethernet-controller.yaml# + +maintainers: + - Vivian Wang + +properties: + compatible: + const: spacemit,k1-emac + + reg: + maxItems: 1 + + clocks: + maxItems: 1 + + interrupts: + maxItems: 1 + + mdio-bus: + $ref: mdio.yaml# + unevaluatedProperties: false + + resets: + maxItems: 1 + + spacemit,apmu: + $ref: /schemas/types.yaml#/definitions/phandle-array + items: + - items: + - description: phandle to syscon that controls this MAC + - description: offset of control registers + description: + A phandle to syscon with byte offset to control registers for this MAC + +required: + - compatible + - reg + - clocks + - interrupts + - resets + - spacemit,apmu + +unevaluatedProperties: false + +examples: + - | + #include + + ethernet at cac80000 { + compatible = "spacemit,k1-emac"; + reg = <0xcac80000 0x00000420>; + clocks = <&syscon_apmu CLK_EMAC0_BUS>; + interrupts = <131>; + mac-address = [ 00 00 00 00 00 00 ]; + phy-handle = <&rgmii0>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac0_cfg>; + resets = <&syscon_apmu RESET_EMAC0>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <0>; + spacemit,apmu = <&syscon_apmu 0x3e4>; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + rgmii0: phy at 1 { + reg = <0x1>; + }; + }; + }; -- 2.50.1 From wangruikang at iscas.ac.cn Fri Sep 5 04:09:33 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Fri, 05 Sep 2025 19:09:33 +0800 Subject: [PATCH net-next v9 4/5] riscv: dts: spacemit: Add Ethernet support for BPI-F3 In-Reply-To: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> Message-ID: <20250905-net-k1-emac-v9-4-f1649b98a19c@iscas.ac.cn> Banana Pi BPI-F3 uses an RGMII PHY for each port and uses GPIO for PHY reset. Tested-by: Hendrik Hamerlinck Signed-off-by: Vivian Wang Reviewed-by: Yixun Lan --- arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts | 46 +++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts b/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts index fe22c747c5012fe56d42ac8a7efdbbdb694f31b6..15fa4a5ebd043f3fbb115d37e5a980c9b773a228 100644 --- a/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts +++ b/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts @@ -40,6 +40,52 @@ &emmc { status = "okay"; }; +ð0 { + phy-handle = <&rgmii0>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac0_cfg>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <0>; + status = "okay"; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + reset-gpios = <&gpio K1_GPIO(110) GPIO_ACTIVE_LOW>; + reset-delay-us = <10000>; + reset-post-delay-us = <100000>; + + rgmii0: phy at 1 { + reg = <0x1>; + }; + }; +}; + +ð1 { + phy-handle = <&rgmii1>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac1_cfg>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <250>; + status = "okay"; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + reset-gpios = <&gpio K1_GPIO(115) GPIO_ACTIVE_LOW>; + reset-delay-us = <10000>; + reset-post-delay-us = <100000>; + + rgmii1: phy at 1 { + reg = <0x1>; + }; + }; +}; + &uart0 { pinctrl-names = "default"; pinctrl-0 = <&uart0_2_cfg>; -- 2.50.1 From wangruikang at iscas.ac.cn Fri Sep 5 04:09:29 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Fri, 05 Sep 2025 19:09:29 +0800 Subject: [PATCH net-next v9 0/5] Add Ethernet MAC support for SpacemiT K1 Message-ID: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> SpacemiT K1 has two gigabit Ethernet MACs with RGMII and RMII support. Add devicetree bindings, driver, and DTS for it. Tested primarily on BananaPi BPI-F3. Basic TX/RX functionality also tested on Milk-V Jupiter. I would like to note that even though some bit field names superficially resemble that of DesignWare MAC, all other differences point to it in fact being a custom design. Based on SpacemiT drivers [1]. These patches are also available at: https://github.com/dramforever/linux/tree/k1/ethernet/v9 [1]: https://github.com/spacemit-com/linux-k1x --- Changes in v9: - Refactor to use phy_interface_mode_is_rgmii - Minor changes - Use netdev_err in more places - Print phy-mode by name on unsupported phy-mode - Link to v8: https://lore.kernel.org/r/20250828-net-k1-emac-v8-0-e9075dd2ca90 at iscas.ac.cn Changes in v8: - Use devres to do of_phy_deregister_fixed_link on probe failure or remove - Simplified control flow in a few places with early return or continue - Minor changes - Removed some unneeded parens in emac_configure_{tx,rx} - Link to v7: https://lore.kernel.org/r/20250826-net-k1-emac-v7-0-5bc158d086ae at iscas.ac.cn Changes in v7: - Removed scoped_guard usage - Renamed error handling path labels after destinations - Fix skb free error handling path in emac_start_xmit and emac_tx_mem_map - Cancel tx_timeout_task to prevent schedule_work lifetime problems - Minor changes: - Remove unnecessary timer_delete_sync in emac_down - Use dev_err_ratelimited in a few more places - Cosmetic fixes in error messages - Link to v6: https://lore.kernel.org/r/20250820-net-k1-emac-v6-0-c1e28f2b8be5 at iscas.ac.cn Changes in v6: - Implement pause frame support - Minor changes: - Convert comment for emac_stats_update() into assert_spin_locked() - Cosmetic fixes for some comments and whitespace - emac_set_mac_addr() is now refactored - Link to v5: https://lore.kernel.org/r/20250812-net-k1-emac-v5-0-dd17c4905f49 at iscas.ac.cn Changes in v5: - Rebased on v6.17-rc1, add back DTS now that they apply cleanly - Use standard statistics interface, handle 32-bit statistics overflow - Minor changes: - Fix clock resource handling in emac_resume - Ratelimit the message in emac_rx_frame_status - Add ndo_validate_addr = eth_validate_addr - Remove unnecessary parens in emac_set_mac_addr - Change some functions that never fail to return void instead of int - Minor rewording - Link to v4: https://lore.kernel.org/r/20250703-net-k1-emac-v4-0-686d09c4cfa8 at iscas.ac.cn Changes in v4: - Resource handling on probe and remove: timer_delete_sync and of_phy_deregister_fixed_link - Drop DTS changes and dependencies (will send through SpacemiT tree) - Minor changes: - Remove redundant phy_stop() and setting of ndev->phydev - Fix error checking for emac_open in emac_resume - Fix one missed dev_err -> dev_err_probe - Fix type of emac_start_xmit - Fix one missed reverse xmas tree formatting - Rename some functions for consistency between emac_* and ndo_* - Link to v3: https://lore.kernel.org/r/20250702-net-k1-emac-v3-0-882dc55404f3 at iscas.ac.cn Changes in v3: - Refactored and simplified emac_tx_mem_map - Addressed other minor v2 review comments - Removed what was patch 3 in v2, depend on DMA buses instead - DT nodes in alphabetical order where appropriate - Link to v2: https://lore.kernel.org/r/20250618-net-k1-emac-v2-0-94f5f07227a8 at iscas.ac.cn Changes in v2: - dts: Put eth0 and eth1 nodes under a bus with dma-ranges - dts: Added Milk-V Jupiter - Fix typo in emac_init_hw() that broke the driver (Oops!) - Reformatted line lengths to under 80 - Addressed other v1 review comments - Link to v1: https://lore.kernel.org/r/20250613-net-k1-emac-v1-0-cc6f9e510667 at iscas.ac.cn --- Vivian Wang (5): dt-bindings: net: Add support for SpacemiT K1 net: spacemit: Add K1 Ethernet MAC riscv: dts: spacemit: Add Ethernet support for K1 riscv: dts: spacemit: Add Ethernet support for BPI-F3 riscv: dts: spacemit: Add Ethernet support for Jupiter .../devicetree/bindings/net/spacemit,k1-emac.yaml | 81 + arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts | 46 + arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts | 46 + arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi | 48 + arch/riscv/boot/dts/spacemit/k1.dtsi | 22 + drivers/net/ethernet/Kconfig | 1 + drivers/net/ethernet/Makefile | 1 + drivers/net/ethernet/spacemit/Kconfig | 29 + drivers/net/ethernet/spacemit/Makefile | 6 + drivers/net/ethernet/spacemit/k1_emac.c | 2183 ++++++++++++++++++++ drivers/net/ethernet/spacemit/k1_emac.h | 426 ++++ 11 files changed, 2889 insertions(+) --- base-commit: 062b3e4a1f880f104a8d4b90b767788786aa7b78 change-id: 20250606-net-k1-emac-3e181508ea64 Best regards, -- Vivian "dramforever" Wang From wangruikang at iscas.ac.cn Fri Sep 5 04:09:31 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Fri, 05 Sep 2025 19:09:31 +0800 Subject: [PATCH net-next v9 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> Message-ID: <20250905-net-k1-emac-v9-2-f1649b98a19c@iscas.ac.cn> The Ethernet MACs found on SpacemiT K1 appears to be a custom design that only superficially resembles some other embedded MACs. SpacemiT refers to them as "EMAC", so let's just call the driver "k1_emac". Supports RGMII and RMII interfaces. Includes support for MAC hardware statistics counters. PTP support is not implemented. Signed-off-by: Vivian Wang Reviewed-by: Maxime Chevallier Reviewed-by: Vadim Fedorenko Reviewed-by: Troy Mitchell Tested-by: Junhui Liu Tested-by: Troy Mitchell --- drivers/net/ethernet/Kconfig | 1 + drivers/net/ethernet/Makefile | 1 + drivers/net/ethernet/spacemit/Kconfig | 29 + drivers/net/ethernet/spacemit/Makefile | 6 + drivers/net/ethernet/spacemit/k1_emac.c | 2183 +++++++++++++++++++++++++++++++ drivers/net/ethernet/spacemit/k1_emac.h | 426 ++++++ 6 files changed, 2646 insertions(+) diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig index f86d4557d8d7756a5e27bc17578353b5c19ca108..aead145dd91d129b7bb410f2d4d754c744dddbf4 100644 --- a/drivers/net/ethernet/Kconfig +++ b/drivers/net/ethernet/Kconfig @@ -188,6 +188,7 @@ source "drivers/net/ethernet/sis/Kconfig" source "drivers/net/ethernet/sfc/Kconfig" source "drivers/net/ethernet/smsc/Kconfig" source "drivers/net/ethernet/socionext/Kconfig" +source "drivers/net/ethernet/spacemit/Kconfig" source "drivers/net/ethernet/stmicro/Kconfig" source "drivers/net/ethernet/sun/Kconfig" source "drivers/net/ethernet/sunplus/Kconfig" diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile index 67182339469a0d8337cc4e92aa51e498c615156d..998dd628b202ced212748450753fe180f0440c74 100644 --- a/drivers/net/ethernet/Makefile +++ b/drivers/net/ethernet/Makefile @@ -91,6 +91,7 @@ obj-$(CONFIG_NET_VENDOR_SOLARFLARE) += sfc/ obj-$(CONFIG_NET_VENDOR_SGI) += sgi/ obj-$(CONFIG_NET_VENDOR_SMSC) += smsc/ obj-$(CONFIG_NET_VENDOR_SOCIONEXT) += socionext/ +obj-$(CONFIG_NET_VENDOR_SPACEMIT) += spacemit/ obj-$(CONFIG_NET_VENDOR_STMICRO) += stmicro/ obj-$(CONFIG_NET_VENDOR_SUN) += sun/ obj-$(CONFIG_NET_VENDOR_SUNPLUS) += sunplus/ diff --git a/drivers/net/ethernet/spacemit/Kconfig b/drivers/net/ethernet/spacemit/Kconfig new file mode 100644 index 0000000000000000000000000000000000000000..85ef61a9b4eff4249ad2d32a6e7dbf283b0c180f --- /dev/null +++ b/drivers/net/ethernet/spacemit/Kconfig @@ -0,0 +1,29 @@ +config NET_VENDOR_SPACEMIT + bool "SpacemiT devices" + default y + depends on ARCH_SPACEMIT || COMPILE_TEST + help + If you have a network (Ethernet) device belonging to this class, + say Y. + + Note that the answer to this question does not directly affect + the kernel: saying N will just cause the configurator to skip all + the questions regarding SpacemiT devices. If you say Y, you will + be asked for your specific chipset/driver in the following questions. + +if NET_VENDOR_SPACEMIT + +config SPACEMIT_K1_EMAC + tristate "SpacemiT K1 Ethernet MAC driver" + depends on ARCH_SPACEMIT || COMPILE_TEST + depends on MFD_SYSCON + depends on OF + default m if ARCH_SPACEMIT + select PHYLIB + help + This driver supports the Ethernet MAC in the SpacemiT K1 SoC. + + To compile this driver as a module, choose M here: the module + will be called k1_emac. + +endif # NET_VENDOR_SPACEMIT diff --git a/drivers/net/ethernet/spacemit/Makefile b/drivers/net/ethernet/spacemit/Makefile new file mode 100644 index 0000000000000000000000000000000000000000..d29efd997a4ff5dcb50986e439997df7e3650570 --- /dev/null +++ b/drivers/net/ethernet/spacemit/Makefile @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Makefile for the SpacemiT network device drivers. +# + +obj-$(CONFIG_SPACEMIT_K1_EMAC) += k1_emac.o diff --git a/drivers/net/ethernet/spacemit/k1_emac.c b/drivers/net/ethernet/spacemit/k1_emac.c new file mode 100644 index 0000000000000000000000000000000000000000..f626aa346dde93054c5fbf483d976bd37d01a609 --- /dev/null +++ b/drivers/net/ethernet/spacemit/k1_emac.c @@ -0,0 +1,2183 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * SpacemiT K1 Ethernet driver + * + * Copyright (C) 2023-2025 SpacemiT (Hangzhou) Technology Co. Ltd + * Copyright (C) 2025 Vivian Wang + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "k1_emac.h" + +#define DRIVER_NAME "k1_emac" + +#define EMAC_DEFAULT_BUFSIZE 1536 +#define EMAC_RX_BUF_2K 2048 +#define EMAC_RX_BUF_4K 4096 + +/* Tuning parameters from SpacemiT */ +#define EMAC_TX_FRAMES 64 +#define EMAC_TX_COAL_TIMEOUT 40000 +#define EMAC_RX_FRAMES 64 +#define EMAC_RX_COAL_TIMEOUT (600 * 312) + +#define DEFAULT_FC_PAUSE_TIME 0xffff +#define DEFAULT_FC_FIFO_HIGH 1600 +#define DEFAULT_TX_ALMOST_FULL 0x1f8 +#define DEFAULT_TX_THRESHOLD 1518 +#define DEFAULT_RX_THRESHOLD 12 +#define DEFAULT_TX_RING_NUM 1024 +#define DEFAULT_RX_RING_NUM 1024 +#define DEFAULT_DMA_BURST MREGBIT_BURST_16WORD +#define HASH_TABLE_SIZE 64 + +enum rx_frame_status { + RX_FRAME_OK, + RX_FRAME_DISCARD, +}; + +struct desc_buf { + u64 dma_addr; + void *buff_addr; + u16 dma_len; + u8 map_as_page; +}; + +struct emac_tx_desc_buffer { + struct sk_buff *skb; + struct desc_buf buf[2]; +}; + +struct emac_rx_desc_buffer { + struct sk_buff *skb; + u64 dma_addr; + void *buff_addr; + u16 dma_len; + u8 map_as_page; +}; + +/** + * struct emac_desc_ring - Software-side information for one descriptor ring + * Same structure used for both RX and TX + * @desc_addr: Virtual address to the descriptor ring memory + * @desc_dma_addr: DMA address of the descriptor ring + * @total_size: Size of ring in bytes + * @total_cnt: Number of descriptors + * @head: Next descriptor to associate a buffer with + * @tail: Next descriptor to check status bit + * @rx_desc_buf: Array of descriptors for RX + * @tx_desc_buf: Array of descriptors for TX, with max of two buffers each + */ +struct emac_desc_ring { + void *desc_addr; + dma_addr_t desc_dma_addr; + u32 total_size; + u32 total_cnt; + u32 head; + u32 tail; + union { + struct emac_rx_desc_buffer *rx_desc_buf; + struct emac_tx_desc_buffer *tx_desc_buf; + }; +}; + +struct emac_priv { + void __iomem *iobase; + u32 dma_buf_sz; + struct emac_desc_ring tx_ring; + struct emac_desc_ring rx_ring; + + struct net_device *ndev; + struct napi_struct napi; + struct platform_device *pdev; + struct clk *bus_clk; + struct clk *ref_clk; + struct regmap *regmap_apmu; + u32 regmap_apmu_offset; + int irq; + + phy_interface_t phy_interface; + + struct emac_hw_tx_stats tx_stats, tx_stats_off; + struct emac_hw_rx_stats rx_stats, rx_stats_off; + + u32 tx_count_frames; + u32 tx_coal_frames; + u32 tx_coal_timeout; + struct work_struct tx_timeout_task; + + struct timer_list txtimer; + struct timer_list stats_timer; + + u32 tx_delay; + u32 rx_delay; + + bool flow_control_autoneg; + u8 flow_control; + + /* Hold for any statistics operation */ + spinlock_t stats_lock; +}; + +static void emac_wr(struct emac_priv *priv, u32 reg, u32 val) +{ + writel(val, priv->iobase + reg); +} + +static int emac_rd(struct emac_priv *priv, u32 reg) +{ + return readl(priv->iobase + reg); +} + +static int emac_phy_interface_config(struct emac_priv *priv) +{ + u32 val = 0, mask = REF_CLK_SEL | RGMII_TX_CLK_SEL | PHY_INTF_RGMII; + + if (phy_interface_mode_is_rgmii(priv->phy_interface)) + val |= PHY_INTF_RGMII; + + regmap_update_bits(priv->regmap_apmu, + priv->regmap_apmu_offset + APMU_EMAC_CTRL_REG, + mask, val); + + return 0; +} + +/* + * Where the hardware expects a MAC address, it is laid out in this high, med, + * low order in three consecutive registers and in this format. + */ + +static void emac_set_mac_addr_reg(struct emac_priv *priv, + const unsigned char *addr, + u32 reg) +{ + emac_wr(priv, reg + sizeof(u32) * 0, addr[1] << 8 | addr[0]); + emac_wr(priv, reg + sizeof(u32) * 1, addr[3] << 8 | addr[2]); + emac_wr(priv, reg + sizeof(u32) * 2, addr[5] << 8 | addr[4]); +} + +static void emac_set_mac_addr(struct emac_priv *priv, const unsigned char *addr) +{ + /* We use only one address, so set the same for flow control as well */ + emac_set_mac_addr_reg(priv, addr, MAC_ADDRESS1_HIGH); + emac_set_mac_addr_reg(priv, addr, MAC_FC_SOURCE_ADDRESS_HIGH); +} + +static void emac_reset_hw(struct emac_priv *priv) +{ + /* Disable all interrupts */ + emac_wr(priv, MAC_INTERRUPT_ENABLE, 0x0); + emac_wr(priv, DMA_INTERRUPT_ENABLE, 0x0); + + /* Disable transmit and receive units */ + emac_wr(priv, MAC_RECEIVE_CONTROL, 0x0); + emac_wr(priv, MAC_TRANSMIT_CONTROL, 0x0); + + /* Disable DMA */ + emac_wr(priv, DMA_CONTROL, 0x0); +} + +static void emac_init_hw(struct emac_priv *priv) +{ + /* Destination address for 802.3x Ethernet flow control */ + u8 fc_dest_addr[ETH_ALEN] = { 0x01, 0x80, 0xc2, 0x00, 0x00, 0x01 }; + + u32 rxirq = 0, dma = 0; + + regmap_set_bits(priv->regmap_apmu, + priv->regmap_apmu_offset + APMU_EMAC_CTRL_REG, + AXI_SINGLE_ID); + + /* Disable transmit and receive units */ + emac_wr(priv, MAC_RECEIVE_CONTROL, 0x0); + emac_wr(priv, MAC_TRANSMIT_CONTROL, 0x0); + + /* Enable MAC address 1 filtering */ + emac_wr(priv, MAC_ADDRESS_CONTROL, MREGBIT_MAC_ADDRESS1_ENABLE); + + /* Zero initialize the multicast hash table */ + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, 0x0); + + /* Configure thresholds */ + emac_wr(priv, MAC_TRANSMIT_FIFO_ALMOST_FULL, DEFAULT_TX_ALMOST_FULL); + emac_wr(priv, MAC_TRANSMIT_PACKET_START_THRESHOLD, + DEFAULT_TX_THRESHOLD); + emac_wr(priv, MAC_RECEIVE_PACKET_START_THRESHOLD, DEFAULT_RX_THRESHOLD); + + /* Configure flow control (enabled in emac_adjust_link() later) */ + emac_set_mac_addr_reg(priv, fc_dest_addr, MAC_FC_SOURCE_ADDRESS_HIGH); + emac_wr(priv, MAC_FC_PAUSE_HIGH_THRESHOLD, DEFAULT_FC_FIFO_HIGH); + emac_wr(priv, MAC_FC_HIGH_PAUSE_TIME, DEFAULT_FC_PAUSE_TIME); + emac_wr(priv, MAC_FC_PAUSE_LOW_THRESHOLD, 0); + + /* RX IRQ mitigation */ + rxirq = EMAC_RX_FRAMES & MREGBIT_RECEIVE_IRQ_FRAME_COUNTER_MASK; + rxirq |= (EMAC_RX_COAL_TIMEOUT + << MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_SHIFT) & + MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_MASK; + + rxirq |= MREGBIT_RECEIVE_IRQ_MITIGATION_ENABLE; + emac_wr(priv, DMA_RECEIVE_IRQ_MITIGATION_CTRL, rxirq); + + /* Disable and set DMA config */ + emac_wr(priv, DMA_CONTROL, 0x0); + + emac_wr(priv, DMA_CONFIGURATION, MREGBIT_SOFTWARE_RESET); + usleep_range(9000, 10000); + emac_wr(priv, DMA_CONFIGURATION, 0x0); + usleep_range(9000, 10000); + + dma |= MREGBIT_STRICT_BURST; + dma |= MREGBIT_DMA_64BIT_MODE; + dma |= DEFAULT_DMA_BURST; + + emac_wr(priv, DMA_CONFIGURATION, dma); +} + +static void emac_dma_start_transmit(struct emac_priv *priv) +{ + /* The actual value written does not matter */ + emac_wr(priv, DMA_TRANSMIT_POLL_DEMAND, 1); +} + +static void emac_enable_interrupt(struct emac_priv *priv) +{ + u32 val; + + val = emac_rd(priv, DMA_INTERRUPT_ENABLE); + val |= MREGBIT_TRANSMIT_TRANSFER_DONE_INTR_ENABLE; + val |= MREGBIT_RECEIVE_TRANSFER_DONE_INTR_ENABLE; + emac_wr(priv, DMA_INTERRUPT_ENABLE, val); +} + +static void emac_disable_interrupt(struct emac_priv *priv) +{ + u32 val; + + val = emac_rd(priv, DMA_INTERRUPT_ENABLE); + val &= ~MREGBIT_TRANSMIT_TRANSFER_DONE_INTR_ENABLE; + val &= ~MREGBIT_RECEIVE_TRANSFER_DONE_INTR_ENABLE; + emac_wr(priv, DMA_INTERRUPT_ENABLE, val); +} + +static u32 emac_tx_avail(struct emac_priv *priv) +{ + struct emac_desc_ring *tx_ring = &priv->tx_ring; + u32 avail; + + if (tx_ring->tail > tx_ring->head) + avail = tx_ring->tail - tx_ring->head - 1; + else + avail = tx_ring->total_cnt - tx_ring->head + tx_ring->tail - 1; + + return avail; +} + +static void emac_tx_coal_timer_resched(struct emac_priv *priv) +{ + mod_timer(&priv->txtimer, + jiffies + usecs_to_jiffies(priv->tx_coal_timeout)); +} + +static void emac_tx_coal_timer(struct timer_list *t) +{ + struct emac_priv *priv = timer_container_of(priv, t, txtimer); + + napi_schedule(&priv->napi); +} + +static bool emac_tx_should_interrupt(struct emac_priv *priv, u32 pkt_num) +{ + priv->tx_count_frames += pkt_num; + if (likely(priv->tx_coal_frames > priv->tx_count_frames)) { + emac_tx_coal_timer_resched(priv); + return false; + } + + priv->tx_count_frames = 0; + return true; +} + +static void emac_free_tx_buf(struct emac_priv *priv, int i) +{ + struct emac_tx_desc_buffer *tx_buf; + struct emac_desc_ring *tx_ring; + struct desc_buf *buf; + int j; + + tx_ring = &priv->tx_ring; + tx_buf = &tx_ring->tx_desc_buf[i]; + + for (j = 0; j < 2; j++) { + buf = &tx_buf->buf[j]; + if (!buf->dma_addr) + continue; + + if (buf->map_as_page) + dma_unmap_page(&priv->pdev->dev, buf->dma_addr, + buf->dma_len, DMA_TO_DEVICE); + else + dma_unmap_single(&priv->pdev->dev, + buf->dma_addr, buf->dma_len, + DMA_TO_DEVICE); + + buf->dma_addr = 0; + buf->map_as_page = false; + buf->buff_addr = NULL; + } + + if (tx_buf->skb) { + dev_kfree_skb_any(tx_buf->skb); + tx_buf->skb = NULL; + } +} + +static void emac_clean_tx_desc_ring(struct emac_priv *priv) +{ + struct emac_desc_ring *tx_ring = &priv->tx_ring; + u32 i; + + /* Free all the TX ring skbs */ + for (i = 0; i < tx_ring->total_cnt; i++) + emac_free_tx_buf(priv, i); + + tx_ring->head = 0; + tx_ring->tail = 0; +} + +static void emac_clean_rx_desc_ring(struct emac_priv *priv) +{ + struct emac_rx_desc_buffer *rx_buf; + struct emac_desc_ring *rx_ring; + u32 i; + + rx_ring = &priv->rx_ring; + + /* Free all the RX ring skbs */ + for (i = 0; i < rx_ring->total_cnt; i++) { + rx_buf = &rx_ring->rx_desc_buf[i]; + + if (!rx_buf->skb) + continue; + + dma_unmap_single(&priv->pdev->dev, rx_buf->dma_addr, + rx_buf->dma_len, DMA_FROM_DEVICE); + + dev_kfree_skb(rx_buf->skb); + rx_buf->skb = NULL; + } + + rx_ring->tail = 0; + rx_ring->head = 0; +} + +static int emac_alloc_tx_resources(struct emac_priv *priv) +{ + struct emac_desc_ring *tx_ring = &priv->tx_ring; + struct platform_device *pdev = priv->pdev; + u32 size; + + size = sizeof(struct emac_tx_desc_buffer) * tx_ring->total_cnt; + + tx_ring->tx_desc_buf = kzalloc(size, GFP_KERNEL); + if (!tx_ring->tx_desc_buf) + return -ENOMEM; + + tx_ring->total_size = tx_ring->total_cnt * sizeof(struct emac_desc); + tx_ring->total_size = ALIGN(tx_ring->total_size, PAGE_SIZE); + + tx_ring->desc_addr = dma_alloc_coherent(&pdev->dev, tx_ring->total_size, + &tx_ring->desc_dma_addr, + GFP_KERNEL); + if (!tx_ring->desc_addr) { + kfree(tx_ring->tx_desc_buf); + return -ENOMEM; + } + + tx_ring->head = 0; + tx_ring->tail = 0; + + return 0; +} + +static int emac_alloc_rx_resources(struct emac_priv *priv) +{ + struct emac_desc_ring *rx_ring = &priv->rx_ring; + struct platform_device *pdev = priv->pdev; + u32 buf_len; + + buf_len = sizeof(struct emac_rx_desc_buffer) * rx_ring->total_cnt; + + rx_ring->rx_desc_buf = kzalloc(buf_len, GFP_KERNEL); + if (!rx_ring->rx_desc_buf) + return -ENOMEM; + + rx_ring->total_size = rx_ring->total_cnt * sizeof(struct emac_desc); + + rx_ring->total_size = ALIGN(rx_ring->total_size, PAGE_SIZE); + + rx_ring->desc_addr = dma_alloc_coherent(&pdev->dev, rx_ring->total_size, + &rx_ring->desc_dma_addr, + GFP_KERNEL); + if (!rx_ring->desc_addr) { + kfree(rx_ring->rx_desc_buf); + return -ENOMEM; + } + + rx_ring->head = 0; + rx_ring->tail = 0; + + return 0; +} + +static void emac_free_tx_resources(struct emac_priv *priv) +{ + struct emac_desc_ring *tr = &priv->tx_ring; + struct device *dev = &priv->pdev->dev; + + emac_clean_tx_desc_ring(priv); + + kfree(tr->tx_desc_buf); + tr->tx_desc_buf = NULL; + + dma_free_coherent(dev, tr->total_size, tr->desc_addr, + tr->desc_dma_addr); + tr->desc_addr = NULL; +} + +static void emac_free_rx_resources(struct emac_priv *priv) +{ + struct emac_desc_ring *rr = &priv->rx_ring; + struct device *dev = &priv->pdev->dev; + + emac_clean_rx_desc_ring(priv); + + kfree(rr->rx_desc_buf); + rr->rx_desc_buf = NULL; + + dma_free_coherent(dev, rr->total_size, rr->desc_addr, + rr->desc_dma_addr); + rr->desc_addr = NULL; +} + +static int emac_tx_clean_desc(struct emac_priv *priv) +{ + struct net_device *ndev = priv->ndev; + struct emac_desc_ring *tx_ring; + struct emac_desc *tx_desc; + u32 i; + + netif_tx_lock(ndev); + + tx_ring = &priv->tx_ring; + + i = tx_ring->tail; + + while (i != tx_ring->head) { + tx_desc = &((struct emac_desc *)tx_ring->desc_addr)[i]; + + /* Stop checking if desc still own by DMA */ + if (READ_ONCE(tx_desc->desc0) & TX_DESC_0_OWN) + break; + + emac_free_tx_buf(priv, i); + memset(tx_desc, 0, sizeof(struct emac_desc)); + + if (++i == tx_ring->total_cnt) + i = 0; + } + + tx_ring->tail = i; + + if (unlikely(netif_queue_stopped(ndev) && + emac_tx_avail(priv) > tx_ring->total_cnt / 4)) + netif_wake_queue(ndev); + + netif_tx_unlock(ndev); + + return 0; +} + +static u32 rx_frame_len(struct emac_desc *desc) +{ + return (desc->desc0 & RX_DESC_0_FRAME_PACKET_LENGTH_MASK) >> + RX_DESC_0_FRAME_PACKET_LENGTH_SHIFT; +} + +static int emac_rx_frame_status(struct emac_priv *priv, struct emac_desc *desc) +{ + const char *msg = NULL; + int ret = RX_FRAME_OK; + + /* Drop if not last descriptor, should not normally happen */ + if (!(desc->desc0 & RX_DESC_0_LAST_DESCRIPTOR)) { + msg = "Not last descriptor"; + ret = RX_FRAME_DISCARD; + } + + if (desc->desc0 & RX_DESC_0_FRAME_RUNT) { + msg = "Runt frame"; + ret = RX_FRAME_DISCARD; + } + + if (desc->desc0 & RX_DESC_0_FRAME_CRC_ERR) { + msg = "Frame CRC error"; + ret = RX_FRAME_DISCARD; + } + + if (desc->desc0 & RX_DESC_0_FRAME_MAX_LEN_ERR) { + msg = "Frame exceeds max length"; + ret = RX_FRAME_DISCARD; + } + + if (desc->desc0 & RX_DESC_0_FRAME_JABBER_ERR) { + msg = "Frame jabber error"; + ret = RX_FRAME_DISCARD; + } + + if (desc->desc0 & RX_DESC_0_FRAME_LENGTH_ERR) { + msg = "Frame length error"; + ret = RX_FRAME_DISCARD; + } + + if (rx_frame_len(desc) <= ETH_FCS_LEN || + rx_frame_len(desc) > priv->dma_buf_sz) { + msg = "Frame length unacceptable"; + ret = RX_FRAME_DISCARD; + } + + if (ret != RX_FRAME_OK) + dev_dbg_ratelimited(&priv->ndev->dev, "RX dropped: %s", msg); + + return ret; +} + +/* RX and TX use the same layout for {RX,TX}_DESC_1_BUFFER_SIZE_{1,2} */ + +static u32 make_buf_size_1(u32 size) +{ + return (size << TX_DESC_1_BUFFER_SIZE_1_SHIFT) & + TX_DESC_1_BUFFER_SIZE_1_MASK; +} + +static u32 make_buf_size_2(u32 size) +{ + return (size << TX_DESC_1_BUFFER_SIZE_2_SHIFT) & + TX_DESC_1_BUFFER_SIZE_2_MASK; +} + +static void emac_alloc_rx_desc_buffers(struct emac_priv *priv) +{ + struct emac_desc_ring *rx_ring = &priv->rx_ring; + struct emac_desc rx_desc, *rx_desc_addr; + struct net_device *ndev = priv->ndev; + struct emac_rx_desc_buffer *rx_buf; + struct sk_buff *skb; + u32 i; + + i = rx_ring->head; + rx_buf = &rx_ring->rx_desc_buf[i]; + + while (!rx_buf->skb) { + skb = netdev_alloc_skb_ip_align(ndev, priv->dma_buf_sz); + if (!skb) + break; + + skb->dev = ndev; + + rx_buf->skb = skb; + rx_buf->dma_len = priv->dma_buf_sz; + rx_buf->dma_addr = dma_map_single(&priv->pdev->dev, skb->data, + priv->dma_buf_sz, + DMA_FROM_DEVICE); + if (dma_mapping_error(&priv->pdev->dev, rx_buf->dma_addr)) { + dev_err_ratelimited(&ndev->dev, "Mapping skb failed\n"); + goto err_free_skb; + } + + rx_desc_addr = &((struct emac_desc *)rx_ring->desc_addr)[i]; + + memset(&rx_desc, 0, sizeof(rx_desc)); + + rx_desc.buffer_addr_1 = rx_buf->dma_addr; + rx_desc.desc1 = make_buf_size_1(rx_buf->dma_len); + + if (++i == rx_ring->total_cnt) { + rx_desc.desc1 |= RX_DESC_1_END_RING; + i = 0; + } + + *rx_desc_addr = rx_desc; + dma_wmb(); + WRITE_ONCE(rx_desc_addr->desc0, rx_desc.desc0 | RX_DESC_0_OWN); + + rx_buf = &rx_ring->rx_desc_buf[i]; + } + + rx_ring->head = i; + return; + +err_free_skb: + dev_kfree_skb_any(skb); + rx_buf->skb = NULL; +} + +/* Returns number of packets received */ +static int emac_rx_clean_desc(struct emac_priv *priv, int budget) +{ + struct net_device *ndev = priv->ndev; + struct emac_rx_desc_buffer *rx_buf; + struct emac_desc_ring *rx_ring; + struct sk_buff *skb = NULL; + struct emac_desc *rx_desc; + u32 got = 0, skb_len, i; + int status; + + rx_ring = &priv->rx_ring; + + i = rx_ring->tail; + + while (budget--) { + rx_desc = &((struct emac_desc *)rx_ring->desc_addr)[i]; + + /* Stop checking if rx_desc still owned by DMA */ + if (READ_ONCE(rx_desc->desc0) & RX_DESC_0_OWN) + break; + + dma_rmb(); + + rx_buf = &rx_ring->rx_desc_buf[i]; + + if (!rx_buf->skb) + break; + + got++; + + dma_unmap_single(&priv->pdev->dev, rx_buf->dma_addr, + rx_buf->dma_len, DMA_FROM_DEVICE); + + status = emac_rx_frame_status(priv, rx_desc); + if (unlikely(status == RX_FRAME_DISCARD)) { + ndev->stats.rx_dropped++; + dev_kfree_skb_irq(rx_buf->skb); + rx_buf->skb = NULL; + } else { + skb = rx_buf->skb; + skb_len = rx_frame_len(rx_desc) - ETH_FCS_LEN; + skb_put(skb, skb_len); + skb->dev = ndev; + ndev->hard_header_len = ETH_HLEN; + + skb->protocol = eth_type_trans(skb, ndev); + + skb->ip_summed = CHECKSUM_NONE; + + napi_gro_receive(&priv->napi, skb); + + ndev->stats.rx_packets++; + ndev->stats.rx_bytes += skb_len; + + memset(rx_desc, 0, sizeof(struct emac_desc)); + rx_buf->skb = NULL; + } + + if (++i == rx_ring->total_cnt) + i = 0; + } + + rx_ring->tail = i; + + emac_alloc_rx_desc_buffers(priv); + + return got; +} + +static int emac_rx_poll(struct napi_struct *napi, int budget) +{ + struct emac_priv *priv = container_of(napi, struct emac_priv, napi); + int work_done; + + emac_tx_clean_desc(priv); + + work_done = emac_rx_clean_desc(priv, budget); + if (work_done < budget && napi_complete_done(napi, work_done)) + emac_enable_interrupt(priv); + + return work_done; +} + +/* + * For convenience, skb->data is fragment 0, frags[0] is fragment 1, etc. + * + * Each descriptor can hold up to two fragments, called buffer 1 and 2. For each + * fragment f, if f % 2 == 0, it uses buffer 1, otherwise it uses buffer 2. + */ + +static int emac_tx_map_frag(struct device *dev, struct emac_desc *tx_desc, + struct emac_tx_desc_buffer *tx_buf, + struct sk_buff *skb, u32 frag_idx) +{ + bool map_as_page, buf_idx; + const skb_frag_t *frag; + phys_addr_t addr; + u32 len; + int ret; + + buf_idx = frag_idx % 2; + + if (frag_idx == 0) { + /* Non-fragmented part */ + len = skb_headlen(skb); + addr = dma_map_single(dev, skb->data, len, DMA_TO_DEVICE); + map_as_page = false; + } else { + /* Fragment */ + frag = &skb_shinfo(skb)->frags[frag_idx - 1]; + len = skb_frag_size(frag); + addr = skb_frag_dma_map(dev, frag, 0, len, DMA_TO_DEVICE); + map_as_page = true; + } + + ret = dma_mapping_error(dev, addr); + if (ret) + return ret; + + tx_buf->buf[buf_idx].dma_addr = addr; + tx_buf->buf[buf_idx].dma_len = len; + tx_buf->buf[buf_idx].map_as_page = map_as_page; + + if (buf_idx == 0) { + tx_desc->buffer_addr_1 = addr; + tx_desc->desc1 |= make_buf_size_1(len); + } else { + tx_desc->buffer_addr_2 = addr; + tx_desc->desc1 |= make_buf_size_2(len); + } + + return 0; +} + +static void emac_tx_mem_map(struct emac_priv *priv, struct sk_buff *skb) +{ + struct emac_desc_ring *tx_ring = &priv->tx_ring; + struct emac_desc tx_desc, *tx_desc_addr; + struct device *dev = &priv->pdev->dev; + struct emac_tx_desc_buffer *tx_buf; + u32 head, old_head, frag_num, f; + bool buf_idx; + + frag_num = skb_shinfo(skb)->nr_frags; + head = tx_ring->head; + old_head = head; + + for (f = 0; f < frag_num + 1; f++) { + buf_idx = f % 2; + + /* + * If using buffer 1, initialize a new desc. Otherwise, use + * buffer 2 of previous fragment's desc. + */ + if (!buf_idx) { + tx_buf = &tx_ring->tx_desc_buf[head]; + tx_desc_addr = + &((struct emac_desc *)tx_ring->desc_addr)[head]; + memset(&tx_desc, 0, sizeof(tx_desc)); + + /* + * Give ownership for all but first desc initially. For + * first desc, give at the end so DMA cannot start + * reading uninitialized descs. + */ + if (head != old_head) + tx_desc.desc0 |= TX_DESC_0_OWN; + + if (++head == tx_ring->total_cnt) { + /* Just used last desc in ring */ + tx_desc.desc1 |= TX_DESC_1_END_RING; + head = 0; + } + } + + if (emac_tx_map_frag(dev, &tx_desc, tx_buf, skb, f)) { + dev_err_ratelimited(&priv->ndev->dev, + "Map TX frag %d failed\n", f); + goto err_free_skb; + } + + if (f == 0) + tx_desc.desc1 |= TX_DESC_1_FIRST_SEGMENT; + + if (f == frag_num) { + tx_desc.desc1 |= TX_DESC_1_LAST_SEGMENT; + tx_buf->skb = skb; + if (emac_tx_should_interrupt(priv, frag_num + 1)) + tx_desc.desc1 |= + TX_DESC_1_INTERRUPT_ON_COMPLETION; + } + + *tx_desc_addr = tx_desc; + } + + /* All descriptors are ready, give ownership for first desc */ + tx_desc_addr = &((struct emac_desc *)tx_ring->desc_addr)[old_head]; + dma_wmb(); + WRITE_ONCE(tx_desc_addr->desc0, tx_desc_addr->desc0 | TX_DESC_0_OWN); + + emac_dma_start_transmit(priv); + + tx_ring->head = head; + + priv->ndev->stats.tx_packets++; + priv->ndev->stats.tx_bytes += skb->len; + + return; + +err_free_skb: + dev_kfree_skb_any(skb); + priv->ndev->stats.tx_dropped++; +} + +static netdev_tx_t emac_start_xmit(struct sk_buff *skb, struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + int nfrags = skb_shinfo(skb)->nr_frags; + struct device *dev = &priv->pdev->dev; + + if (unlikely(emac_tx_avail(priv) < nfrags + 1)) { + if (!netif_queue_stopped(ndev)) { + netif_stop_queue(ndev); + dev_err_ratelimited(dev, "TX ring full, stop TX queue\n"); + } + return NETDEV_TX_BUSY; + } + + emac_tx_mem_map(priv, skb); + + /* Make sure there is space in the ring for the next TX. */ + if (unlikely(emac_tx_avail(priv) <= MAX_SKB_FRAGS + 2)) + netif_stop_queue(ndev); + + return NETDEV_TX_OK; +} + +static int emac_set_mac_address(struct net_device *ndev, void *addr) +{ + struct emac_priv *priv = netdev_priv(ndev); + int ret = eth_mac_addr(ndev, addr); + + if (ret) + return ret; + + /* If running, set now; if not running it will be set in emac_up. */ + if (netif_running(ndev)) + emac_set_mac_addr(priv, ndev->dev_addr); + + return 0; +} + +static void emac_mac_multicast_filter_clear(struct emac_priv *priv) +{ + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, 0x0); +} + +/* Configure Multicast and Promiscuous modes */ +static void emac_set_rx_mode(struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + u32 crc32, bit, reg, hash, val; + struct netdev_hw_addr *ha; + u32 mc_filter[4] = { 0 }; + + val = emac_rd(priv, MAC_ADDRESS_CONTROL); + + val &= ~MREGBIT_PROMISCUOUS_MODE; + + if (ndev->flags & IFF_PROMISC) { + /* Enable promisc mode */ + val |= MREGBIT_PROMISCUOUS_MODE; + } else if ((ndev->flags & IFF_ALLMULTI) || + (netdev_mc_count(ndev) > HASH_TABLE_SIZE)) { + /* Accept all multicast frames by setting every bit */ + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, 0xffff); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, 0xffff); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, 0xffff); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, 0xffff); + } else if (!netdev_mc_empty(ndev)) { + emac_mac_multicast_filter_clear(priv); + netdev_for_each_mc_addr(ha, ndev) { + /* Calculate the CRC of the MAC address */ + crc32 = ether_crc(ETH_ALEN, ha->addr); + + /* + * The hash table is an array of 4 16-bit registers. It + * is treated like an array of 64 bits (bits[hash]). Use + * the upper 6 bits of the above CRC as the hash value. + */ + hash = (crc32 >> 26) & 0x3F; + reg = hash / 16; + bit = hash % 16; + mc_filter[reg] |= BIT(bit); + } + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, mc_filter[0]); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, mc_filter[1]); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, mc_filter[2]); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, mc_filter[3]); + } + + emac_wr(priv, MAC_ADDRESS_CONTROL, val); +} + +static int emac_change_mtu(struct net_device *ndev, int mtu) +{ + struct emac_priv *priv = netdev_priv(ndev); + u32 frame_len; + + if (netif_running(ndev)) { + netdev_err(ndev, "must be stopped to change MTU\n"); + return -EBUSY; + } + + frame_len = mtu + ETH_HLEN + ETH_FCS_LEN; + + if (frame_len <= EMAC_DEFAULT_BUFSIZE) + priv->dma_buf_sz = EMAC_DEFAULT_BUFSIZE; + else if (frame_len <= EMAC_RX_BUF_2K) + priv->dma_buf_sz = EMAC_RX_BUF_2K; + else + priv->dma_buf_sz = EMAC_RX_BUF_4K; + + ndev->mtu = mtu; + + return 0; +} + +static void emac_tx_timeout(struct net_device *ndev, unsigned int txqueue) +{ + struct emac_priv *priv = netdev_priv(ndev); + + schedule_work(&priv->tx_timeout_task); +} + +static int emac_mii_read(struct mii_bus *bus, int phy_addr, int regnum) +{ + struct emac_priv *priv = bus->priv; + u32 cmd = 0, val; + int ret; + + cmd |= phy_addr & 0x1F; + cmd |= (regnum & 0x1F) << 5; + cmd |= MREGBIT_START_MDIO_TRANS | MREGBIT_MDIO_READ_WRITE; + + emac_wr(priv, MAC_MDIO_DATA, 0x0); + emac_wr(priv, MAC_MDIO_CONTROL, cmd); + + ret = readl_poll_timeout(priv->iobase + MAC_MDIO_CONTROL, val, + !((val >> 15) & 0x1), 100, 10000); + + if (ret) + return ret; + + val = emac_rd(priv, MAC_MDIO_DATA); + return val; +} + +static int emac_mii_write(struct mii_bus *bus, int phy_addr, int regnum, + u16 value) +{ + struct emac_priv *priv = bus->priv; + u32 cmd = 0, val; + int ret; + + emac_wr(priv, MAC_MDIO_DATA, value); + + cmd |= phy_addr & 0x1F; + cmd |= (regnum & 0x1F) << 5; + cmd |= MREGBIT_START_MDIO_TRANS; + + emac_wr(priv, MAC_MDIO_CONTROL, cmd); + + ret = readl_poll_timeout(priv->iobase + MAC_MDIO_CONTROL, val, + !((val >> 15) & 0x1), 100, 10000); + + return ret; +} + +static int emac_mdio_init(struct emac_priv *priv) +{ + struct device *dev = &priv->pdev->dev; + struct device_node *mii_np; + struct mii_bus *mii; + int ret; + + mii = devm_mdiobus_alloc(dev); + if (!mii) + return -ENOMEM; + + mii->priv = priv; + mii->name = "k1_emac_mii"; + mii->read = emac_mii_read; + mii->write = emac_mii_write; + mii->parent = dev; + mii->phy_mask = 0xffffffff; + snprintf(mii->id, MII_BUS_ID_SIZE, "%s", priv->pdev->name); + + mii_np = of_get_available_child_by_name(dev->of_node, "mdio-bus"); + + ret = devm_of_mdiobus_register(dev, mii, mii_np); + if (ret) + dev_err_probe(dev, ret, "Failed to register mdio bus\n"); + + of_node_put(mii_np); + return ret; +} + +static void emac_set_tx_fc(struct emac_priv *priv, bool enable) +{ + u32 val; + + val = emac_rd(priv, MAC_FC_CONTROL); + + if (enable) { + val |= MREGBIT_FC_GENERATION_ENABLE; + val |= MREGBIT_AUTO_FC_GENERATION_ENABLE; + } else { + val &= ~MREGBIT_FC_GENERATION_ENABLE; + val &= ~MREGBIT_AUTO_FC_GENERATION_ENABLE; + } + + emac_wr(priv, MAC_FC_CONTROL, val); +} + +static void emac_set_rx_fc(struct emac_priv *priv, bool enable) +{ + u32 val = emac_rd(priv, MAC_FC_CONTROL); + + if (enable) + val |= MREGBIT_FC_DECODE_ENABLE; + else + val &= ~MREGBIT_FC_DECODE_ENABLE; + + emac_wr(priv, MAC_FC_CONTROL, val); +} + +static void emac_set_fc(struct emac_priv *priv, u8 fc) +{ + emac_set_tx_fc(priv, fc & FLOW_CTRL_TX); + emac_set_rx_fc(priv, fc & FLOW_CTRL_RX); + priv->flow_control = fc; +} + +static void emac_set_fc_autoneg(struct emac_priv *priv) +{ + struct phy_device *phydev = priv->ndev->phydev; + u32 local_adv, remote_adv; + u8 fc; + + local_adv = linkmode_adv_to_lcl_adv_t(phydev->advertising); + + remote_adv = 0; + + if (phydev->pause) + remote_adv |= LPA_PAUSE_CAP; + + if (phydev->asym_pause) + remote_adv |= LPA_PAUSE_ASYM; + + fc = mii_resolve_flowctrl_fdx(local_adv, remote_adv); + + priv->flow_control_autoneg = true; + + emac_set_fc(priv, fc); +} + +/* + * Even though this MAC supports gigabit operation, it only provides 32-bit + * statistics counters. The most overflow-prone counters are the "bytes" ones, + * which at gigabit overflow about twice a minute. + * + * Therefore, we maintain the high 32 bits of counters ourselves, incrementing + * every time statistics seem to go backwards. Also, update periodically to + * catch overflows when we are not otherwise checking the statistics often + * enough. + */ + +#define EMAC_STATS_TIMER_PERIOD 20 + +static int emac_read_stat_cnt(struct emac_priv *priv, u8 cnt, u32 *res, + u32 control_reg, u32 high_reg, u32 low_reg) +{ + u32 val; + int ret; + + /* The "read" bit is the same for TX and RX */ + + val = MREGBIT_START_TX_COUNTER_READ | cnt; + emac_wr(priv, control_reg, val); + val = emac_rd(priv, control_reg); + + ret = readl_poll_timeout_atomic(priv->iobase + control_reg, val, + !(val & MREGBIT_START_TX_COUNTER_READ), + 100, 10000); + + if (ret) { + netdev_err(priv->ndev, "Read stat timeout\n"); + return ret; + } + + *res = emac_rd(priv, high_reg) << 16; + *res |= (u16)emac_rd(priv, low_reg); + + return 0; +} + +static int emac_tx_read_stat_cnt(struct emac_priv *priv, u8 cnt, u32 *res) +{ + return emac_read_stat_cnt(priv, cnt, res, MAC_TX_STATCTR_CONTROL, + MAC_TX_STATCTR_DATA_HIGH, + MAC_TX_STATCTR_DATA_LOW); +} + +static int emac_rx_read_stat_cnt(struct emac_priv *priv, u8 cnt, u32 *res) +{ + return emac_read_stat_cnt(priv, cnt, res, MAC_RX_STATCTR_CONTROL, + MAC_RX_STATCTR_DATA_HIGH, + MAC_RX_STATCTR_DATA_LOW); +} + +static void emac_update_counter(u64 *counter, u32 new_low) +{ + u32 old_low = (u32)*counter; + u64 high = *counter >> 32; + + if (old_low > new_low) { + /* Overflowed, increment high 32 bits */ + high++; + } + + *counter = (high << 32) | new_low; +} + +static void emac_stats_update(struct emac_priv *priv) +{ + u64 *tx_stats_off = (u64 *)&priv->tx_stats_off; + u64 *rx_stats_off = (u64 *)&priv->rx_stats_off; + u64 *tx_stats = (u64 *)&priv->tx_stats; + u64 *rx_stats = (u64 *)&priv->rx_stats; + u32 i, res; + + assert_spin_locked(&priv->stats_lock); + + if (!netif_running(priv->ndev) || !netif_device_present(priv->ndev)) { + /* Not up, don't try to update */ + return; + } + + for (i = 0; i < sizeof(priv->tx_stats) / sizeof(*tx_stats); i++) { + /* + * If reading stats times out, everything is broken and there's + * nothing we can do. Reading statistics also can't return an + * error, so just return without updating and without + * rescheduling. + */ + if (emac_tx_read_stat_cnt(priv, i, &res)) + return; + + /* + * Re-initializing while bringing interface up resets counters + * to zero, so to provide continuity, we add the values saved + * last time we did emac_down() to the new hardware-provided + * value. + */ + emac_update_counter(&tx_stats[i], res + (u32)tx_stats_off[i]); + } + + /* Similar remarks as TX stats */ + for (i = 0; i < sizeof(priv->rx_stats) / sizeof(*rx_stats); i++) { + if (emac_rx_read_stat_cnt(priv, i, &res)) + return; + emac_update_counter(&rx_stats[i], res + (u32)rx_stats_off[i]); + } + + mod_timer(&priv->stats_timer, jiffies + EMAC_STATS_TIMER_PERIOD * HZ); +} + +static void emac_stats_timer(struct timer_list *t) +{ + struct emac_priv *priv = timer_container_of(priv, t, stats_timer); + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + spin_unlock(&priv->stats_lock); +} + +static const struct ethtool_rmon_hist_range emac_rmon_hist_ranges[] = { + { 64, 64 }, + { 65, 127 }, + { 128, 255 }, + { 256, 511 }, + { 512, 1023 }, + { 1024, 1518 }, + { 1519, 4096 }, + { /* sentinel */ }, +}; + +static void emac_get_stats64(struct net_device *dev, + struct rtnl_link_stats64 *storage) +{ + struct emac_priv *priv = netdev_priv(dev); + struct emac_hw_tx_stats *tx_stats; + struct emac_hw_rx_stats *rx_stats; + + tx_stats = &priv->tx_stats; + rx_stats = &priv->rx_stats; + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + storage->tx_packets = tx_stats->tx_ok_pkts; + storage->tx_bytes = tx_stats->tx_ok_bytes; + storage->tx_errors = tx_stats->tx_err_pkts; + + storage->rx_packets = rx_stats->rx_ok_pkts; + storage->rx_bytes = rx_stats->rx_ok_bytes; + storage->rx_errors = rx_stats->rx_err_total_pkts; + storage->rx_crc_errors = rx_stats->rx_crc_err_pkts; + storage->rx_frame_errors = rx_stats->rx_align_err_pkts; + storage->rx_length_errors = rx_stats->rx_len_err_pkts; + + storage->collisions = tx_stats->tx_singleclsn_pkts; + storage->collisions += tx_stats->tx_multiclsn_pkts; + storage->collisions += tx_stats->tx_excessclsn_pkts; + + storage->rx_missed_errors = rx_stats->rx_drp_fifo_full_pkts; + storage->rx_missed_errors += rx_stats->rx_truncate_fifo_full_pkts; + + spin_unlock(&priv->stats_lock); +} + +static void emac_get_rmon_stats(struct net_device *dev, + struct ethtool_rmon_stats *rmon_stats, + const struct ethtool_rmon_hist_range **ranges) +{ + struct emac_priv *priv = netdev_priv(dev); + struct emac_hw_rx_stats *rx_stats; + + rx_stats = &priv->rx_stats; + + *ranges = emac_rmon_hist_ranges; + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + rmon_stats->undersize_pkts = rx_stats->rx_len_undersize_pkts; + rmon_stats->oversize_pkts = rx_stats->rx_len_oversize_pkts; + rmon_stats->fragments = rx_stats->rx_len_fragment_pkts; + rmon_stats->jabbers = rx_stats->rx_len_jabber_pkts; + + /* Only RX has histogram stats */ + + rmon_stats->hist[0] = rx_stats->rx_64_pkts; + rmon_stats->hist[1] = rx_stats->rx_65_127_pkts; + rmon_stats->hist[2] = rx_stats->rx_128_255_pkts; + rmon_stats->hist[3] = rx_stats->rx_256_511_pkts; + rmon_stats->hist[4] = rx_stats->rx_512_1023_pkts; + rmon_stats->hist[5] = rx_stats->rx_1024_1518_pkts; + rmon_stats->hist[6] = rx_stats->rx_1519_plus_pkts; + + spin_unlock(&priv->stats_lock); +} + +static void emac_get_eth_mac_stats(struct net_device *dev, + struct ethtool_eth_mac_stats *mac_stats) +{ + struct emac_priv *priv = netdev_priv(dev); + struct emac_hw_tx_stats *tx_stats; + struct emac_hw_rx_stats *rx_stats; + + tx_stats = &priv->tx_stats; + rx_stats = &priv->rx_stats; + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + mac_stats->MulticastFramesXmittedOK = tx_stats->tx_multicast_pkts; + mac_stats->BroadcastFramesXmittedOK = tx_stats->tx_broadcast_pkts; + + mac_stats->MulticastFramesReceivedOK = rx_stats->rx_multicast_pkts; + mac_stats->BroadcastFramesReceivedOK = rx_stats->rx_broadcast_pkts; + + mac_stats->SingleCollisionFrames = tx_stats->tx_singleclsn_pkts; + mac_stats->MultipleCollisionFrames = tx_stats->tx_multiclsn_pkts; + mac_stats->LateCollisions = tx_stats->tx_lateclsn_pkts; + mac_stats->FramesAbortedDueToXSColls = tx_stats->tx_excessclsn_pkts; + + spin_unlock(&priv->stats_lock); +} + +static void emac_get_pause_stats(struct net_device *dev, + struct ethtool_pause_stats *pause_stats) +{ + struct emac_priv *priv = netdev_priv(dev); + struct emac_hw_tx_stats *tx_stats; + struct emac_hw_rx_stats *rx_stats; + + tx_stats = &priv->tx_stats; + rx_stats = &priv->rx_stats; + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + pause_stats->tx_pause_frames = tx_stats->tx_pause_pkts; + pause_stats->rx_pause_frames = rx_stats->rx_pause_pkts; + + spin_unlock(&priv->stats_lock); +} + +/* Other statistics that are not derivable from standard statistics */ + +#define EMAC_ETHTOOL_STAT(type, name) \ + { offsetof(type, name) / sizeof(u64), #name } + +static const struct emac_ethtool_stats { + size_t offset; + char str[ETH_GSTRING_LEN]; +} emac_ethtool_rx_stats[] = { + EMAC_ETHTOOL_STAT(struct emac_hw_rx_stats, rx_drp_fifo_full_pkts), + EMAC_ETHTOOL_STAT(struct emac_hw_rx_stats, rx_truncate_fifo_full_pkts), +}; + +static int emac_get_sset_count(struct net_device *dev, int sset) +{ + switch (sset) { + case ETH_SS_STATS: + return ARRAY_SIZE(emac_ethtool_rx_stats); + default: + return -EOPNOTSUPP; + } +} + +static void emac_get_strings(struct net_device *dev, u32 stringset, u8 *data) +{ + int i; + + switch (stringset) { + case ETH_SS_STATS: + for (i = 0; i < ARRAY_SIZE(emac_ethtool_rx_stats); i++) { + memcpy(data, emac_ethtool_rx_stats[i].str, + ETH_GSTRING_LEN); + data += ETH_GSTRING_LEN; + } + break; + } +} + +static void emac_get_ethtool_stats(struct net_device *dev, + struct ethtool_stats *stats, u64 *data) +{ + struct emac_priv *priv = netdev_priv(dev); + u64 *rx_stats = (u64 *)&priv->rx_stats; + int i; + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + for (i = 0; i < ARRAY_SIZE(emac_ethtool_rx_stats); i++) + data[i] = rx_stats[emac_ethtool_rx_stats[i].offset]; + + spin_unlock(&priv->stats_lock); +} + +static int emac_ethtool_get_regs_len(struct net_device *dev) +{ + return (EMAC_DMA_REG_CNT + EMAC_MAC_REG_CNT) * sizeof(u32); +} + +static void emac_ethtool_get_regs(struct net_device *dev, + struct ethtool_regs *regs, void *space) +{ + struct emac_priv *priv = netdev_priv(dev); + u32 *reg_space = space; + int i; + + regs->version = 1; + + for (i = 0; i < EMAC_DMA_REG_CNT; i++) + reg_space[i] = emac_rd(priv, DMA_CONFIGURATION + i * 4); + + for (i = 0; i < EMAC_MAC_REG_CNT; i++) + reg_space[i + EMAC_DMA_REG_CNT] = + emac_rd(priv, MAC_GLOBAL_CONTROL + i * 4); +} + +static void emac_get_pauseparam(struct net_device *dev, + struct ethtool_pauseparam *pause) +{ + struct emac_priv *priv = netdev_priv(dev); + + pause->autoneg = priv->flow_control_autoneg; + pause->tx_pause = !!(priv->flow_control & FLOW_CTRL_TX); + pause->rx_pause = !!(priv->flow_control & FLOW_CTRL_RX); +} + +static int emac_set_pauseparam(struct net_device *dev, + struct ethtool_pauseparam *pause) +{ + struct emac_priv *priv = netdev_priv(dev); + u8 fc = 0; + + priv->flow_control_autoneg = pause->autoneg; + + if (pause->autoneg) { + emac_set_fc_autoneg(priv); + } else { + if (pause->tx_pause) + fc |= FLOW_CTRL_TX; + + if (pause->rx_pause) + fc |= FLOW_CTRL_RX; + + emac_set_fc(priv, fc); + } + + return 0; +} + +static void emac_get_drvinfo(struct net_device *dev, + struct ethtool_drvinfo *info) +{ + strscpy(info->driver, DRIVER_NAME, sizeof(info->driver)); + info->n_stats = ARRAY_SIZE(emac_ethtool_rx_stats); +} + +static void emac_tx_timeout_task(struct work_struct *work) +{ + struct net_device *ndev; + struct emac_priv *priv; + + priv = container_of(work, struct emac_priv, tx_timeout_task); + ndev = priv->ndev; + + rtnl_lock(); + + /* No need to reset if already down */ + if (!netif_running(ndev)) { + rtnl_unlock(); + return; + } + + netdev_err(ndev, "MAC reset due to TX timeout\n"); + + netif_trans_update(ndev); /* prevent tx timeout */ + dev_close(ndev); + dev_open(ndev, NULL); + + rtnl_unlock(); +} + +static void emac_sw_init(struct emac_priv *priv) +{ + priv->dma_buf_sz = EMAC_DEFAULT_BUFSIZE; + + priv->tx_ring.total_cnt = DEFAULT_TX_RING_NUM; + priv->rx_ring.total_cnt = DEFAULT_RX_RING_NUM; + + spin_lock_init(&priv->stats_lock); + + INIT_WORK(&priv->tx_timeout_task, emac_tx_timeout_task); + + priv->tx_coal_frames = EMAC_TX_FRAMES; + priv->tx_coal_timeout = EMAC_TX_COAL_TIMEOUT; + + timer_setup(&priv->txtimer, emac_tx_coal_timer, 0); + timer_setup(&priv->stats_timer, emac_stats_timer, 0); +} + +static irqreturn_t emac_interrupt_handler(int irq, void *dev_id) +{ + struct net_device *ndev = (struct net_device *)dev_id; + struct emac_priv *priv = netdev_priv(ndev); + bool should_schedule = false; + u32 clr = 0; + u32 status; + + status = emac_rd(priv, DMA_STATUS_IRQ); + + if (status & MREGBIT_TRANSMIT_TRANSFER_DONE_IRQ) { + clr |= MREGBIT_TRANSMIT_TRANSFER_DONE_IRQ; + should_schedule = true; + } + + if (status & MREGBIT_TRANSMIT_DES_UNAVAILABLE_IRQ) + clr |= MREGBIT_TRANSMIT_DES_UNAVAILABLE_IRQ; + + if (status & MREGBIT_TRANSMIT_DMA_STOPPED_IRQ) + clr |= MREGBIT_TRANSMIT_DMA_STOPPED_IRQ; + + if (status & MREGBIT_RECEIVE_TRANSFER_DONE_IRQ) { + clr |= MREGBIT_RECEIVE_TRANSFER_DONE_IRQ; + should_schedule = true; + } + + if (status & MREGBIT_RECEIVE_DES_UNAVAILABLE_IRQ) + clr |= MREGBIT_RECEIVE_DES_UNAVAILABLE_IRQ; + + if (status & MREGBIT_RECEIVE_DMA_STOPPED_IRQ) + clr |= MREGBIT_RECEIVE_DMA_STOPPED_IRQ; + + if (status & MREGBIT_RECEIVE_MISSED_FRAME_IRQ) + clr |= MREGBIT_RECEIVE_MISSED_FRAME_IRQ; + + if (should_schedule) { + if (napi_schedule_prep(&priv->napi)) { + emac_disable_interrupt(priv); + __napi_schedule_irqoff(&priv->napi); + } + } + + emac_wr(priv, DMA_STATUS_IRQ, clr); + + return IRQ_HANDLED; +} + +static void emac_configure_tx(struct emac_priv *priv) +{ + u32 val; + + /* Set base address */ + val = (u32)priv->tx_ring.desc_dma_addr; + emac_wr(priv, DMA_TRANSMIT_BASE_ADDRESS, val); + + /* Set TX inter-frame gap value, enable transmit */ + val = emac_rd(priv, MAC_TRANSMIT_CONTROL); + val &= ~MREGBIT_IFG_LEN; + val |= MREGBIT_TRANSMIT_ENABLE; + val |= MREGBIT_TRANSMIT_AUTO_RETRY; + emac_wr(priv, MAC_TRANSMIT_CONTROL, val); + + emac_wr(priv, DMA_TRANSMIT_AUTO_POLL_COUNTER, 0x0); + + /* Start TX DMA */ + val = emac_rd(priv, DMA_CONTROL); + val |= MREGBIT_START_STOP_TRANSMIT_DMA; + emac_wr(priv, DMA_CONTROL, val); +} + +static void emac_configure_rx(struct emac_priv *priv) +{ + u32 val; + + /* Set base address */ + val = (u32)priv->rx_ring.desc_dma_addr; + emac_wr(priv, DMA_RECEIVE_BASE_ADDRESS, val); + + /* Enable receive */ + val = emac_rd(priv, MAC_RECEIVE_CONTROL); + val |= MREGBIT_RECEIVE_ENABLE; + val |= MREGBIT_STORE_FORWARD; + emac_wr(priv, MAC_RECEIVE_CONTROL, val); + + /* Start RX DMA */ + val = emac_rd(priv, DMA_CONTROL); + val |= MREGBIT_START_STOP_RECEIVE_DMA; + emac_wr(priv, DMA_CONTROL, val); +} + +static void emac_adjust_link(struct net_device *dev) +{ + struct emac_priv *priv = netdev_priv(dev); + struct phy_device *phydev = dev->phydev; + u32 ctrl; + + if (phydev->link) { + ctrl = emac_rd(priv, MAC_GLOBAL_CONTROL); + + /* Update duplex and speed from PHY */ + + if (!phydev->duplex) + ctrl &= ~MREGBIT_FULL_DUPLEX_MODE; + else + ctrl |= MREGBIT_FULL_DUPLEX_MODE; + + ctrl &= ~MREGBIT_SPEED; + + switch (phydev->speed) { + case SPEED_1000: + ctrl |= MREGBIT_SPEED_1000M; + break; + case SPEED_100: + ctrl |= MREGBIT_SPEED_100M; + break; + case SPEED_10: + ctrl |= MREGBIT_SPEED_10M; + break; + default: + netdev_err(dev, "Unknown speed: %d\n", phydev->speed); + phydev->speed = SPEED_UNKNOWN; + break; + } + + emac_wr(priv, MAC_GLOBAL_CONTROL, ctrl); + + emac_set_fc_autoneg(priv); + } + + phy_print_status(phydev); +} + +static void emac_update_delay_line(struct emac_priv *priv) +{ + u32 mask = 0, val = 0; + + mask |= EMAC_RX_DLINE_EN; + mask |= EMAC_RX_DLINE_STEP_MASK | EMAC_RX_DLINE_CODE_MASK; + mask |= EMAC_TX_DLINE_EN; + mask |= EMAC_TX_DLINE_STEP_MASK | EMAC_TX_DLINE_CODE_MASK; + + if (phy_interface_mode_is_rgmii(priv->phy_interface)) { + val |= EMAC_RX_DLINE_EN; + val |= EMAC_DLINE_STEP_15P6 << EMAC_RX_DLINE_STEP_SHIFT; + val |= (priv->rx_delay << EMAC_RX_DLINE_CODE_SHIFT) & + EMAC_RX_DLINE_CODE_MASK; + + val |= EMAC_TX_DLINE_EN; + val |= EMAC_DLINE_STEP_15P6 << EMAC_TX_DLINE_STEP_SHIFT; + val |= (priv->tx_delay << EMAC_TX_DLINE_CODE_SHIFT) & + EMAC_TX_DLINE_CODE_MASK; + } + + regmap_update_bits(priv->regmap_apmu, + priv->regmap_apmu_offset + APMU_EMAC_DLINE_REG, + mask, val); +} + +static int emac_phy_connect(struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + struct device *dev = &priv->pdev->dev; + struct phy_device *phydev; + struct device_node *np; + int ret; + + ret = of_get_phy_mode(dev->of_node, &priv->phy_interface); + if (ret) { + netdev_err(ndev, "No phy-mode found"); + return ret; + } + + switch (priv->phy_interface) { + case PHY_INTERFACE_MODE_RMII: + case PHY_INTERFACE_MODE_RGMII: + case PHY_INTERFACE_MODE_RGMII_ID: + case PHY_INTERFACE_MODE_RGMII_RXID: + case PHY_INTERFACE_MODE_RGMII_TXID: + break; + default: + netdev_err(ndev, "Unsupported PHY interface %s", + phy_modes(priv->phy_interface)); + return -EINVAL; + } + + np = of_parse_phandle(dev->of_node, "phy-handle", 0); + if (!np && of_phy_is_fixed_link(dev->of_node)) + np = of_node_get(dev->of_node); + + if (!np) { + netdev_err(ndev, "No PHY specified"); + return -ENODEV; + } + + ret = emac_phy_interface_config(priv); + if (ret) + goto err_node_put; + + phydev = of_phy_connect(ndev, np, &emac_adjust_link, 0, + priv->phy_interface); + if (!phydev) { + netdev_err(ndev, "Could not attach to PHY\n"); + ret = -ENODEV; + goto err_node_put; + } + + phy_support_asym_pause(phydev); + + phydev->mac_managed_pm = true; + + emac_update_delay_line(priv); + +err_node_put: + of_node_put(np); + return ret; +} + +static int emac_up(struct emac_priv *priv) +{ + struct platform_device *pdev = priv->pdev; + struct net_device *ndev = priv->ndev; + int ret; + + pm_runtime_get_sync(&pdev->dev); + + ret = emac_phy_connect(ndev); + if (ret) { + dev_err(&pdev->dev, "emac_phy_connect failed\n"); + goto err_pm_put; + } + + emac_init_hw(priv); + + emac_set_mac_addr(priv, ndev->dev_addr); + emac_configure_tx(priv); + emac_configure_rx(priv); + + emac_alloc_rx_desc_buffers(priv); + + phy_start(ndev->phydev); + + ret = request_irq(priv->irq, emac_interrupt_handler, IRQF_SHARED, + ndev->name, ndev); + if (ret) { + dev_err(&pdev->dev, "request_irq failed\n"); + goto err_reset_disconnect_phy; + } + + /* Don't enable MAC interrupts */ + emac_wr(priv, MAC_INTERRUPT_ENABLE, 0x0); + + /* Enable DMA interrupts */ + emac_wr(priv, DMA_INTERRUPT_ENABLE, + MREGBIT_TRANSMIT_TRANSFER_DONE_INTR_ENABLE | + MREGBIT_TRANSMIT_DMA_STOPPED_INTR_ENABLE | + MREGBIT_RECEIVE_TRANSFER_DONE_INTR_ENABLE | + MREGBIT_RECEIVE_DMA_STOPPED_INTR_ENABLE | + MREGBIT_RECEIVE_MISSED_FRAME_INTR_ENABLE); + + napi_enable(&priv->napi); + + netif_start_queue(ndev); + + emac_stats_timer(&priv->stats_timer); + + return 0; + +err_reset_disconnect_phy: + emac_reset_hw(priv); + phy_disconnect(ndev->phydev); + +err_pm_put: + pm_runtime_put_sync(&pdev->dev); + return ret; +} + +static int emac_down(struct emac_priv *priv) +{ + struct platform_device *pdev = priv->pdev; + struct net_device *ndev = priv->ndev; + + netif_stop_queue(ndev); + + phy_disconnect(ndev->phydev); + + emac_wr(priv, MAC_INTERRUPT_ENABLE, 0x0); + emac_wr(priv, DMA_INTERRUPT_ENABLE, 0x0); + + free_irq(priv->irq, ndev); + + napi_disable(&priv->napi); + + timer_delete_sync(&priv->txtimer); + cancel_work_sync(&priv->tx_timeout_task); + + timer_delete_sync(&priv->stats_timer); + + emac_reset_hw(priv); + + /* Update and save current stats, see emac_stats_update() for usage */ + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + priv->tx_stats_off = priv->tx_stats; + priv->rx_stats_off = priv->rx_stats; + + spin_unlock(&priv->stats_lock); + + pm_runtime_put_sync(&pdev->dev); + return 0; +} + +/* Called when net interface is brought up. */ +static int emac_open(struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + struct device *dev = &priv->pdev->dev; + int ret; + + ret = emac_alloc_tx_resources(priv); + if (ret) { + dev_err(dev, "Cannot allocate TX resources\n"); + return ret; + } + + ret = emac_alloc_rx_resources(priv); + if (ret) { + dev_err(dev, "Cannot allocate RX resources\n"); + goto err_free_tx; + } + + ret = emac_up(priv); + if (ret) { + dev_err(dev, "Error when bringing interface up\n"); + goto err_free_rx; + } + return 0; + +err_free_rx: + emac_free_rx_resources(priv); +err_free_tx: + emac_free_tx_resources(priv); + + return ret; +} + +/* Called when interface is brought down. */ +static int emac_stop(struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + + emac_down(priv); + emac_free_tx_resources(priv); + emac_free_rx_resources(priv); + + return 0; +} + +static const struct ethtool_ops emac_ethtool_ops = { + .get_link_ksettings = phy_ethtool_get_link_ksettings, + .set_link_ksettings = phy_ethtool_set_link_ksettings, + .nway_reset = phy_ethtool_nway_reset, + .get_drvinfo = emac_get_drvinfo, + .get_link = ethtool_op_get_link, + + .get_regs = emac_ethtool_get_regs, + .get_regs_len = emac_ethtool_get_regs_len, + + .get_rmon_stats = emac_get_rmon_stats, + .get_pause_stats = emac_get_pause_stats, + .get_eth_mac_stats = emac_get_eth_mac_stats, + + .get_sset_count = emac_get_sset_count, + .get_strings = emac_get_strings, + .get_ethtool_stats = emac_get_ethtool_stats, + + .get_pauseparam = emac_get_pauseparam, + .set_pauseparam = emac_set_pauseparam, +}; + +static const struct net_device_ops emac_netdev_ops = { + .ndo_open = emac_open, + .ndo_stop = emac_stop, + .ndo_start_xmit = emac_start_xmit, + .ndo_validate_addr = eth_validate_addr, + .ndo_set_mac_address = emac_set_mac_address, + .ndo_eth_ioctl = phy_do_ioctl_running, + .ndo_change_mtu = emac_change_mtu, + .ndo_tx_timeout = emac_tx_timeout, + .ndo_set_rx_mode = emac_set_rx_mode, + .ndo_get_stats64 = emac_get_stats64, +}; + +/* Currently we always use 15.6 ps/step for the delay line */ + +static u32 delay_ps_to_unit(u32 ps) +{ + return DIV_ROUND_CLOSEST(ps * 10, 156); +} + +static u32 delay_unit_to_ps(u32 unit) +{ + return DIV_ROUND_CLOSEST(unit * 156, 10); +} + +#define EMAC_MAX_DELAY_UNIT \ + (EMAC_TX_DLINE_CODE_MASK >> EMAC_TX_DLINE_CODE_SHIFT) + +/* Minus one just to be safe from rounding errors */ +#define EMAC_MAX_DELAY_PS (delay_unit_to_ps(EMAC_MAX_DELAY_UNIT - 1)) + +static int emac_config_dt(struct platform_device *pdev, struct emac_priv *priv) +{ + struct device_node *np = pdev->dev.of_node; + struct device *dev = &pdev->dev; + u8 mac_addr[ETH_ALEN] = { 0 }; + int ret; + + priv->iobase = devm_platform_ioremap_resource(pdev, 0); + if (IS_ERR(priv->iobase)) + return dev_err_probe(dev, PTR_ERR(priv->iobase), + "ioremap failed\n"); + + priv->regmap_apmu = + syscon_regmap_lookup_by_phandle_args(np, "spacemit,apmu", 1, + &priv->regmap_apmu_offset); + + if (IS_ERR(priv->regmap_apmu)) + return dev_err_probe(dev, PTR_ERR(priv->regmap_apmu), + "failed to get syscon\n"); + + priv->irq = platform_get_irq(pdev, 0); + if (priv->irq < 0) + return priv->irq; + + ret = of_get_mac_address(np, mac_addr); + if (ret) { + if (ret == -EPROBE_DEFER) + return dev_err_probe(dev, ret, + "Can't get MAC address\n"); + + dev_info(&pdev->dev, "Using random MAC address\n"); + eth_hw_addr_random(priv->ndev); + } else { + eth_hw_addr_set(priv->ndev, mac_addr); + } + + priv->tx_delay = 0; + priv->rx_delay = 0; + + of_property_read_u32(np, "tx-internal-delay-ps", &priv->tx_delay); + of_property_read_u32(np, "rx-internal-delay-ps", &priv->rx_delay); + + if (priv->tx_delay > EMAC_MAX_DELAY_PS) { + dev_err(&pdev->dev, + "tx-internal-delay-ps too large: max %d, got %d", + EMAC_MAX_DELAY_PS, priv->tx_delay); + return -EINVAL; + } + + if (priv->rx_delay > EMAC_MAX_DELAY_PS) { + dev_err(&pdev->dev, + "rx-internal-delay-ps too large: max %d, got %d", + EMAC_MAX_DELAY_PS, priv->rx_delay); + return -EINVAL; + } + + priv->tx_delay = delay_ps_to_unit(priv->tx_delay); + priv->rx_delay = delay_ps_to_unit(priv->rx_delay); + + return 0; +} + +static void emac_phy_deregister_fixed_link(void *data) +{ + struct device_node *of_node = data; + + of_phy_deregister_fixed_link(of_node); +} + +static int emac_probe(struct platform_device *pdev) +{ + struct device *dev = &pdev->dev; + struct reset_control *reset; + struct net_device *ndev; + struct emac_priv *priv; + int ret; + + ndev = devm_alloc_etherdev(dev, sizeof(struct emac_priv)); + if (!ndev) + return -ENOMEM; + + ndev->hw_features = NETIF_F_SG; + ndev->features |= ndev->hw_features; + + ndev->max_mtu = EMAC_RX_BUF_4K - (ETH_HLEN + ETH_FCS_LEN); + + priv = netdev_priv(ndev); + priv->ndev = ndev; + priv->pdev = pdev; + platform_set_drvdata(pdev, priv); + + ret = emac_config_dt(pdev, priv); + if (ret < 0) + return dev_err_probe(dev, ret, "Configuration failed\n"); + + ndev->watchdog_timeo = 5 * HZ; + ndev->base_addr = (unsigned long)priv->iobase; + ndev->irq = priv->irq; + + ndev->ethtool_ops = &emac_ethtool_ops; + ndev->netdev_ops = &emac_netdev_ops; + + devm_pm_runtime_enable(&pdev->dev); + + priv->bus_clk = devm_clk_get_enabled(&pdev->dev, NULL); + if (IS_ERR(priv->bus_clk)) + return dev_err_probe(dev, PTR_ERR(priv->bus_clk), + "Failed to get clock\n"); + + reset = devm_reset_control_get_optional_exclusive_deasserted(&pdev->dev, + NULL); + if (IS_ERR(reset)) + return dev_err_probe(dev, PTR_ERR(reset), + "Failed to get reset\n"); + + if (of_phy_is_fixed_link(dev->of_node)) { + ret = of_phy_register_fixed_link(dev->of_node); + if (ret) + return dev_err_probe(dev, ret, + "Failed to register fixed-link\n"); + + ret = devm_add_action_or_reset(dev, + emac_phy_deregister_fixed_link, + dev->of_node); + + if (ret) { + dev_err(dev, "devm_add_action_or_reset failed\n"); + return ret; + } + } + + emac_sw_init(priv); + + ret = emac_mdio_init(priv); + if (ret) + goto err_timer_delete; + + SET_NETDEV_DEV(ndev, &pdev->dev); + + ret = devm_register_netdev(dev, ndev); + if (ret) { + dev_err(dev, "devm_register_netdev failed\n"); + goto err_timer_delete; + } + + netif_napi_add(ndev, &priv->napi, emac_rx_poll); + netif_carrier_off(ndev); + + return 0; + +err_timer_delete: + timer_delete_sync(&priv->txtimer); + timer_delete_sync(&priv->stats_timer); + + return ret; +} + +static void emac_remove(struct platform_device *pdev) +{ + struct emac_priv *priv = platform_get_drvdata(pdev); + + timer_shutdown_sync(&priv->txtimer); + cancel_work_sync(&priv->tx_timeout_task); + + timer_shutdown_sync(&priv->stats_timer); + + emac_reset_hw(priv); +} + +static int emac_resume(struct device *dev) +{ + struct emac_priv *priv = dev_get_drvdata(dev); + struct net_device *ndev = priv->ndev; + int ret; + + ret = clk_prepare_enable(priv->bus_clk); + if (ret < 0) { + dev_err(dev, "Failed to enable bus clock: %d\n", ret); + return ret; + } + + if (!netif_running(ndev)) + return 0; + + ret = emac_open(ndev); + if (ret) { + clk_disable_unprepare(priv->bus_clk); + return ret; + } + + netif_device_attach(ndev); + + emac_stats_timer(&priv->stats_timer); + + return 0; +} + +static int emac_suspend(struct device *dev) +{ + struct emac_priv *priv = dev_get_drvdata(dev); + struct net_device *ndev = priv->ndev; + + if (!ndev || !netif_running(ndev)) { + clk_disable_unprepare(priv->bus_clk); + return 0; + } + + emac_stop(ndev); + + clk_disable_unprepare(priv->bus_clk); + netif_device_detach(ndev); + return 0; +} + +static const struct dev_pm_ops emac_pm_ops = { + SYSTEM_SLEEP_PM_OPS(emac_suspend, emac_resume) +}; + +static const struct of_device_id emac_of_match[] = { + { .compatible = "spacemit,k1-emac" }, + { /* sentinel */ }, +}; +MODULE_DEVICE_TABLE(of, emac_of_match); + +static struct platform_driver emac_driver = { + .probe = emac_probe, + .remove = emac_remove, + .driver = { + .name = DRIVER_NAME, + .of_match_table = of_match_ptr(emac_of_match), + .pm = &emac_pm_ops, + }, +}; +module_platform_driver(emac_driver); + +MODULE_DESCRIPTION("SpacemiT K1 Ethernet driver"); +MODULE_AUTHOR("Vivian Wang "); +MODULE_LICENSE("GPL"); diff --git a/drivers/net/ethernet/spacemit/k1_emac.h b/drivers/net/ethernet/spacemit/k1_emac.h new file mode 100644 index 0000000000000000000000000000000000000000..ef681f06a50a63cca34561353a7b7ccf3f996853 --- /dev/null +++ b/drivers/net/ethernet/spacemit/k1_emac.h @@ -0,0 +1,426 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * SpacemiT K1 Ethernet hardware definitions + * + * Copyright (C) 2023-2025 SpacemiT (Hangzhou) Technology Co. Ltd + * Copyright (C) 2025 Vivian Wang + */ + +#ifndef _K1_EMAC_H_ +#define _K1_EMAC_H_ + +/* APMU syscon registers */ + +#define APMU_EMAC_CTRL_REG 0x0 + +#define PHY_INTF_RGMII BIT(2) + +/* + * Only valid for RMII mode + * 0: Ref clock from External PHY + * 1: Ref clock from SoC + */ +#define REF_CLK_SEL BIT(3) + +/* + * Function clock select + * 0: 208 MHz + * 1: 312 MHz + */ +#define FUNC_CLK_SEL BIT(4) + +/* Only valid for RMII, invert TX clk */ +#define RMII_TX_CLK_SEL BIT(6) + +/* Only valid for RMII, invert RX clk */ +#define RMII_RX_CLK_SEL BIT(7) + +/* + * Only valid for RGMII + * 0: TX clk from RX clk + * 1: TX clk from SoC + */ +#define RGMII_TX_CLK_SEL BIT(8) + +#define PHY_IRQ_EN BIT(12) +#define AXI_SINGLE_ID BIT(13) + +#define RMII_TX_PHASE_SHIFT 16 +#define RMII_TX_PHASE_MASK GENMASK(18, 16) +#define RMII_RX_PHASE_SHIFT 20 +#define RMII_RX_PHASE_MASK GENMASK(22, 20) + +#define RGMII_TX_PHASE_SHIFT 24 +#define RGMII_TX_PHASE_MASK GENMASK(26, 24) +#define RGMII_RX_PHASE_SHIFT 28 +#define RGMII_RX_PHASE_MASK GENMASK(30, 28) + +#define APMU_EMAC_DLINE_REG 0x4 + +#define EMAC_RX_DLINE_EN BIT(0) +#define EMAC_RX_DLINE_STEP_SHIFT 4 +#define EMAC_RX_DLINE_STEP_MASK GENMASK(5, 4) +#define EMAC_RX_DLINE_CODE_SHIFT 8 +#define EMAC_RX_DLINE_CODE_MASK GENMASK(15, 8) + +#define EMAC_TX_DLINE_EN BIT(16) +#define EMAC_TX_DLINE_STEP_SHIFT 20 +#define EMAC_TX_DLINE_STEP_MASK GENMASK(21, 20) +#define EMAC_TX_DLINE_CODE_SHIFT 24 +#define EMAC_TX_DLINE_CODE_MASK GENMASK(31, 24) + +#define EMAC_DLINE_STEP_15P6 0 /* 15.6 ps/step */ +#define EMAC_DLINE_STEP_24P4 1 /* 24.4 ps/step */ +#define EMAC_DLINE_STEP_29P7 2 /* 29.7 ps/step */ +#define EMAC_DLINE_STEP_35P1 3 /* 35.1 ps/step */ + +/* DMA register set */ +#define DMA_CONFIGURATION 0x0000 +#define DMA_CONTROL 0x0004 +#define DMA_STATUS_IRQ 0x0008 +#define DMA_INTERRUPT_ENABLE 0x000c + +#define DMA_TRANSMIT_AUTO_POLL_COUNTER 0x0010 +#define DMA_TRANSMIT_POLL_DEMAND 0x0014 +#define DMA_RECEIVE_POLL_DEMAND 0x0018 + +#define DMA_TRANSMIT_BASE_ADDRESS 0x001c +#define DMA_RECEIVE_BASE_ADDRESS 0x0020 +#define DMA_MISSED_FRAME_COUNTER 0x0024 +#define DMA_STOP_FLUSH_COUNTER 0x0028 + +#define DMA_RECEIVE_IRQ_MITIGATION_CTRL 0x002c + +#define DMA_CURRENT_TRANSMIT_DESCRIPTOR_POINTER 0x0030 +#define DMA_CURRENT_TRANSMIT_BUFFER_POINTER 0x0034 +#define DMA_CURRENT_RECEIVE_DESCRIPTOR_POINTER 0x0038 +#define DMA_CURRENT_RECEIVE_BUFFER_POINTER 0x003c + +/* MAC Register set */ +#define MAC_GLOBAL_CONTROL 0x0100 +#define MAC_TRANSMIT_CONTROL 0x0104 +#define MAC_RECEIVE_CONTROL 0x0108 +#define MAC_MAXIMUM_FRAME_SIZE 0x010c +#define MAC_TRANSMIT_JABBER_SIZE 0x0110 +#define MAC_RECEIVE_JABBER_SIZE 0x0114 +#define MAC_ADDRESS_CONTROL 0x0118 +#define MAC_MDIO_CLK_DIV 0x011c +#define MAC_ADDRESS1_HIGH 0x0120 +#define MAC_ADDRESS1_MED 0x0124 +#define MAC_ADDRESS1_LOW 0x0128 +#define MAC_ADDRESS2_HIGH 0x012c +#define MAC_ADDRESS2_MED 0x0130 +#define MAC_ADDRESS2_LOW 0x0134 +#define MAC_ADDRESS3_HIGH 0x0138 +#define MAC_ADDRESS3_MED 0x013c +#define MAC_ADDRESS3_LOW 0x0140 +#define MAC_ADDRESS4_HIGH 0x0144 +#define MAC_ADDRESS4_MED 0x0148 +#define MAC_ADDRESS4_LOW 0x014c +#define MAC_MULTICAST_HASH_TABLE1 0x0150 +#define MAC_MULTICAST_HASH_TABLE2 0x0154 +#define MAC_MULTICAST_HASH_TABLE3 0x0158 +#define MAC_MULTICAST_HASH_TABLE4 0x015c +#define MAC_FC_CONTROL 0x0160 +#define MAC_FC_PAUSE_FRAME_GENERATE 0x0164 +#define MAC_FC_SOURCE_ADDRESS_HIGH 0x0168 +#define MAC_FC_SOURCE_ADDRESS_MED 0x016c +#define MAC_FC_SOURCE_ADDRESS_LOW 0x0170 +#define MAC_FC_DESTINATION_ADDRESS_HIGH 0x0174 +#define MAC_FC_DESTINATION_ADDRESS_MED 0x0178 +#define MAC_FC_DESTINATION_ADDRESS_LOW 0x017c +#define MAC_FC_PAUSE_TIME_VALUE 0x0180 +#define MAC_FC_HIGH_PAUSE_TIME 0x0184 +#define MAC_FC_LOW_PAUSE_TIME 0x0188 +#define MAC_FC_PAUSE_HIGH_THRESHOLD 0x018c +#define MAC_FC_PAUSE_LOW_THRESHOLD 0x0190 +#define MAC_MDIO_CONTROL 0x01a0 +#define MAC_MDIO_DATA 0x01a4 +#define MAC_RX_STATCTR_CONTROL 0x01a8 +#define MAC_RX_STATCTR_DATA_HIGH 0x01ac +#define MAC_RX_STATCTR_DATA_LOW 0x01b0 +#define MAC_TX_STATCTR_CONTROL 0x01b4 +#define MAC_TX_STATCTR_DATA_HIGH 0x01b8 +#define MAC_TX_STATCTR_DATA_LOW 0x01bc +#define MAC_TRANSMIT_FIFO_ALMOST_FULL 0x01c0 +#define MAC_TRANSMIT_PACKET_START_THRESHOLD 0x01c4 +#define MAC_RECEIVE_PACKET_START_THRESHOLD 0x01c8 +#define MAC_STATUS_IRQ 0x01e0 +#define MAC_INTERRUPT_ENABLE 0x01e4 + +/* Used for register dump */ +#define EMAC_DMA_REG_CNT 16 +#define EMAC_MAC_REG_CNT 124 + +/* DMA_CONFIGURATION (0x0000) */ + +/* + * 0-DMA controller in normal operation mode, + * 1-DMA controller reset to default state, + * clearing all internal state information + */ +#define MREGBIT_SOFTWARE_RESET BIT(0) + +#define MREGBIT_BURST_1WORD BIT(1) +#define MREGBIT_BURST_2WORD BIT(2) +#define MREGBIT_BURST_4WORD BIT(3) +#define MREGBIT_BURST_8WORD BIT(4) +#define MREGBIT_BURST_16WORD BIT(5) +#define MREGBIT_BURST_32WORD BIT(6) +#define MREGBIT_BURST_64WORD BIT(7) +#define MREGBIT_BURST_LENGTH GENMASK(7, 1) +#define MREGBIT_DESCRIPTOR_SKIP_LENGTH GENMASK(12, 8) + +/* For Receive and Transmit DMA operate in Big-Endian mode for Descriptors. */ +#define MREGBIT_DESCRIPTOR_BYTE_ORDERING BIT(13) + +#define MREGBIT_BIG_LITLE_ENDIAN BIT(14) +#define MREGBIT_TX_RX_ARBITRATION BIT(15) +#define MREGBIT_WAIT_FOR_DONE BIT(16) +#define MREGBIT_STRICT_BURST BIT(17) +#define MREGBIT_DMA_64BIT_MODE BIT(18) + +/* DMA_CONTROL (0x0004) */ +#define MREGBIT_START_STOP_TRANSMIT_DMA BIT(0) +#define MREGBIT_START_STOP_RECEIVE_DMA BIT(1) + +/* DMA_STATUS_IRQ (0x0008) */ +#define MREGBIT_TRANSMIT_TRANSFER_DONE_IRQ BIT(0) +#define MREGBIT_TRANSMIT_DES_UNAVAILABLE_IRQ BIT(1) +#define MREGBIT_TRANSMIT_DMA_STOPPED_IRQ BIT(2) +#define MREGBIT_RECEIVE_TRANSFER_DONE_IRQ BIT(4) +#define MREGBIT_RECEIVE_DES_UNAVAILABLE_IRQ BIT(5) +#define MREGBIT_RECEIVE_DMA_STOPPED_IRQ BIT(6) +#define MREGBIT_RECEIVE_MISSED_FRAME_IRQ BIT(7) +#define MREGBIT_MAC_IRQ BIT(8) +#define MREGBIT_TRANSMIT_DMA_STATE GENMASK(18, 16) +#define MREGBIT_RECEIVE_DMA_STATE GENMASK(23, 20) + +/* DMA_INTERRUPT_ENABLE (0x000c) */ +#define MREGBIT_TRANSMIT_TRANSFER_DONE_INTR_ENABLE BIT(0) +#define MREGBIT_TRANSMIT_DES_UNAVAILABLE_INTR_ENABLE BIT(1) +#define MREGBIT_TRANSMIT_DMA_STOPPED_INTR_ENABLE BIT(2) +#define MREGBIT_RECEIVE_TRANSFER_DONE_INTR_ENABLE BIT(4) +#define MREGBIT_RECEIVE_DES_UNAVAILABLE_INTR_ENABLE BIT(5) +#define MREGBIT_RECEIVE_DMA_STOPPED_INTR_ENABLE BIT(6) +#define MREGBIT_RECEIVE_MISSED_FRAME_INTR_ENABLE BIT(7) +#define MREGBIT_MAC_INTR_ENABLE BIT(8) + +/* DMA_RECEIVE_IRQ_MITIGATION_CTRL (0x002c) */ +#define MREGBIT_RECEIVE_IRQ_FRAME_COUNTER_MASK GENMASK(7, 0) +#define MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_SHIFT 8 +#define MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_MASK GENMASK(27, 8) +#define MREGBIT_RECEIVE_IRQ_FRAME_COUNTER_MODE BIT(30) +#define MREGBIT_RECEIVE_IRQ_MITIGATION_ENABLE BIT(31) + +/* MAC_GLOBAL_CONTROL (0x0100) */ +#define MREGBIT_SPEED GENMASK(1, 0) +#define MREGBIT_SPEED_10M 0x0 +#define MREGBIT_SPEED_100M BIT(0) +#define MREGBIT_SPEED_1000M BIT(1) +#define MREGBIT_FULL_DUPLEX_MODE BIT(2) +#define MREGBIT_RESET_RX_STAT_COUNTERS BIT(3) +#define MREGBIT_RESET_TX_STAT_COUNTERS BIT(4) +#define MREGBIT_UNICAST_WAKEUP_MODE BIT(8) +#define MREGBIT_MAGIC_PACKET_WAKEUP_MODE BIT(9) + +/* MAC_TRANSMIT_CONTROL (0x0104) */ +#define MREGBIT_TRANSMIT_ENABLE BIT(0) +#define MREGBIT_INVERT_FCS BIT(1) +#define MREGBIT_DISABLE_FCS_INSERT BIT(2) +#define MREGBIT_TRANSMIT_AUTO_RETRY BIT(3) +#define MREGBIT_IFG_LEN GENMASK(6, 4) +#define MREGBIT_PREAMBLE_LENGTH GENMASK(9, 7) + +/* MAC_RECEIVE_CONTROL (0x0108) */ +#define MREGBIT_RECEIVE_ENABLE BIT(0) +#define MREGBIT_DISABLE_FCS_CHECK BIT(1) +#define MREGBIT_STRIP_FCS BIT(2) +#define MREGBIT_STORE_FORWARD BIT(3) +#define MREGBIT_STATUS_FIRST BIT(4) +#define MREGBIT_PASS_BAD_FRAMES BIT(5) +#define MREGBIT_ACOOUNT_VLAN BIT(6) + +/* MAC_MAXIMUM_FRAME_SIZE (0x010c) */ +#define MREGBIT_MAX_FRAME_SIZE GENMASK(13, 0) + +/* MAC_TRANSMIT_JABBER_SIZE (0x0110) */ +#define MREGBIT_TRANSMIT_JABBER_SIZE GENMASK(15, 0) + +/* MAC_RECEIVE_JABBER_SIZE (0x0114) */ +#define MREGBIT_RECEIVE_JABBER_SIZE GENMASK(15, 0) + +/* MAC_ADDRESS_CONTROL (0x0118) */ +#define MREGBIT_MAC_ADDRESS1_ENABLE BIT(0) +#define MREGBIT_MAC_ADDRESS2_ENABLE BIT(1) +#define MREGBIT_MAC_ADDRESS3_ENABLE BIT(2) +#define MREGBIT_MAC_ADDRESS4_ENABLE BIT(3) +#define MREGBIT_INVERSE_MAC_ADDRESS1_ENABLE BIT(4) +#define MREGBIT_INVERSE_MAC_ADDRESS2_ENABLE BIT(5) +#define MREGBIT_INVERSE_MAC_ADDRESS3_ENABLE BIT(6) +#define MREGBIT_INVERSE_MAC_ADDRESS4_ENABLE BIT(7) +#define MREGBIT_PROMISCUOUS_MODE BIT(8) + +/* MAC_FC_CONTROL (0x0160) */ +#define MREGBIT_FC_DECODE_ENABLE BIT(0) +#define MREGBIT_FC_GENERATION_ENABLE BIT(1) +#define MREGBIT_AUTO_FC_GENERATION_ENABLE BIT(2) +#define MREGBIT_MULTICAST_MODE BIT(3) +#define MREGBIT_BLOCK_PAUSE_FRAMES BIT(4) + +/* MAC_FC_PAUSE_FRAME_GENERATE (0x0164) */ +#define MREGBIT_GENERATE_PAUSE_FRAME BIT(0) + +/* MAC_FC_PAUSE_TIME_VALUE (0x0180) */ +#define MREGBIT_MAC_FC_PAUSE_TIME GENMASK(15, 0) + +/* MAC_MDIO_CONTROL (0x01a0) */ +#define MREGBIT_PHY_ADDRESS GENMASK(4, 0) +#define MREGBIT_REGISTER_ADDRESS GENMASK(9, 5) +#define MREGBIT_MDIO_READ_WRITE BIT(10) +#define MREGBIT_START_MDIO_TRANS BIT(15) + +/* MAC_MDIO_DATA (0x01a4) */ +#define MREGBIT_MDIO_DATA GENMASK(15, 0) + +/* MAC_RX_STATCTR_CONTROL (0x01a8) */ +#define MREGBIT_RX_COUNTER_NUMBER GENMASK(4, 0) +#define MREGBIT_START_RX_COUNTER_READ BIT(15) + +/* MAC_RX_STATCTR_DATA_HIGH (0x01ac) */ +#define MREGBIT_RX_STATCTR_DATA_HIGH GENMASK(15, 0) +/* MAC_RX_STATCTR_DATA_LOW (0x01b0) */ +#define MREGBIT_RX_STATCTR_DATA_LOW GENMASK(15, 0) + +/* MAC_TX_STATCTR_CONTROL (0x01b4) */ +#define MREGBIT_TX_COUNTER_NUMBER GENMASK(4, 0) +#define MREGBIT_START_TX_COUNTER_READ BIT(15) + +/* MAC_TX_STATCTR_DATA_HIGH (0x01b8) */ +#define MREGBIT_TX_STATCTR_DATA_HIGH GENMASK(15, 0) +/* MAC_TX_STATCTR_DATA_LOW (0x01bc) */ +#define MREGBIT_TX_STATCTR_DATA_LOW GENMASK(15, 0) + +/* MAC_TRANSMIT_FIFO_ALMOST_FULL (0x01c0) */ +#define MREGBIT_TX_FIFO_AF GENMASK(13, 0) + +/* MAC_TRANSMIT_PACKET_START_THRESHOLD (0x01c4) */ +#define MREGBIT_TX_PACKET_START_THRESHOLD GENMASK(13, 0) + +/* MAC_RECEIVE_PACKET_START_THRESHOLD (0x01c8) */ +#define MREGBIT_RX_PACKET_START_THRESHOLD GENMASK(13, 0) + +/* MAC_STATUS_IRQ (0x01e0) */ +#define MREGBIT_MAC_UNDERRUN_IRQ BIT(0) +#define MREGBIT_MAC_JABBER_IRQ BIT(1) + +/* MAC_INTERRUPT_ENABLE (0x01e4) */ +#define MREGBIT_MAC_UNDERRUN_INTERRUPT_ENABLE BIT(0) +#define MREGBIT_JABBER_INTERRUPT_ENABLE BIT(1) + +/* RX DMA descriptor */ + +#define RX_DESC_0_FRAME_PACKET_LENGTH_SHIFT 0 +#define RX_DESC_0_FRAME_PACKET_LENGTH_MASK GENMASK(13, 0) +#define RX_DESC_0_FRAME_ALIGN_ERR BIT(14) +#define RX_DESC_0_FRAME_RUNT BIT(15) +#define RX_DESC_0_FRAME_ETHERNET_TYPE BIT(16) +#define RX_DESC_0_FRAME_VLAN BIT(17) +#define RX_DESC_0_FRAME_MULTICAST BIT(18) +#define RX_DESC_0_FRAME_BROADCAST BIT(19) +#define RX_DESC_0_FRAME_CRC_ERR BIT(20) +#define RX_DESC_0_FRAME_MAX_LEN_ERR BIT(21) +#define RX_DESC_0_FRAME_JABBER_ERR BIT(22) +#define RX_DESC_0_FRAME_LENGTH_ERR BIT(23) +#define RX_DESC_0_FRAME_MAC_ADDR1_MATCH BIT(24) +#define RX_DESC_0_FRAME_MAC_ADDR2_MATCH BIT(25) +#define RX_DESC_0_FRAME_MAC_ADDR3_MATCH BIT(26) +#define RX_DESC_0_FRAME_MAC_ADDR4_MATCH BIT(27) +#define RX_DESC_0_FRAME_PAUSE_CTRL BIT(28) +#define RX_DESC_0_LAST_DESCRIPTOR BIT(29) +#define RX_DESC_0_FIRST_DESCRIPTOR BIT(30) +#define RX_DESC_0_OWN BIT(31) + +#define RX_DESC_1_BUFFER_SIZE_1_SHIFT 0 +#define RX_DESC_1_BUFFER_SIZE_1_MASK GENMASK(11, 0) +#define RX_DESC_1_BUFFER_SIZE_2_SHIFT 12 +#define RX_DESC_1_BUFFER_SIZE_2_MASK GENMASK(23, 12) + /* [24] reserved */ +#define RX_DESC_1_SECOND_ADDRESS_CHAINED BIT(25) +#define RX_DESC_1_END_RING BIT(26) + /* [29:27] reserved */ +#define RX_DESC_1_RX_TIMESTAMP BIT(30) +#define RX_DESC_1_PTP_PKT BIT(31) + +/* TX DMA descriptor */ + + /* [29:0] unused */ +#define TX_DESC_0_TX_TIMESTAMP BIT(30) +#define TX_DESC_0_OWN BIT(31) + +#define TX_DESC_1_BUFFER_SIZE_1_SHIFT 0 +#define TX_DESC_1_BUFFER_SIZE_1_MASK GENMASK(11, 0) +#define TX_DESC_1_BUFFER_SIZE_2_SHIFT 12 +#define TX_DESC_1_BUFFER_SIZE_2_MASK GENMASK(23, 12) +#define TX_DESC_1_FORCE_EOP_ERROR BIT(24) +#define TX_DESC_1_SECOND_ADDRESS_CHAINED BIT(25) +#define TX_DESC_1_END_RING BIT(26) +#define TX_DESC_1_DISABLE_PADDING BIT(27) +#define TX_DESC_1_ADD_CRC_DISABLE BIT(28) +#define TX_DESC_1_FIRST_SEGMENT BIT(29) +#define TX_DESC_1_LAST_SEGMENT BIT(30) +#define TX_DESC_1_INTERRUPT_ON_COMPLETION BIT(31) + +struct emac_desc { + u32 desc0; + u32 desc1; + u32 buffer_addr_1; + u32 buffer_addr_2; +}; + +/* Keep stats in this order, index used for accessing hardware */ + +struct emac_hw_tx_stats { + u64 tx_ok_pkts; + u64 tx_total_pkts; + u64 tx_ok_bytes; + u64 tx_err_pkts; + u64 tx_singleclsn_pkts; + u64 tx_multiclsn_pkts; + u64 tx_lateclsn_pkts; + u64 tx_excessclsn_pkts; + u64 tx_unicast_pkts; + u64 tx_multicast_pkts; + u64 tx_broadcast_pkts; + u64 tx_pause_pkts; +}; + +struct emac_hw_rx_stats { + u64 rx_ok_pkts; + u64 rx_total_pkts; + u64 rx_crc_err_pkts; + u64 rx_align_err_pkts; + u64 rx_err_total_pkts; + u64 rx_ok_bytes; + u64 rx_total_bytes; + u64 rx_unicast_pkts; + u64 rx_multicast_pkts; + u64 rx_broadcast_pkts; + u64 rx_pause_pkts; + u64 rx_len_err_pkts; + u64 rx_len_undersize_pkts; + u64 rx_len_oversize_pkts; + u64 rx_len_fragment_pkts; + u64 rx_len_jabber_pkts; + u64 rx_64_pkts; + u64 rx_65_127_pkts; + u64 rx_128_255_pkts; + u64 rx_256_511_pkts; + u64 rx_512_1023_pkts; + u64 rx_1024_1518_pkts; + u64 rx_1519_plus_pkts; + u64 rx_drp_fifo_full_pkts; + u64 rx_truncate_fifo_full_pkts; +}; + +#endif /* _K1_EMAC_H_ */ -- 2.50.1 From axboe at kernel.dk Fri Sep 5 04:26:53 2025 From: axboe at kernel.dk (Jens Axboe) Date: Fri, 5 Sep 2025 05:26:53 -0600 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <5090355d-546a-4d06-99e1-064354d156b5@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <5090355d-546a-4d06-99e1-064354d156b5@redhat.com> Message-ID: <1513d5fd-14ef-4cd0-a9a5-1016e9be6540@kernel.dk> On 9/5/25 12:41 AM, David Hildenbrand wrote: > On 01.09.25 17:03, David Hildenbrand wrote: >> We can just cleanup the code by calculating the #refs earlier, >> so we can just inline what remains of record_subpages(). >> >> Calculate the number of references/pages ahead of times, and record them >> only once all our tests passed. >> >> Signed-off-by: David Hildenbrand >> --- >> mm/gup.c | 25 ++++++++----------------- >> 1 file changed, 8 insertions(+), 17 deletions(-) >> >> diff --git a/mm/gup.c b/mm/gup.c >> index c10cd969c1a3b..f0f4d1a68e094 100644 >> --- a/mm/gup.c >> +++ b/mm/gup.c >> @@ -484,19 +484,6 @@ static inline void mm_set_has_pinned_flag(struct mm_struct *mm) >> #ifdef CONFIG_MMU >> #ifdef CONFIG_HAVE_GUP_FAST >> -static int record_subpages(struct page *page, unsigned long sz, >> - unsigned long addr, unsigned long end, >> - struct page **pages) >> -{ >> - int nr; >> - >> - page += (addr & (sz - 1)) >> PAGE_SHIFT; >> - for (nr = 0; addr != end; nr++, addr += PAGE_SIZE) >> - pages[nr] = page++; >> - >> - return nr; >> -} >> - >> /** >> * try_grab_folio_fast() - Attempt to get or pin a folio in fast path. >> * @page: pointer to page to be grabbed >> @@ -2967,8 +2954,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, >> if (pmd_special(orig)) >> return 0; >> - page = pmd_page(orig); >> - refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr); >> + refs = (end - addr) >> PAGE_SHIFT; >> + page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); >> folio = try_grab_folio_fast(page, refs, flags); >> if (!folio) >> @@ -2989,6 +2976,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, >> } >> *nr += refs; >> + for (; refs; refs--) >> + *(pages++) = page++; >> folio_set_referenced(folio); >> return 1; >> } >> @@ -3007,8 +2996,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, >> if (pud_special(orig)) >> return 0; >> - page = pud_page(orig); >> - refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr); >> + refs = (end - addr) >> PAGE_SHIFT; >> + page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); >> folio = try_grab_folio_fast(page, refs, flags); >> if (!folio) >> @@ -3030,6 +3019,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, >> } >> *nr += refs; >> + for (; refs; refs--) >> + *(pages++) = page++; >> folio_set_referenced(folio); >> return 1; >> } > > Okay, this code is nasty. We should rework this code to just return the nr and receive a the proper > pages pointer, getting rid of the "*nr" parameter. > > For the time being, the following should do the trick: > > commit bfd07c995814354f6b66c5b6a72e96a7aa9fb73b (HEAD -> nth_page) > Author: David Hildenbrand > Date: Fri Sep 5 08:38:43 2025 +0200 > > fixup: mm/gup: remove record_subpages() > pages is not adjusted by the caller, but idnexed by existing *nr. > Signed-off-by: David Hildenbrand > > diff --git a/mm/gup.c b/mm/gup.c > index 010fe56f6e132..22420f2069ee1 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -2981,6 +2981,7 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > return 0; > } > > + pages += *nr; > *nr += refs; > for (; refs; refs--) > *(pages++) = page++; > @@ -3024,6 +3025,7 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > return 0; > } > > + pages += *nr; > *nr += refs; > for (; refs; refs--) > *(pages++) = page++; > Tested as fixing the issue for me, thanks. -- Jens Axboe From lorenzo.stoakes at oracle.com Fri Sep 5 04:34:39 2025 From: lorenzo.stoakes at oracle.com (Lorenzo Stoakes) Date: Fri, 5 Sep 2025 12:34:39 +0100 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <5090355d-546a-4d06-99e1-064354d156b5@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <5090355d-546a-4d06-99e1-064354d156b5@redhat.com> Message-ID: On Fri, Sep 05, 2025 at 08:41:23AM +0200, David Hildenbrand wrote: > On 01.09.25 17:03, David Hildenbrand wrote: > > We can just cleanup the code by calculating the #refs earlier, > > so we can just inline what remains of record_subpages(). > > > > Calculate the number of references/pages ahead of times, and record them > > only once all our tests passed. > > > > Signed-off-by: David Hildenbrand So strange I thought I looked at this...! > > --- > > mm/gup.c | 25 ++++++++----------------- > > 1 file changed, 8 insertions(+), 17 deletions(-) > > > > diff --git a/mm/gup.c b/mm/gup.c > > index c10cd969c1a3b..f0f4d1a68e094 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -484,19 +484,6 @@ static inline void mm_set_has_pinned_flag(struct mm_struct *mm) > > #ifdef CONFIG_MMU > > #ifdef CONFIG_HAVE_GUP_FAST > > -static int record_subpages(struct page *page, unsigned long sz, > > - unsigned long addr, unsigned long end, > > - struct page **pages) > > -{ > > - int nr; > > - > > - page += (addr & (sz - 1)) >> PAGE_SHIFT; > > - for (nr = 0; addr != end; nr++, addr += PAGE_SIZE) > > - pages[nr] = page++; > > - > > - return nr; > > -} > > - > > /** > > * try_grab_folio_fast() - Attempt to get or pin a folio in fast path. > > * @page: pointer to page to be grabbed > > @@ -2967,8 +2954,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > > if (pmd_special(orig)) > > return 0; > > - page = pmd_page(orig); > > - refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr); > > + refs = (end - addr) >> PAGE_SHIFT; > > + page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); > > folio = try_grab_folio_fast(page, refs, flags); > > if (!folio) > > @@ -2989,6 +2976,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > > } > > *nr += refs; > > + for (; refs; refs--) > > + *(pages++) = page++; > > folio_set_referenced(folio); > > return 1; > > } > > @@ -3007,8 +2996,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > > if (pud_special(orig)) > > return 0; > > - page = pud_page(orig); > > - refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr); > > + refs = (end - addr) >> PAGE_SHIFT; > > + page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); > > folio = try_grab_folio_fast(page, refs, flags); > > if (!folio) > > @@ -3030,6 +3019,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > > } > > *nr += refs; > > + for (; refs; refs--) > > + *(pages++) = page++; > > folio_set_referenced(folio); > > return 1; > > } > > Okay, this code is nasty. We should rework this code to just return the nr and receive a the proper > pages pointer, getting rid of the "*nr" parameter. > > For the time being, the following should do the trick: > > commit bfd07c995814354f6b66c5b6a72e96a7aa9fb73b (HEAD -> nth_page) > Author: David Hildenbrand > Date: Fri Sep 5 08:38:43 2025 +0200 > > fixup: mm/gup: remove record_subpages() > pages is not adjusted by the caller, but idnexed by existing *nr. > Signed-off-by: David Hildenbrand > > diff --git a/mm/gup.c b/mm/gup.c > index 010fe56f6e132..22420f2069ee1 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -2981,6 +2981,7 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > return 0; > } > + pages += *nr; > *nr += refs; > for (; refs; refs--) > *(pages++) = page++; > @@ -3024,6 +3025,7 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > return 0; > } > + pages += *nr; > *nr += refs; > for (; refs; refs--) > *(pages++) = page++; This looks correct. But. This is VERY nasty. Before we'd call record_subpages() with pages + *nr, where it was clear we were offsetting by this, now we're making things imo way more confusing. This makes me less in love with this approach to be honest. But perhaps it's the least worst thing for now until we can do a bigger refactor... So since this seems correct to me, and for the sake of moving things forward (was this one patch dropped from mm-new or does mm-new just have an old version? Confused): Reviewed-by: Lorenzo Stoakes For this patch obviously with the fix applied. But can we PLEASE revisit this :) > > > -- > > Cheers > > David / dhildenb > Cheers, Lorenzo From david at redhat.com Fri Sep 5 04:38:32 2025 From: david at redhat.com (David Hildenbrand) Date: Fri, 5 Sep 2025 13:38:32 +0200 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <5090355d-546a-4d06-99e1-064354d156b5@redhat.com> Message-ID: <9fe9f8c7-f59d-4a4b-9668-d3cd2c5a5fc9@redhat.com> On 05.09.25 13:34, Lorenzo Stoakes wrote: > On Fri, Sep 05, 2025 at 08:41:23AM +0200, David Hildenbrand wrote: >> On 01.09.25 17:03, David Hildenbrand wrote: >>> We can just cleanup the code by calculating the #refs earlier, >>> so we can just inline what remains of record_subpages(). >>> >>> Calculate the number of references/pages ahead of times, and record them >>> only once all our tests passed. >>> >>> Signed-off-by: David Hildenbrand > > So strange I thought I looked at this...! > >>> --- >>> mm/gup.c | 25 ++++++++----------------- >>> 1 file changed, 8 insertions(+), 17 deletions(-) >>> >>> diff --git a/mm/gup.c b/mm/gup.c >>> index c10cd969c1a3b..f0f4d1a68e094 100644 >>> --- a/mm/gup.c >>> +++ b/mm/gup.c >>> @@ -484,19 +484,6 @@ static inline void mm_set_has_pinned_flag(struct mm_struct *mm) >>> #ifdef CONFIG_MMU >>> #ifdef CONFIG_HAVE_GUP_FAST >>> -static int record_subpages(struct page *page, unsigned long sz, >>> - unsigned long addr, unsigned long end, >>> - struct page **pages) >>> -{ >>> - int nr; >>> - >>> - page += (addr & (sz - 1)) >> PAGE_SHIFT; >>> - for (nr = 0; addr != end; nr++, addr += PAGE_SIZE) >>> - pages[nr] = page++; >>> - >>> - return nr; >>> -} >>> - >>> /** >>> * try_grab_folio_fast() - Attempt to get or pin a folio in fast path. >>> * @page: pointer to page to be grabbed >>> @@ -2967,8 +2954,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, >>> if (pmd_special(orig)) >>> return 0; >>> - page = pmd_page(orig); >>> - refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr); >>> + refs = (end - addr) >> PAGE_SHIFT; >>> + page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); >>> folio = try_grab_folio_fast(page, refs, flags); >>> if (!folio) >>> @@ -2989,6 +2976,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, >>> } >>> *nr += refs; >>> + for (; refs; refs--) >>> + *(pages++) = page++; >>> folio_set_referenced(folio); >>> return 1; >>> } >>> @@ -3007,8 +2996,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, >>> if (pud_special(orig)) >>> return 0; >>> - page = pud_page(orig); >>> - refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr); >>> + refs = (end - addr) >> PAGE_SHIFT; >>> + page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); >>> folio = try_grab_folio_fast(page, refs, flags); >>> if (!folio) >>> @@ -3030,6 +3019,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, >>> } >>> *nr += refs; >>> + for (; refs; refs--) >>> + *(pages++) = page++; >>> folio_set_referenced(folio); >>> return 1; >>> } >> >> Okay, this code is nasty. We should rework this code to just return the nr and receive a the proper >> pages pointer, getting rid of the "*nr" parameter. >> >> For the time being, the following should do the trick: >> >> commit bfd07c995814354f6b66c5b6a72e96a7aa9fb73b (HEAD -> nth_page) >> Author: David Hildenbrand >> Date: Fri Sep 5 08:38:43 2025 +0200 >> >> fixup: mm/gup: remove record_subpages() >> pages is not adjusted by the caller, but idnexed by existing *nr. >> Signed-off-by: David Hildenbrand >> >> diff --git a/mm/gup.c b/mm/gup.c >> index 010fe56f6e132..22420f2069ee1 100644 >> --- a/mm/gup.c >> +++ b/mm/gup.c >> @@ -2981,6 +2981,7 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, >> return 0; >> } >> + pages += *nr; >> *nr += refs; >> for (; refs; refs--) >> *(pages++) = page++; >> @@ -3024,6 +3025,7 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, >> return 0; >> } >> + pages += *nr; >> *nr += refs; >> for (; refs; refs--) >> *(pages++) = page++; > > This looks correct. > > But. > > This is VERY nasty. Before we'd call record_subpages() with pages + *nr, where > it was clear we were offsetting by this, now we're making things imo way more > confusing. > > This makes me less in love with this approach to be honest. > > But perhaps it's the least worst thing for now until we can do a bigger > refactor... > > So since this seems correct to me, and for the sake of moving things forward > (was this one patch dropped from mm-new or does mm-new just have an old version? > Confused): > > Reviewed-by: Lorenzo Stoakes > > For this patch obviously with the fix applied. > > But can we PLEASE revisit this :) Yeah, I already asked someone internally if he would have time to do some refactorings in mm/gup.c. If that won't work out I shall do it at some point (and the same time reworking follow_page_mask() to just consume the array as well like gup does) -- Cheers David / dhildenb From apatel at ventanamicro.com Fri Sep 5 05:25:12 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Fri, 5 Sep 2025 17:55:12 +0530 Subject: [PATCH] RISC-V: Enable HOTPLUG_PARALLEL for secondary CPUs Message-ID: <20250905122512.71684-1-apatel@ventanamicro.com> The core kernel already supports parallel bringup of secondary CPUs (aka HOTPLUG_PARALLEL). The x86 and MIPS architectures already use HOTPLUG_PARALLEL and ARM is also moving toward it. On RISC-V, there is no arch specific global data accessed in the RISC-V secondary CPU bringup path so enabling HOTPLUG_PARALLEL for RISC-V would only requires: 1) Providing RISC-V specific arch_cpuhp_kick_ap_alive() 2) Calling cpuhp_ap_sync_alive() from smp_callin() This patch is tested natively with OpenSBI on QEMU RV64 virt machine with 64 cores and also tested with KVM RISC-V guest with 32 VCPUs. Signed-off-by: Anup Patel --- arch/riscv/Kconfig | 2 +- arch/riscv/kernel/smpboot.c | 15 +++++++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index a4b233a0659e..d5800d6f9a15 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -196,7 +196,7 @@ config RISCV select HAVE_SAMPLE_FTRACE_DIRECT_MULTI select HAVE_STACKPROTECTOR select HAVE_SYSCALL_TRACEPOINTS - select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU + select HOTPLUG_PARALLEL if HOTPLUG_CPU select IRQ_DOMAIN select IRQ_FORCED_THREADING select KASAN_VMALLOC if KASAN diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c index 601a321e0f17..d85916a3660c 100644 --- a/arch/riscv/kernel/smpboot.c +++ b/arch/riscv/kernel/smpboot.c @@ -39,7 +39,9 @@ #include "head.h" +#ifndef CONFIG_HOTPLUG_PARALLEL static DECLARE_COMPLETION(cpu_running); +#endif void __init smp_prepare_cpus(unsigned int max_cpus) { @@ -179,6 +181,12 @@ static int start_secondary_cpu(int cpu, struct task_struct *tidle) return -EOPNOTSUPP; } +#ifdef CONFIG_HOTPLUG_PARALLEL +int arch_cpuhp_kick_ap_alive(unsigned int cpu, struct task_struct *tidle) +{ + return start_secondary_cpu(cpu, tidle); +} +#else int __cpu_up(unsigned int cpu, struct task_struct *tidle) { int ret = 0; @@ -199,6 +207,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle) return ret; } +#endif void __init smp_cpus_done(unsigned int max_cpus) { @@ -225,6 +234,10 @@ asmlinkage __visible void smp_callin(void) mmgrab(mm); current->active_mm = mm; +#ifdef CONFIG_HOTPLUG_PARALLEL + cpuhp_ap_sync_alive(); +#endif + store_cpu_topology(curr_cpuid); notify_cpu_starting(curr_cpuid); @@ -243,7 +256,9 @@ asmlinkage __visible void smp_callin(void) */ local_flush_icache_all(); local_flush_tlb_all(); +#ifndef CONFIG_HOTPLUG_PARALLEL complete(&cpu_running); +#endif /* * Disable preemption before enabling interrupts, so we don't try to * schedule a CPU that hasn't actually started yet. -- 2.43.0 From joro at 8bytes.org Fri Sep 5 06:07:55 2025 From: joro at 8bytes.org (Joerg Roedel) Date: Fri, 5 Sep 2025 15:07:55 +0200 Subject: [PATCH v6 0/3] RISC-V: Add ACPI support for IOMMU In-Reply-To: <20250818045807.763922-1-sunilvl@ventanamicro.com> References: <20250818045807.763922-1-sunilvl@ventanamicro.com> Message-ID: On Mon, Aug 18, 2025 at 10:28:04AM +0530, Sunil V L wrote: > Sunil V L (3): > ACPI: RISC-V: Add support for RIMT > ACPI: scan: Add support for RISC-V in acpi_iommu_configure_id() > iommu/riscv: Add ACPI support > > MAINTAINERS | 1 + > arch/riscv/Kconfig | 1 + > drivers/acpi/Kconfig | 4 + > drivers/acpi/riscv/Kconfig | 7 + > drivers/acpi/riscv/Makefile | 1 + > drivers/acpi/riscv/init.c | 2 + > drivers/acpi/riscv/init.h | 1 + > drivers/acpi/riscv/rimt.c | 520 +++++++++++++++++++++++++++ > drivers/acpi/scan.c | 4 + > drivers/iommu/riscv/iommu-platform.c | 17 +- > drivers/iommu/riscv/iommu.c | 10 + > include/linux/acpi_rimt.h | 28 ++ > 12 files changed, 595 insertions(+), 1 deletion(-) > create mode 100644 drivers/acpi/riscv/Kconfig > create mode 100644 drivers/acpi/riscv/rimt.c > create mode 100644 include/linux/acpi_rimt.h Applied, thanks. From ajones at ventanamicro.com Fri Sep 5 07:21:51 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Fri, 5 Sep 2025 09:21:51 -0500 Subject: [PATCH v3 5/6] RISC-V: KVM: Implement ONE_REG interface for SBI FWFT state In-Reply-To: <20250823155947.1354229-6-apatel@ventanamicro.com> References: <20250823155947.1354229-1-apatel@ventanamicro.com> <20250823155947.1354229-6-apatel@ventanamicro.com> Message-ID: <20250905-005c0bc3e16e909c5d91eef4@orel> On Sat, Aug 23, 2025 at 09:29:46PM +0530, Anup Patel wrote: > The KVM user-space needs a way to save/restore the state of > SBI FWFT features so implement SBI extension ONE_REG callbacks. > > Signed-off-by: Anup Patel > --- > arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h | 1 + > arch/riscv/include/uapi/asm/kvm.h | 15 ++ > arch/riscv/kvm/vcpu_sbi_fwft.c | 197 +++++++++++++++++++-- > 3 files changed, 200 insertions(+), 13 deletions(-) > Reviewed-by: Andrew Jones From e at freeshell.de Fri Sep 5 07:39:38 2025 From: e at freeshell.de (E Shattow) Date: Fri, 5 Sep 2025 07:39:38 -0700 Subject: [PATCH v3 0/5] riscv: dts: starfive: Add Milk-V Mars CM (Lite) SoM Message-ID: <20250905144011.928332-1-e@freeshell.de> Milk-V Mars CM and Mars CM Lite System-on-Module both are based on the StarFive JH7110 SoC and compatible with the Raspberry Pi CM4IO Classic IO Board carrier. Mars CM Lite is Mars CM without the eMMC storage component on mmc0 and the mmc0 interface configured instead for SD Card use. The optional WiFi+BT chipset is connected via SDIO on the mmc1 interface that would otherwise be connected to an SD Card slot on the StarFive VisionFive2 reference design. Add the related devicetree files for both Milk-V Mars CM and Milk-V Mars CM Lite describing the currently supported features, namely PMIC, UART, I2C, GPIO, eMMC or SD Card, WiFi+BT, QSPI Flash, and Ethernet. Caveat with vendor AP6256 firmware files present the firmware loading is successful but subsequently fails IRQ wake initialization. Common GPIO conflicts for "WiFi" optioned boards having this module: pwmdac_pins: - AP6256: WL_REG_ON>>WIFI_REG_ON_H_GPIO33 - AP6256: WL_HOST_WAKE>>WIFI_WAKE_HOST_H_GPIO34 i2c2_pins: - AP6256: UART_CTS_N<>UART1_CTSn_GPIO3 i2c6_pins: - AP6256: UART_RXD<>UART_RX_GPIO17 Tested successfully for basic mmc0 storage, USB, and network functionality on: - Milk-V Mars CM 8GB - Milk-V Mars CM Lite 4GB - Milk-V Mars CM Lite WiFi 8GB Changes since v2: - PATCH 3/5 delete newline at end of file - PATCH 5/5 delete newline at end of file Link to v2: https://lore.kernel.org/lkml/20250831225959.531393-1-e at freeshell.de/ E Shattow (5): riscv: dts: starfive: add common board dtsi for Milk-V Mars CM variants dt-bindings: riscv: starfive: add milkv,marscm-emmc riscv: dts: starfive: add Milk-V Mars CM system-on-module dt-bindings: riscv: starfive: add milkv,marscm-lite riscv: dts: starfive: add Milk-V Mars CM Lite system-on-module .../devicetree/bindings/riscv/starfive.yaml | 2 + arch/riscv/boot/dts/starfive/Makefile | 2 + .../dts/starfive/jh7110-milkv-marscm-emmc.dts | 12 ++ .../dts/starfive/jh7110-milkv-marscm-lite.dts | 25 +++ .../dts/starfive/jh7110-milkv-marscm.dtsi | 159 ++++++++++++++++++ 5 files changed, 200 insertions(+) create mode 100644 arch/riscv/boot/dts/starfive/jh7110-milkv-marscm-emmc.dts create mode 100644 arch/riscv/boot/dts/starfive/jh7110-milkv-marscm-lite.dts create mode 100644 arch/riscv/boot/dts/starfive/jh7110-milkv-marscm.dtsi base-commit: 8181cc2f3f21657392da912eb20ee17514c87828 -- 2.50.0 From e at freeshell.de Fri Sep 5 07:39:39 2025 From: e at freeshell.de (E Shattow) Date: Fri, 5 Sep 2025 07:39:39 -0700 Subject: [PATCH v3 1/5] riscv: dts: starfive: add common board dtsi for Milk-V Mars CM variants In-Reply-To: <20250905144011.928332-1-e@freeshell.de> References: <20250905144011.928332-1-e@freeshell.de> Message-ID: <20250905144011.928332-2-e@freeshell.de> Add a common board dtsi for use by Milk-V Mars CM and Milk-V Mars CM Lite. Signed-off-by: E Shattow --- .../dts/starfive/jh7110-milkv-marscm.dtsi | 159 ++++++++++++++++++ 1 file changed, 159 insertions(+) create mode 100644 arch/riscv/boot/dts/starfive/jh7110-milkv-marscm.dtsi diff --git a/arch/riscv/boot/dts/starfive/jh7110-milkv-marscm.dtsi b/arch/riscv/boot/dts/starfive/jh7110-milkv-marscm.dtsi new file mode 100644 index 000000000000..25b70af564ee --- /dev/null +++ b/arch/riscv/boot/dts/starfive/jh7110-milkv-marscm.dtsi @@ -0,0 +1,159 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2025 E Shattow + */ + +/dts-v1/; +#include +#include "jh7110-common.dtsi" + +/ { + aliases { + i2c1 = &i2c1; + i2c3 = &i2c3; + i2c4 = &i2c4; + serial3 = &uart3; + }; + + sdio_pwrseq: sdio-pwrseq { + compatible = "mmc-pwrseq-simple"; + reset-gpios = <&sysgpio 33 GPIO_ACTIVE_LOW>; + }; +}; + +&gmac0 { + assigned-clocks = <&aoncrg JH7110_AONCLK_GMAC0_TX>; + assigned-clock-parents = <&aoncrg JH7110_AONCLK_GMAC0_RMII_RTX>; + starfive,tx-use-rgmii-clk; + status = "okay"; +}; + +&i2c0 { + status = "okay"; +}; + +&i2c2 { + status = "disabled"; +}; + +&i2c6 { + status = "disabled"; +}; + +&mmc1 { + #address-cells = <1>; + #size-cells = <0>; + + mmc-pwrseq = <&sdio_pwrseq>; + non-removable; + status = "okay"; + + ap6256: wifi at 1 { + compatible = "brcm,bcm43456-fmac", "brcm,bcm4329-fmac"; + reg = <1>; + interrupt-parent = <&sysgpio>; + interrupts = <34 IRQ_TYPE_LEVEL_HIGH>; + interrupt-names = "host-wake"; + pinctrl-0 = <&wifi_host_wake_irq>; + pinctrl-names = "default"; + }; +}; + +&pcie0 { + status = "okay"; +}; + +&phy0 { + rx-internal-delay-ps = <1500>; + tx-internal-delay-ps = <1500>; + motorcomm,rx-clk-drv-microamp = <3970>; + motorcomm,rx-data-drv-microamp = <2910>; + motorcomm,tx-clk-10-inverted; + motorcomm,tx-clk-100-inverted; + motorcomm,tx-clk-1000-inverted; + motorcomm,tx-clk-adj-enabled; +}; + +&pwm { + status = "okay"; +}; + +&spi0 { + status = "okay"; +}; + +&sysgpio { + uart1_pins: uart1-0 { + tx-pins { + pinmux = ; + bias-disable; + drive-strength = <12>; + input-disable; + input-schmitt-disable; + }; + + rx-pins { + pinmux = ; + bias-pull-up; + input-enable; + input-schmitt-enable; + }; + + cts-pins { + pinmux = ; + bias-disable; + input-enable; + input-schmitt-enable; + }; + + rts-pins { + pinmux = ; + bias-disable; + input-disable; + input-schmitt-disable; + }; + }; + + usb0_pins: usb0-0 { + vbus-pins { + pinmux = ; + bias-disable; + input-disable; + input-schmitt-disable; + slew-rate = <0>; + }; + }; + + wifi_host_wake_irq: wifi-host-wake-irq-0 { + wake-pins { + pinmux = ; + input-enable; + }; + }; +}; + +&uart1 { + uart-has-rtscts; + pinctrl-0 = <&uart1_pins>; + pinctrl-names = "default"; + status = "okay"; +}; + +&usb0 { + dr_mode = "host"; + pinctrl-names = "default"; + pinctrl-0 = <&usb0_pins>; + status = "okay"; +}; -- 2.50.0 From e at freeshell.de Fri Sep 5 07:39:40 2025 From: e at freeshell.de (E Shattow) Date: Fri, 5 Sep 2025 07:39:40 -0700 Subject: [PATCH v3 2/5] dt-bindings: riscv: starfive: add milkv,marscm-emmc In-Reply-To: <20250905144011.928332-1-e@freeshell.de> References: <20250905144011.928332-1-e@freeshell.de> Message-ID: <20250905144011.928332-3-e@freeshell.de> Add "milkv,marscm-emmc" as a StarFive JH7110 SoC-based system-on-module. Signed-off-by: E Shattow Acked-by: Rob Herring (Arm) --- Documentation/devicetree/bindings/riscv/starfive.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/riscv/starfive.yaml b/Documentation/devicetree/bindings/riscv/starfive.yaml index 7ef85174353d..0713edb687fe 100644 --- a/Documentation/devicetree/bindings/riscv/starfive.yaml +++ b/Documentation/devicetree/bindings/riscv/starfive.yaml @@ -28,6 +28,7 @@ properties: - enum: - deepcomputing,fml13v01 - milkv,mars + - milkv,marscm-emmc - pine64,star64 - starfive,visionfive-2-v1.2a - starfive,visionfive-2-v1.3b -- 2.50.0 From e at freeshell.de Fri Sep 5 07:39:41 2025 From: e at freeshell.de (E Shattow) Date: Fri, 5 Sep 2025 07:39:41 -0700 Subject: [PATCH v3 3/5] riscv: dts: starfive: add Milk-V Mars CM system-on-module In-Reply-To: <20250905144011.928332-1-e@freeshell.de> References: <20250905144011.928332-1-e@freeshell.de> Message-ID: <20250905144011.928332-4-e@freeshell.de> Milk-V Mars CM is a System-on-Module based on the StarFive VisionFive 2 board and Radxa CM3 System-on-Module compatible with the Raspberry Pi CM4IO Classic IO Board. Mars CM SoM features: - StarFive JH7110 System on Chip with RV64GC up to 1.5GHz - AXP15060 Power Management Unit - LPDDR4 2GB / 4GB / 8GB DRAM memory - BL24C04F 4K bits (512 x 8) EEPROM - GigaDevice 25LQ128EWIG QSPI NOR Flash 16M or SoC ROM UART loader for boot (selectable by GPIO) - eMMC5.0 8GB / 16GB / 32GB flash storage onboard - AP6256 via SDIO 2.0 onboard wireless connectivity WiFi 5 + Bluetooth 5.2 (optional, present in models with WiFi feature) - 1x Motorcomm YT8531C Gigabit Ethernet PHY - IMG BXE-4-32 Integrated GPU with 3D Acceleration: - H.264 & H.265 4K at 60fps Decoding - H.265 1080p at 30fps Encoding - JPEG encoder / decoder Additional features available via 2x 100-pin connectors for CM4IO Board: - 1x HDMI 2.0 - 1x MIPI DSI (4-lanes) - 1x 2CH Audio out (via GPIO) - 1x MIPI CSI (2x2-lanes or 1x4-lanes) - 1x USB 2.0 - 1x PCIe 1-lane Host, Gen 2 (5Gbps) - Up to 28x GPIO, supporting 3.3V - UART x6 - PWM x8 - I2C x7 - SPI - I2S Link to Milk-V Mars CM schematics: https://github.com/milkv-mars/mars-files/tree/main/Mars-CM_Hardware_Schematices Link to StarFive JH7110 Technical Reference Manual: https://doc-en.rvspace.org/JH7110/TRM/index.html Link to Raspberry Pi CM4IO datasheet: https://datasheets.raspberrypi.com/cm4io/cm4io-datasheet.pdf Add the devicetree file to make use of StarFive JH7110 common supported features PMIC, EEPROM, UART, I2C, GPIO, eMMC, PCIe, QSPI Flash, PWM, and Ethernet. Also configure the common SD Card interface mmc1 for onboard SDIO BT+WiFi. Signed-off-by: E Shattow --- arch/riscv/boot/dts/starfive/Makefile | 1 + .../boot/dts/starfive/jh7110-milkv-marscm-emmc.dts | 12 ++++++++++++ 2 files changed, 13 insertions(+) create mode 100644 arch/riscv/boot/dts/starfive/jh7110-milkv-marscm-emmc.dts diff --git a/arch/riscv/boot/dts/starfive/Makefile b/arch/riscv/boot/dts/starfive/Makefile index b3bb12f78e7d..79742617ddab 100644 --- a/arch/riscv/boot/dts/starfive/Makefile +++ b/arch/riscv/boot/dts/starfive/Makefile @@ -10,6 +10,7 @@ dtb-$(CONFIG_ARCH_STARFIVE) += jh7100-starfive-visionfive-v1.dtb dtb-$(CONFIG_ARCH_STARFIVE) += jh7110-deepcomputing-fml13v01.dtb dtb-$(CONFIG_ARCH_STARFIVE) += jh7110-milkv-mars.dtb +dtb-$(CONFIG_ARCH_STARFIVE) += jh7110-milkv-marscm-emmc.dtb dtb-$(CONFIG_ARCH_STARFIVE) += jh7110-pine64-star64.dtb dtb-$(CONFIG_ARCH_STARFIVE) += jh7110-starfive-visionfive-2-v1.2a.dtb dtb-$(CONFIG_ARCH_STARFIVE) += jh7110-starfive-visionfive-2-v1.3b.dtb diff --git a/arch/riscv/boot/dts/starfive/jh7110-milkv-marscm-emmc.dts b/arch/riscv/boot/dts/starfive/jh7110-milkv-marscm-emmc.dts new file mode 100644 index 000000000000..e568537af2c4 --- /dev/null +++ b/arch/riscv/boot/dts/starfive/jh7110-milkv-marscm-emmc.dts @@ -0,0 +1,12 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2025 E Shattow + */ + +/dts-v1/; +#include "jh7110-milkv-marscm.dtsi" + +/ { + model = "Milk-V Mars CM"; + compatible = "milkv,marscm-emmc", "starfive,jh7110"; +}; -- 2.50.0 From e at freeshell.de Fri Sep 5 07:39:42 2025 From: e at freeshell.de (E Shattow) Date: Fri, 5 Sep 2025 07:39:42 -0700 Subject: [PATCH v3 4/5] dt-bindings: riscv: starfive: add milkv,marscm-lite In-Reply-To: <20250905144011.928332-1-e@freeshell.de> References: <20250905144011.928332-1-e@freeshell.de> Message-ID: <20250905144011.928332-5-e@freeshell.de> Add "milkv,marscm-lite" as a StarFive JH7110 SoC-based system-on-module. Signed-off-by: E Shattow Acked-by: Rob Herring (Arm) --- Documentation/devicetree/bindings/riscv/starfive.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/riscv/starfive.yaml b/Documentation/devicetree/bindings/riscv/starfive.yaml index 0713edb687fe..04510341a71e 100644 --- a/Documentation/devicetree/bindings/riscv/starfive.yaml +++ b/Documentation/devicetree/bindings/riscv/starfive.yaml @@ -29,6 +29,7 @@ properties: - deepcomputing,fml13v01 - milkv,mars - milkv,marscm-emmc + - milkv,marscm-lite - pine64,star64 - starfive,visionfive-2-v1.2a - starfive,visionfive-2-v1.3b -- 2.50.0 From e at freeshell.de Fri Sep 5 07:39:43 2025 From: e at freeshell.de (E Shattow) Date: Fri, 5 Sep 2025 07:39:43 -0700 Subject: [PATCH v3 5/5] riscv: dts: starfive: add Milk-V Mars CM Lite system-on-module In-Reply-To: <20250905144011.928332-1-e@freeshell.de> References: <20250905144011.928332-1-e@freeshell.de> Message-ID: <20250905144011.928332-6-e@freeshell.de> Milk-V Mars CM Lite is a System-on-Module based on the Milk-V Mars CM without the onboard eMMC storage component populated and configured instead for SD3.0 Card Slot on that interface via 100-pin connector. Link to Milk-V Mars CM Lite schematics: https://github.com/milkv-mars/mars-files/tree/main/Mars-CM_Hardware_Schematices Link to StarFive JH7110 Technical Reference Manual: https://doc-en.rvspace.org/JH7110/TRM/index.html Link to Raspberry Pi CM4IO datasheet: https://datasheets.raspberrypi.com/cm4io/cm4io-datasheet.pdf Add the devicetree file to make use of StarFive JH7110 common supported features PMIC, EEPROM, UART, I2C, GPIO, PCIe, QSPI Flash, PWM, and Ethernet. Also configure the eMMC interface mmc0 for SD Card use and configure the common SD Card interface mmc1 for onboard SDIO BT+WiFi. Signed-off-by: E Shattow --- arch/riscv/boot/dts/starfive/Makefile | 1 + .../dts/starfive/jh7110-milkv-marscm-lite.dts | 25 +++++++++++++++++++ 2 files changed, 26 insertions(+) create mode 100644 arch/riscv/boot/dts/starfive/jh7110-milkv-marscm-lite.dts diff --git a/arch/riscv/boot/dts/starfive/Makefile b/arch/riscv/boot/dts/starfive/Makefile index 79742617ddab..62b659f89ba7 100644 --- a/arch/riscv/boot/dts/starfive/Makefile +++ b/arch/riscv/boot/dts/starfive/Makefile @@ -11,6 +11,7 @@ dtb-$(CONFIG_ARCH_STARFIVE) += jh7100-starfive-visionfive-v1.dtb dtb-$(CONFIG_ARCH_STARFIVE) += jh7110-deepcomputing-fml13v01.dtb dtb-$(CONFIG_ARCH_STARFIVE) += jh7110-milkv-mars.dtb dtb-$(CONFIG_ARCH_STARFIVE) += jh7110-milkv-marscm-emmc.dtb +dtb-$(CONFIG_ARCH_STARFIVE) += jh7110-milkv-marscm-lite.dtb dtb-$(CONFIG_ARCH_STARFIVE) += jh7110-pine64-star64.dtb dtb-$(CONFIG_ARCH_STARFIVE) += jh7110-starfive-visionfive-2-v1.2a.dtb dtb-$(CONFIG_ARCH_STARFIVE) += jh7110-starfive-visionfive-2-v1.3b.dtb diff --git a/arch/riscv/boot/dts/starfive/jh7110-milkv-marscm-lite.dts b/arch/riscv/boot/dts/starfive/jh7110-milkv-marscm-lite.dts new file mode 100644 index 000000000000..6c40d0ec4011 --- /dev/null +++ b/arch/riscv/boot/dts/starfive/jh7110-milkv-marscm-lite.dts @@ -0,0 +1,25 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2025 E Shattow + */ + +/dts-v1/; +#include "jh7110-milkv-marscm.dtsi" + +/ { + model = "Milk-V Mars CM Lite"; + compatible = "milkv,marscm-lite", "starfive,jh7110"; +}; + +&mmc0 { + bus-width = <4>; + cd-gpios = <&sysgpio 41 GPIO_ACTIVE_LOW>; +}; + +&mmc0_pins { + pwren-pins { + pinmux = ; + }; +}; -- 2.50.0 From conor at kernel.org Fri Sep 5 08:24:39 2025 From: conor at kernel.org (Conor Dooley) Date: Fri, 5 Sep 2025 16:24:39 +0100 Subject: [PATCH 2/2] RISC-V: re-enable gcc + rust builds In-Reply-To: References: <20250830-cheesy-prone-ee5fae406c22@spud> <20250903190806.2604757-1-SpriteOvO@gmail.com> <20250903190806.2604757-2-SpriteOvO@gmail.com> <20250904-sterilize-swagger-c7999b124e83@spud> Message-ID: <20250905-prolonged-chip-73be9d74ddb5@spud> Yo, On Fri, Sep 05, 2025 at 06:56:35AM +0800, Asuna wrote: > > One thing - please don't send new versions > > of patchsets in response to earlier versions or other threads. It > > doesn't do you any favours with mailbox visibility. > > I apologize for this, I'm pretty much new to mailing lists, so I had > followed the step "Explicit In-Reply-To headers" [1] in doc. For future > patches I'll send them alone instead of replying to existing threads. > > [1]: https://www.kernel.org/doc/html/v6.9/process/submitting-patches.html#explicit-in-reply-to-headers > > > Other than Zicsr/Zifencei that may need explicit handling in a dedicated > > option, the approach here seems kinda backwards. > > Individually these symbols don't actually mean what they say they do, > > which is confusing: "recognises" here is true even when it may not be > > true at all because TOOLCHAIN_HAS_FOO is not set. Why can these options > > not be removed, and instead the TOOLCHAIN_HAS_FOO options grow a > > "depends on !RUST || "? > > Yes, it's kinda "backwards", which is intentional, based on the following > considerations: > > 1) As mentioned in rust/Makefile, filtering flags for libclang is a hack, > because currently bindgen only has libclang as backend, and ideally bindgen > should support GCC so that the passed CC flags are supposed to be fully > compatible. On the RISC-V side, I tend to think that version checking for > extensions for libclang is also a hack, which could have been accomplished > with just the cc-option function, ideally. > > 2) Rust bindgen only "generates" FFI stuff, it is not involved in the final > assembly stage. In other words, it doesn't matter so much what RISC-V > extensions to turn on for bindgen (although it does have a little impact, > like some macro switches), it's more matter to CC. > Therefore, I chose not to modify the original extension config conditions so > that if libclang doesn't support the CC flag for an extension, then the Rust > build is not supported, rather than treating the extension as not supported. I don't agree with this take, I don't think that any extension should "blindly" take priority over rust like this. Got two or three main gripes with how it is being done here. Firstly, you're lumping every extension into one option even though many of them will not be even implemented on the target. There's no need to disable rust if the user has no intention of even making use of the extension that would block its use. That runs into the second point, in that you're using TOOLCHAIN_HAS_FOO here, which is only an indicator of whether the toolchain supports the extension not whether the kernel is even going to use it. The third problem I have is that the symbol you're interacting with is not user selectable, and therefore doesn't allow the user to decide whether or not a particular extension or rust support with the toolchain they have is the higher priority. If the check moves to the individual TOOLCHAIN_HAS_FOO options, they could be a depends on !RUST || which would allow the user to make a decision about which has a greater priority while also handling the extensions individually. > Nonetheless, it occurred to me as I was writing this reply that if GCC > implements a new extension in the future that LLVM/Clang doesn't yet have, > this could once again lead to a break in GCC+Rust build support if the > kernel decides to use the new extension. So it's a trade-off, you guys > decide, I'm fine with both. > > Regarding the name, initially I named it "compatible", and ended up changed > it to "recognize" before sending the patch. If we continue on this path, I'm > not sure what name is appropriate to use here, do you guys have any ideas? > > > What does the libclang >= 17 requirement actually do here? Is that the > > version where llvm starts to require that Zicsr/Zifencei is set in order > > to use them? I think a comment to that effect is required if so. This > > doesn't actually need to be blocking either, should just be able to > > filter it out of march when passing to bindgen, no? > > libclang >= 17 starts recognizing Zicsr/Zifencei in -march, passing them to > -march doesn't generate an error, and passing them or not doesn't have any > real difference. (still follows ISA before version 20190608 -- > Zicsr/Zifencei are included in base ISA). I should have written a comment > there to avoid confusion. > > Reference commit in LLVM/Clang 22e199e6af ("[RISCV] Accept zicsr and > zifencei command line options") > https://github.com/llvm/llvm-project/commit/22e199e6afb1263c943c0c0d4498694e15bf8a16 > > > What about the case where TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI is not > > set at all? Currently your patch is going to block rust in that case, > > when actually nothing needs to be done at all - no part of the toolchain > > requires understanding Zicsr/Zifencei as standalone extensions in this > > case. > > This is a bug, I missed this case. So it should be corrected to: > > config RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZICSR_ZIFENCEI > ? ? def_bool y > ? ? depends on TOOLCHAIN_NEEDS_OLD_ISA_SPEC || > !TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI || > RUST_BINDGEN_LIBCLANG_VERSION >= 170000 > > > > The TOOLCHAIN_NEEDS_OLD_ISA_SPEC handling I don't remember 100% how it > > works, but if bindgen requires them to be set to use the extension > > this will return true but do nothing to add the extensions to march? > > That seems wrong to me. > > I'd be fairly amenable to disabling rust though when used in combination > > with gcc < 11.3 and gas >=2.36 since it's such a niche condition, rather > > doing work to support it. That'd be effectively an inversion of your > > first condition. > > The current latest version of LLVM/Clang still does not require explicit > Zicsr/Zifence to enable these two extensions, Clang just accepts them in > -march and then silently ignores them. > > Checking the usage of CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC: > > ifdef CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC > KBUILD_CFLAGS += -Wa,-misa-spec=2.2 > KBUILD_AFLAGS += -Wa,-misa-spec=2.2 > else > riscv-march-$(CONFIG_TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI) := > $(riscv-march-y)_zicsr_zifencei > endif > > It just uses -Wa to force an older ISA version to GAS. So the > RUST_BINDGEN_LIBCLANG_RECOGNIZES_ZICSR_ZIFENCEI I corrected above should be > fine now I guess? Or would you still prefer your idea of blocking Rust if > TOOLCHAIN_NEEDS_OLD_ISA_SPEC is true? Nah, if the explicit setting isn't required then it should be fine to not block on it being used. To be honest, I'm not concerned about Zicsr/Zifencei being communicated across to bindgen as much as I would be about other extensions, my motivation here is regarding build breakages - in particular when things like TOOLCHAIN_NEEDS_OLD_ISA_SPEC is set, since it's a very niche configuration that if someone told me they were using I would tell them to stop. As I said, the original reason for this existing was to support w/e old version of debian linaro were using that could not do LLVM=1 builds and I think the person who added to this handle gcc with older binutils was trying to do a gradual move from an old toolchain in steps to a modern one, so neither were instances of someone actually wanting to use such a strange mix. > (To be clear, the breaking changes regarding Zicsr/Zifence are since ISA > version 20190608, and versions 2.0, 2.1, 2.2 are older than 20190608) > > The only thing I'm confused about is that according to the comment of > TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI, GCC-12.1.0 bumped the default ISA > to 20191213, but why doesn't the depends-on have condition || (CC_IS_GCC && > GCC_VERSION >= 120100)? It's probably something along the lines of there being no _C_ code that produces the Zicsr and Zifencei instructions, and therefore no build errors produced if they're missing. That's part of why I said my motivation in this particular case is build breakage, more than anything else. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From conor at kernel.org Fri Sep 5 08:25:56 2025 From: conor at kernel.org (Conor Dooley) Date: Fri, 5 Sep 2025 16:25:56 +0100 Subject: [PATCH 2/2] RISC-V: re-enable gcc + rust builds In-Reply-To: <1b95b2f0-e916-4a86-a274-da2ff7f9d516@gmail.com> References: <20250830-cheesy-prone-ee5fae406c22@spud> <20250903190806.2604757-1-SpriteOvO@gmail.com> <20250903190806.2604757-2-SpriteOvO@gmail.com> <20250904-sterilize-swagger-c7999b124e83@spud> <1b95b2f0-e916-4a86-a274-da2ff7f9d516@gmail.com> Message-ID: <20250905-domain-theater-214254632b87@spud> On Fri, Sep 05, 2025 at 07:07:20AM +0800, Asuna wrote: > CC rust-for-linux list, I missed it in copying from get_maintainer.pl, the > thread is a bit of a mess now :( If you're doing that, keep the whole message in the mail. Think I just perpetuated the problem by replying to the mail a body rather than the one with the amended CC list. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From conor at kernel.org Fri Sep 5 08:28:43 2025 From: conor at kernel.org (Conor Dooley) Date: Fri, 5 Sep 2025 16:28:43 +0100 Subject: [PATCH 2/2] RISC-V: re-enable gcc + rust builds In-Reply-To: References: <20250830-cheesy-prone-ee5fae406c22@spud> <20250903190806.2604757-1-SpriteOvO@gmail.com> <20250903190806.2604757-2-SpriteOvO@gmail.com> <20250904-sterilize-swagger-c7999b124e83@spud> Message-ID: <20250905-swipe-unstuck-dd7ad6e5466a@spud> On Fri, Sep 05, 2025 at 06:56:35AM +0800, Asuna wrote: > > One thing - please don't send new versions > > of patchsets in response to earlier versions or other threads. It > > doesn't do you any favours with mailbox visibility. > > I apologize for this, I'm pretty much new to mailing lists, so I had > followed the step "Explicit In-Reply-To headers" [1] in doc. For future > patches I'll send them alone instead of replying to existing threads. > > [1]: https://www.kernel.org/doc/html/v6.9/process/submitting-patches.html#explicit-in-reply-to-headers Ye I think this is mostly just misleading. You're better off providing a lore link in the body of the mail than replying to some old thread. I find that explicit in-reply-to stuff only really helpful to send a single patch as part of a conversation where it's effectively an RFC. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From horms at kernel.org Fri Sep 5 08:35:00 2025 From: horms at kernel.org (Simon Horman) Date: Fri, 5 Sep 2025 16:35:00 +0100 Subject: [PATCH net-next v9 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250905-net-k1-emac-v9-2-f1649b98a19c@iscas.ac.cn> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> <20250905-net-k1-emac-v9-2-f1649b98a19c@iscas.ac.cn> Message-ID: <20250905153500.GH553991@horms.kernel.org> On Fri, Sep 05, 2025 at 07:09:31PM +0800, Vivian Wang wrote: > The Ethernet MACs found on SpacemiT K1 appears to be a custom design > that only superficially resembles some other embedded MACs. SpacemiT > refers to them as "EMAC", so let's just call the driver "k1_emac". > > Supports RGMII and RMII interfaces. Includes support for MAC hardware > statistics counters. PTP support is not implemented. > > Signed-off-by: Vivian Wang > Reviewed-by: Maxime Chevallier > Reviewed-by: Vadim Fedorenko > Reviewed-by: Troy Mitchell > Tested-by: Junhui Liu > Tested-by: Troy Mitchell ... > diff --git a/drivers/net/ethernet/spacemit/k1_emac.c b/drivers/net/ethernet/spacemit/k1_emac.c ... > +static void emac_init_hw(struct emac_priv *priv) > +{ > + /* Destination address for 802.3x Ethernet flow control */ > + u8 fc_dest_addr[ETH_ALEN] = { 0x01, 0x80, 0xc2, 0x00, 0x00, 0x01 }; > + > + u32 rxirq = 0, dma = 0; > + > + regmap_set_bits(priv->regmap_apmu, > + priv->regmap_apmu_offset + APMU_EMAC_CTRL_REG, > + AXI_SINGLE_ID); > + > + /* Disable transmit and receive units */ > + emac_wr(priv, MAC_RECEIVE_CONTROL, 0x0); > + emac_wr(priv, MAC_TRANSMIT_CONTROL, 0x0); > + > + /* Enable MAC address 1 filtering */ > + emac_wr(priv, MAC_ADDRESS_CONTROL, MREGBIT_MAC_ADDRESS1_ENABLE); > + > + /* Zero initialize the multicast hash table */ > + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, 0x0); > + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, 0x0); > + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, 0x0); > + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, 0x0); > + > + /* Configure thresholds */ > + emac_wr(priv, MAC_TRANSMIT_FIFO_ALMOST_FULL, DEFAULT_TX_ALMOST_FULL); > + emac_wr(priv, MAC_TRANSMIT_PACKET_START_THRESHOLD, > + DEFAULT_TX_THRESHOLD); > + emac_wr(priv, MAC_RECEIVE_PACKET_START_THRESHOLD, DEFAULT_RX_THRESHOLD); > + > + /* Configure flow control (enabled in emac_adjust_link() later) */ > + emac_set_mac_addr_reg(priv, fc_dest_addr, MAC_FC_SOURCE_ADDRESS_HIGH); > + emac_wr(priv, MAC_FC_PAUSE_HIGH_THRESHOLD, DEFAULT_FC_FIFO_HIGH); > + emac_wr(priv, MAC_FC_HIGH_PAUSE_TIME, DEFAULT_FC_PAUSE_TIME); > + emac_wr(priv, MAC_FC_PAUSE_LOW_THRESHOLD, 0); > + > + /* RX IRQ mitigation */ > + rxirq = EMAC_RX_FRAMES & MREGBIT_RECEIVE_IRQ_FRAME_COUNTER_MASK; > + rxirq |= (EMAC_RX_COAL_TIMEOUT > + << MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_SHIFT) & > + MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_MASK; Probably this driver can benefit from using FIELD_PREP and FIELD_GET in a number of places. In this case I think it would mean that MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_SHIFT can be removed entirely. > + > + rxirq |= MREGBIT_RECEIVE_IRQ_MITIGATION_ENABLE; > + emac_wr(priv, DMA_RECEIVE_IRQ_MITIGATION_CTRL, rxirq); ... > +/* Returns number of packets received */ > +static int emac_rx_clean_desc(struct emac_priv *priv, int budget) > +{ > + struct net_device *ndev = priv->ndev; > + struct emac_rx_desc_buffer *rx_buf; > + struct emac_desc_ring *rx_ring; > + struct sk_buff *skb = NULL; > + struct emac_desc *rx_desc; > + u32 got = 0, skb_len, i; > + int status; > + > + rx_ring = &priv->rx_ring; > + > + i = rx_ring->tail; > + > + while (budget--) { > + rx_desc = &((struct emac_desc *)rx_ring->desc_addr)[i]; > + > + /* Stop checking if rx_desc still owned by DMA */ > + if (READ_ONCE(rx_desc->desc0) & RX_DESC_0_OWN) > + break; > + > + dma_rmb(); > + > + rx_buf = &rx_ring->rx_desc_buf[i]; > + > + if (!rx_buf->skb) > + break; > + > + got++; > + > + dma_unmap_single(&priv->pdev->dev, rx_buf->dma_addr, > + rx_buf->dma_len, DMA_FROM_DEVICE); > + > + status = emac_rx_frame_status(priv, rx_desc); > + if (unlikely(status == RX_FRAME_DISCARD)) { > + ndev->stats.rx_dropped++; As per the comment in struct net-device, ndev->stats should not be used in modern drivers. Probably you want to implement NETDEV_PCPU_STAT_TSTATS. Sorry for not mentioning this in an earlier review of stats in this driver. > + dev_kfree_skb_irq(rx_buf->skb); > + rx_buf->skb = NULL; > + } else { > + skb = rx_buf->skb; > + skb_len = rx_frame_len(rx_desc) - ETH_FCS_LEN; > + skb_put(skb, skb_len); > + skb->dev = ndev; > + ndev->hard_header_len = ETH_HLEN; > + > + skb->protocol = eth_type_trans(skb, ndev); > + > + skb->ip_summed = CHECKSUM_NONE; > + > + napi_gro_receive(&priv->napi, skb); > + > + ndev->stats.rx_packets++; > + ndev->stats.rx_bytes += skb_len; > + > + memset(rx_desc, 0, sizeof(struct emac_desc)); > + rx_buf->skb = NULL; > + } > + > + if (++i == rx_ring->total_cnt) > + i = 0; > + } > + > + rx_ring->tail = i; > + > + emac_alloc_rx_desc_buffers(priv); > + > + return got; > +} ... From wangruikang at iscas.ac.cn Fri Sep 5 08:45:29 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Fri, 5 Sep 2025 23:45:29 +0800 Subject: [PATCH net-next v9 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250905153500.GH553991@horms.kernel.org> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> <20250905-net-k1-emac-v9-2-f1649b98a19c@iscas.ac.cn> <20250905153500.GH553991@horms.kernel.org> Message-ID: <0605f176-5cdb-4f5b-9a6b-afa139c96732@iscas.ac.cn> Hi Simon, Thanks for the review. (I have a question about the use of ndev->stats - see below.) On 9/5/25 23:35, Simon Horman wrote: > On Fri, Sep 05, 2025 at 07:09:31PM +0800, Vivian Wang wrote: >> The Ethernet MACs found on SpacemiT K1 appears to be a custom design >> that only superficially resembles some other embedded MACs. SpacemiT >> refers to them as "EMAC", so let's just call the driver "k1_emac". >> >> Supports RGMII and RMII interfaces. Includes support for MAC hardware >> statistics counters. PTP support is not implemented. >> >> Signed-off-by: Vivian Wang >> Reviewed-by: Maxime Chevallier >> Reviewed-by: Vadim Fedorenko >> Reviewed-by: Troy Mitchell >> Tested-by: Junhui Liu >> Tested-by: Troy Mitchell > ... > >> diff --git a/drivers/net/ethernet/spacemit/k1_emac.c b/drivers/net/ethernet/spacemit/k1_emac.c > ... > >> +static void emac_init_hw(struct emac_priv *priv) >> +{ >> + /* Destination address for 802.3x Ethernet flow control */ >> + u8 fc_dest_addr[ETH_ALEN] = { 0x01, 0x80, 0xc2, 0x00, 0x00, 0x01 }; >> + >> + u32 rxirq = 0, dma = 0; >> + >> + regmap_set_bits(priv->regmap_apmu, >> + priv->regmap_apmu_offset + APMU_EMAC_CTRL_REG, >> + AXI_SINGLE_ID); >> + >> + /* Disable transmit and receive units */ >> + emac_wr(priv, MAC_RECEIVE_CONTROL, 0x0); >> + emac_wr(priv, MAC_TRANSMIT_CONTROL, 0x0); >> + >> + /* Enable MAC address 1 filtering */ >> + emac_wr(priv, MAC_ADDRESS_CONTROL, MREGBIT_MAC_ADDRESS1_ENABLE); >> + >> + /* Zero initialize the multicast hash table */ >> + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, 0x0); >> + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, 0x0); >> + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, 0x0); >> + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, 0x0); >> + >> + /* Configure thresholds */ >> + emac_wr(priv, MAC_TRANSMIT_FIFO_ALMOST_FULL, DEFAULT_TX_ALMOST_FULL); >> + emac_wr(priv, MAC_TRANSMIT_PACKET_START_THRESHOLD, >> + DEFAULT_TX_THRESHOLD); >> + emac_wr(priv, MAC_RECEIVE_PACKET_START_THRESHOLD, DEFAULT_RX_THRESHOLD); >> + >> + /* Configure flow control (enabled in emac_adjust_link() later) */ >> + emac_set_mac_addr_reg(priv, fc_dest_addr, MAC_FC_SOURCE_ADDRESS_HIGH); >> + emac_wr(priv, MAC_FC_PAUSE_HIGH_THRESHOLD, DEFAULT_FC_FIFO_HIGH); >> + emac_wr(priv, MAC_FC_HIGH_PAUSE_TIME, DEFAULT_FC_PAUSE_TIME); >> + emac_wr(priv, MAC_FC_PAUSE_LOW_THRESHOLD, 0); >> + >> + /* RX IRQ mitigation */ >> + rxirq = EMAC_RX_FRAMES & MREGBIT_RECEIVE_IRQ_FRAME_COUNTER_MASK; >> + rxirq |= (EMAC_RX_COAL_TIMEOUT >> + << MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_SHIFT) & >> + MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_MASK; > Probably this driver can benefit from using FIELD_PREP and FIELD_GET > in a number of places. In this case I think it would mean that > MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_SHIFT can be removed entirely. That looks useful. There's a few more uses of *_SHIFT in this driver, and I think I can get them all to use FIELD_PREP. I'll change those in the next version. >> + >> + rxirq |= MREGBIT_RECEIVE_IRQ_MITIGATION_ENABLE; >> + emac_wr(priv, DMA_RECEIVE_IRQ_MITIGATION_CTRL, rxirq); > ... > >> +/* Returns number of packets received */ >> +static int emac_rx_clean_desc(struct emac_priv *priv, int budget) >> +{ >> + struct net_device *ndev = priv->ndev; >> + struct emac_rx_desc_buffer *rx_buf; >> + struct emac_desc_ring *rx_ring; >> + struct sk_buff *skb = NULL; >> + struct emac_desc *rx_desc; >> + u32 got = 0, skb_len, i; >> + int status; >> + >> + rx_ring = &priv->rx_ring; >> + >> + i = rx_ring->tail; >> + >> + while (budget--) { >> + rx_desc = &((struct emac_desc *)rx_ring->desc_addr)[i]; >> + >> + /* Stop checking if rx_desc still owned by DMA */ >> + if (READ_ONCE(rx_desc->desc0) & RX_DESC_0_OWN) >> + break; >> + >> + dma_rmb(); >> + >> + rx_buf = &rx_ring->rx_desc_buf[i]; >> + >> + if (!rx_buf->skb) >> + break; >> + >> + got++; >> + >> + dma_unmap_single(&priv->pdev->dev, rx_buf->dma_addr, >> + rx_buf->dma_len, DMA_FROM_DEVICE); >> + >> + status = emac_rx_frame_status(priv, rx_desc); >> + if (unlikely(status == RX_FRAME_DISCARD)) { >> + ndev->stats.rx_dropped++; > As per the comment in struct net-device, > ndev->stats should not be used in modern drivers. > > Probably you want to implement NETDEV_PCPU_STAT_TSTATS. > > Sorry for not mentioning this in an earlier review of > stats in this driver. > On a closer look, these counters in ndev->stats seems to be redundant with the hardware-tracked statistics, so maybe I should just not bother with updating ndev->stats. Does that make sense? Thanks, Vivian "dramforever" Wang From horms at kernel.org Fri Sep 5 09:01:58 2025 From: horms at kernel.org (Simon Horman) Date: Fri, 5 Sep 2025 17:01:58 +0100 Subject: [PATCH net-next v9 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <0605f176-5cdb-4f5b-9a6b-afa139c96732@iscas.ac.cn> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> <20250905-net-k1-emac-v9-2-f1649b98a19c@iscas.ac.cn> <20250905153500.GH553991@horms.kernel.org> <0605f176-5cdb-4f5b-9a6b-afa139c96732@iscas.ac.cn> Message-ID: <20250905160158.GI553991@horms.kernel.org> On Fri, Sep 05, 2025 at 11:45:29PM +0800, Vivian Wang wrote: ... Hi Vivian, > >> + status = emac_rx_frame_status(priv, rx_desc); > >> + if (unlikely(status == RX_FRAME_DISCARD)) { > >> + ndev->stats.rx_dropped++; > > As per the comment in struct net-device, > > ndev->stats should not be used in modern drivers. > > > > Probably you want to implement NETDEV_PCPU_STAT_TSTATS. > > > > Sorry for not mentioning this in an earlier review of > > stats in this driver. > > > On a closer look, these counters in ndev->stats seems to be redundant > with the hardware-tracked statistics, so maybe I should just not bother > with updating ndev->stats. Does that make sense? For rx/tx packets/bytes I think that makes sense. But what about rx/tx drops? ... From wangruikang at iscas.ac.cn Fri Sep 5 09:35:37 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Sat, 6 Sep 2025 00:35:37 +0800 Subject: [PATCH net-next v9 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250905160158.GI553991@horms.kernel.org> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> <20250905-net-k1-emac-v9-2-f1649b98a19c@iscas.ac.cn> <20250905153500.GH553991@horms.kernel.org> <0605f176-5cdb-4f5b-9a6b-afa139c96732@iscas.ac.cn> <20250905160158.GI553991@horms.kernel.org> Message-ID: <45053235-3b01-42d8-98aa-042681104d11@iscas.ac.cn> On 9/6/25 00:01, Simon Horman wrote: > On Fri, Sep 05, 2025 at 11:45:29PM +0800, Vivian Wang wrote: > > ... > > Hi Vivian, > >>>> + status = emac_rx_frame_status(priv, rx_desc); >>>> + if (unlikely(status == RX_FRAME_DISCARD)) { >>>> + ndev->stats.rx_dropped++; >>> As per the comment in struct net-device, >>> ndev->stats should not be used in modern drivers. >>> >>> Probably you want to implement NETDEV_PCPU_STAT_TSTATS. >>> >>> Sorry for not mentioning this in an earlier review of >>> stats in this driver. >>> >> On a closer look, these counters in ndev->stats seems to be redundant >> with the hardware-tracked statistics, so maybe I should just not bother >> with updating ndev->stats. Does that make sense? > For rx/tx packets/bytes I think that makes sense. > But what about rx/tx drops? Right... but tstats doesn't have *_dropped. It seems that tx_dropped and rx_dropped are considered "slow path" for real devices. It makes sense to me that those should be very rare. So it seems that what I should do is to just track tx_dropped and rx_dropped myself in a member in emac_priv and report in the ndo_get_stats64 callback, and use the hardware stuff for the rest, as implemented now. Vivian "dramforever" Wang From vishal.moola at gmail.com Fri Sep 5 11:02:35 2025 From: vishal.moola at gmail.com (Vishal Moola (Oracle)) Date: Fri, 5 Sep 2025 11:02:35 -0700 Subject: [PATCH v3 3/7] x86: Stop calling page_address() in free_pages() In-Reply-To: References: <20250903185921.1785167-1-vishal.moola@gmail.com> <20250903185921.1785167-4-vishal.moola@gmail.com> Message-ID: On Thu, Sep 04, 2025 at 02:54:24PM +0300, Mike Rapoport wrote: > On Thu, Sep 04, 2025 at 02:51:14PM +0300, Mike Rapoport wrote: > > On Wed, Sep 03, 2025 at 11:59:17AM -0700, Vishal Moola (Oracle) wrote: > > > free_pages() should be used when we only have a virtual address. We > > > should call __free_pages() directly on our page instead. > > > > > > Signed-off-by: Vishal Moola (Oracle) > > > Acked-by: Dave Hansen > > > --- > > > arch/x86/mm/init_64.c | 2 +- > > > arch/x86/platform/efi/memmap.c | 2 +- > > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > > > index b9426fce5f3e..0e4270e20fad 100644 > > > --- a/arch/x86/mm/init_64.c > > > +++ b/arch/x86/mm/init_64.c > > > @@ -1031,7 +1031,7 @@ static void __meminit free_pagetable(struct page *page, int order) > > > free_reserved_pages(page, nr_pages); > > > #endif > > > } else { > > > - free_pages((unsigned long)page_address(page), order); > > > + __free_pages(page, order); > > > } > > > } > > > > > > diff --git a/arch/x86/platform/efi/memmap.c b/arch/x86/platform/efi/memmap.c > > > index 061b8ecc71a1..023697c88910 100644 > > > --- a/arch/x86/platform/efi/memmap.c > > > +++ b/arch/x86/platform/efi/memmap.c > > > @@ -42,7 +42,7 @@ void __init __efi_memmap_free(u64 phys, unsigned long size, unsigned long flags) > > > struct page *p = pfn_to_page(PHYS_PFN(phys)); > > > unsigned int order = get_order(size); > > > > > > - free_pages((unsigned long) page_address(p), order); > > > > Could be just free_pages((unsigned long)phys_to_virt(phys), order), then > > the page is not needed at all. > > Or even __free_pages(phys_to_page(phys), order); Right. It actually looks like we could inline this whole block if we really wanted to... __free_pages(phys_to_page(phys), get_order(size)); Should I send a fixup (or v4) with this change? From atish.patra at linux.dev Fri Sep 5 11:04:55 2025 From: atish.patra at linux.dev (Atish Patra) Date: Fri, 5 Sep 2025 11:04:55 -0700 Subject: [PATCH] drivers/perf: riscv: Remove redundant ternary operators In-Reply-To: <20250828122510.30843-1-liaoyuanhong@vivo.com> References: <20250828122510.30843-1-liaoyuanhong@vivo.com> Message-ID: <3e113599-2d99-4585-af14-a93cafe11d33@linux.dev> On 8/28/25 5:25 AM, Liao Yuanhong wrote: > For ternary operators in the form of "a ? true : false", if 'a' itself > returns a boolean result, the ternary operator can be omitted. Remove > redundant ternary operators to clean up the code. > > Signed-off-by: Liao Yuanhong > --- > drivers/perf/riscv_pmu_sbi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c > index 698de8ddf895..c18dbffa9834 100644 > --- a/drivers/perf/riscv_pmu_sbi.c > +++ b/drivers/perf/riscv_pmu_sbi.c > @@ -339,7 +339,7 @@ static bool pmu_sbi_ctr_is_fw(int cidx) > if (!info) > return false; > > - return (info->type == SBI_PMU_CTR_TYPE_FW) ? true : false; > + return info->type == SBI_PMU_CTR_TYPE_FW; > } > > /* Reviewed-by: Atish Patra From atishp at rivosinc.com Fri Sep 5 12:34:48 2025 From: atishp at rivosinc.com (Atish Kumar Patra) Date: Fri, 5 Sep 2025 12:34:48 -0700 Subject: [PATCH v5 6/9] KVM: Add a helper function to check if a gpa is in writable memselot In-Reply-To: References: <20250829-pmu_event_info-v5-0-9dca26139a33@rivosinc.com> <20250829-pmu_event_info-v5-6-9dca26139a33@rivosinc.com> Message-ID: On Fri, Sep 5, 2025 at 1:23?AM Sean Christopherson wrote: > > On Wed, Sep 03, 2025, Atish Kumar Patra wrote: > > On Fri, Aug 29, 2025 at 1:47?PM Sean Christopherson wrote: > > > > > > On Fri, Aug 29, 2025, Atish Patra wrote: > > > > +static inline bool kvm_is_gpa_in_writable_memslot(struct kvm *kvm, gpa_t gpa) > > > > +{ > > > > + bool writable; > > > > + unsigned long hva = gfn_to_hva_prot(kvm, gpa_to_gfn(gpa), &writable); > > > > + > > > > + return !kvm_is_error_hva(hva) && writable; > > > > > > I don't hate this API, but I don't love it either. Because knowing that the > > > _memslot_ is writable doesn't mean all that much. E.g. in this usage: > > > > > > hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable); > > > if (kvm_is_error_hva(hva) || !writable) > > > return SBI_ERR_INVALID_ADDRESS; > > > > > > ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta)); > > > if (ret) > > > return SBI_ERR_FAILURE; > > > > > > the error code returned to the guest will be different if the memslot is read-only > > > versus if the VMA is read-only (or not even mapped!). Unless every read-only > > > memslot is explicitly communicated as such to the guest, I don't see how the guest > > > can *know* that a memslot is read-only, so returning INVALID_ADDRESS in that case > > > but not when the underlying VMA isn't writable seems odd. > > > > > > It's also entirely possible the memslot could be replaced with a read-only memslot > > > after the check, or vice versa, i.e. become writable after being rejected. Is it > > > *really* a problem to return FAILURE if the guest attempts to setup steal-time in > > > a read-only memslot? I.e. why not do this and call it good? > > > > > > > Reposting the response as gmail converted my previous response as > > html. Sorry for the spam. > > > > From a functionality pov, that should be fine. However, we have > > explicit error conditions for read only memory defined in the SBI STA > > specification[1]. > > Technically, we will violate the spec if we return FAILURE instead of > > INVALID_ADDRESS for read only memslot. > > But KVM is already violating the spec, as kvm_vcpu_write_guest() redoes the > memslot lookup and so could encounter a read-only memslot (if it races with > a memslot update), and because the underlying memory could be read-only even if > the memslot is writable. > Ahh. Thanks for clarifying that. > Why not simply return SBI_ERR_INVALID_ADDRESS on kvm_vcpu_write_guest() failure? > The only downside of that is KVM will also return SBI_ERR_INVALID_ADDRESS if the > userspace mapping is completely missing, but AFAICT that doesn't seem to be an > outright spec violation. Yes. That's correct. That can still be considered as invalid address. I will revise the patch according to this. Thanks for the suggestions. From pjw at kernel.org Fri Sep 5 15:14:43 2025 From: pjw at kernel.org (Paul Walmsley) Date: Fri, 5 Sep 2025 16:14:43 -0600 (MDT) Subject: [PATCH 1/2] riscv: Fix sparse warning in __get_user_error() In-Reply-To: <20250903-dev-alex-sparse_warnings_v1-v1-1-7e6350beb700@rivosinc.com> References: <20250903-dev-alex-sparse_warnings_v1-v1-0-7e6350beb700@rivosinc.com> <20250903-dev-alex-sparse_warnings_v1-v1-1-7e6350beb700@rivosinc.com> Message-ID: <6bc6fe97-b9e6-03ca-91e9-e61fcd51f3b9@kernel.org> On Wed, 3 Sep 2025, Alexandre Ghiti wrote: > We used to assign 0 to x without an appropriate cast which results in > sparse complaining when x is a pointer: > > >> block/ioctl.c:72:39: sparse: sparse: Using plain integer as NULL pointer > > So fix this by casting 0 to the correct type of x. Thanks, queued for v6.17-rc fixes. - Paul From pjw at kernel.org Fri Sep 5 15:15:06 2025 From: pjw at kernel.org (Paul Walmsley) Date: Fri, 5 Sep 2025 16:15:06 -0600 (MDT) Subject: [PATCH 2/2] riscv: Fix sparse warning about different address spaces In-Reply-To: <20250903-dev-alex-sparse_warnings_v1-v1-2-7e6350beb700@rivosinc.com> References: <20250903-dev-alex-sparse_warnings_v1-v1-0-7e6350beb700@rivosinc.com> <20250903-dev-alex-sparse_warnings_v1-v1-2-7e6350beb700@rivosinc.com> Message-ID: <4381e8fc-67eb-c7f8-d4ad-17a1fe0a5bfa@kernel.org> On Wed, 3 Sep 2025, Alexandre Ghiti wrote: > We did not propagate the __user attribute of the pointers in > __get_kernel_nofault() and __put_kernel_nofault(), which results in > sparse complaining: > > >> mm/maccess.c:41:17: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected void const [noderef] __user *from @@ got unsigned long long [usertype] * @@ > mm/maccess.c:41:17: sparse: expected void const [noderef] __user *from > mm/maccess.c:41:17: sparse: got unsigned long long [usertype] * > > So fix this by correctly casting those pointers. > > Reported-by: kernel test robot > Closes: https://lore.kernel.org/oe-kbuild-all/202508161713.RWu30Lv1-lkp at intel.com/ > Suggested-by: Al Viro > Fixes: f6bff7827a48 ("riscv: uaccess: use 'asm_goto_output' for get_user()") > Cc: stable at vger.kernel.org > Signed-off-by: Alexandre Ghiti Thanks, queued for v6.17-rc fixes. - Paul From pjw at kernel.org Fri Sep 5 15:16:30 2025 From: pjw at kernel.org (Paul Walmsley) Date: Fri, 5 Sep 2025 16:16:30 -0600 (MDT) Subject: [GIT PULL] RISC-V updates for v6.17-rc5 Message-ID: <053b276c-b22b-f3e7-6c11-abe61b8ee36b@kernel.org> Linus, The following changes since commit 8f5ae30d69d7543eee0d70083daf4de8fe15d585: Linux 6.17-rc1 (2025-08-10 19:41:16 +0300) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux tags/riscv-for-linus-6.17-rc5 for you to fetch changes up to a03ee11b8f850bd008226c6d392da24163dfb56e: riscv: Fix sparse warning about different address spaces (2025-09-05 15:33:52 -0600) ---------------------------------------------------------------- Several RISC-V fixes for v6.17-rc5: - An LTO fix for clang when building with CONFIG_CMODEL_MEDLOW - A fix for ACPI CPPC CSR read/write return values - Several fixes for incorrect access widths in thread_info.cpu reads - A fix for an issue in __put_user_nocheck() that was causing the glibc tst-socket-timestamp test to fail - A fix to initialize struct kexec_buf records in several kexec-related functions, which were generating UBSAN warnings - Two fixes for sparse warnings ---------------------------------------------------------------- Alexandre Ghiti (2): riscv: Fix sparse warning in __get_user_error() riscv: Fix sparse warning about different address spaces Anup Patel (1): ACPI: RISC-V: Fix FFH_CPPC_CSR error handling Aurelien Jarno (1): riscv: uaccess: fix __put_user_nocheck for unaligned accesses Breno Leitao (1): riscv: kexec: Initialize kexec_buf struct Nathan Chancellor (1): riscv: Only allow LTO with CMODEL_MEDANY Radim Kr?m?? (4): riscv: use lw when reading int cpu in new_vmalloc_check riscv, bpf: use lw when reading int cpu in BPF_MOV64_PERCPU_REG riscv, bpf: use lw when reading int cpu in bpf_get_smp_processor_id riscv: use lw when reading int cpu in asm_per_cpu arch/riscv/Kconfig | 2 +- arch/riscv/include/asm/asm.h | 2 +- arch/riscv/include/asm/uaccess.h | 8 ++++---- arch/riscv/kernel/entry.S | 2 +- arch/riscv/kernel/kexec_elf.c | 4 ++-- arch/riscv/kernel/kexec_image.c | 2 +- arch/riscv/kernel/machine_kexec_file.c | 2 +- arch/riscv/net/bpf_jit_comp64.c | 4 ++-- drivers/acpi/riscv/cppc.c | 4 ++-- 9 files changed, 15 insertions(+), 15 deletions(-) From pjw at kernel.org Fri Sep 5 15:32:01 2025 From: pjw at kernel.org (Paul Walmsley) Date: Fri, 5 Sep 2025 16:32:01 -0600 (MDT) Subject: Helping out with arch/riscv maintenance Message-ID: <4620eefc-8304-85a5-5a89-dcc6610edbfb@kernel.org> Hi folks, At Palmer's request, I'll be helping out with arch/riscv maintenance for a little while. Palmer's been doing a great job, and I'm sure I won't be able to do as good of a job as he's been doing. Let's see if we can keep things moving smartly in the interim. As you may have seen, a fixes PR was just sent to Linus. Many of the patches were initially collected by Alex Ghiti, and I borrowed some of his curation work for that branch - thanks very much, Alex. Will plan to send at least one more set of fixes to Linus before the merge window. Some of my local kernel maintenance infrastructure is a bit out of date. I'll be cleaning that up over the next few weeks. Thanks for your patience while that gets updated. - Paul From ebiggers at kernel.org Fri Sep 5 16:00:06 2025 From: ebiggers at kernel.org (Eric Biggers) Date: Fri, 5 Sep 2025 16:00:06 -0700 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <5090355d-546a-4d06-99e1-064354d156b5@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <5090355d-546a-4d06-99e1-064354d156b5@redhat.com> Message-ID: <20250905230006.GA1776@sol> On Fri, Sep 05, 2025 at 08:41:23AM +0200, David Hildenbrand wrote: > On 01.09.25 17:03, David Hildenbrand wrote: > > We can just cleanup the code by calculating the #refs earlier, > > so we can just inline what remains of record_subpages(). > > > > Calculate the number of references/pages ahead of times, and record them > > only once all our tests passed. > > > > Signed-off-by: David Hildenbrand > > --- > > mm/gup.c | 25 ++++++++----------------- > > 1 file changed, 8 insertions(+), 17 deletions(-) > > > > diff --git a/mm/gup.c b/mm/gup.c > > index c10cd969c1a3b..f0f4d1a68e094 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -484,19 +484,6 @@ static inline void mm_set_has_pinned_flag(struct mm_struct *mm) > > #ifdef CONFIG_MMU > > #ifdef CONFIG_HAVE_GUP_FAST > > -static int record_subpages(struct page *page, unsigned long sz, > > - unsigned long addr, unsigned long end, > > - struct page **pages) > > -{ > > - int nr; > > - > > - page += (addr & (sz - 1)) >> PAGE_SHIFT; > > - for (nr = 0; addr != end; nr++, addr += PAGE_SIZE) > > - pages[nr] = page++; > > - > > - return nr; > > -} > > - > > /** > > * try_grab_folio_fast() - Attempt to get or pin a folio in fast path. > > * @page: pointer to page to be grabbed > > @@ -2967,8 +2954,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > > if (pmd_special(orig)) > > return 0; > > - page = pmd_page(orig); > > - refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr); > > + refs = (end - addr) >> PAGE_SHIFT; > > + page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); > > folio = try_grab_folio_fast(page, refs, flags); > > if (!folio) > > @@ -2989,6 +2976,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > > } > > *nr += refs; > > + for (; refs; refs--) > > + *(pages++) = page++; > > folio_set_referenced(folio); > > return 1; > > } > > @@ -3007,8 +2996,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > > if (pud_special(orig)) > > return 0; > > - page = pud_page(orig); > > - refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr); > > + refs = (end - addr) >> PAGE_SHIFT; > > + page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); > > folio = try_grab_folio_fast(page, refs, flags); > > if (!folio) > > @@ -3030,6 +3019,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > > } > > *nr += refs; > > + for (; refs; refs--) > > + *(pages++) = page++; > > folio_set_referenced(folio); > > return 1; > > } > > Okay, this code is nasty. We should rework this code to just return the nr and receive a the proper > pages pointer, getting rid of the "*nr" parameter. > > For the time being, the following should do the trick: > > commit bfd07c995814354f6b66c5b6a72e96a7aa9fb73b (HEAD -> nth_page) > Author: David Hildenbrand > Date: Fri Sep 5 08:38:43 2025 +0200 > > fixup: mm/gup: remove record_subpages() > pages is not adjusted by the caller, but idnexed by existing *nr. > Signed-off-by: David Hildenbrand > > diff --git a/mm/gup.c b/mm/gup.c > index 010fe56f6e132..22420f2069ee1 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -2981,6 +2981,7 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > return 0; > } > + pages += *nr; > *nr += refs; > for (; refs; refs--) > *(pages++) = page++; > @@ -3024,6 +3025,7 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > return 0; > } > + pages += *nr; > *nr += refs; > for (; refs; refs--) > *(pages++) = page++; Can this get folded in soon? This bug is causing crashes in AF_ALG too. Thanks, - Eric From dianders at chromium.org Fri Sep 5 16:57:37 2025 From: dianders at chromium.org (Doug Anderson) Date: Fri, 5 Sep 2025 16:57:37 -0700 Subject: [External] Re: [PATCH 1/2] watchdog: refactor watchdog_hld functionality In-Reply-To: References: <20250827100959.83023-1-cuiyunhui@bytedance.com> <20250827100959.83023-2-cuiyunhui@bytedance.com> Message-ID: Hi, On Wed, Sep 3, 2025 at 4:56?AM yunhui cui wrote: > > Hi Doug? > > On Wed, Sep 3, 2025 at 1:04?AM Doug Anderson wrote: > > > > Hi, > > > > On Sun, Aug 31, 2025 at 10:57?PM yunhui cui wrote: > > > > > > Hi Doug, > > > > > > On Sat, Aug 30, 2025 at 5:34?AM Doug Anderson wrote: > > > > > > > > Hi, > > > > > > > > On Wed, Aug 27, 2025 at 3:10?AM Yunhui Cui wrote: > > > > > > > > > > Move watchdog_hld.c to kernel/, and rename arm_pmu_irq_is_nmi() > > > > > to arch_pmu_irq_is_nmi() for cross-arch reusability. > > > > > > > > > > Signed-off-by: Yunhui Cui > > > > > --- > > > > > arch/arm64/kernel/Makefile | 1 - > > > > > drivers/perf/arm_pmu.c | 2 +- > > > > > include/linux/nmi.h | 1 + > > > > > include/linux/perf/arm_pmu.h | 2 -- > > > > > kernel/Makefile | 2 +- > > > > > {arch/arm64/kernel => kernel}/watchdog_hld.c | 8 ++++++-- > > > > > 6 files changed, 9 insertions(+), 7 deletions(-) > > > > > rename {arch/arm64/kernel => kernel}/watchdog_hld.c (97%) > > > > > > > > I'm not a huge fan of the perf hardlockup detector and IMO we should > > > > maybe just delete it. Thus spreading it to support a new architecture > > > > isn't my favorite thing to do. Can't you use the buddy hardlockup > > > > detector? > > > > > > Why is there a plan to remove CONFIG_HARDLOCKUP_DETECTOR_PERF? Could > > > you explain the specific reasons? Is the community's future plan to > > > favor CONFIG_HARDLOCKUP_DETECTOR_BUDDY? > > > > I don't think there are any concrete plans, but there was some discussion here: > > > > https://lore.kernel.org/all/CAD=FV=WWUiCi6bZCs_gseFpDDWNkuJMoL6XCftEo6W7q6jRCkg at mail.gmail.com/ > > > > -Doug > > > > I?ve read your linked content, which details the pros and cons of perf > watchdog and buddy watchdog. > I think everyone will agree on choosing one as the default. > It seems there?s no kernel/watchdog entry in MAINTAINERS?what?s next > for these two approaches? I guess to start, someone (you?) should send some patches to the list. Maybe one patch to make buddy the default and one to change the description of the "perf" lockup detector say that its usage is discouraged, that it might be removed, that people should use the "buddy" detector instead, and that if there's a reason someone needs the "perf" detector instead of the buddy one then they should make some loud noises. You'd want to CC folks who were involved in recent watchdog changes and make sure to CC Andrew (akpm). If folks react positive and Andrew agrees then he'll likely land the the patches and we'll have made forward progress. :-) -Doug From kuba at kernel.org Fri Sep 5 16:59:08 2025 From: kuba at kernel.org (Jakub Kicinski) Date: Fri, 5 Sep 2025 16:59:08 -0700 Subject: [PATCH net-next v9 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <45053235-3b01-42d8-98aa-042681104d11@iscas.ac.cn> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> <20250905-net-k1-emac-v9-2-f1649b98a19c@iscas.ac.cn> <20250905153500.GH553991@horms.kernel.org> <0605f176-5cdb-4f5b-9a6b-afa139c96732@iscas.ac.cn> <20250905160158.GI553991@horms.kernel.org> <45053235-3b01-42d8-98aa-042681104d11@iscas.ac.cn> Message-ID: <20250905165908.69548ce0@kernel.org> On Sat, 6 Sep 2025 00:35:37 +0800 Vivian Wang wrote: > >> On a closer look, these counters in ndev->stats seems to be redundant > >> with the hardware-tracked statistics, so maybe I should just not bother > >> with updating ndev->stats. Does that make sense? > > For rx/tx packets/bytes I think that makes sense. > > But what about rx/tx drops? > > Right... but tstats doesn't have *_dropped. It seems that tx_dropped and > rx_dropped are considered "slow path" for real devices. It makes sense > to me that those should be very rare. Pretty sure Simon meant the per-cpu netdev stats in general. There are three types of them, if you need drops I think you probably want dstats. Take a look. From jhubbard at nvidia.com Fri Sep 5 18:05:21 2025 From: jhubbard at nvidia.com (John Hubbard) Date: Fri, 5 Sep 2025 18:05:21 -0700 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <20250901150359.867252-20-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> Message-ID: <016307ba-427d-4646-8e4d-1ffefd2c1968@nvidia.com> On 9/1/25 8:03 AM, David Hildenbrand wrote: > We can just cleanup the code by calculating the #refs earlier, > so we can just inline what remains of record_subpages(). > > Calculate the number of references/pages ahead of times, and record them > only once all our tests passed. > > Signed-off-by: David Hildenbrand > --- > mm/gup.c | 25 ++++++++----------------- > 1 file changed, 8 insertions(+), 17 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index c10cd969c1a3b..f0f4d1a68e094 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -484,19 +484,6 @@ static inline void mm_set_has_pinned_flag(struct mm_struct *mm) > #ifdef CONFIG_MMU > > #ifdef CONFIG_HAVE_GUP_FAST > -static int record_subpages(struct page *page, unsigned long sz, > - unsigned long addr, unsigned long end, > - struct page **pages) > -{ > - int nr; > - > - page += (addr & (sz - 1)) >> PAGE_SHIFT; > - for (nr = 0; addr != end; nr++, addr += PAGE_SIZE) > - pages[nr] = page++; > - > - return nr; > -} > - > /** > * try_grab_folio_fast() - Attempt to get or pin a folio in fast path. > * @page: pointer to page to be grabbed > @@ -2967,8 +2954,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > if (pmd_special(orig)) > return 0; > > - page = pmd_page(orig); > - refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr); > + refs = (end - addr) >> PAGE_SHIFT; > + page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); > > folio = try_grab_folio_fast(page, refs, flags); > if (!folio) > @@ -2989,6 +2976,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > } > > *nr += refs; > + for (; refs; refs--) > + *(pages++) = page++; > folio_set_referenced(folio); > return 1; > } > @@ -3007,8 +2996,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > if (pud_special(orig)) > return 0; > > - page = pud_page(orig); > - refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr); > + refs = (end - addr) >> PAGE_SHIFT; > + page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); > > folio = try_grab_folio_fast(page, refs, flags); > if (!folio) > @@ -3030,6 +3019,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > } > > *nr += refs; > + for (; refs; refs--) > + *(pages++) = page++; Hi David, Probably a similar sentiment as Lorenzo here...the above diffs make the code *worse* to read. In fact, I recall adding record_subpages() here long ago, specifically to help clarify what was going on. Now it's been returned to it's original, cryptic form. Just my take on it, for whatever that's worth. :) thanks, -- John Hubbard > folio_set_referenced(folio); > return 1; > } From wangruikang at iscas.ac.cn Fri Sep 5 18:46:31 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Sat, 6 Sep 2025 09:46:31 +0800 Subject: [PATCH net-next v9 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250905165908.69548ce0@kernel.org> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> <20250905-net-k1-emac-v9-2-f1649b98a19c@iscas.ac.cn> <20250905153500.GH553991@horms.kernel.org> <0605f176-5cdb-4f5b-9a6b-afa139c96732@iscas.ac.cn> <20250905160158.GI553991@horms.kernel.org> <45053235-3b01-42d8-98aa-042681104d11@iscas.ac.cn> <20250905165908.69548ce0@kernel.org> Message-ID: <19279021-e89e-458a-8bf1-62ad2f76a0ba@iscas.ac.cn> On 9/6/25 07:59, Jakub Kicinski wrote: > On Sat, 6 Sep 2025 00:35:37 +0800 Vivian Wang wrote: >>>> On a closer look, these counters in ndev->stats seems to be redundant >>>> with the hardware-tracked statistics, so maybe I should just not bother >>>> with updating ndev->stats. Does that make sense? >>> For rx/tx packets/bytes I think that makes sense. >>> But what about rx/tx drops? >> Right... but tstats doesn't have *_dropped. It seems that tx_dropped and >> rx_dropped are considered "slow path" for real devices. It makes sense >> to me that those should be very rare. > Pretty sure Simon meant the per-cpu netdev stats in general. > There are three types of them, if you need drops I think you > probably want dstats. Take a look. Thank you, I will look into this. From david at redhat.com Fri Sep 5 23:56:48 2025 From: david at redhat.com (David Hildenbrand) Date: Sat, 6 Sep 2025 08:56:48 +0200 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <016307ba-427d-4646-8e4d-1ffefd2c1968@nvidia.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <016307ba-427d-4646-8e4d-1ffefd2c1968@nvidia.com> Message-ID: <85e760cf-b994-40db-8d13-221feee55c60@redhat.com> On 06.09.25 03:05, John Hubbard wrote: > On 9/1/25 8:03 AM, David Hildenbrand wrote: >> We can just cleanup the code by calculating the #refs earlier, >> so we can just inline what remains of record_subpages(). >> >> Calculate the number of references/pages ahead of times, and record them >> only once all our tests passed. >> >> Signed-off-by: David Hildenbrand >> --- >> mm/gup.c | 25 ++++++++----------------- >> 1 file changed, 8 insertions(+), 17 deletions(-) >> >> diff --git a/mm/gup.c b/mm/gup.c >> index c10cd969c1a3b..f0f4d1a68e094 100644 >> --- a/mm/gup.c >> +++ b/mm/gup.c >> @@ -484,19 +484,6 @@ static inline void mm_set_has_pinned_flag(struct mm_struct *mm) >> #ifdef CONFIG_MMU >> >> #ifdef CONFIG_HAVE_GUP_FAST >> -static int record_subpages(struct page *page, unsigned long sz, >> - unsigned long addr, unsigned long end, >> - struct page **pages) >> -{ >> - int nr; >> - >> - page += (addr & (sz - 1)) >> PAGE_SHIFT; >> - for (nr = 0; addr != end; nr++, addr += PAGE_SIZE) >> - pages[nr] = page++; >> - >> - return nr; >> -} >> - >> /** >> * try_grab_folio_fast() - Attempt to get or pin a folio in fast path. >> * @page: pointer to page to be grabbed >> @@ -2967,8 +2954,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, >> if (pmd_special(orig)) >> return 0; >> >> - page = pmd_page(orig); >> - refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr); >> + refs = (end - addr) >> PAGE_SHIFT; >> + page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); >> >> folio = try_grab_folio_fast(page, refs, flags); >> if (!folio) >> @@ -2989,6 +2976,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, >> } >> >> *nr += refs; >> + for (; refs; refs--) >> + *(pages++) = page++; >> folio_set_referenced(folio); >> return 1; >> } >> @@ -3007,8 +2996,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, >> if (pud_special(orig)) >> return 0; >> >> - page = pud_page(orig); >> - refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr); >> + refs = (end - addr) >> PAGE_SHIFT; >> + page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); >> >> folio = try_grab_folio_fast(page, refs, flags); >> if (!folio) >> @@ -3030,6 +3019,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, >> } >> >> *nr += refs; >> + for (; refs; refs--) >> + *(pages++) = page++; > > Hi David, Hi! > > Probably a similar sentiment as Lorenzo here...the above diffs make the code > *worse* to read. In fact, I recall adding record_subpages() here long ago, > specifically to help clarify what was going on. Well, there is a lot I dislike about record_subpages() to go back there. Starting with "as Willy keeps explaining, the concept of subpages do not exist and ending with "why do we fill out the array even on failure". :) > > Now it's been returned to it's original, cryptic form. > The code in the caller was so uncryptic that both me and Lorenzo missed that magical addition. :P > Just my take on it, for whatever that's worth. :) As always, appreciated. I could of course keep the simple loop in some "record_folio_pages" function and clean up what I dislike about record_subpages(). But I much rather want the call chain to be cleaned up instead, if possible. Roughly, what I am thinking (limiting it to pte+pmd case) about is the following: From d6d6d21dbf435d8030782a627175e36e6c7b2dfb Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Sat, 6 Sep 2025 08:33:42 +0200 Subject: [PATCH] tmp Signed-off-by: David Hildenbrand --- mm/gup.c | 79 ++++++++++++++++++++++++++------------------------------ 1 file changed, 36 insertions(+), 43 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 22420f2069ee1..98907ead749c0 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2845,12 +2845,11 @@ static void __maybe_unused gup_fast_undo_dev_pagemap(int *nr, int nr_start, * also check pmd here to make sure pmd doesn't change (corresponds to * pmdp_collapse_flush() in the THP collapse code path). */ -static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, - int *nr) +static unsigned long gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, + unsigned long end, unsigned int flags, struct page **pages) { struct dev_pagemap *pgmap = NULL; - int ret = 0; + unsigned long nr_pages = 0; pte_t *ptep, *ptem; ptem = ptep = pte_offset_map(&pmd, addr); @@ -2908,24 +2907,20 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, * details. */ if (flags & FOLL_PIN) { - ret = arch_make_folio_accessible(folio); - if (ret) { + if (arch_make_folio_accessible(folio)) { gup_put_folio(folio, 1, flags); goto pte_unmap; } } folio_set_referenced(folio); - pages[*nr] = page; - (*nr)++; + pages[nr_pages++] = page; } while (ptep++, addr += PAGE_SIZE, addr != end); - ret = 1; - pte_unmap: if (pgmap) put_dev_pagemap(pgmap); pte_unmap(ptem); - return ret; + return nr_pages; } #else @@ -2938,21 +2933,24 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, * get_user_pages_fast_only implementation that can pin pages. Thus it's still * useful to have gup_fast_pmd_leaf even if we can't operate on ptes. */ -static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, - int *nr) +static unsigned long gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, + unsigned long end, unsigned int flags, struct page **pages) { return 0; } #endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */ -static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, - int *nr) +static unsigned long gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, + unsigned long end, unsigned int flags, struct page **pages) { + const unsigned long nr_pages = (end - addr) >> PAGE_SHIFT; struct page *page; struct folio *folio; - int refs; + unsigned long i; + + /* See gup_fast_pte_range() */ + if (pmd_protnone(orig)) + return 0; if (!pmd_access_permitted(orig, flags & FOLL_WRITE)) return 0; @@ -2960,33 +2958,30 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, if (pmd_special(orig)) return 0; - refs = (end - addr) >> PAGE_SHIFT; page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); - folio = try_grab_folio_fast(page, refs, flags); + folio = try_grab_folio_fast(page, nr_pages, flags); if (!folio) return 0; if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) { - gup_put_folio(folio, refs, flags); + gup_put_folio(folio, nr_pages, flags); return 0; } if (!gup_fast_folio_allowed(folio, flags)) { - gup_put_folio(folio, refs, flags); + gup_put_folio(folio, nr_pages, flags); return 0; } if (!pmd_write(orig) && gup_must_unshare(NULL, flags, &folio->page)) { - gup_put_folio(folio, refs, flags); + gup_put_folio(folio, nr_pages, flags); return 0; } - pages += *nr; - *nr += refs; - for (; refs; refs--) + for (i = 0; i < nr_pages; i++) *(pages++) = page++; folio_set_referenced(folio); - return 1; + return nr_pages; } static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, @@ -3033,11 +3028,11 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, return 1; } -static int gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, - int *nr) +static unsigned long gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, + unsigned long end, unsigned int flags, struct page **pages) { - unsigned long next; + unsigned long cur_nr_pages, next; + unsigned long nr_pages = 0; pmd_t *pmdp; pmdp = pmd_offset_lockless(pudp, pud, addr); @@ -3046,23 +3041,21 @@ static int gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, next = pmd_addr_end(addr, end); if (!pmd_present(pmd)) - return 0; + break; - if (unlikely(pmd_leaf(pmd))) { - /* See gup_fast_pte_range() */ - if (pmd_protnone(pmd)) - return 0; + if (unlikely(pmd_leaf(pmd))) + cur_nr_pages = gup_fast_pmd_leaf(pmd, pmdp, addr, next, flags, pages); + else + cur_nr_pages = gup_fast_pte_range(pmd, pmdp, addr, next, flags, pages); - if (!gup_fast_pmd_leaf(pmd, pmdp, addr, next, flags, - pages, nr)) - return 0; + nr_pages += cur_nr_pages; + pages += cur_nr_pages; - } else if (!gup_fast_pte_range(pmd, pmdp, addr, next, flags, - pages, nr)) - return 0; + if (nr_pages != (next - addr) >> PAGE_SIZE) + break; } while (pmdp++, addr = next, addr != end); - return 1; + return nr_pages; } static int gup_fast_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr, -- 2.50.1 Oh, I might even have found a bug moving away from that questionable "ret==1 means success" handling in gup_fast_pte_range()? Will have to double-check, but likely the following is the right thing to do. From 8f48b25ef93e7ef98611fd58ec89384ad5171782 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Sat, 6 Sep 2025 08:46:45 +0200 Subject: [PATCH] mm/gup: fix handling of errors from arch_make_folio_accessible() in follow_page_pte() In case we call arch_make_folio_accessible() and it fails, we would incorrectly return a value that is "!= 0" to the caller, indicating that we pinned all requested pages and that the caller can keep going. follow_page_pte() is not supposed to return error values, but instead 0 on failure and 1 on success. That is of course wrong, because the caller will just keep going pinning more pages. If we happen to pin a page afterwards, we're in trouble, because we essentially skipped some pages. Fixes: f28d43636d6f ("mm/gup/writeback: add callbacks for inaccessible pages") Signed-off-by: David Hildenbrand --- mm/gup.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 22420f2069ee1..cff226ec0ee7d 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2908,8 +2908,7 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, * details. */ if (flags & FOLL_PIN) { - ret = arch_make_folio_accessible(folio); - if (ret) { + if (arch_make_folio_accessible(folio)) { gup_put_folio(folio, 1, flags); goto pte_unmap; } -- 2.50.1 -- Cheers David / dhildenb From david at redhat.com Fri Sep 5 23:57:37 2025 From: david at redhat.com (David Hildenbrand) Date: Sat, 6 Sep 2025 08:57:37 +0200 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <20250905230006.GA1776@sol> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <5090355d-546a-4d06-99e1-064354d156b5@redhat.com> <20250905230006.GA1776@sol> Message-ID: <64fe4c61-f9cc-4a5a-9c33-07bd0f089e94@redhat.com> On 06.09.25 01:00, Eric Biggers wrote: > On Fri, Sep 05, 2025 at 08:41:23AM +0200, David Hildenbrand wrote: >> On 01.09.25 17:03, David Hildenbrand wrote: >>> We can just cleanup the code by calculating the #refs earlier, >>> so we can just inline what remains of record_subpages(). >>> >>> Calculate the number of references/pages ahead of times, and record them >>> only once all our tests passed. >>> >>> Signed-off-by: David Hildenbrand >>> --- >>> mm/gup.c | 25 ++++++++----------------- >>> 1 file changed, 8 insertions(+), 17 deletions(-) >>> >>> diff --git a/mm/gup.c b/mm/gup.c >>> index c10cd969c1a3b..f0f4d1a68e094 100644 >>> --- a/mm/gup.c >>> +++ b/mm/gup.c >>> @@ -484,19 +484,6 @@ static inline void mm_set_has_pinned_flag(struct mm_struct *mm) >>> #ifdef CONFIG_MMU >>> #ifdef CONFIG_HAVE_GUP_FAST >>> -static int record_subpages(struct page *page, unsigned long sz, >>> - unsigned long addr, unsigned long end, >>> - struct page **pages) >>> -{ >>> - int nr; >>> - >>> - page += (addr & (sz - 1)) >> PAGE_SHIFT; >>> - for (nr = 0; addr != end; nr++, addr += PAGE_SIZE) >>> - pages[nr] = page++; >>> - >>> - return nr; >>> -} >>> - >>> /** >>> * try_grab_folio_fast() - Attempt to get or pin a folio in fast path. >>> * @page: pointer to page to be grabbed >>> @@ -2967,8 +2954,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, >>> if (pmd_special(orig)) >>> return 0; >>> - page = pmd_page(orig); >>> - refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr); >>> + refs = (end - addr) >> PAGE_SHIFT; >>> + page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); >>> folio = try_grab_folio_fast(page, refs, flags); >>> if (!folio) >>> @@ -2989,6 +2976,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, >>> } >>> *nr += refs; >>> + for (; refs; refs--) >>> + *(pages++) = page++; >>> folio_set_referenced(folio); >>> return 1; >>> } >>> @@ -3007,8 +2996,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, >>> if (pud_special(orig)) >>> return 0; >>> - page = pud_page(orig); >>> - refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr); >>> + refs = (end - addr) >> PAGE_SHIFT; >>> + page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); >>> folio = try_grab_folio_fast(page, refs, flags); >>> if (!folio) >>> @@ -3030,6 +3019,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, >>> } >>> *nr += refs; >>> + for (; refs; refs--) >>> + *(pages++) = page++; >>> folio_set_referenced(folio); >>> return 1; >>> } >> >> Okay, this code is nasty. We should rework this code to just return the nr and receive a the proper >> pages pointer, getting rid of the "*nr" parameter. >> >> For the time being, the following should do the trick: >> >> commit bfd07c995814354f6b66c5b6a72e96a7aa9fb73b (HEAD -> nth_page) >> Author: David Hildenbrand >> Date: Fri Sep 5 08:38:43 2025 +0200 >> >> fixup: mm/gup: remove record_subpages() >> pages is not adjusted by the caller, but idnexed by existing *nr. >> Signed-off-by: David Hildenbrand >> >> diff --git a/mm/gup.c b/mm/gup.c >> index 010fe56f6e132..22420f2069ee1 100644 >> --- a/mm/gup.c >> +++ b/mm/gup.c >> @@ -2981,6 +2981,7 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, >> return 0; >> } >> + pages += *nr; >> *nr += refs; >> for (; refs; refs--) >> *(pages++) = page++; >> @@ -3024,6 +3025,7 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, >> return 0; >> } >> + pages += *nr; >> *nr += refs; >> for (; refs; refs--) >> *(pages++) = page++; > > Can this get folded in soon? This bug is causing crashes in AF_ALG too. Andrew immediately dropped the original patch, so it's gone from mm-unstable and should be gone from next soon (today?). -- Cheers David / dhildenb From david at redhat.com Sat Sep 6 00:00:54 2025 From: david at redhat.com (David Hildenbrand) Date: Sat, 6 Sep 2025 09:00:54 +0200 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <85e760cf-b994-40db-8d13-221feee55c60@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <016307ba-427d-4646-8e4d-1ffefd2c1968@nvidia.com> <85e760cf-b994-40db-8d13-221feee55c60@redhat.com> Message-ID: <815cbde4-a56d-446d-b517-c63e12e473de@redhat.com> > pmdp = pmd_offset_lockless(pudp, pud, addr); > @@ -3046,23 +3041,21 @@ static int gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, > > next = pmd_addr_end(addr, end); > if (!pmd_present(pmd)) > - return 0; > + break; > > - if (unlikely(pmd_leaf(pmd))) { > - /* See gup_fast_pte_range() */ > - if (pmd_protnone(pmd)) > - return 0; > + if (unlikely(pmd_leaf(pmd))) > + cur_nr_pages = gup_fast_pmd_leaf(pmd, pmdp, addr, next, flags, pages); > + else > + cur_nr_pages = gup_fast_pte_range(pmd, pmdp, addr, next, flags, pages); > > - if (!gup_fast_pmd_leaf(pmd, pmdp, addr, next, flags, > - pages, nr)) > - return 0; > + nr_pages += cur_nr_pages; > + pages += cur_nr_pages; > > - } else if (!gup_fast_pte_range(pmd, pmdp, addr, next, flags, > - pages, nr)) > - return 0; > + if (nr_pages != (next - addr) >> PAGE_SIZE) > + break; ^ cur_nr_pages. Open for suggestions on how to make that thing here even better. -- Cheers David / dhildenb From lkp at intel.com Sat Sep 6 06:31:09 2025 From: lkp at intel.com (kernel test robot) Date: Sat, 6 Sep 2025 21:31:09 +0800 Subject: [PATCH v9 2/5] mm: uffd_wp: Add pte_uffd_wp_available() In-Reply-To: <20250905103651.489197-3-zhangchunyan@iscas.ac.cn> References: <20250905103651.489197-3-zhangchunyan@iscas.ac.cn> Message-ID: <202509062145.1ipU0q7y-lkp@intel.com> Hi Chunyan, kernel test robot noticed the following build errors: [auto build test ERROR on brauner-vfs/vfs.all] [also build test ERROR on linus/master v6.17-rc4] [cannot apply to akpm-mm/mm-everything arnd-asm-generic/master next-20250905] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Chunyan-Zhang/mm-softdirty-Add-pte_soft_dirty_available/20250905-184138 base: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all patch link: https://lore.kernel.org/r/20250905103651.489197-3-zhangchunyan%40iscas.ac.cn patch subject: [PATCH v9 2/5] mm: uffd_wp: Add pte_uffd_wp_available() config: m68k-allnoconfig (https://download.01.org/0day-ci/archive/20250906/202509062145.1ipU0q7y-lkp at intel.com/config) compiler: m68k-linux-gcc (GCC) 15.1.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250906/202509062145.1ipU0q7y-lkp at intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202509062145.1ipU0q7y-lkp at intel.com/ All errors (new ones prefixed by >>): In file included from fs/../mm/internal.h:13, from fs/exec.c:82: include/linux/mm_inline.h: In function 'pte_install_uffd_wp_if_needed': >> include/linux/mm_inline.h:573:56: error: implicit declaration of function 'pte_uffd_wp_available'; did you mean 'pte_soft_dirty_available'? [-Wimplicit-function-declaration] 573 | if (!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || !pte_uffd_wp_available()) | ^~~~~~~~~~~~~~~~~~~~~ | pte_soft_dirty_available In file included from include/asm-generic/bug.h:7, from arch/m68k/include/asm/bug.h:32, from include/linux/bug.h:5, from include/linux/thread_info.h:13, from include/asm-generic/preempt.h:5, from ./arch/m68k/include/generated/asm/preempt.h:1, from include/linux/preempt.h:79, from include/linux/spinlock.h:56, from include/linux/mmzone.h:8, from include/linux/gfp.h:7, from include/linux/slab.h:16, from fs/exec.c:27: >> include/linux/mm_inline.h:579:23: error: implicit declaration of function 'pte_none'; did you mean 'p4d_none'? [-Wimplicit-function-declaration] 579 | WARN_ON_ONCE(!pte_none(ptep_get(pte))); | ^~~~~~~~ include/linux/once_lite.h:28:41: note: in definition of macro 'DO_ONCE_LITE_IF' 28 | bool __ret_do_once = !!(condition); \ | ^~~~~~~~~ include/linux/mm_inline.h:579:9: note: in expansion of macro 'WARN_ON_ONCE' 579 | WARN_ON_ONCE(!pte_none(ptep_get(pte))); | ^~~~~~~~~~~~ >> include/linux/mm_inline.h:579:32: error: implicit declaration of function 'ptep_get' [-Wimplicit-function-declaration] 579 | WARN_ON_ONCE(!pte_none(ptep_get(pte))); | ^~~~~~~~ include/linux/once_lite.h:28:41: note: in definition of macro 'DO_ONCE_LITE_IF' 28 | bool __ret_do_once = !!(condition); \ | ^~~~~~~~~ include/linux/mm_inline.h:579:9: note: in expansion of macro 'WARN_ON_ONCE' 579 | WARN_ON_ONCE(!pte_none(ptep_get(pte))); | ^~~~~~~~~~~~ In file included from include/linux/file.h:9, from include/linux/kernel_read_file.h:5, from fs/exec.c:26: >> include/linux/mm_inline.h:591:22: error: implicit declaration of function 'pte_present'; did you mean 'p4d_present'? [-Wimplicit-function-declaration] 591 | if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval))) | ^~~~~~~~~~~ include/linux/compiler.h:77:45: note: in definition of macro 'unlikely' 77 | # define unlikely(x) __builtin_expect(!!(x), 0) | ^ >> include/linux/mm_inline.h:591:45: error: implicit declaration of function 'pte_uffd_wp' [-Wimplicit-function-declaration] 591 | if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval))) | ^~~~~~~~~~~ include/linux/compiler.h:77:45: note: in definition of macro 'unlikely' 77 | # define unlikely(x) __builtin_expect(!!(x), 0) | ^ >> include/linux/mm_inline.h:602:17: error: implicit declaration of function 'set_pte_at'; did you mean 'user_path_at'? [-Wimplicit-function-declaration] 602 | set_pte_at(vma->vm_mm, addr, pte, | ^~~~~~~~~~ | user_path_at >> include/linux/mm_inline.h:603:28: error: implicit declaration of function 'make_pte_marker' [-Wimplicit-function-declaration] 603 | make_pte_marker(PTE_MARKER_UFFD_WP)); | ^~~~~~~~~~~~~~~ >> include/linux/mm_inline.h:603:44: error: 'PTE_MARKER_UFFD_WP' undeclared (first use in this function) 603 | make_pte_marker(PTE_MARKER_UFFD_WP)); | ^~~~~~~~~~~~~~~~~~ include/linux/mm_inline.h:603:44: note: each undeclared identifier is reported only once for each function it appears in -- In file included from fs/splice.c:27: include/linux/mm_inline.h: In function 'pte_install_uffd_wp_if_needed': >> include/linux/mm_inline.h:573:56: error: implicit declaration of function 'pte_uffd_wp_available'; did you mean 'pte_soft_dirty_available'? [-Wimplicit-function-declaration] 573 | if (!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || !pte_uffd_wp_available()) | ^~~~~~~~~~~~~~~~~~~~~ | pte_soft_dirty_available In file included from include/asm-generic/bug.h:7, from arch/m68k/include/asm/bug.h:32, from include/linux/bug.h:5, from include/linux/vfsdebug.h:5, from include/linux/fs.h:5, from include/linux/highmem.h:5, from include/linux/bvec.h:10, from fs/splice.c:21: >> include/linux/mm_inline.h:579:23: error: implicit declaration of function 'pte_none'; did you mean 'p4d_none'? [-Wimplicit-function-declaration] 579 | WARN_ON_ONCE(!pte_none(ptep_get(pte))); | ^~~~~~~~ include/linux/once_lite.h:28:41: note: in definition of macro 'DO_ONCE_LITE_IF' 28 | bool __ret_do_once = !!(condition); \ | ^~~~~~~~~ include/linux/mm_inline.h:579:9: note: in expansion of macro 'WARN_ON_ONCE' 579 | WARN_ON_ONCE(!pte_none(ptep_get(pte))); | ^~~~~~~~~~~~ >> include/linux/mm_inline.h:579:32: error: implicit declaration of function 'ptep_get' [-Wimplicit-function-declaration] 579 | WARN_ON_ONCE(!pte_none(ptep_get(pte))); | ^~~~~~~~ include/linux/once_lite.h:28:41: note: in definition of macro 'DO_ONCE_LITE_IF' 28 | bool __ret_do_once = !!(condition); \ | ^~~~~~~~~ include/linux/mm_inline.h:579:9: note: in expansion of macro 'WARN_ON_ONCE' 579 | WARN_ON_ONCE(!pte_none(ptep_get(pte))); | ^~~~~~~~~~~~ In file included from include/asm-generic/bug.h:5: >> include/linux/mm_inline.h:591:22: error: implicit declaration of function 'pte_present'; did you mean 'p4d_present'? [-Wimplicit-function-declaration] 591 | if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval))) | ^~~~~~~~~~~ include/linux/compiler.h:77:45: note: in definition of macro 'unlikely' 77 | # define unlikely(x) __builtin_expect(!!(x), 0) | ^ >> include/linux/mm_inline.h:591:45: error: implicit declaration of function 'pte_uffd_wp' [-Wimplicit-function-declaration] 591 | if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval))) | ^~~~~~~~~~~ include/linux/compiler.h:77:45: note: in definition of macro 'unlikely' 77 | # define unlikely(x) __builtin_expect(!!(x), 0) | ^ include/linux/mm_inline.h:602:17: error: implicit declaration of function 'set_pte_at' [-Wimplicit-function-declaration] 602 | set_pte_at(vma->vm_mm, addr, pte, | ^~~~~~~~~~ >> include/linux/mm_inline.h:603:28: error: implicit declaration of function 'make_pte_marker' [-Wimplicit-function-declaration] 603 | make_pte_marker(PTE_MARKER_UFFD_WP)); | ^~~~~~~~~~~~~~~ >> include/linux/mm_inline.h:603:44: error: 'PTE_MARKER_UFFD_WP' undeclared (first use in this function) 603 | make_pte_marker(PTE_MARKER_UFFD_WP)); | ^~~~~~~~~~~~~~~~~~ include/linux/mm_inline.h:603:44: note: each undeclared identifier is reported only once for each function it appears in vim +573 include/linux/mm_inline.h 556 557 /* 558 * If this pte is wr-protected by uffd-wp in any form, arm the special pte to 559 * replace a none pte. NOTE! This should only be called when *pte is already 560 * cleared so we will never accidentally replace something valuable. Meanwhile 561 * none pte also means we are not demoting the pte so tlb flushed is not needed. 562 * E.g., when pte cleared the caller should have taken care of the tlb flush. 563 * 564 * Must be called with pgtable lock held so that no thread will see the none 565 * pte, and if they see it, they'll fault and serialize at the pgtable lock. 566 * 567 * Returns true if an uffd-wp pte was installed, false otherwise. 568 */ 569 static inline bool 570 pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, 571 pte_t *pte, pte_t pteval) 572 { > 573 if (!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || !pte_uffd_wp_available()) 574 return false; 575 576 bool arm_uffd_pte = false; 577 578 /* The current status of the pte should be "cleared" before calling */ > 579 WARN_ON_ONCE(!pte_none(ptep_get(pte))); 580 581 /* 582 * NOTE: userfaultfd_wp_unpopulated() doesn't need this whole 583 * thing, because when zapping either it means it's dropping the 584 * page, or in TTU where the present pte will be quickly replaced 585 * with a swap pte. There's no way of leaking the bit. 586 */ 587 if (vma_is_anonymous(vma) || !userfaultfd_wp(vma)) 588 return false; 589 590 /* A uffd-wp wr-protected normal pte */ > 591 if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval))) 592 arm_uffd_pte = true; 593 594 /* 595 * A uffd-wp wr-protected swap pte. Note: this should even cover an 596 * existing pte marker with uffd-wp bit set. 597 */ 598 if (unlikely(pte_swp_uffd_wp_any(pteval))) 599 arm_uffd_pte = true; 600 601 if (unlikely(arm_uffd_pte)) { > 602 set_pte_at(vma->vm_mm, addr, pte, > 603 make_pte_marker(PTE_MARKER_UFFD_WP)); 604 return true; 605 } 606 607 return false; 608 } 609 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki From fustini at kernel.org Sat Sep 6 12:12:55 2025 From: fustini at kernel.org (Drew Fustini) Date: Sat, 6 Sep 2025 12:12:55 -0700 Subject: [GIT PULL] clk: thead: Updates for v6.18 Message-ID: The following changes since commit 8f5ae30d69d7543eee0d70083daf4de8fe15d585: Linux 6.17-rc1 (2025-08-10 19:41:16 +0300) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/fustini/linux.git tags/thead-clk-for-v6.18 for you to fetch changes up to c567bc5fc68c4388c00e11fc65fd14fe86b52070: clk: thead: th1520-ap: set all AXI clocks to CLK_IS_CRITICAL (2025-08-18 14:58:23 -0700) ---------------------------------------------------------------- T-HEAD clock changes for v6.18 Updates for the T-HEAD TH1520 clock controller: - Describe gate clocks with clk_gate so that clock gates can be clock parents. This is similar to the mux clock refactor in 54edba916e29 ("clk: thead: th1520-ap: Describe mux clocks with clk_mux"). - Add support for enabling/disabling PLLs. Some PLLs are put into a disabled state by the bootloader, and clock driver now has the ability to enable them. - Set all AXI clocks to CLK_IS_CRITICAL. The AXI crossbar of TH1520 has no proper timeout handling, which means gating AXI clocks can easily lead to bus timeout and hang the system. All these clock gates are ungated by default on system reset. - Convert all current CLK_IGNORE_UNUSED usage to CLK_IS_CRITICAL to prevent unwanted clock gating. - Fix parent of padctrl0 clock, fix parent of DPU pixel clocks and support changing DPU pixel clock rate. All changes have been tested in linux-next. Signed-off-by: Drew Fustini ---------------------------------------------------------------- Icenowy Zheng (5): clk: thead: th1520-ap: describe gate clocks with clk_gate clk: thead: th1520-ap: fix parent of padctrl0 clock clk: thead: add support for enabling/disabling PLLs clk: thead: support changing DPU pixel clock rate clk: thead: th1520-ap: set all AXI clocks to CLK_IS_CRITICAL Michal Wilczynski (1): clk: thead: Correct parent for DPU pixel clocks drivers/clk/thead/clk-th1520-ap.c | 503 ++++++++++++++++++++++---------------- 1 file changed, 292 insertions(+), 211 deletions(-) From fustini at kernel.org Sat Sep 6 12:15:41 2025 From: fustini at kernel.org (Drew Fustini) Date: Sat, 6 Sep 2025 12:15:41 -0700 Subject: [GIT PULL] RISC-V T-HEAD Devicetrees for v6.18 Message-ID: The following changes since commit 8f5ae30d69d7543eee0d70083daf4de8fe15d585: Linux 6.17-rc1 (2025-08-10 19:41:16 +0300) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/fustini/linux.git tags/thead-dt-for-v6.18 for you to fetch changes up to dfa743da83ab7ba51ec5692d5939ba1bab4b78c1: MAINTAINERS: Add RISC-V T-HEAD SoC patchwork (2025-09-06 11:05:03 -0700) ---------------------------------------------------------------- T-HEAD Devicetrees for v6.18 Add a device tree node for the IMG BXM-4-64 GPU present in the T-HEAD TH1520 SoC used by the Lichee Pi 4A board. This node enables support for the GPU using the drm/imagination driver. By adding this node, the kernel can recognize and initialize the GPU, providing graphics acceleration capabilities on the Lichee Pi 4A and other boards based on the TH1520 SoC. The display controller and HDMI output are still a work in progress. Also included is a MAINTAINERS patch that adds an entry for the T-Head SoC patchwork. Signed-off-by: Drew Fustini ---------------------------------------------------------------- Drew Fustini (1): MAINTAINERS: Add RISC-V T-HEAD SoC patchwork Michal Wilczynski (1): riscv: dts: thead: th1520: Add IMG BXM-4-64 GPU node MAINTAINERS | 1 + arch/riscv/boot/dts/thead/th1520.dtsi | 21 +++++++++++++++++++++ 2 files changed, 22 insertions(+) From ziyao at disroot.org Sat Sep 6 20:13:07 2025 From: ziyao at disroot.org (Yao Zi) Date: Sun, 7 Sep 2025 03:13:07 +0000 Subject: [PATCH v8 2/3] clk: canaan: Add clock driver for Canaan K230 In-Reply-To: <20250905-b4-k230-clk-v8-2-96caa02d5428@zohomail.com> References: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> <20250905-b4-k230-clk-v8-2-96caa02d5428@zohomail.com> Message-ID: > On Fri, Sep 05, 2025 at 11:10:23AM +0800, Xukai Wang wrote: > This patch provides basic support for the K230 clock, which covers > all clocks in K230 SoC. > > The clock tree of the K230 SoC consists of a 24MHZ external crystal > oscillator, PLLs and an external pulse input for timerX, and their > derived clocks. > > Co-developed-by: Troy Mitchell > Signed-off-by: Troy Mitchell > Signed-off-by: Xukai Wang > --- > drivers/clk/Kconfig | 6 + > drivers/clk/Makefile | 1 + > drivers/clk/clk-k230.c | 2456 ++++++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 2463 insertions(+) > > diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig > index 299bc678ed1b9fcd9110bb8c5937a1bd1ea60e23..b597912607a6cc8eabff459a890a1e7353ef9c1d 100644 > --- a/drivers/clk/Kconfig > +++ b/drivers/clk/Kconfig > @@ -464,6 +464,12 @@ config COMMON_CLK_K210 > help > Support for the Canaan Kendryte K210 RISC-V SoC clocks. > > +config COMMON_CLK_K230 > + bool "Clock driver for the Canaan Kendryte K230 SoC" > + depends on ARCH_CANAAN || COMPILE_TEST > + help > + Support for the Canaan Kendryte K230 RISC-V SoC clocks. > + > config COMMON_CLK_SP7021 > tristate "Clock driver for Sunplus SP7021 SoC" > depends on SOC_SP7021 || COMPILE_TEST > diff --git a/drivers/clk/Makefile b/drivers/clk/Makefile > index fb8878a5d7d93da6bec487460cdf63f1f764a431..5df50b1e14c701ed38397bfb257db26e8dd278b8 100644 > --- a/drivers/clk/Makefile > +++ b/drivers/clk/Makefile > @@ -51,6 +51,7 @@ obj-$(CONFIG_MACH_ASPEED_G6) += clk-ast2600.o > obj-$(CONFIG_ARCH_HIGHBANK) += clk-highbank.o > obj-$(CONFIG_CLK_HSDK) += clk-hsdk-pll.o > obj-$(CONFIG_COMMON_CLK_K210) += clk-k210.o > +obj-$(CONFIG_COMMON_CLK_K230) += clk-k230.o > obj-$(CONFIG_LMK04832) += clk-lmk04832.o > obj-$(CONFIG_COMMON_CLK_LAN966X) += clk-lan966x.o > obj-$(CONFIG_COMMON_CLK_LOCHNAGAR) += clk-lochnagar.o > diff --git a/drivers/clk/clk-k230.c b/drivers/clk/clk-k230.c > new file mode 100644 > index 0000000000000000000000000000000000000000..2ba74c008b30ae3400acbd8c08550e8315dfe205 > --- /dev/null > +++ b/drivers/clk/clk-k230.c > @@ -0,0 +1,2456 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * Kendryte Canaan K230 Clock Drivers > + * > + * Author: Xukai Wang > + * Author: Troy Mitchell > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > + > +/* PLL control register bits. */ > +#define K230_PLL_BYPASS_ENABLE BIT(19) > +#define K230_PLL_GATE_ENABLE BIT(2) > +#define K230_PLL_GATE_WRITE_ENABLE BIT(18) > +#define K230_PLL_OD_SHIFT 24 > +#define K230_PLL_OD_MASK 0xF > +#define K230_PLL_R_SHIFT 16 > +#define K230_PLL_R_MASK 0x3F > +#define K230_PLL_F_SHIFT 0 > +#define K230_PLL_F_MASK 0x1FFF > +#define K230_PLL_DIV_REG_OFFSET 0x00 > +#define K230_PLL_BYPASS_REG_OFFSET 0x04 > +#define K230_PLL_GATE_REG_OFFSET 0x08 > +#define K230_PLL_LOCK_REG_OFFSET 0x0C Maybe FIELD_PREP() and FIELD_GET() would help for the PLL-related rountines, and you could get avoid of writing shifts and masks by hand. ... > +struct k230_clk_rate_self { > + struct clk_hw hw; > + void __iomem *reg; > + bool read_only; Isn't a read-only multiplier, divider or something capable of both a simple fixed-factor hardware? If so please switch to the existing clock hardware, instead of introducing a field in description of rate clocks. It's worth noting that you've already had at least one fixed-rate clock (shrm_sram_div2). > + u32 write_enable_bit; > + u32 mul_min; > + u32 mul_max; > + u32 mul_shift; > + u32 mul_mask; > + u32 div_min; > + u32 div_max; > + u32 div_shift; > + u32 div_mask; > + /* ensures mutual exclusion for concurrent register access. */ > + spinlock_t *lock; > +}; ... > +static int k230_clk_find_approximate_mul_div(u32 mul_min, u32 mul_max, > + u32 div_min, u32 div_max, > + unsigned long rate, > + unsigned long parent_rate, > + u32 *div, u32 *mul) > +{ > + long abs_min; > + long abs_current; > + long perfect_divide; > + > + if (!rate || !parent_rate || !mul_min) > + return -EINVAL; > + > + perfect_divide = (long)((parent_rate * 1000) / rate); > + abs_min = abs(perfect_divide - > + (long)(((long)div_max * 1000) / (long)mul_min)); > + > + *div = div_max; > + *mul = mul_min; > + > + for (u32 i = div_max - 1; i >= div_min; i--) { > + for (u32 j = mul_min + 1; j <= mul_max; j++) { > + abs_current = abs(perfect_divide - > + (long)(((long)i * 1000) / (long)j)); > + > + if (abs_min > abs_current) { > + abs_min = abs_current; > + *div = i; > + *mul = j; > + } > + } > + } > + > + return 0; > +} This looks like a poor version of rational_best_approximation(). Could you please consider switching to it? > +static int k230_clk_set_rate_mul(struct clk_hw *hw, unsigned long rate, > + unsigned long parent_rate) > +{ > + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); > + struct k230_clk_rate_self *rate_self = &clk->clk; > + u32 div, mul, mul_reg; > + > + if (rate > parent_rate) > + return -EINVAL; > + > + if (rate_self->read_only) > + return 0; > + > + if (k230_clk_find_approximate_mul(rate_self->mul_min, rate_self->mul_max, > + rate_self->div_min, rate_self->div_max, > + rate, parent_rate, &div, &mul)) > + return -EINVAL; > + > + guard(spinlock)(rate_self->lock); > + > + mul_reg = readl(rate_self->reg + clk->mul_reg_off); > + mul_reg |= ((mul - 1) & rate_self->mul_mask) << (rate_self->mul_shift); > + mul_reg |= BIT(rate_self->write_enable_bit); > + writel(mul_reg, rate_self->reg + clk->mul_reg_off); > + > + return 0; > +} > + > +static int k230_clk_set_rate_div(struct clk_hw *hw, unsigned long rate, > + unsigned long parent_rate) > +{ > + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); > + struct k230_clk_rate_self *rate_self = &clk->clk; > + u32 div, mul, div_reg; > + > + if (rate > parent_rate) > + return -EINVAL; > + > + if (rate_self->read_only) > + return 0; > + > + if (k230_clk_find_approximate_div(rate_self->mul_min, rate_self->mul_max, > + rate_self->div_min, rate_self->div_max, > + rate, parent_rate, &div, &mul)) > + return -EINVAL; > + > + guard(spinlock)(rate_self->lock); > + > + div_reg = readl(rate_self->reg + clk->div_reg_off); > + div_reg |= ((div - 1) & rate_self->div_mask) << (rate_self->div_shift); > + div_reg |= BIT(rate_self->write_enable_bit); > + writel(div_reg, rate_self->reg + clk->div_reg_off); > + > + return 0; > +} > + > +static int k230_clk_set_rate_mul_div(struct clk_hw *hw, unsigned long rate, > + unsigned long parent_rate) > +{ > + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); > + struct k230_clk_rate_self *rate_self = &clk->clk; > + u32 div, mul, div_reg, mul_reg; > + > + if (rate > parent_rate) > + return -EINVAL; > + > + if (rate_self->read_only) > + return 0; > + > + if (k230_clk_find_approximate_mul_div(rate_self->mul_min, rate_self->mul_max, > + rate_self->div_min, rate_self->div_max, > + rate, parent_rate, &div, &mul)) > + return -EINVAL; > + > + guard(spinlock)(rate_self->lock); > + > + div_reg = readl(rate_self->reg + clk->div_reg_off); > + div_reg |= ((div - 1) & rate_self->div_mask) << (rate_self->div_shift); > + div_reg |= BIT(rate_self->write_enable_bit); > + writel(div_reg, rate_self->reg + clk->div_reg_off); > + > + mul_reg = readl(rate_self->reg + clk->mul_reg_off); > + mul_reg |= ((mul - 1) & rate_self->mul_mask) << (rate_self->mul_shift); > + mul_reg |= BIT(rate_self->write_enable_bit); > + writel(mul_reg, rate_self->reg + clk->mul_reg_off); > + > + return 0; > +} There are three variants of rate clocks, mul-only, div-only and mul-div ones, which are similar to clk-multiplier, clk-divider, clk-fractional-divider. The only difference is to setup new parameters for K230's rate clocks, a register bit, described as k230_clk_rate_self.write_enable_bit, must be set first. What do you think of introducing support for such "write enable bit" to the generic implementation of multipler/divider/fractional? Then you could reuse the generic implementation in K230's driver, avoiding code duplication. ... > +static const struct of_device_id k230_clk_ids[] = { > + { .compatible = "canaan,k230-clk" }, > + { /* Sentinel */ } > +}; > +MODULE_DEVICE_TABLE(of, k230_clk_ids); MODULE_DEVICE_TABLE is unnecessary if your driver couldn't be built as a module. > +static struct platform_driver k230_clk_driver = { > + .driver = { > + .name = "k230_clock_controller", > + .of_match_table = k230_clk_ids, > + }, > + .probe = k230_clk_probe, > +}; > +builtin_platform_driver(k230_clk_driver); Best regards, Yao Zi From ziyao at disroot.org Sat Sep 6 20:17:26 2025 From: ziyao at disroot.org (Yao Zi) Date: Sun, 7 Sep 2025 03:17:26 +0000 Subject: [PATCH v8 2/3] clk: canaan: Add clock driver for Canaan K230 In-Reply-To: References: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> <20250905-b4-k230-clk-v8-2-96caa02d5428@zohomail.com> Message-ID: On Sun, Sep 07, 2025 at 03:13:07AM +0000, Yao Zi wrote: > > +struct k230_clk_rate_self { > > + struct clk_hw hw; > > + void __iomem *reg; > > + bool read_only; > > Isn't a read-only multiplier, divider or something capable of both a > simple fixed-factor hardware? If so please switch to the existing clock > hardware, instead of introducing a field in description of rate clocks. > > It's worth noting that you've already had at least one fixed-rate clock > (shrm_sram_div2). It should be "fixed-factor" clock instead of "fixed-rate" clock, sorry for the typo. Regards, Yao Zi From jhubbard at nvidia.com Sat Sep 6 22:14:19 2025 From: jhubbard at nvidia.com (John Hubbard) Date: Sat, 6 Sep 2025 22:14:19 -0700 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <85e760cf-b994-40db-8d13-221feee55c60@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <016307ba-427d-4646-8e4d-1ffefd2c1968@nvidia.com> <85e760cf-b994-40db-8d13-221feee55c60@redhat.com> Message-ID: <0a28adde-acaf-4d55-96ba-c32d6113285f@nvidia.com> On 9/5/25 11:56 PM, David Hildenbrand wrote: > On 06.09.25 03:05, John Hubbard wrote: >> On 9/1/25 8:03 AM, David Hildenbrand wrote: ...> Well, there is a lot I dislike about record_subpages() to go back there. > Starting with "as Willy keeps explaining, the concept of subpages do > not exist and ending with "why do we fill out the array even on failure". > > :) I am also very glad to see the entire concept of subpages disappear. >> >> Now it's been returned to it's original, cryptic form. >> > > The code in the caller was so uncryptic that both me and Lorenzo missed > that magical addition. :P > >> Just my take on it, for whatever that's worth. :) > > As always, appreciated. > > I could of course keep the simple loop in some "record_folio_pages" > function and clean up what I dislike about record_subpages(). > > But I much rather want the call chain to be cleaned up instead, if > possible. > Right! The primary way that record_subpages() helped was in showing what was going on: a function call helps a lot to self-document, sometimes. > > Roughly, what I am thinking (limiting it to pte+pmd case) about is the > following: The code below looks much cleaner, that's great! thanks, -- John Hubbard > > > From d6d6d21dbf435d8030782a627175e36e6c7b2dfb Mon Sep 17 00:00:00 2001 > From: David Hildenbrand > Date: Sat, 6 Sep 2025 08:33:42 +0200 > Subject: [PATCH] tmp > > Signed-off-by: David Hildenbrand > --- > ?mm/gup.c | 79 ++++++++++++++++++++++++++------------------------------ > ?1 file changed, 36 insertions(+), 43 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index 22420f2069ee1..98907ead749c0 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -2845,12 +2845,11 @@ static void __maybe_unused > gup_fast_undo_dev_pagemap(int *nr, int nr_start, > ? * also check pmd here to make sure pmd doesn't change (corresponds to > ? * pmdp_collapse_flush() in the THP collapse code path). > ? */ > -static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, > -??????? unsigned long end, unsigned int flags, struct page **pages, > -??????? int *nr) > +static unsigned long gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, > unsigned long addr, > +??????? unsigned long end, unsigned int flags, struct page **pages) > ?{ > ???? struct dev_pagemap *pgmap = NULL; > -??? int ret = 0; > +??? unsigned long nr_pages = 0; > ???? pte_t *ptep, *ptem; > > ???? ptem = ptep = pte_offset_map(&pmd, addr); > @@ -2908,24 +2907,20 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t > *pmdp, unsigned long addr, > ????????? * details. > ????????? */ > ???????? if (flags & FOLL_PIN) { > -??????????? ret = arch_make_folio_accessible(folio); > -??????????? if (ret) { > +??????????? if (arch_make_folio_accessible(folio)) { > ???????????????? gup_put_folio(folio, 1, flags); > ???????????????? goto pte_unmap; > ???????????? } > ???????? } > ???????? folio_set_referenced(folio); > -??????? pages[*nr] = page; > -??????? (*nr)++; > +??????? pages[nr_pages++] = page; > ???? } while (ptep++, addr += PAGE_SIZE, addr != end); > > -??? ret = 1; > - > ?pte_unmap: > ???? if (pgmap) > ???????? put_dev_pagemap(pgmap); > ???? pte_unmap(ptem); > -??? return ret; > +??? return nr_pages; > ?} > ?#else > > @@ -2938,21 +2933,24 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t > *pmdp, unsigned long addr, > ? * get_user_pages_fast_only implementation that can pin pages. Thus > it's still > ? * useful to have gup_fast_pmd_leaf even if we can't operate on ptes. > ? */ > -static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, > -??????? unsigned long end, unsigned int flags, struct page **pages, > -??????? int *nr) > +static unsigned long gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, > unsigned long addr, > +??????? unsigned long end, unsigned int flags, struct page **pages) > ?{ > ???? return 0; > ?} > ?#endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */ > > -static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > -??????? unsigned long end, unsigned int flags, struct page **pages, > -??????? int *nr) > +static unsigned long gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, > unsigned long addr, > +??????? unsigned long end, unsigned int flags, struct page **pages) > ?{ > +??? const unsigned long nr_pages = (end - addr) >> PAGE_SHIFT; > ???? struct page *page; > ???? struct folio *folio; > -??? int refs; > +??? unsigned long i; > + > +??? /* See gup_fast_pte_range() */ > +??? if (pmd_protnone(orig)) > +??????? return 0; > > ???? if (!pmd_access_permitted(orig, flags & FOLL_WRITE)) > ???????? return 0; > @@ -2960,33 +2958,30 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t > *pmdp, unsigned long addr, > ???? if (pmd_special(orig)) > ???????? return 0; > > -??? refs = (end - addr) >> PAGE_SHIFT; > ???? page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); > > -??? folio = try_grab_folio_fast(page, refs, flags); > +??? folio = try_grab_folio_fast(page, nr_pages, flags); > ???? if (!folio) > ???????? return 0; > > ???? if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) { > -??????? gup_put_folio(folio, refs, flags); > +??????? gup_put_folio(folio, nr_pages, flags); > ???????? return 0; > ???? } > > ???? if (!gup_fast_folio_allowed(folio, flags)) { > -??????? gup_put_folio(folio, refs, flags); > +??????? gup_put_folio(folio, nr_pages, flags); > ???????? return 0; > ???? } > ???? if (!pmd_write(orig) && gup_must_unshare(NULL, flags, &folio- > >page)) { > -??????? gup_put_folio(folio, refs, flags); > +??????? gup_put_folio(folio, nr_pages, flags); > ???????? return 0; > ???? } > > -??? pages += *nr; > -??? *nr += refs; > -??? for (; refs; refs--) > +??? for (i = 0; i < nr_pages; i++) > ???????? *(pages++) = page++; > ???? folio_set_referenced(folio); > -??? return 1; > +??? return nr_pages; > ?} > > ?static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > @@ -3033,11 +3028,11 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t > *pudp, unsigned long addr, > ???? return 1; > ?} > > -static int gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, > -??????? unsigned long end, unsigned int flags, struct page **pages, > -??????? int *nr) > +static unsigned long gup_fast_pmd_range(pud_t *pudp, pud_t pud, > unsigned long addr, > +??????? unsigned long end, unsigned int flags, struct page **pages) > ?{ > -??? unsigned long next; > +??? unsigned long cur_nr_pages, next; > +??? unsigned long nr_pages = 0; > ???? pmd_t *pmdp; > > ???? pmdp = pmd_offset_lockless(pudp, pud, addr); > @@ -3046,23 +3041,21 @@ static int gup_fast_pmd_range(pud_t *pudp, pud_t > pud, unsigned long addr, > > ???????? next = pmd_addr_end(addr, end); > ???????? if (!pmd_present(pmd)) > -??????????? return 0; > +??????????? break; > > -??????? if (unlikely(pmd_leaf(pmd))) { > -??????????? /* See gup_fast_pte_range() */ > -??????????? if (pmd_protnone(pmd)) > -??????????????? return 0; > +??????? if (unlikely(pmd_leaf(pmd))) > +??????????? cur_nr_pages = gup_fast_pmd_leaf(pmd, pmdp, addr, next, > flags, pages); > +??????? else > +??????????? cur_nr_pages = gup_fast_pte_range(pmd, pmdp, addr, next, > flags, pages); > > -??????????? if (!gup_fast_pmd_leaf(pmd, pmdp, addr, next, flags, > -??????????????? pages, nr)) > -??????????????? return 0; > +??????? nr_pages += cur_nr_pages; > +??????? pages += cur_nr_pages; > > -??????? } else if (!gup_fast_pte_range(pmd, pmdp, addr, next, flags, > -?????????????????????????? pages, nr)) > -??????????? return 0; > +??????? if (nr_pages != (next - addr) >> PAGE_SIZE) > +??????????? break; > ???? } while (pmdp++, addr = next, addr != end); > > -??? return 1; > +??? return nr_pages; > ?} > > ?static int gup_fast_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr, From wangruikang at iscas.ac.cn Sun Sep 7 01:22:44 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Sun, 7 Sep 2025 16:22:44 +0800 Subject: [PATCH net-next v9 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250905165908.69548ce0@kernel.org> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> <20250905-net-k1-emac-v9-2-f1649b98a19c@iscas.ac.cn> <20250905153500.GH553991@horms.kernel.org> <0605f176-5cdb-4f5b-9a6b-afa139c96732@iscas.ac.cn> <20250905160158.GI553991@horms.kernel.org> <45053235-3b01-42d8-98aa-042681104d11@iscas.ac.cn> <20250905165908.69548ce0@kernel.org> Message-ID: On 9/6/25 07:59, Jakub Kicinski wrote: > On Sat, 6 Sep 2025 00:35:37 +0800 Vivian Wang wrote: >>>> On a closer look, these counters in ndev->stats seems to be redundant >>>> with the hardware-tracked statistics, so maybe I should just not bother >>>> with updating ndev->stats. Does that make sense? >>> For rx/tx packets/bytes I think that makes sense. >>> But what about rx/tx drops? >> Right... but tstats doesn't have *_dropped. It seems that tx_dropped and >> rx_dropped are considered "slow path" for real devices. It makes sense >> to me that those should be very rare. > Pretty sure Simon meant the per-cpu netdev stats in general. > There are three types of them, if you need drops I think you > probably want dstats. Take a look. According to this comment in net/core/dev.c dev_get_stats(): ? ? /* ? ? ?* IPv{4,6} and udp tunnels share common stat helpers and use ? ? ?* different stat type (NETDEV_PCPU_STAT_TSTATS vs ? ? ?* NETDEV_PCPU_STAT_DSTATS). Ensure the accounting is consistent. ? ? ?*/ "dstats" is meant for tunnels. This doesn't look like the right thing to use, and no other pcpu_stat_type gives me tx_dropped. Do you think I should use dstats anyway? (And yes the only software-tracked one should be tx_dropped. Since we pre-allocate the RX buffers, there is no opportunity to drop on RX in software.) From pr-tracker-bot at kernel.org Sun Sep 7 08:32:15 2025 From: pr-tracker-bot at kernel.org (pr-tracker-bot at kernel.org) Date: Sun, 07 Sep 2025 15:32:15 +0000 Subject: [GIT PULL] RISC-V updates for v6.17-rc5 In-Reply-To: <053b276c-b22b-f3e7-6c11-abe61b8ee36b@kernel.org> References: <053b276c-b22b-f3e7-6c11-abe61b8ee36b@kernel.org> Message-ID: <175725913509.3081192.6395265560508974365.pr-tracker-bot@kernel.org> The pull request you sent on Fri, 5 Sep 2025 16:16:30 -0600 (MDT): > git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux tags/riscv-for-linus-6.17-rc5 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/00e69828220782cae5df67d1546d4969770c9753 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html From irogers at google.com Sun Sep 7 16:14:39 2025 From: irogers at google.com (Ian Rogers) Date: Sun, 7 Sep 2025 16:14:39 -0700 Subject: [External] Re: [PATCH 1/2] watchdog: refactor watchdog_hld functionality In-Reply-To: References: <20250827100959.83023-1-cuiyunhui@bytedance.com> <20250827100959.83023-2-cuiyunhui@bytedance.com> Message-ID: On Fri, Sep 5, 2025 at 4:57?PM Doug Anderson wrote: > > Hi, > > On Wed, Sep 3, 2025 at 4:56?AM yunhui cui wrote: > > I?ve read your linked content, which details the pros and cons of perf > > watchdog and buddy watchdog. > > I think everyone will agree on choosing one as the default. > > It seems there?s no kernel/watchdog entry in MAINTAINERS?what?s next > > for these two approaches? > > I guess to start, someone (you?) should send some patches to the list. > Maybe one patch to make buddy the default and one to change the > description of the "perf" lockup detector say that its usage is > discouraged, that it might be removed, that people should use the > "buddy" detector instead, and that if there's a reason someone needs > the "perf" detector instead of the buddy one then they should make > some loud noises. > > You'd want to CC folks who were involved in recent watchdog changes > and make sure to CC Andrew (akpm). If folks react positive and Andrew > agrees then he'll likely land the the patches and we'll have made > forward progress. :-) +1 There are also things like /proc/sys/kernel/nmi_watchdog being used to enable/disable the hard lookup detector. If we could move that to a unique file so that perf is less confused in places like: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/util.c#n70 ie. perf shouldn't warn about the NMI watchdog being enabled and taking a perf event when it doesn't. Thanks, Ian From hengqi.chen at gmail.com Sun Sep 7 18:24:48 2025 From: hengqi.chen at gmail.com (Hengqi Chen) Date: Mon, 8 Sep 2025 01:24:48 +0000 Subject: [PATCH bpf-next v3] riscv, bpf: Sign extend struct ops return values properly Message-ID: <20250908012448.1695-1-hengqi.chen@gmail.com> The ns_bpf_qdisc selftest triggers a kernel panic: Unable to handle kernel paging request at virtual address ffffffffa38dbf58 Current test_progs pgtable: 4K pagesize, 57-bit VAs, pgdp=0x00000001109cc000 [ffffffffa38dbf58] pgd=000000011fffd801, p4d=000000011fffd401, pud=000000011fffd001, pmd=0000000000000000 Oops [#1] Modules linked in: bpf_testmod(OE) xt_conntrack nls_iso8859_1 dm_mod drm drm_panel_orientation_quirks configfs backlight btrfs blake2b_generic xor lzo_compress zlib_deflate raid6_pq efivarfs [last unloaded: bpf_testmod(OE)] CPU: 1 UID: 0 PID: 23584 Comm: test_progs Tainted: G W OE 6.17.0-rc1-g2465bb83e0b4 #1 NONE Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2024.01+dfsg-1ubuntu5.1 01/01/2024 epc : __qdisc_run+0x82/0x6f0 ra : __qdisc_run+0x6e/0x6f0 epc : ffffffff80bd5c7a ra : ffffffff80bd5c66 sp : ff2000000eecb550 gp : ffffffff82472098 tp : ff60000096895940 t0 : ffffffff8001f180 t1 : ffffffff801e1664 t2 : 0000000000000000 s0 : ff2000000eecb5d0 s1 : ff60000093a6a600 a0 : ffffffffa38dbee8 a1 : 0000000000000001 a2 : ff2000000eecb510 a3 : 0000000000000001 a4 : 0000000000000000 a5 : 0000000000000010 a6 : 0000000000000000 a7 : 0000000000735049 s2 : ffffffffa38dbee8 s3 : 0000000000000040 s4 : ff6000008bcda000 s5 : 0000000000000008 s6 : ff60000093a6a680 s7 : ff60000093a6a6f0 s8 : ff60000093a6a6ac s9 : ff60000093140000 s10: 0000000000000000 s11: ff2000000eecb9d0 t3 : 0000000000000000 t4 : 0000000000ff0000 t5 : 0000000000000000 t6 : ff60000093a6a8b6 status: 0000000200000120 badaddr: ffffffffa38dbf58 cause: 000000000000000d [] __qdisc_run+0x82/0x6f0 [] __dev_queue_xmit+0x4c0/0x1128 [] neigh_resolve_output+0xd0/0x170 [] ip6_finish_output2+0x226/0x6c8 [] ip6_finish_output+0x10c/0x2a0 [] ip6_output+0x5e/0x178 [] ip6_xmit+0x29a/0x608 [] inet6_csk_xmit+0xe6/0x140 [] __tcp_transmit_skb+0x45c/0xaa8 [] tcp_connect+0x9ce/0xd10 [] tcp_v6_connect+0x4ac/0x5e8 [] __inet_stream_connect+0xd8/0x318 [] inet_stream_connect+0x3e/0x68 [] __sys_connect_file+0x50/0x88 [] __sys_connect+0x96/0xc8 [] __riscv_sys_connect+0x20/0x30 [] do_trap_ecall_u+0x256/0x378 [] handle_exception+0x14a/0x156 Code: 892a 0363 1205 489c 8bc1 c7e5 2d03 084a 2703 080a (2783) 0709 ---[ end trace 0000000000000000 ]--- The bpf_fifo_dequeue prog returns a skb which is a pointer. The pointer is treated as a 32bit value and sign extend to 64bit in epilogue. This behavior is right for most bpf prog types but wrong for struct ops which requires RISC-V ABI. So let's sign extend struct ops return values according to the function model and RISC-V ABI([0]). [0]: https://riscv.org/wp-content/uploads/2024/12/riscv-calling.pdf Fixes: 25ad10658dc1 ("riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace framework") Signed-off-by: Hengqi Chen --- arch/riscv/net/bpf_jit_comp64.c | 42 ++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c index 397968d6ee09..a860be52dc49 100644 --- a/arch/riscv/net/bpf_jit_comp64.c +++ b/arch/riscv/net/bpf_jit_comp64.c @@ -711,6 +711,39 @@ static int emit_atomic_rmw(u8 rd, u8 rs, const struct bpf_insn *insn, return 0; } +/* + * Sign-extend the register if necessary + */ +static int sign_extend(u8 rd, u8 rs, u8 sz, bool sign, struct rv_jit_context *ctx) +{ + if (!sign && (sz == 1 || sz == 2)) { + if (rd != rs) + emit_mv(rd, rs, ctx); + return 0; + } + + switch (sz) { + case 1: + emit_sextb(rd, rs, ctx); + break; + case 2: + emit_sexth(rd, rs, ctx); + break; + case 4: + emit_sextw(rd, rs, ctx); + break; + case 8: + if (rd != rs) + emit_mv(rd, rs, ctx); + break; + default: + pr_err("bpf-jit: invalid size %d for sign_extend\n", sz); + return -EINVAL; + } + + return 0; +} + #define BPF_FIXUP_OFFSET_MASK GENMASK(26, 0) #define BPF_FIXUP_REG_MASK GENMASK(31, 27) #define REG_DONT_CLEAR_MARKER 0 /* RV_REG_ZERO unused in pt_regmap */ @@ -1175,8 +1208,15 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, restore_args(min_t(int, nr_arg_slots, RV_MAX_REG_ARGS), args_off, ctx); if (save_ret) { - emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx); + if (is_struct_ops) { + ret = sign_extend(RV_REG_A0, regmap[BPF_REG_0], m->ret_size, + m->ret_flags & BTF_FMODEL_SIGNED_ARG, ctx); + if (ret) + goto out; + } else { + emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); + } } emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); -- 2.45.2 From anup at brainfault.org Sun Sep 7 20:51:00 2025 From: anup at brainfault.org (Anup Patel) Date: Mon, 8 Sep 2025 09:21:00 +0530 Subject: [PATCH v3 0/6] ONE_REG interface for SBI FWFT extension In-Reply-To: <20250823155947.1354229-1-apatel@ventanamicro.com> References: <20250823155947.1354229-1-apatel@ventanamicro.com> Message-ID: On Sat, Aug 23, 2025 at 9:30?PM Anup Patel wrote: > > This series adds ONE_REG interface for SBI FWFT extension implemented > by KVM RISC-V. This was missed out in accepted SBI FWFT patches for > KVM RISC-V. > > These patches can also be found in the riscv_kvm_fwft_one_reg_v3 branch > at: https://github.com/avpatel/linux.git > > Changes since v2: > - Re-based on latest KVM RISC-V queue > - Improved FWFT ONE_REG interface to allow enabling/disabling each > FWFT feature from KVM userspace > > Changes since v1: > - Dropped have_state in PATCH4 as suggested by Drew > - Added Drew's Reviewed-by in appropriate patches > > Anup Patel (6): > RISC-V: KVM: Set initial value of hedeleg in kvm_arch_vcpu_create() > RISC-V: KVM: Introduce feature specific reset for SBI FWFT > RISC-V: KVM: Introduce optional ONE_REG callbacks for SBI extensions > RISC-V: KVM: Move copy_sbi_ext_reg_indices() to SBI implementation > RISC-V: KVM: Implement ONE_REG interface for SBI FWFT state > KVM: riscv: selftests: Add SBI FWFT to get-reg-list test Queued this series for Linux-6.18 Regards, Anup > > arch/riscv/include/asm/kvm_vcpu_sbi.h | 22 +- > arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h | 1 + > arch/riscv/include/uapi/asm/kvm.h | 15 ++ > arch/riscv/kvm/vcpu.c | 3 +- > arch/riscv/kvm/vcpu_onereg.c | 60 +---- > arch/riscv/kvm/vcpu_sbi.c | 172 +++++++++++-- > arch/riscv/kvm/vcpu_sbi_fwft.c | 227 ++++++++++++++++-- > arch/riscv/kvm/vcpu_sbi_sta.c | 63 +++-- > .../selftests/kvm/riscv/get-reg-list.c | 32 +++ > 9 files changed, 467 insertions(+), 128 deletions(-) > > -- > 2.43.0 > From pulehui at huawei.com Sun Sep 7 23:31:19 2025 From: pulehui at huawei.com (Pu Lehui) Date: Mon, 8 Sep 2025 14:31:19 +0800 Subject: [PATCH bpf-next v3] riscv, bpf: Sign extend struct ops return values properly In-Reply-To: <20250908012448.1695-1-hengqi.chen@gmail.com> References: <20250908012448.1695-1-hengqi.chen@gmail.com> Message-ID: On 2025/9/8 9:24, Hengqi Chen wrote: > The ns_bpf_qdisc selftest triggers a kernel panic: > > Unable to handle kernel paging request at virtual address ffffffffa38dbf58 > Current test_progs pgtable: 4K pagesize, 57-bit VAs, pgdp=0x00000001109cc000 > [ffffffffa38dbf58] pgd=000000011fffd801, p4d=000000011fffd401, pud=000000011fffd001, pmd=0000000000000000 > Oops [#1] > Modules linked in: bpf_testmod(OE) xt_conntrack nls_iso8859_1 dm_mod drm drm_panel_orientation_quirks configfs backlight btrfs blake2b_generic xor lzo_compress zlib_deflate raid6_pq efivarfs [last unloaded: bpf_testmod(OE)] > CPU: 1 UID: 0 PID: 23584 Comm: test_progs Tainted: G W OE 6.17.0-rc1-g2465bb83e0b4 #1 NONE > Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2024.01+dfsg-1ubuntu5.1 01/01/2024 > epc : __qdisc_run+0x82/0x6f0 > ra : __qdisc_run+0x6e/0x6f0 > epc : ffffffff80bd5c7a ra : ffffffff80bd5c66 sp : ff2000000eecb550 > gp : ffffffff82472098 tp : ff60000096895940 t0 : ffffffff8001f180 > t1 : ffffffff801e1664 t2 : 0000000000000000 s0 : ff2000000eecb5d0 > s1 : ff60000093a6a600 a0 : ffffffffa38dbee8 a1 : 0000000000000001 > a2 : ff2000000eecb510 a3 : 0000000000000001 a4 : 0000000000000000 > a5 : 0000000000000010 a6 : 0000000000000000 a7 : 0000000000735049 > s2 : ffffffffa38dbee8 s3 : 0000000000000040 s4 : ff6000008bcda000 > s5 : 0000000000000008 s6 : ff60000093a6a680 s7 : ff60000093a6a6f0 > s8 : ff60000093a6a6ac s9 : ff60000093140000 s10: 0000000000000000 > s11: ff2000000eecb9d0 t3 : 0000000000000000 t4 : 0000000000ff0000 > t5 : 0000000000000000 t6 : ff60000093a6a8b6 > status: 0000000200000120 badaddr: ffffffffa38dbf58 cause: 000000000000000d > [] __qdisc_run+0x82/0x6f0 > [] __dev_queue_xmit+0x4c0/0x1128 > [] neigh_resolve_output+0xd0/0x170 > [] ip6_finish_output2+0x226/0x6c8 > [] ip6_finish_output+0x10c/0x2a0 > [] ip6_output+0x5e/0x178 > [] ip6_xmit+0x29a/0x608 > [] inet6_csk_xmit+0xe6/0x140 > [] __tcp_transmit_skb+0x45c/0xaa8 > [] tcp_connect+0x9ce/0xd10 > [] tcp_v6_connect+0x4ac/0x5e8 > [] __inet_stream_connect+0xd8/0x318 > [] inet_stream_connect+0x3e/0x68 > [] __sys_connect_file+0x50/0x88 > [] __sys_connect+0x96/0xc8 > [] __riscv_sys_connect+0x20/0x30 > [] do_trap_ecall_u+0x256/0x378 > [] handle_exception+0x14a/0x156 > Code: 892a 0363 1205 489c 8bc1 c7e5 2d03 084a 2703 080a (2783) 0709 > ---[ end trace 0000000000000000 ]--- > > The bpf_fifo_dequeue prog returns a skb which is a pointer. > The pointer is treated as a 32bit value and sign extend to > 64bit in epilogue. This behavior is right for most bpf prog > types but wrong for struct ops which requires RISC-V ABI. > > So let's sign extend struct ops return values according to > the function model and RISC-V ABI([0]). > > [0]: https://riscv.org/wp-content/uploads/2024/12/riscv-calling.pdf > > Fixes: 25ad10658dc1 ("riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace framework") > Signed-off-by: Hengqi Chen > --- > arch/riscv/net/bpf_jit_comp64.c | 42 ++++++++++++++++++++++++++++++++- > 1 file changed, 41 insertions(+), 1 deletion(-) > > diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c > index 397968d6ee09..a860be52dc49 100644 > --- a/arch/riscv/net/bpf_jit_comp64.c > +++ b/arch/riscv/net/bpf_jit_comp64.c > @@ -711,6 +711,39 @@ static int emit_atomic_rmw(u8 rd, u8 rs, const struct bpf_insn *insn, > return 0; > } > > +/* > + * Sign-extend the register if necessary > + */ > +static int sign_extend(u8 rd, u8 rs, u8 sz, bool sign, struct rv_jit_context *ctx) > +{ > + if (!sign && (sz == 1 || sz == 2)) { > + if (rd != rs) > + emit_mv(rd, rs, ctx); > + return 0; > + } > + > + switch (sz) { > + case 1: > + emit_sextb(rd, rs, ctx); > + break; > + case 2: > + emit_sexth(rd, rs, ctx); > + break; > + case 4: > + emit_sextw(rd, rs, ctx); > + break; > + case 8: > + if (rd != rs) > + emit_mv(rd, rs, ctx); > + break; > + default: > + pr_err("bpf-jit: invalid size %d for sign_extend\n", sz); > + return -EINVAL; > + } > + > + return 0; > +} > + > #define BPF_FIXUP_OFFSET_MASK GENMASK(26, 0) > #define BPF_FIXUP_REG_MASK GENMASK(31, 27) > #define REG_DONT_CLEAR_MARKER 0 /* RV_REG_ZERO unused in pt_regmap */ > @@ -1175,8 +1208,15 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, > restore_args(min_t(int, nr_arg_slots, RV_MAX_REG_ARGS), args_off, ctx); > > if (save_ret) { > - emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx); > + if (is_struct_ops) { > + ret = sign_extend(RV_REG_A0, regmap[BPF_REG_0], m->ret_size, > + m->ret_flags & BTF_FMODEL_SIGNED_ARG, ctx); > + if (ret) > + goto out; > + } else { > + emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > + } > } > > emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); Thanks Hengqi, feel free to add: Reviewed-by: Pu Lehui Tested-by: Pu Lehui From david at redhat.com Mon Sep 8 01:00:05 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 8 Sep 2025 10:00:05 +0200 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <0a28adde-acaf-4d55-96ba-c32d6113285f@nvidia.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <016307ba-427d-4646-8e4d-1ffefd2c1968@nvidia.com> <85e760cf-b994-40db-8d13-221feee55c60@redhat.com> <0a28adde-acaf-4d55-96ba-c32d6113285f@nvidia.com> Message-ID: <28fc8fb3-f16b-4efb-b8e3-24081f035c73@redhat.com> >> Roughly, what I am thinking (limiting it to pte+pmd case) about is the >> following: > > The code below looks much cleaner, that's great! Great, I (or Aristeu if he has capacity) will clean this all up soon. -- Cheers David / dhildenb From valentina.fernandezalanis at microchip.com Mon Sep 8 04:57:26 2025 From: valentina.fernandezalanis at microchip.com (Valentina Fernandez) Date: Mon, 8 Sep 2025 12:57:26 +0100 Subject: [PATCH v3 0/6] Icicle Kit with prod device and Discovery Kit support Message-ID: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> Hi all, With the introduction of the Icicle Kit with the production device (MPFS250T) to the market, it's necessary to distinguish it from the engineering sample (-es) variant. This is because engineering samples cannot write to flash from the MSS, as noted in the PolarFire SoC FPGA ES errata. This series adds a common board DTSI for the Icicle Kit, containing hardware shared by both the engineering sample and production versions, as well as a DTS for each Icicle Kit variant. The last two patches add support for the PolarFire SoC Discovery Kit board. Changes since v2: - rename ccc clock to clock-cccref to match fixed clock binding Changes since v1: - fix order of properties in mailbox nodes - drop redundant status property from ddrc_cache nodes - fix lowercase hex in reserved memory regions Thanks, Valentina Valentina Fernandez (6): riscv: dts: microchip: add common board dtsi for icicle kit variants dt-bindings: riscv: microchip: document icicle kit with production device riscv: dts: microchip: add icicle kit with production device riscv: dts: microchip: rename icicle kit ccc clock and other minor fixes dt-bindings: riscv: microchip: document Discovery Kit riscv: dts: microchip: add a device tree for Discovery Kit .../devicetree/bindings/riscv/microchip.yaml | 13 + arch/riscv/boot/dts/microchip/Makefile | 2 + .../dts/microchip/mpfs-disco-kit-fabric.dtsi | 58 ++++ .../boot/dts/microchip/mpfs-disco-kit.dts | 190 +++++++++++++ .../dts/microchip/mpfs-icicle-kit-common.dtsi | 249 ++++++++++++++++++ .../dts/microchip/mpfs-icicle-kit-fabric.dtsi | 25 +- .../dts/microchip/mpfs-icicle-kit-prod.dts | 23 ++ .../boot/dts/microchip/mpfs-icicle-kit.dts | 244 +---------------- 8 files changed, 559 insertions(+), 245 deletions(-) create mode 100644 arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi create mode 100644 arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts create mode 100644 arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi create mode 100644 arch/riscv/boot/dts/microchip/mpfs-icicle-kit-prod.dts -- 2.34.1 From valentina.fernandezalanis at microchip.com Mon Sep 8 04:57:27 2025 From: valentina.fernandezalanis at microchip.com (Valentina Fernandez) Date: Mon, 8 Sep 2025 12:57:27 +0100 Subject: [PATCH v3 1/6] riscv: dts: microchip: add common board dtsi for icicle kit variants In-Reply-To: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> References: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> Message-ID: <20250908115732.31092-2-valentina.fernandezalanis@microchip.com> In preparation for supporting the Icicle Kit with production silicon, add a common board dtsi for the icicle kit with hardware shared by both the engineering sample and production versions. Signed-off-by: Valentina Fernandez --- .../dts/microchip/mpfs-icicle-kit-common.dtsi | 247 ++++++++++++++++++ .../boot/dts/microchip/mpfs-icicle-kit.dts | 241 +---------------- 2 files changed, 248 insertions(+), 240 deletions(-) create mode 100644 arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi new file mode 100644 index 000000000000..eafea3b69cd7 --- /dev/null +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi @@ -0,0 +1,247 @@ +// SPDX-License-Identifier: (GPL-2.0 OR MIT) +/* Copyright (c) 2025 Microchip Technology Inc */ + +/dts-v1/; + +#include "mpfs.dtsi" +#include "mpfs-icicle-kit-fabric.dtsi" +#include +#include + +/ { + aliases { + ethernet0 = &mac1; + serial0 = &mmuart0; + serial1 = &mmuart1; + serial2 = &mmuart2; + serial3 = &mmuart3; + serial4 = &mmuart4; + }; + + chosen { + stdout-path = "serial1:115200n8"; + }; + + leds { + compatible = "gpio-leds"; + + led-1 { + gpios = <&gpio2 16 GPIO_ACTIVE_HIGH>; + color = ; + label = "led1"; + }; + + led-2 { + gpios = <&gpio2 17 GPIO_ACTIVE_HIGH>; + color = ; + label = "led2"; + }; + + led-3 { + gpios = <&gpio2 18 GPIO_ACTIVE_HIGH>; + color = ; + label = "led3"; + }; + + led-4 { + gpios = <&gpio2 19 GPIO_ACTIVE_HIGH>; + color = ; + label = "led4"; + }; + }; + + ddrc_cache_lo: memory at 80000000 { + device_type = "memory"; + reg = <0x0 0x80000000 0x0 0x40000000>; + status = "okay"; + }; + + ddrc_cache_hi: memory at 1040000000 { + device_type = "memory"; + reg = <0x10 0x40000000 0x0 0x40000000>; + status = "okay"; + }; + + reserved-memory { + #address-cells = <2>; + #size-cells = <2>; + ranges; + + hss_payload: region at BFC00000 { + reg = <0x0 0xBFC00000 0x0 0x400000>; + no-map; + }; + }; +}; + +&core_pwm0 { + status = "okay"; +}; + +&gpio2 { + interrupts = <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>; + status = "okay"; +}; + +&i2c0 { + status = "okay"; +}; + +&i2c1 { + status = "okay"; + + power-monitor at 10 { + compatible = "microchip,pac1934"; + reg = <0x10>; + + #address-cells = <1>; + #size-cells = <0>; + + channel at 1 { + reg = <0x1>; + shunt-resistor-micro-ohms = <10000>; + label = "VDDREG"; + }; + + channel at 2 { + reg = <0x2>; + shunt-resistor-micro-ohms = <10000>; + label = "VDDA25"; + }; + + channel at 3 { + reg = <0x3>; + shunt-resistor-micro-ohms = <10000>; + label = "VDD25"; + }; + + channel at 4 { + reg = <0x4>; + shunt-resistor-micro-ohms = <10000>; + label = "VDDA_REG"; + }; + }; +}; + +&i2c2 { + status = "okay"; +}; + +&mac0 { + phy-mode = "sgmii"; + phy-handle = <&phy0>; + status = "okay"; +}; + +&mac1 { + phy-mode = "sgmii"; + phy-handle = <&phy1>; + status = "okay"; + + phy1: ethernet-phy at 9 { + reg = <9>; + }; + + phy0: ethernet-phy at 8 { + reg = <8>; + }; +}; + +&mbox { + status = "okay"; +}; + +&mmc { + bus-width = <4>; + disable-wp; + cap-sd-highspeed; + cap-mmc-highspeed; + mmc-ddr-1_8v; + mmc-hs200-1_8v; + sd-uhs-sdr12; + sd-uhs-sdr25; + sd-uhs-sdr50; + sd-uhs-sdr104; + status = "okay"; +}; + +&mmuart1 { + status = "okay"; +}; + +&mmuart2 { + status = "okay"; +}; + +&mmuart3 { + status = "okay"; +}; + +&mmuart4 { + status = "okay"; +}; + +&pcie { + status = "okay"; +}; + +&qspi { + status = "okay"; +}; + +&refclk { + clock-frequency = <125000000>; +}; + +&refclk_ccc { + clock-frequency = <50000000>; +}; + +&rtc { + status = "okay"; +}; + +&spi0 { + status = "okay"; +}; + +&spi1 { + status = "okay"; +}; + +&syscontroller { + status = "okay"; +}; + +&syscontroller_qspi { + /* + * The flash *is* there, but Icicle kits that have engineering sample + * silicon (write?) access to this flash to non-functional. The system + * controller itself can actually access it, but the MSS cannot write + * an image there. Instantiating a coreQSPI in the fabric & connecting + * it to the flash instead should work though. Pre-production or later + * silicon does not have this issue. + */ + status = "disabled"; + + sys_ctrl_flash: flash at 0 { // MT25QL01GBBB8ESF-0SIT + compatible = "jedec,spi-nor"; + #address-cells = <1>; + #size-cells = <1>; + spi-max-frequency = <20000000>; + spi-rx-bus-width = <1>; + reg = <0>; + }; +}; + +&usb { + status = "okay"; + dr_mode = "host"; +}; diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts index f80df225f72b..2cb08ed0946d 100644 --- a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts @@ -3,249 +3,10 @@ /dts-v1/; -#include "mpfs.dtsi" -#include "mpfs-icicle-kit-fabric.dtsi" -#include -#include +#include "mpfs-icicle-kit-common.dtsi" / { model = "Microchip PolarFire-SoC Icicle Kit"; compatible = "microchip,mpfs-icicle-reference-rtlv2210", "microchip,mpfs-icicle-kit", "microchip,mpfs"; - - aliases { - ethernet0 = &mac1; - serial0 = &mmuart0; - serial1 = &mmuart1; - serial2 = &mmuart2; - serial3 = &mmuart3; - serial4 = &mmuart4; - }; - - chosen { - stdout-path = "serial1:115200n8"; - }; - - leds { - compatible = "gpio-leds"; - - led-1 { - gpios = <&gpio2 16 GPIO_ACTIVE_HIGH>; - color = ; - label = "led1"; - }; - - led-2 { - gpios = <&gpio2 17 GPIO_ACTIVE_HIGH>; - color = ; - label = "led2"; - }; - - led-3 { - gpios = <&gpio2 18 GPIO_ACTIVE_HIGH>; - color = ; - label = "led3"; - }; - - led-4 { - gpios = <&gpio2 19 GPIO_ACTIVE_HIGH>; - color = ; - label = "led4"; - }; - }; - - ddrc_cache_lo: memory at 80000000 { - device_type = "memory"; - reg = <0x0 0x80000000 0x0 0x40000000>; - status = "okay"; - }; - - ddrc_cache_hi: memory at 1040000000 { - device_type = "memory"; - reg = <0x10 0x40000000 0x0 0x40000000>; - status = "okay"; - }; - - reserved-memory { - #address-cells = <2>; - #size-cells = <2>; - ranges; - - hss_payload: region at BFC00000 { - reg = <0x0 0xBFC00000 0x0 0x400000>; - no-map; - }; - }; -}; - -&core_pwm0 { - status = "okay"; -}; - -&gpio2 { - interrupts = <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>, - <53>, <53>, <53>, <53>; - status = "okay"; -}; - -&i2c0 { - status = "okay"; -}; - -&i2c1 { - status = "okay"; - - power-monitor at 10 { - compatible = "microchip,pac1934"; - reg = <0x10>; - - #address-cells = <1>; - #size-cells = <0>; - - channel at 1 { - reg = <0x1>; - shunt-resistor-micro-ohms = <10000>; - label = "VDDREG"; - }; - - channel at 2 { - reg = <0x2>; - shunt-resistor-micro-ohms = <10000>; - label = "VDDA25"; - }; - - channel at 3 { - reg = <0x3>; - shunt-resistor-micro-ohms = <10000>; - label = "VDD25"; - }; - - channel at 4 { - reg = <0x4>; - shunt-resistor-micro-ohms = <10000>; - label = "VDDA_REG"; - }; - }; -}; - -&i2c2 { - status = "okay"; -}; - -&mac0 { - phy-mode = "sgmii"; - phy-handle = <&phy0>; - status = "okay"; -}; - -&mac1 { - phy-mode = "sgmii"; - phy-handle = <&phy1>; - status = "okay"; - - phy1: ethernet-phy at 9 { - reg = <9>; - }; - - phy0: ethernet-phy at 8 { - reg = <8>; - }; -}; - -&mbox { - status = "okay"; -}; - -&mmc { - bus-width = <4>; - disable-wp; - cap-sd-highspeed; - cap-mmc-highspeed; - mmc-ddr-1_8v; - mmc-hs200-1_8v; - sd-uhs-sdr12; - sd-uhs-sdr25; - sd-uhs-sdr50; - sd-uhs-sdr104; - status = "okay"; -}; - -&mmuart1 { - status = "okay"; -}; - -&mmuart2 { - status = "okay"; -}; - -&mmuart3 { - status = "okay"; -}; - -&mmuart4 { - status = "okay"; -}; - -&pcie { - status = "okay"; -}; - -&qspi { - status = "okay"; -}; - -&refclk { - clock-frequency = <125000000>; -}; - -&refclk_ccc { - clock-frequency = <50000000>; -}; - -&rtc { - status = "okay"; -}; - -&spi0 { - status = "okay"; -}; - -&spi1 { - status = "okay"; -}; - -&syscontroller { - status = "okay"; -}; - -&syscontroller_qspi { - /* - * The flash *is* there, but Icicle kits that have engineering sample - * silicon (write?) access to this flash to non-functional. The system - * controller itself can actually access it, but the MSS cannot write - * an image there. Instantiating a coreQSPI in the fabric & connecting - * it to the flash instead should work though. Pre-production or later - * silicon does not have this issue. - */ - status = "disabled"; - - sys_ctrl_flash: flash at 0 { // MT25QL01GBBB8ESF-0SIT - compatible = "jedec,spi-nor"; - #address-cells = <1>; - #size-cells = <1>; - spi-max-frequency = <20000000>; - spi-rx-bus-width = <1>; - reg = <0>; - }; -}; - -&usb { - status = "okay"; - dr_mode = "host"; }; -- 2.34.1 From valentina.fernandezalanis at microchip.com Mon Sep 8 04:57:28 2025 From: valentina.fernandezalanis at microchip.com (Valentina Fernandez) Date: Mon, 8 Sep 2025 12:57:28 +0100 Subject: [PATCH v3 2/6] dt-bindings: riscv: microchip: document icicle kit with production device In-Reply-To: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> References: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> Message-ID: <20250908115732.31092-3-valentina.fernandezalanis@microchip.com> With the introduction of the Icicle Kit using the production MPFS250T device, it's necessary to distinguish it from the engineering sample (-es) variant. Engineering samples cannot write to flash from the MSS, as noted in the PolarFire SoC FPGA ES errata. Add specific compatibles for the Icicle Kit with Production device (MPFS250T) and Icicle Kit with Engineering Sample (MPFS250T_ES). The icicle kit reference designs in the v2025.07 release include the Mi-V IHC IP v2, used to send/receive data between clusters when using Asymmetric Multiprocessing (AMP) mode. In reference design releases prior to v2025.07, the MI-V IHC subsystem was included as a proof of concept in the design prior to becoming an IP available in the Libero catalog. Among other improvements, the new Mi-V IHC IP v2 includes some changes to the register map. For this reason, make use of a new reference design compatible to denote that v2025.07 reference design releases are not backwards compatible. Signed-off-by: Valentina Fernandez --- Documentation/devicetree/bindings/riscv/microchip.yaml | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/Documentation/devicetree/bindings/riscv/microchip.yaml b/Documentation/devicetree/bindings/riscv/microchip.yaml index 78ce76ae1b6d..8ddc5c02973e 100644 --- a/Documentation/devicetree/bindings/riscv/microchip.yaml +++ b/Documentation/devicetree/bindings/riscv/microchip.yaml @@ -18,10 +18,18 @@ properties: const: '/' compatible: oneOf: + - items: + - const: microchip,mpfs-icicle-prod-reference-rtl-v2507 + - const: microchip,mpfs-icicle-kit-prod + - const: microchip,mpfs-icicle-kit + - const: microchip,mpfs-prod + - const: microchip,mpfs + - items: - enum: - microchip,mpfs-icicle-reference-rtlv2203 - microchip,mpfs-icicle-reference-rtlv2210 + - microchip,mpfs-icicle-es-reference-rtl-v2507 - const: microchip,mpfs-icicle-kit - const: microchip,mpfs -- 2.34.1 From valentina.fernandezalanis at microchip.com Mon Sep 8 04:57:29 2025 From: valentina.fernandezalanis at microchip.com (Valentina Fernandez) Date: Mon, 8 Sep 2025 12:57:29 +0100 Subject: [PATCH v3 3/6] riscv: dts: microchip: add icicle kit with production device In-Reply-To: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> References: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> Message-ID: <20250908115732.31092-4-valentina.fernandezalanis@microchip.com> With the introduction of the Icicle Kit using the production MPFS250T device, it's necessary to distinguish it from the engineering sample (-es) variant. Engineering samples cannot write to flash from the MSS, as noted in the PolarFire SoC FPGA ES errata. Add a new device tree (mpfs-icicle-kit-prod.dts) for the production board which includes the icicle kit common dtsi and enable the system controller SPI flash, which is only accessible on production silicon. Remove redundant board compatible from fabric dtsi and update board compatibles for v2025.07 release, which includes Mi-V IHC v2 for AMP cluster communication. Signed-off-by: Valentina Fernandez --- arch/riscv/boot/dts/microchip/Makefile | 1 + .../dts/microchip/mpfs-icicle-kit-common.dtsi | 4 ++++ .../dts/microchip/mpfs-icicle-kit-fabric.dtsi | 23 ++++++++++++++++--- .../dts/microchip/mpfs-icicle-kit-prod.dts | 23 +++++++++++++++++++ .../boot/dts/microchip/mpfs-icicle-kit.dts | 3 ++- 5 files changed, 50 insertions(+), 4 deletions(-) create mode 100644 arch/riscv/boot/dts/microchip/mpfs-icicle-kit-prod.dts diff --git a/arch/riscv/boot/dts/microchip/Makefile b/arch/riscv/boot/dts/microchip/Makefile index f51aeeb9fd3b..1e2f4e41bf0d 100644 --- a/arch/riscv/boot/dts/microchip/Makefile +++ b/arch/riscv/boot/dts/microchip/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-beaglev-fire.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-icicle-kit.dtb +dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-icicle-kit-prod.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-m100pfsevp.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-polarberry.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-sev-kit.dtb diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi index eafea3b69cd7..5c7a8ffad85b 100644 --- a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi @@ -134,6 +134,10 @@ &i2c2 { status = "okay"; }; +&ihc { + status = "okay"; +}; + &mac0 { phy-mode = "sgmii"; phy-handle = <&phy0>; diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-fabric.dtsi b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-fabric.dtsi index a6dda55a2d1d..e673b676fd1a 100644 --- a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-fabric.dtsi +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-fabric.dtsi @@ -2,9 +2,6 @@ /* Copyright (c) 2020-2021 Microchip Technology Inc */ / { - compatible = "microchip,mpfs-icicle-reference-rtlv2210", "microchip,mpfs-icicle-kit", - "microchip,mpfs"; - core_pwm0: pwm at 40000000 { compatible = "microchip,corepwm-rtl-v4"; reg = <0x0 0x40000000 0x0 0xF0>; @@ -26,6 +23,26 @@ i2c2: i2c at 40000200 { status = "disabled"; }; + ihc: mailbox { + compatible = "microchip,sbi-ipc"; + interrupt-parent = <&plic>; + interrupts = <180>, <179>, <178>, <177>; + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; + #mbox-cells = <1>; + status = "disabled"; + }; + + mailbox at 50000000 { + compatible = "microchip,miv-ihc-rtl-v2"; + reg = <0x0 0x50000000 0x0 0x1c000>; + interrupt-parent = <&plic>; + interrupts = <180>, <179>, <178>, <177>; + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; + #mbox-cells = <1>; + microchip,ihc-chan-disabled-mask = /bits/ 16 <0>; + status = "disabled"; + }; + pcie: pcie at 3000000000 { compatible = "microchip,pcie-host-1.0"; #address-cells = <0x3>; diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-prod.dts b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-prod.dts new file mode 100644 index 000000000000..8afedece89d1 --- /dev/null +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-prod.dts @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: (GPL-2.0 OR MIT) +/* Copyright (c) 2025 Microchip Technology Inc */ + +/dts-v1/; + +#include "mpfs-icicle-kit-common.dtsi" + +/ { + model = "Microchip PolarFire-SoC Icicle Kit (Production Silicon)"; + compatible = "microchip,mpfs-icicle-prod-reference-rtl-v2507", + "microchip,mpfs-icicle-kit-prod", + "microchip,mpfs-icicle-kit", + "microchip,mpfs-prod", + "microchip,mpfs"; +}; + +&syscontroller { + microchip,bitstream-flash = <&sys_ctrl_flash>; +}; + +&syscontroller_qspi { + status = "okay"; +}; diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts index 2cb08ed0946d..556aa9638282 100644 --- a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts @@ -7,6 +7,7 @@ / { model = "Microchip PolarFire-SoC Icicle Kit"; - compatible = "microchip,mpfs-icicle-reference-rtlv2210", "microchip,mpfs-icicle-kit", + compatible = "microchip,mpfs-icicle-es-reference-rtl-v2507", + "microchip,mpfs-icicle-kit", "microchip,mpfs"; }; -- 2.34.1 From valentina.fernandezalanis at microchip.com Mon Sep 8 04:57:30 2025 From: valentina.fernandezalanis at microchip.com (Valentina Fernandez) Date: Mon, 8 Sep 2025 12:57:30 +0100 Subject: [PATCH v3 4/6] riscv: dts: microchip: rename icicle kit ccc clock and other minor fixes In-Reply-To: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> References: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> Message-ID: <20250908115732.31092-5-valentina.fernandezalanis@microchip.com> Rename the Clock Conditioning Circuit (CCC) reference clock to match the fixed clock bindings naming recommendation. Update the reserved memory regions in the Icicle Kit common dtsi to use lowercase hex and drop the redundant status properties from the memory regions, as they are not required. Signed-off-by: Valentina Fernandez --- arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi | 6 ++---- arch/riscv/boot/dts/microchip/mpfs-icicle-kit-fabric.dtsi | 2 +- 2 files changed, 3 insertions(+), 5 deletions(-) diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi index 5c7a8ffad85b..e01a216e6c3a 100644 --- a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-common.dtsi @@ -53,13 +53,11 @@ led-4 { ddrc_cache_lo: memory at 80000000 { device_type = "memory"; reg = <0x0 0x80000000 0x0 0x40000000>; - status = "okay"; }; ddrc_cache_hi: memory at 1040000000 { device_type = "memory"; reg = <0x10 0x40000000 0x0 0x40000000>; - status = "okay"; }; reserved-memory { @@ -67,8 +65,8 @@ reserved-memory { #size-cells = <2>; ranges; - hss_payload: region at BFC00000 { - reg = <0x0 0xBFC00000 0x0 0x400000>; + hss_payload: region at bfc00000 { + reg = <0x0 0xbfc00000 0x0 0x400000>; no-map; }; }; diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-fabric.dtsi b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-fabric.dtsi index e673b676fd1a..71f724325578 100644 --- a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-fabric.dtsi +++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit-fabric.dtsi @@ -74,7 +74,7 @@ pcie_intc: interrupt-controller { }; }; - refclk_ccc: cccrefclk { + refclk_ccc: clock-cccref { compatible = "fixed-clock"; #clock-cells = <0>; }; -- 2.34.1 From valentina.fernandezalanis at microchip.com Mon Sep 8 04:57:31 2025 From: valentina.fernandezalanis at microchip.com (Valentina Fernandez) Date: Mon, 8 Sep 2025 12:57:31 +0100 Subject: [PATCH v3 5/6] dt-bindings: riscv: microchip: document Discovery Kit In-Reply-To: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> References: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> Message-ID: <20250908115732.31092-6-valentina.fernandezalanis@microchip.com> The Discovery Kit (MPFS-DISCO-KIT) is a development board featuring a Microchip PolarFire SoC MPFS095T. Link: https://www.microchip.com/en-us/development-tool/mpfs-disco-kit Signed-off-by: Valentina Fernandez --- Documentation/devicetree/bindings/riscv/microchip.yaml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/Documentation/devicetree/bindings/riscv/microchip.yaml b/Documentation/devicetree/bindings/riscv/microchip.yaml index 8ddc5c02973e..381d6eb6672e 100644 --- a/Documentation/devicetree/bindings/riscv/microchip.yaml +++ b/Documentation/devicetree/bindings/riscv/microchip.yaml @@ -33,6 +33,11 @@ properties: - const: microchip,mpfs-icicle-kit - const: microchip,mpfs + - items: + - const: microchip,mpfs-disco-kit-reference-rtl-v2507 + - const: microchip,mpfs-disco-kit + - const: microchip,mpfs + - items: - enum: - aldec,tysom-m-mpfs250t-rev2 -- 2.34.1 From valentina.fernandezalanis at microchip.com Mon Sep 8 04:57:32 2025 From: valentina.fernandezalanis at microchip.com (Valentina Fernandez) Date: Mon, 8 Sep 2025 12:57:32 +0100 Subject: [PATCH v3 6/6] riscv: dts: microchip: add a device tree for Discovery Kit In-Reply-To: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> References: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> Message-ID: <20250908115732.31092-7-valentina.fernandezalanis@microchip.com> Add a minimal device tree for the Microchip PolarFire SoC Discovery Kit. The Discovery Kit is a cost-optimized board based on PolarFire SoC MPFS095T and features: - 1 GB DDR4x16 - 1x Gigabit Ethernet - 3x UARTs - Raspberry Pi connector - mikroBus connector - microSD card connector Link: https://www.microchip.com/en-us/development-tool/mpfs-disco-kit Signed-off-by: Valentina Fernandez --- arch/riscv/boot/dts/microchip/Makefile | 1 + .../dts/microchip/mpfs-disco-kit-fabric.dtsi | 58 ++++++ .../boot/dts/microchip/mpfs-disco-kit.dts | 190 ++++++++++++++++++ 3 files changed, 249 insertions(+) create mode 100644 arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi create mode 100644 arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts diff --git a/arch/riscv/boot/dts/microchip/Makefile b/arch/riscv/boot/dts/microchip/Makefile index 1e2f4e41bf0d..345ed7a48cc1 100644 --- a/arch/riscv/boot/dts/microchip/Makefile +++ b/arch/riscv/boot/dts/microchip/Makefile @@ -1,5 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-beaglev-fire.dtb +dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-disco-kit.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-icicle-kit.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-icicle-kit-prod.dtb dtb-$(CONFIG_ARCH_MICROCHIP_POLARFIRE) += mpfs-m100pfsevp.dtb diff --git a/arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi b/arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi new file mode 100644 index 000000000000..ae8be7d6f392 --- /dev/null +++ b/arch/riscv/boot/dts/microchip/mpfs-disco-kit-fabric.dtsi @@ -0,0 +1,58 @@ +// SPDX-License-Identifier: (GPL-2.0 OR MIT) +/* Copyright (c) 2020-2025 Microchip Technology Inc */ + +/ { + core_pwm0: pwm at 40000000 { + compatible = "microchip,corepwm-rtl-v4"; + reg = <0x0 0x40000000 0x0 0xF0>; + microchip,sync-update-mask = /bits/ 32 <0>; + #pwm-cells = <3>; + clocks = <&ccc_sw CLK_CCC_PLL0_OUT3>; + status = "disabled"; + }; + + i2c2: i2c at 40000200 { + compatible = "microchip,corei2c-rtl-v7"; + reg = <0x0 0x40000200 0x0 0x100>; + #address-cells = <1>; + #size-cells = <0>; + clocks = <&ccc_sw CLK_CCC_PLL0_OUT3>; + interrupt-parent = <&plic>; + interrupts = <122>; + clock-frequency = <100000>; + status = "disabled"; + }; + + ihc: mailbox { + compatible = "microchip,sbi-ipc"; + interrupt-parent = <&plic>; + interrupts = <180>, <179>, <178>, <177>; + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; + #mbox-cells = <1>; + status = "disabled"; + }; + + mailbox at 50000000 { + compatible = "microchip,miv-ihc-rtl-v2"; + reg = <0x0 0x50000000 0x0 0x1c000>; + interrupt-parent = <&plic>; + interrupts = <180>, <179>, <178>, <177>; + interrupt-names = "hart-1", "hart-2", "hart-3", "hart-4"; + #mbox-cells = <1>; + microchip,ihc-chan-disabled-mask = /bits/ 16 <0>; + status = "disabled"; + }; + + refclk_ccc: clock-cccref { + compatible = "fixed-clock"; + #clock-cells = <0>; + }; +}; + +&ccc_sw { + clocks = <&refclk_ccc>, <&refclk_ccc>, <&refclk_ccc>, <&refclk_ccc>, + <&refclk_ccc>, <&refclk_ccc>; + clock-names = "pll0_ref0", "pll0_ref1", "pll1_ref0", "pll1_ref1", + "dll0_ref", "dll1_ref"; + status = "okay"; +}; diff --git a/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts b/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts new file mode 100644 index 000000000000..c068b9bb5bfd --- /dev/null +++ b/arch/riscv/boot/dts/microchip/mpfs-disco-kit.dts @@ -0,0 +1,190 @@ +// SPDX-License-Identifier: (GPL-2.0 OR MIT) +/* Copyright (c) 2020-2025 Microchip Technology Inc */ + +/dts-v1/; + +#include "mpfs.dtsi" +#include "mpfs-disco-kit-fabric.dtsi" +#include +#include + +/ { + model = "Microchip PolarFire-SoC Discovery Kit"; + compatible = "microchip,mpfs-disco-kit-reference-rtl-v2507", + "microchip,mpfs-disco-kit", + "microchip,mpfs"; + + aliases { + ethernet0 = &mac0; + serial4 = &mmuart4; + }; + + chosen { + stdout-path = "serial4:115200n8"; + }; + + leds { + compatible = "gpio-leds"; + + led-1 { + gpios = <&gpio2 17 GPIO_ACTIVE_HIGH>; + color = ; + label = "led1"; + }; + + led-2 { + gpios = <&gpio2 18 GPIO_ACTIVE_HIGH>; + color = ; + label = "led2"; + }; + + led-3 { + gpios = <&gpio2 19 GPIO_ACTIVE_HIGH>; + color = ; + label = "led3"; + }; + + led-4 { + gpios = <&gpio2 20 GPIO_ACTIVE_HIGH>; + color = ; + label = "led4"; + }; + + led-5 { + gpios = <&gpio2 21 GPIO_ACTIVE_HIGH>; + color = ; + label = "led5"; + }; + + led-6 { + gpios = <&gpio2 22 GPIO_ACTIVE_HIGH>; + color = ; + label = "led6"; + }; + + led-7 { + gpios = <&gpio2 23 GPIO_ACTIVE_HIGH>; + color = ; + label = "led7"; + }; + + led-8 { + gpios = <&gpio1 9 GPIO_ACTIVE_HIGH>; + color = ; + label = "led8"; + }; + }; + + ddrc_cache_lo: memory at 80000000 { + device_type = "memory"; + reg = <0x0 0x80000000 0x0 0x40000000>; + }; + + reserved-memory { + #address-cells = <2>; + #size-cells = <2>; + ranges; + + hss_payload: region at bfc00000 { + reg = <0x0 0xbfc00000 0x0 0x400000>; + no-map; + }; + }; +}; + +&core_pwm0 { + status = "okay"; +}; + +&gpio1 { + interrupts = <27>, <28>, <29>, <30>, + <31>, <32>, <33>, <47>, + <35>, <36>, <37>, <38>, + <39>, <40>, <41>, <42>, + <43>, <44>, <45>, <46>, + <47>, <48>, <49>, <50>; + status = "okay"; +}; + +&gpio2 { + interrupts = <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>, + <53>, <53>, <53>, <53>; + status = "okay"; +}; + +&i2c0 { + status = "okay"; +}; + +&i2c2 { + status = "okay"; +}; + +&ihc { + status = "okay"; +}; + +&mac0 { + phy-mode = "sgmii"; + phy-handle = <&phy0>; + status = "okay"; + + phy0: ethernet-phy at b { + reg = <0xb>; + }; +}; + +&mbox { + status = "okay"; +}; + +&mmc { + bus-width = <4>; + disable-wp; + cap-sd-highspeed; + cap-mmc-highspeed; + sd-uhs-sdr12; + sd-uhs-sdr25; + sd-uhs-sdr50; + sd-uhs-sdr104; + no-1-8-v; + status = "okay"; +}; + +&mmuart1 { + status = "okay"; +}; + +&mmuart4 { + status = "okay"; +}; + +&refclk { + clock-frequency = <125000000>; +}; + +&refclk_ccc { + clock-frequency = <50000000>; +}; + +&rtc { + status = "okay"; +}; + +&spi0 { + status = "okay"; +}; + +&spi1 { + status = "okay"; +}; + +&syscontroller { + status = "okay"; +}; -- 2.34.1 From lorenzo.stoakes at oracle.com Mon Sep 8 05:25:29 2025 From: lorenzo.stoakes at oracle.com (Lorenzo Stoakes) Date: Mon, 8 Sep 2025 13:25:29 +0100 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <85e760cf-b994-40db-8d13-221feee55c60@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <016307ba-427d-4646-8e4d-1ffefd2c1968@nvidia.com> <85e760cf-b994-40db-8d13-221feee55c60@redhat.com> Message-ID: <727cabec-5ee8-4793-926b-8d78febcd623@lucifer.local> On Sat, Sep 06, 2025 at 08:56:48AM +0200, David Hildenbrand wrote: > On 06.09.25 03:05, John Hubbard wrote: > > > > Probably a similar sentiment as Lorenzo here...the above diffs make the code > > *worse* to read. In fact, I recall adding record_subpages() here long ago, > > specifically to help clarify what was going on. > > Well, there is a lot I dislike about record_subpages() to go back there. > Starting with "as Willy keeps explaining, the concept of subpages do > not exist and ending with "why do we fill out the array even on failure". Yes > > :) > > > > > Now it's been returned to it's original, cryptic form. > > > > The code in the caller was so uncryptic that both me and Lorenzo missed > that magical addition. :P :'( > > > Just my take on it, for whatever that's worth. :) > > As always, appreciated. > > I could of course keep the simple loop in some "record_folio_pages" > function and clean up what I dislike about record_subpages(). > > But I much rather want the call chain to be cleaned up instead, if possible. > > > Roughly, what I am thinking (limiting it to pte+pmd case) about is the following: I cannot get the below to apply even with the original patch here applied + fix. It looks like (in mm-new :) commit e73f43a66d5f ("mm/gup: remove dead pgmap refcounting code") by Alastair has conflicted here, but even then I can't make it apply, with/without your fix...! > > > From d6d6d21dbf435d8030782a627175e36e6c7b2dfb Mon Sep 17 00:00:00 2001 > From: David Hildenbrand > Date: Sat, 6 Sep 2025 08:33:42 +0200 > Subject: [PATCH] tmp > > Signed-off-by: David Hildenbrand > --- > mm/gup.c | 79 ++++++++++++++++++++++++++------------------------------ > 1 file changed, 36 insertions(+), 43 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index 22420f2069ee1..98907ead749c0 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -2845,12 +2845,11 @@ static void __maybe_unused gup_fast_undo_dev_pagemap(int *nr, int nr_start, > * also check pmd here to make sure pmd doesn't change (corresponds to > * pmdp_collapse_flush() in the THP collapse code path). > */ > -static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, > - unsigned long end, unsigned int flags, struct page **pages, > - int *nr) > +static unsigned long gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, > + unsigned long end, unsigned int flags, struct page **pages) > { > struct dev_pagemap *pgmap = NULL; > - int ret = 0; > + unsigned long nr_pages = 0; > pte_t *ptep, *ptem; > ptem = ptep = pte_offset_map(&pmd, addr); > @@ -2908,24 +2907,20 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, > * details. > */ > if (flags & FOLL_PIN) { > - ret = arch_make_folio_accessible(folio); > - if (ret) { > + if (arch_make_folio_accessible(folio)) { > gup_put_folio(folio, 1, flags); > goto pte_unmap; > } > } > folio_set_referenced(folio); > - pages[*nr] = page; > - (*nr)++; > + pages[nr_pages++] = page; > } while (ptep++, addr += PAGE_SIZE, addr != end); > - ret = 1; > - > pte_unmap: > if (pgmap) > put_dev_pagemap(pgmap); > pte_unmap(ptem); > - return ret; > + return nr_pages; > } > #else > @@ -2938,21 +2933,24 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, > * get_user_pages_fast_only implementation that can pin pages. Thus it's still > * useful to have gup_fast_pmd_leaf even if we can't operate on ptes. > */ > -static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, > - unsigned long end, unsigned int flags, struct page **pages, > - int *nr) > +static unsigned long gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, > + unsigned long end, unsigned int flags, struct page **pages) > { > return 0; > } > #endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */ > -static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > - unsigned long end, unsigned int flags, struct page **pages, > - int *nr) > +static unsigned long gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > + unsigned long end, unsigned int flags, struct page **pages) > { > + const unsigned long nr_pages = (end - addr) >> PAGE_SHIFT; > struct page *page; > struct folio *folio; > - int refs; > + unsigned long i; > + > + /* See gup_fast_pte_range() */ > + if (pmd_protnone(orig)) > + return 0; > if (!pmd_access_permitted(orig, flags & FOLL_WRITE)) > return 0; > @@ -2960,33 +2958,30 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, > if (pmd_special(orig)) > return 0; > - refs = (end - addr) >> PAGE_SHIFT; > page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); > - folio = try_grab_folio_fast(page, refs, flags); > + folio = try_grab_folio_fast(page, nr_pages, flags); > if (!folio) > return 0; > if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) { > - gup_put_folio(folio, refs, flags); > + gup_put_folio(folio, nr_pages, flags); > return 0; > } > if (!gup_fast_folio_allowed(folio, flags)) { > - gup_put_folio(folio, refs, flags); > + gup_put_folio(folio, nr_pages, flags); > return 0; > } > if (!pmd_write(orig) && gup_must_unshare(NULL, flags, &folio->page)) { > - gup_put_folio(folio, refs, flags); > + gup_put_folio(folio, nr_pages, flags); > return 0; > } > - pages += *nr; > - *nr += refs; > - for (; refs; refs--) > + for (i = 0; i < nr_pages; i++) > *(pages++) = page++; > folio_set_referenced(folio); > - return 1; > + return nr_pages; > } > static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > @@ -3033,11 +3028,11 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > return 1; > } > -static int gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, > - unsigned long end, unsigned int flags, struct page **pages, > - int *nr) > +static unsigned long gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, > + unsigned long end, unsigned int flags, struct page **pages) > { > - unsigned long next; > + unsigned long cur_nr_pages, next; > + unsigned long nr_pages = 0; > pmd_t *pmdp; > pmdp = pmd_offset_lockless(pudp, pud, addr); > @@ -3046,23 +3041,21 @@ static int gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, > next = pmd_addr_end(addr, end); > if (!pmd_present(pmd)) > - return 0; > + break; > - if (unlikely(pmd_leaf(pmd))) { > - /* See gup_fast_pte_range() */ > - if (pmd_protnone(pmd)) > - return 0; > + if (unlikely(pmd_leaf(pmd))) > + cur_nr_pages = gup_fast_pmd_leaf(pmd, pmdp, addr, next, flags, pages); > + else > + cur_nr_pages = gup_fast_pte_range(pmd, pmdp, addr, next, flags, pages); > - if (!gup_fast_pmd_leaf(pmd, pmdp, addr, next, flags, > - pages, nr)) > - return 0; > + nr_pages += cur_nr_pages; > + pages += cur_nr_pages; > - } else if (!gup_fast_pte_range(pmd, pmdp, addr, next, flags, > - pages, nr)) > - return 0; > + if (nr_pages != (next - addr) >> PAGE_SIZE) > + break; > } while (pmdp++, addr = next, addr != end); > - return 1; > + return nr_pages; > } > static int gup_fast_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr, OK I guess you intentionally left the rest as a TODO :) So I'll wait for you to post it before reviewing in-depth. This generally LGTM as an approach, getting rid of *nr is important that's really horrible. > -- > 2.50.1 > > > > Oh, I might even have found a bug moving away from that questionable > "ret==1 means success" handling in gup_fast_pte_range()? Will > have to double-check, but likely the following is the right thing to do. > > > > From 8f48b25ef93e7ef98611fd58ec89384ad5171782 Mon Sep 17 00:00:00 2001 > From: David Hildenbrand > Date: Sat, 6 Sep 2025 08:46:45 +0200 > Subject: [PATCH] mm/gup: fix handling of errors from > arch_make_folio_accessible() in follow_page_pte() > > In case we call arch_make_folio_accessible() and it fails, we would > incorrectly return a value that is "!= 0" to the caller, indicating that > we pinned all requested pages and that the caller can keep going. > > follow_page_pte() is not supposed to return error values, but instead > 0 on failure and 1 on success. > > That is of course wrong, because the caller will just keep going pinning > more pages. If we happen to pin a page afterwards, we're in trouble, > because we essentially skipped some pages. > > Fixes: f28d43636d6f ("mm/gup/writeback: add callbacks for inaccessible pages") > Signed-off-by: David Hildenbrand > --- > mm/gup.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index 22420f2069ee1..cff226ec0ee7d 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -2908,8 +2908,7 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, > * details. > */ > if (flags & FOLL_PIN) { > - ret = arch_make_folio_accessible(folio); > - if (ret) { > + if (arch_make_folio_accessible(folio)) { Oh Lord above. Lol. Yikes. Yeah I think your fix is valid... > gup_put_folio(folio, 1, flags); > goto pte_unmap; > } > -- > 2.50.1 > > > -- > Cheers > > David / dhildenb > From wangruikang at iscas.ac.cn Mon Sep 8 05:34:25 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Mon, 08 Sep 2025 20:34:25 +0800 Subject: [PATCH net-next v10 1/5] dt-bindings: net: Add support for SpacemiT K1 In-Reply-To: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> References: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> Message-ID: <20250908-net-k1-emac-v10-1-90d807ccd469@iscas.ac.cn> The Ethernet MACs on SpacemiT K1 appears to be a custom design. SpacemiT refers to them as "EMAC", so let's just call them "spacemit,k1-emac". Signed-off-by: Vivian Wang Reviewed-by: Conor Dooley --- .../devicetree/bindings/net/spacemit,k1-emac.yaml | 81 ++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/Documentation/devicetree/bindings/net/spacemit,k1-emac.yaml b/Documentation/devicetree/bindings/net/spacemit,k1-emac.yaml new file mode 100644 index 0000000000000000000000000000000000000000..500a3e1daa230ea3a1fad30d8ea56a7822fccb3d --- /dev/null +++ b/Documentation/devicetree/bindings/net/spacemit,k1-emac.yaml @@ -0,0 +1,81 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/spacemit,k1-emac.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: SpacemiT K1 Ethernet MAC + +allOf: + - $ref: ethernet-controller.yaml# + +maintainers: + - Vivian Wang + +properties: + compatible: + const: spacemit,k1-emac + + reg: + maxItems: 1 + + clocks: + maxItems: 1 + + interrupts: + maxItems: 1 + + mdio-bus: + $ref: mdio.yaml# + unevaluatedProperties: false + + resets: + maxItems: 1 + + spacemit,apmu: + $ref: /schemas/types.yaml#/definitions/phandle-array + items: + - items: + - description: phandle to syscon that controls this MAC + - description: offset of control registers + description: + A phandle to syscon with byte offset to control registers for this MAC + +required: + - compatible + - reg + - clocks + - interrupts + - resets + - spacemit,apmu + +unevaluatedProperties: false + +examples: + - | + #include + + ethernet at cac80000 { + compatible = "spacemit,k1-emac"; + reg = <0xcac80000 0x00000420>; + clocks = <&syscon_apmu CLK_EMAC0_BUS>; + interrupts = <131>; + mac-address = [ 00 00 00 00 00 00 ]; + phy-handle = <&rgmii0>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac0_cfg>; + resets = <&syscon_apmu RESET_EMAC0>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <0>; + spacemit,apmu = <&syscon_apmu 0x3e4>; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + rgmii0: phy at 1 { + reg = <0x1>; + }; + }; + }; -- 2.50.1 From wangruikang at iscas.ac.cn Mon Sep 8 05:34:27 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Mon, 08 Sep 2025 20:34:27 +0800 Subject: [PATCH net-next v10 3/5] riscv: dts: spacemit: Add Ethernet support for K1 In-Reply-To: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> References: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> Message-ID: <20250908-net-k1-emac-v10-3-90d807ccd469@iscas.ac.cn> Add nodes for each of the two Ethernet MACs on K1 with generic properties. Also add "gmac" pins to pinctrl config. Signed-off-by: Vivian Wang Reviewed-by: Yixun Lan --- arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi | 48 ++++++++++++++++++++++++++++ arch/riscv/boot/dts/spacemit/k1.dtsi | 22 +++++++++++++ 2 files changed, 70 insertions(+) diff --git a/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi b/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi index 3810557374228100be7adab58cd785c72e6d4aed..aff19c86d5ff381881016eaa87fc4809da65b50e 100644 --- a/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi +++ b/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi @@ -11,6 +11,54 @@ #define K1_GPIO(x) (x / 32) (x % 32) &pinctrl { + gmac0_cfg: gmac0-cfg { + gmac0-pins { + pinmux = , /* gmac0_rxdv */ + , /* gmac0_rx_d0 */ + , /* gmac0_rx_d1 */ + , /* gmac0_rx_clk */ + , /* gmac0_rx_d2 */ + , /* gmac0_rx_d3 */ + , /* gmac0_tx_d0 */ + , /* gmac0_tx_d1 */ + , /* gmac0_tx */ + , /* gmac0_tx_d2 */ + , /* gmac0_tx_d3 */ + , /* gmac0_tx_en */ + , /* gmac0_mdc */ + , /* gmac0_mdio */ + , /* gmac0_int_n */ + ; /* gmac0_clk_ref */ + + bias-pull-up = <0>; + drive-strength = <21>; + }; + }; + + gmac1_cfg: gmac1-cfg { + gmac1-pins { + pinmux = , /* gmac1_rxdv */ + , /* gmac1_rx_d0 */ + , /* gmac1_rx_d1 */ + , /* gmac1_rx_clk */ + , /* gmac1_rx_d2 */ + , /* gmac1_rx_d3 */ + , /* gmac1_tx_d0 */ + , /* gmac1_tx_d1 */ + , /* gmac1_tx */ + , /* gmac1_tx_d2 */ + , /* gmac1_tx_d3 */ + , /* gmac1_tx_en */ + , /* gmac1_mdc */ + , /* gmac1_mdio */ + , /* gmac1_int_n */ + ; /* gmac1_clk_ref */ + + bias-pull-up = <0>; + drive-strength = <21>; + }; + }; + uart0_2_cfg: uart0-2-cfg { uart0-2-pins { pinmux = , diff --git a/arch/riscv/boot/dts/spacemit/k1.dtsi b/arch/riscv/boot/dts/spacemit/k1.dtsi index abde8bb07c95c5a745736a2dd6f0c0e0d7c696e4..7b2ac3637d6d9fa1929418cc68aa25c57850ac7f 100644 --- a/arch/riscv/boot/dts/spacemit/k1.dtsi +++ b/arch/riscv/boot/dts/spacemit/k1.dtsi @@ -805,6 +805,28 @@ network-bus { #size-cells = <2>; dma-ranges = <0x0 0x00000000 0x0 0x00000000 0x0 0x80000000>, <0x0 0x80000000 0x1 0x00000000 0x0 0x80000000>; + + eth0: ethernet at cac80000 { + compatible = "spacemit,k1-emac"; + reg = <0x0 0xcac80000 0x0 0x420>; + clocks = <&syscon_apmu CLK_EMAC0_BUS>; + interrupts = <131>; + mac-address = [ 00 00 00 00 00 00 ]; + resets = <&syscon_apmu RESET_EMAC0>; + spacemit,apmu = <&syscon_apmu 0x3e4>; + status = "disabled"; + }; + + eth1: ethernet at cac81000 { + compatible = "spacemit,k1-emac"; + reg = <0x0 0xcac81000 0x0 0x420>; + clocks = <&syscon_apmu CLK_EMAC1_BUS>; + interrupts = <133>; + mac-address = [ 00 00 00 00 00 00 ]; + resets = <&syscon_apmu RESET_EMAC1>; + spacemit,apmu = <&syscon_apmu 0x3ec>; + status = "disabled"; + }; }; pcie-bus { -- 2.50.1 From wangruikang at iscas.ac.cn Mon Sep 8 05:34:29 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Mon, 08 Sep 2025 20:34:29 +0800 Subject: [PATCH net-next v10 5/5] riscv: dts: spacemit: Add Ethernet support for Jupiter In-Reply-To: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> References: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> Message-ID: <20250908-net-k1-emac-v10-5-90d807ccd469@iscas.ac.cn> Milk-V Jupiter uses an RGMII PHY for each port and uses GPIO for PHY reset. Signed-off-by: Vivian Wang Reviewed-by: Yixun Lan --- arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts | 46 +++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts b/arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts index 4483192141049caa201c093fb206b6134a064f42..c5933555c06b66f40e61fe2b9c159ba0770c2fa1 100644 --- a/arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts +++ b/arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts @@ -20,6 +20,52 @@ chosen { }; }; +ð0 { + phy-handle = <&rgmii0>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac0_cfg>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <0>; + status = "okay"; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + reset-gpios = <&gpio K1_GPIO(110) GPIO_ACTIVE_LOW>; + reset-delay-us = <10000>; + reset-post-delay-us = <100000>; + + rgmii0: phy at 1 { + reg = <0x1>; + }; + }; +}; + +ð1 { + phy-handle = <&rgmii1>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac1_cfg>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <250>; + status = "okay"; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + reset-gpios = <&gpio K1_GPIO(115) GPIO_ACTIVE_LOW>; + reset-delay-us = <10000>; + reset-post-delay-us = <100000>; + + rgmii1: phy at 1 { + reg = <0x1>; + }; + }; +}; + &uart0 { pinctrl-names = "default"; pinctrl-0 = <&uart0_2_cfg>; -- 2.50.1 From wangruikang at iscas.ac.cn Mon Sep 8 05:34:26 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Mon, 08 Sep 2025 20:34:26 +0800 Subject: [PATCH net-next v10 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> References: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> Message-ID: <20250908-net-k1-emac-v10-2-90d807ccd469@iscas.ac.cn> The Ethernet MACs found on SpacemiT K1 appears to be a custom design that only superficially resembles some other embedded MACs. SpacemiT refers to them as "EMAC", so let's just call the driver "k1_emac". Supports RGMII and RMII interfaces. Includes support for MAC hardware statistics counters. PTP support is not implemented. Signed-off-by: Vivian Wang Reviewed-by: Maxime Chevallier Reviewed-by: Vadim Fedorenko Reviewed-by: Troy Mitchell Tested-by: Junhui Liu Tested-by: Troy Mitchell --- drivers/net/ethernet/Kconfig | 1 + drivers/net/ethernet/Makefile | 1 + drivers/net/ethernet/spacemit/Kconfig | 29 + drivers/net/ethernet/spacemit/Makefile | 6 + drivers/net/ethernet/spacemit/k1_emac.c | 2156 +++++++++++++++++++++++++++++++ drivers/net/ethernet/spacemit/k1_emac.h | 406 ++++++ 6 files changed, 2599 insertions(+) diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig index f86d4557d8d7756a5e27bc17578353b5c19ca108..aead145dd91d129b7bb410f2d4d754c744dddbf4 100644 --- a/drivers/net/ethernet/Kconfig +++ b/drivers/net/ethernet/Kconfig @@ -188,6 +188,7 @@ source "drivers/net/ethernet/sis/Kconfig" source "drivers/net/ethernet/sfc/Kconfig" source "drivers/net/ethernet/smsc/Kconfig" source "drivers/net/ethernet/socionext/Kconfig" +source "drivers/net/ethernet/spacemit/Kconfig" source "drivers/net/ethernet/stmicro/Kconfig" source "drivers/net/ethernet/sun/Kconfig" source "drivers/net/ethernet/sunplus/Kconfig" diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile index 67182339469a0d8337cc4e92aa51e498c615156d..998dd628b202ced212748450753fe180f0440c74 100644 --- a/drivers/net/ethernet/Makefile +++ b/drivers/net/ethernet/Makefile @@ -91,6 +91,7 @@ obj-$(CONFIG_NET_VENDOR_SOLARFLARE) += sfc/ obj-$(CONFIG_NET_VENDOR_SGI) += sgi/ obj-$(CONFIG_NET_VENDOR_SMSC) += smsc/ obj-$(CONFIG_NET_VENDOR_SOCIONEXT) += socionext/ +obj-$(CONFIG_NET_VENDOR_SPACEMIT) += spacemit/ obj-$(CONFIG_NET_VENDOR_STMICRO) += stmicro/ obj-$(CONFIG_NET_VENDOR_SUN) += sun/ obj-$(CONFIG_NET_VENDOR_SUNPLUS) += sunplus/ diff --git a/drivers/net/ethernet/spacemit/Kconfig b/drivers/net/ethernet/spacemit/Kconfig new file mode 100644 index 0000000000000000000000000000000000000000..85ef61a9b4eff4249ad2d32a6e7dbf283b0c180f --- /dev/null +++ b/drivers/net/ethernet/spacemit/Kconfig @@ -0,0 +1,29 @@ +config NET_VENDOR_SPACEMIT + bool "SpacemiT devices" + default y + depends on ARCH_SPACEMIT || COMPILE_TEST + help + If you have a network (Ethernet) device belonging to this class, + say Y. + + Note that the answer to this question does not directly affect + the kernel: saying N will just cause the configurator to skip all + the questions regarding SpacemiT devices. If you say Y, you will + be asked for your specific chipset/driver in the following questions. + +if NET_VENDOR_SPACEMIT + +config SPACEMIT_K1_EMAC + tristate "SpacemiT K1 Ethernet MAC driver" + depends on ARCH_SPACEMIT || COMPILE_TEST + depends on MFD_SYSCON + depends on OF + default m if ARCH_SPACEMIT + select PHYLIB + help + This driver supports the Ethernet MAC in the SpacemiT K1 SoC. + + To compile this driver as a module, choose M here: the module + will be called k1_emac. + +endif # NET_VENDOR_SPACEMIT diff --git a/drivers/net/ethernet/spacemit/Makefile b/drivers/net/ethernet/spacemit/Makefile new file mode 100644 index 0000000000000000000000000000000000000000..d29efd997a4ff5dcb50986e439997df7e3650570 --- /dev/null +++ b/drivers/net/ethernet/spacemit/Makefile @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Makefile for the SpacemiT network device drivers. +# + +obj-$(CONFIG_SPACEMIT_K1_EMAC) += k1_emac.o diff --git a/drivers/net/ethernet/spacemit/k1_emac.c b/drivers/net/ethernet/spacemit/k1_emac.c new file mode 100644 index 0000000000000000000000000000000000000000..38c6eeb77b17fb12d0ace55f9e39429a26b2badb --- /dev/null +++ b/drivers/net/ethernet/spacemit/k1_emac.c @@ -0,0 +1,2156 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * SpacemiT K1 Ethernet driver + * + * Copyright (C) 2023-2025 SpacemiT (Hangzhou) Technology Co. Ltd + * Copyright (C) 2025 Vivian Wang + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "k1_emac.h" + +#define DRIVER_NAME "k1_emac" + +#define EMAC_DEFAULT_BUFSIZE 1536 +#define EMAC_RX_BUF_2K 2048 +#define EMAC_RX_BUF_4K 4096 + +/* Tuning parameters from SpacemiT */ +#define EMAC_TX_FRAMES 64 +#define EMAC_TX_COAL_TIMEOUT 40000 +#define EMAC_RX_FRAMES 64 +#define EMAC_RX_COAL_TIMEOUT (600 * 312) + +#define DEFAULT_FC_PAUSE_TIME 0xffff +#define DEFAULT_FC_FIFO_HIGH 1600 +#define DEFAULT_TX_ALMOST_FULL 0x1f8 +#define DEFAULT_TX_THRESHOLD 1518 +#define DEFAULT_RX_THRESHOLD 12 +#define DEFAULT_TX_RING_NUM 1024 +#define DEFAULT_RX_RING_NUM 1024 +#define DEFAULT_DMA_BURST MREGBIT_BURST_16WORD +#define HASH_TABLE_SIZE 64 + +struct desc_buf { + u64 dma_addr; + void *buff_addr; + u16 dma_len; + u8 map_as_page; +}; + +struct emac_tx_desc_buffer { + struct sk_buff *skb; + struct desc_buf buf[2]; +}; + +struct emac_rx_desc_buffer { + struct sk_buff *skb; + u64 dma_addr; + void *buff_addr; + u16 dma_len; + u8 map_as_page; +}; + +/** + * struct emac_desc_ring - Software-side information for one descriptor ring + * Same structure used for both RX and TX + * @desc_addr: Virtual address to the descriptor ring memory + * @desc_dma_addr: DMA address of the descriptor ring + * @total_size: Size of ring in bytes + * @total_cnt: Number of descriptors + * @head: Next descriptor to associate a buffer with + * @tail: Next descriptor to check status bit + * @rx_desc_buf: Array of descriptors for RX + * @tx_desc_buf: Array of descriptors for TX, with max of two buffers each + */ +struct emac_desc_ring { + void *desc_addr; + dma_addr_t desc_dma_addr; + u32 total_size; + u32 total_cnt; + u32 head; + u32 tail; + union { + struct emac_rx_desc_buffer *rx_desc_buf; + struct emac_tx_desc_buffer *tx_desc_buf; + }; +}; + +struct emac_priv { + void __iomem *iobase; + u32 dma_buf_sz; + struct emac_desc_ring tx_ring; + struct emac_desc_ring rx_ring; + + struct net_device *ndev; + struct napi_struct napi; + struct platform_device *pdev; + struct clk *bus_clk; + struct clk *ref_clk; + struct regmap *regmap_apmu; + u32 regmap_apmu_offset; + int irq; + + phy_interface_t phy_interface; + + struct emac_hw_tx_stats tx_stats, tx_stats_off; + struct emac_hw_rx_stats rx_stats, rx_stats_off; + + /* Just access with atomic operations, since K1 is 64-bit */ + u64 __percpu *stat_tx_dropped; + + u32 tx_count_frames; + u32 tx_coal_frames; + u32 tx_coal_timeout; + struct work_struct tx_timeout_task; + + struct timer_list txtimer; + struct timer_list stats_timer; + + u32 tx_delay; + u32 rx_delay; + + bool flow_control_autoneg; + u8 flow_control; + + /* Hold while touching hardware statistics */ + spinlock_t stats_lock; +}; + +static void emac_wr(struct emac_priv *priv, u32 reg, u32 val) +{ + writel(val, priv->iobase + reg); +} + +static int emac_rd(struct emac_priv *priv, u32 reg) +{ + return readl(priv->iobase + reg); +} + +static int emac_phy_interface_config(struct emac_priv *priv) +{ + u32 val = 0, mask = REF_CLK_SEL | RGMII_TX_CLK_SEL | PHY_INTF_RGMII; + + if (phy_interface_mode_is_rgmii(priv->phy_interface)) + val |= PHY_INTF_RGMII; + + regmap_update_bits(priv->regmap_apmu, + priv->regmap_apmu_offset + APMU_EMAC_CTRL_REG, + mask, val); + + return 0; +} + +/* + * Where the hardware expects a MAC address, it is laid out in this high, med, + * low order in three consecutive registers and in this format. + */ + +static void emac_set_mac_addr_reg(struct emac_priv *priv, + const unsigned char *addr, + u32 reg) +{ + emac_wr(priv, reg + sizeof(u32) * 0, addr[1] << 8 | addr[0]); + emac_wr(priv, reg + sizeof(u32) * 1, addr[3] << 8 | addr[2]); + emac_wr(priv, reg + sizeof(u32) * 2, addr[5] << 8 | addr[4]); +} + +static void emac_set_mac_addr(struct emac_priv *priv, const unsigned char *addr) +{ + /* We use only one address, so set the same for flow control as well */ + emac_set_mac_addr_reg(priv, addr, MAC_ADDRESS1_HIGH); + emac_set_mac_addr_reg(priv, addr, MAC_FC_SOURCE_ADDRESS_HIGH); +} + +static void emac_reset_hw(struct emac_priv *priv) +{ + /* Disable all interrupts */ + emac_wr(priv, MAC_INTERRUPT_ENABLE, 0x0); + emac_wr(priv, DMA_INTERRUPT_ENABLE, 0x0); + + /* Disable transmit and receive units */ + emac_wr(priv, MAC_RECEIVE_CONTROL, 0x0); + emac_wr(priv, MAC_TRANSMIT_CONTROL, 0x0); + + /* Disable DMA */ + emac_wr(priv, DMA_CONTROL, 0x0); +} + +static void emac_init_hw(struct emac_priv *priv) +{ + /* Destination address for 802.3x Ethernet flow control */ + u8 fc_dest_addr[ETH_ALEN] = { 0x01, 0x80, 0xc2, 0x00, 0x00, 0x01 }; + + u32 rxirq = 0, dma = 0; + + regmap_set_bits(priv->regmap_apmu, + priv->regmap_apmu_offset + APMU_EMAC_CTRL_REG, + AXI_SINGLE_ID); + + /* Disable transmit and receive units */ + emac_wr(priv, MAC_RECEIVE_CONTROL, 0x0); + emac_wr(priv, MAC_TRANSMIT_CONTROL, 0x0); + + /* Enable MAC address 1 filtering */ + emac_wr(priv, MAC_ADDRESS_CONTROL, MREGBIT_MAC_ADDRESS1_ENABLE); + + /* Zero initialize the multicast hash table */ + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, 0x0); + + /* Configure thresholds */ + emac_wr(priv, MAC_TRANSMIT_FIFO_ALMOST_FULL, DEFAULT_TX_ALMOST_FULL); + emac_wr(priv, MAC_TRANSMIT_PACKET_START_THRESHOLD, + DEFAULT_TX_THRESHOLD); + emac_wr(priv, MAC_RECEIVE_PACKET_START_THRESHOLD, DEFAULT_RX_THRESHOLD); + + /* Configure flow control (enabled in emac_adjust_link() later) */ + emac_set_mac_addr_reg(priv, fc_dest_addr, MAC_FC_SOURCE_ADDRESS_HIGH); + emac_wr(priv, MAC_FC_PAUSE_HIGH_THRESHOLD, DEFAULT_FC_FIFO_HIGH); + emac_wr(priv, MAC_FC_HIGH_PAUSE_TIME, DEFAULT_FC_PAUSE_TIME); + emac_wr(priv, MAC_FC_PAUSE_LOW_THRESHOLD, 0); + + /* RX IRQ mitigation */ + rxirq = FIELD_PREP(MREGBIT_RECEIVE_IRQ_FRAME_COUNTER_MASK, + EMAC_RX_FRAMES); + rxirq |= FIELD_PREP(MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_MASK, + EMAC_RX_COAL_TIMEOUT); + rxirq |= MREGBIT_RECEIVE_IRQ_MITIGATION_ENABLE; + emac_wr(priv, DMA_RECEIVE_IRQ_MITIGATION_CTRL, rxirq); + + /* Disable and set DMA config */ + emac_wr(priv, DMA_CONTROL, 0x0); + + emac_wr(priv, DMA_CONFIGURATION, MREGBIT_SOFTWARE_RESET); + usleep_range(9000, 10000); + emac_wr(priv, DMA_CONFIGURATION, 0x0); + usleep_range(9000, 10000); + + dma |= MREGBIT_STRICT_BURST; + dma |= MREGBIT_DMA_64BIT_MODE; + dma |= DEFAULT_DMA_BURST; + + emac_wr(priv, DMA_CONFIGURATION, dma); +} + +static void emac_dma_start_transmit(struct emac_priv *priv) +{ + /* The actual value written does not matter */ + emac_wr(priv, DMA_TRANSMIT_POLL_DEMAND, 1); +} + +static void emac_enable_interrupt(struct emac_priv *priv) +{ + u32 val; + + val = emac_rd(priv, DMA_INTERRUPT_ENABLE); + val |= MREGBIT_TRANSMIT_TRANSFER_DONE_INTR_ENABLE; + val |= MREGBIT_RECEIVE_TRANSFER_DONE_INTR_ENABLE; + emac_wr(priv, DMA_INTERRUPT_ENABLE, val); +} + +static void emac_disable_interrupt(struct emac_priv *priv) +{ + u32 val; + + val = emac_rd(priv, DMA_INTERRUPT_ENABLE); + val &= ~MREGBIT_TRANSMIT_TRANSFER_DONE_INTR_ENABLE; + val &= ~MREGBIT_RECEIVE_TRANSFER_DONE_INTR_ENABLE; + emac_wr(priv, DMA_INTERRUPT_ENABLE, val); +} + +static u32 emac_tx_avail(struct emac_priv *priv) +{ + struct emac_desc_ring *tx_ring = &priv->tx_ring; + u32 avail; + + if (tx_ring->tail > tx_ring->head) + avail = tx_ring->tail - tx_ring->head - 1; + else + avail = tx_ring->total_cnt - tx_ring->head + tx_ring->tail - 1; + + return avail; +} + +static void emac_tx_coal_timer_resched(struct emac_priv *priv) +{ + mod_timer(&priv->txtimer, + jiffies + usecs_to_jiffies(priv->tx_coal_timeout)); +} + +static void emac_tx_coal_timer(struct timer_list *t) +{ + struct emac_priv *priv = timer_container_of(priv, t, txtimer); + + napi_schedule(&priv->napi); +} + +static bool emac_tx_should_interrupt(struct emac_priv *priv, u32 pkt_num) +{ + priv->tx_count_frames += pkt_num; + if (likely(priv->tx_coal_frames > priv->tx_count_frames)) { + emac_tx_coal_timer_resched(priv); + return false; + } + + priv->tx_count_frames = 0; + return true; +} + +static void emac_free_tx_buf(struct emac_priv *priv, int i) +{ + struct emac_tx_desc_buffer *tx_buf; + struct emac_desc_ring *tx_ring; + struct desc_buf *buf; + int j; + + tx_ring = &priv->tx_ring; + tx_buf = &tx_ring->tx_desc_buf[i]; + + for (j = 0; j < 2; j++) { + buf = &tx_buf->buf[j]; + if (!buf->dma_addr) + continue; + + if (buf->map_as_page) + dma_unmap_page(&priv->pdev->dev, buf->dma_addr, + buf->dma_len, DMA_TO_DEVICE); + else + dma_unmap_single(&priv->pdev->dev, + buf->dma_addr, buf->dma_len, + DMA_TO_DEVICE); + + buf->dma_addr = 0; + buf->map_as_page = false; + buf->buff_addr = NULL; + } + + if (tx_buf->skb) { + dev_kfree_skb_any(tx_buf->skb); + tx_buf->skb = NULL; + } +} + +static void emac_clean_tx_desc_ring(struct emac_priv *priv) +{ + struct emac_desc_ring *tx_ring = &priv->tx_ring; + u32 i; + + /* Free all the TX ring skbs */ + for (i = 0; i < tx_ring->total_cnt; i++) + emac_free_tx_buf(priv, i); + + tx_ring->head = 0; + tx_ring->tail = 0; +} + +static void emac_clean_rx_desc_ring(struct emac_priv *priv) +{ + struct emac_rx_desc_buffer *rx_buf; + struct emac_desc_ring *rx_ring; + u32 i; + + rx_ring = &priv->rx_ring; + + /* Free all the RX ring skbs */ + for (i = 0; i < rx_ring->total_cnt; i++) { + rx_buf = &rx_ring->rx_desc_buf[i]; + + if (!rx_buf->skb) + continue; + + dma_unmap_single(&priv->pdev->dev, rx_buf->dma_addr, + rx_buf->dma_len, DMA_FROM_DEVICE); + + dev_kfree_skb(rx_buf->skb); + rx_buf->skb = NULL; + } + + rx_ring->tail = 0; + rx_ring->head = 0; +} + +static int emac_alloc_tx_resources(struct emac_priv *priv) +{ + struct emac_desc_ring *tx_ring = &priv->tx_ring; + struct platform_device *pdev = priv->pdev; + u32 size; + + size = sizeof(struct emac_tx_desc_buffer) * tx_ring->total_cnt; + + tx_ring->tx_desc_buf = kzalloc(size, GFP_KERNEL); + if (!tx_ring->tx_desc_buf) + return -ENOMEM; + + tx_ring->total_size = tx_ring->total_cnt * sizeof(struct emac_desc); + tx_ring->total_size = ALIGN(tx_ring->total_size, PAGE_SIZE); + + tx_ring->desc_addr = dma_alloc_coherent(&pdev->dev, tx_ring->total_size, + &tx_ring->desc_dma_addr, + GFP_KERNEL); + if (!tx_ring->desc_addr) { + kfree(tx_ring->tx_desc_buf); + return -ENOMEM; + } + + tx_ring->head = 0; + tx_ring->tail = 0; + + return 0; +} + +static int emac_alloc_rx_resources(struct emac_priv *priv) +{ + struct emac_desc_ring *rx_ring = &priv->rx_ring; + struct platform_device *pdev = priv->pdev; + u32 buf_len; + + buf_len = sizeof(struct emac_rx_desc_buffer) * rx_ring->total_cnt; + + rx_ring->rx_desc_buf = kzalloc(buf_len, GFP_KERNEL); + if (!rx_ring->rx_desc_buf) + return -ENOMEM; + + rx_ring->total_size = rx_ring->total_cnt * sizeof(struct emac_desc); + + rx_ring->total_size = ALIGN(rx_ring->total_size, PAGE_SIZE); + + rx_ring->desc_addr = dma_alloc_coherent(&pdev->dev, rx_ring->total_size, + &rx_ring->desc_dma_addr, + GFP_KERNEL); + if (!rx_ring->desc_addr) { + kfree(rx_ring->rx_desc_buf); + return -ENOMEM; + } + + rx_ring->head = 0; + rx_ring->tail = 0; + + return 0; +} + +static void emac_free_tx_resources(struct emac_priv *priv) +{ + struct emac_desc_ring *tr = &priv->tx_ring; + struct device *dev = &priv->pdev->dev; + + emac_clean_tx_desc_ring(priv); + + kfree(tr->tx_desc_buf); + tr->tx_desc_buf = NULL; + + dma_free_coherent(dev, tr->total_size, tr->desc_addr, + tr->desc_dma_addr); + tr->desc_addr = NULL; +} + +static void emac_free_rx_resources(struct emac_priv *priv) +{ + struct emac_desc_ring *rr = &priv->rx_ring; + struct device *dev = &priv->pdev->dev; + + emac_clean_rx_desc_ring(priv); + + kfree(rr->rx_desc_buf); + rr->rx_desc_buf = NULL; + + dma_free_coherent(dev, rr->total_size, rr->desc_addr, + rr->desc_dma_addr); + rr->desc_addr = NULL; +} + +static int emac_tx_clean_desc(struct emac_priv *priv) +{ + struct net_device *ndev = priv->ndev; + struct emac_desc_ring *tx_ring; + struct emac_desc *tx_desc; + u32 i; + + netif_tx_lock(ndev); + + tx_ring = &priv->tx_ring; + + i = tx_ring->tail; + + while (i != tx_ring->head) { + tx_desc = &((struct emac_desc *)tx_ring->desc_addr)[i]; + + /* Stop checking if desc still own by DMA */ + if (READ_ONCE(tx_desc->desc0) & TX_DESC_0_OWN) + break; + + emac_free_tx_buf(priv, i); + memset(tx_desc, 0, sizeof(struct emac_desc)); + + if (++i == tx_ring->total_cnt) + i = 0; + } + + tx_ring->tail = i; + + if (unlikely(netif_queue_stopped(ndev) && + emac_tx_avail(priv) > tx_ring->total_cnt / 4)) + netif_wake_queue(ndev); + + netif_tx_unlock(ndev); + + return 0; +} + +static bool emac_rx_frame_good(struct emac_priv *priv, struct emac_desc *desc) +{ + const char *msg; + u32 len; + + len = FIELD_GET(RX_DESC_0_FRAME_PACKET_LENGTH_MASK, desc->desc0); + + if (WARN_ON_ONCE(!(desc->desc0 & RX_DESC_0_LAST_DESCRIPTOR))) + msg = "Not last descriptor"; /* This would be a bug */ + else if (desc->desc0 & RX_DESC_0_FRAME_RUNT) + msg = "Runt frame"; + else if (desc->desc0 & RX_DESC_0_FRAME_CRC_ERR) + msg = "Frame CRC error"; + else if (desc->desc0 & RX_DESC_0_FRAME_MAX_LEN_ERR) + msg = "Frame exceeds max length"; + else if (desc->desc0 & RX_DESC_0_FRAME_JABBER_ERR) + msg = "Frame jabber error"; + else if (desc->desc0 & RX_DESC_0_FRAME_LENGTH_ERR) + msg = "Frame length error"; + else if (len <= ETH_FCS_LEN || len > priv->dma_buf_sz) + msg = "Frame length unacceptable"; + else + return true; /* All good */ + + dev_dbg_ratelimited(&priv->ndev->dev, "RX error: %s", msg); + + return false; +} + +static void emac_alloc_rx_desc_buffers(struct emac_priv *priv) +{ + struct emac_desc_ring *rx_ring = &priv->rx_ring; + struct emac_desc rx_desc, *rx_desc_addr; + struct net_device *ndev = priv->ndev; + struct emac_rx_desc_buffer *rx_buf; + struct sk_buff *skb; + u32 i; + + i = rx_ring->head; + rx_buf = &rx_ring->rx_desc_buf[i]; + + while (!rx_buf->skb) { + skb = netdev_alloc_skb_ip_align(ndev, priv->dma_buf_sz); + if (!skb) + break; + + skb->dev = ndev; + + rx_buf->skb = skb; + rx_buf->dma_len = priv->dma_buf_sz; + rx_buf->dma_addr = dma_map_single(&priv->pdev->dev, skb->data, + priv->dma_buf_sz, + DMA_FROM_DEVICE); + if (dma_mapping_error(&priv->pdev->dev, rx_buf->dma_addr)) { + dev_err_ratelimited(&ndev->dev, "Mapping skb failed\n"); + goto err_free_skb; + } + + rx_desc_addr = &((struct emac_desc *)rx_ring->desc_addr)[i]; + + memset(&rx_desc, 0, sizeof(rx_desc)); + + rx_desc.buffer_addr_1 = rx_buf->dma_addr; + rx_desc.desc1 = FIELD_PREP(RX_DESC_1_BUFFER_SIZE_1_MASK, + rx_buf->dma_len); + + if (++i == rx_ring->total_cnt) { + rx_desc.desc1 |= RX_DESC_1_END_RING; + i = 0; + } + + *rx_desc_addr = rx_desc; + dma_wmb(); + WRITE_ONCE(rx_desc_addr->desc0, rx_desc.desc0 | RX_DESC_0_OWN); + + rx_buf = &rx_ring->rx_desc_buf[i]; + } + + rx_ring->head = i; + return; + +err_free_skb: + dev_kfree_skb_any(skb); + rx_buf->skb = NULL; +} + +/* Returns number of packets received */ +static int emac_rx_clean_desc(struct emac_priv *priv, int budget) +{ + struct net_device *ndev = priv->ndev; + struct emac_rx_desc_buffer *rx_buf; + struct emac_desc_ring *rx_ring; + struct sk_buff *skb = NULL; + struct emac_desc *rx_desc; + u32 got = 0, skb_len, i; + + rx_ring = &priv->rx_ring; + + i = rx_ring->tail; + + while (budget--) { + rx_desc = &((struct emac_desc *)rx_ring->desc_addr)[i]; + + /* Stop checking if rx_desc still owned by DMA */ + if (READ_ONCE(rx_desc->desc0) & RX_DESC_0_OWN) + break; + + dma_rmb(); + + rx_buf = &rx_ring->rx_desc_buf[i]; + + if (!rx_buf->skb) + break; + + got++; + + dma_unmap_single(&priv->pdev->dev, rx_buf->dma_addr, + rx_buf->dma_len, DMA_FROM_DEVICE); + + if (likely(emac_rx_frame_good(priv, rx_desc))) { + skb = rx_buf->skb; + + skb_len = FIELD_GET(RX_DESC_0_FRAME_PACKET_LENGTH_MASK, + rx_desc->desc0); + skb_len -= ETH_FCS_LEN; + + skb_put(skb, skb_len); + skb->dev = ndev; + ndev->hard_header_len = ETH_HLEN; + + skb->protocol = eth_type_trans(skb, ndev); + + skb->ip_summed = CHECKSUM_NONE; + + napi_gro_receive(&priv->napi, skb); + + memset(rx_desc, 0, sizeof(struct emac_desc)); + rx_buf->skb = NULL; + } else { + dev_kfree_skb_irq(rx_buf->skb); + rx_buf->skb = NULL; + } + + if (++i == rx_ring->total_cnt) + i = 0; + } + + rx_ring->tail = i; + + emac_alloc_rx_desc_buffers(priv); + + return got; +} + +static int emac_rx_poll(struct napi_struct *napi, int budget) +{ + struct emac_priv *priv = container_of(napi, struct emac_priv, napi); + int work_done; + + emac_tx_clean_desc(priv); + + work_done = emac_rx_clean_desc(priv, budget); + if (work_done < budget && napi_complete_done(napi, work_done)) + emac_enable_interrupt(priv); + + return work_done; +} + +/* + * For convenience, skb->data is fragment 0, frags[0] is fragment 1, etc. + * + * Each descriptor can hold up to two fragments, called buffer 1 and 2. For each + * fragment f, if f % 2 == 0, it uses buffer 1, otherwise it uses buffer 2. + */ + +static int emac_tx_map_frag(struct device *dev, struct emac_desc *tx_desc, + struct emac_tx_desc_buffer *tx_buf, + struct sk_buff *skb, u32 frag_idx) +{ + bool map_as_page, buf_idx; + const skb_frag_t *frag; + phys_addr_t addr; + u32 len; + int ret; + + buf_idx = frag_idx % 2; + + if (frag_idx == 0) { + /* Non-fragmented part */ + len = skb_headlen(skb); + addr = dma_map_single(dev, skb->data, len, DMA_TO_DEVICE); + map_as_page = false; + } else { + /* Fragment */ + frag = &skb_shinfo(skb)->frags[frag_idx - 1]; + len = skb_frag_size(frag); + addr = skb_frag_dma_map(dev, frag, 0, len, DMA_TO_DEVICE); + map_as_page = true; + } + + ret = dma_mapping_error(dev, addr); + if (ret) + return ret; + + tx_buf->buf[buf_idx].dma_addr = addr; + tx_buf->buf[buf_idx].dma_len = len; + tx_buf->buf[buf_idx].map_as_page = map_as_page; + + if (buf_idx == 0) { + tx_desc->buffer_addr_1 = addr; + tx_desc->desc1 |= FIELD_PREP(TX_DESC_1_BUFFER_SIZE_1_MASK, len); + } else { + tx_desc->buffer_addr_2 = addr; + tx_desc->desc1 |= FIELD_PREP(TX_DESC_1_BUFFER_SIZE_2_MASK, len); + } + + return 0; +} + +static void emac_tx_mem_map(struct emac_priv *priv, struct sk_buff *skb) +{ + struct emac_desc_ring *tx_ring = &priv->tx_ring; + struct emac_desc tx_desc, *tx_desc_addr; + struct device *dev = &priv->pdev->dev; + struct emac_tx_desc_buffer *tx_buf; + u32 head, old_head, frag_num, f; + bool buf_idx; + + frag_num = skb_shinfo(skb)->nr_frags; + head = tx_ring->head; + old_head = head; + + for (f = 0; f < frag_num + 1; f++) { + buf_idx = f % 2; + + /* + * If using buffer 1, initialize a new desc. Otherwise, use + * buffer 2 of previous fragment's desc. + */ + if (!buf_idx) { + tx_buf = &tx_ring->tx_desc_buf[head]; + tx_desc_addr = + &((struct emac_desc *)tx_ring->desc_addr)[head]; + memset(&tx_desc, 0, sizeof(tx_desc)); + + /* + * Give ownership for all but first desc initially. For + * first desc, give at the end so DMA cannot start + * reading uninitialized descs. + */ + if (head != old_head) + tx_desc.desc0 |= TX_DESC_0_OWN; + + if (++head == tx_ring->total_cnt) { + /* Just used last desc in ring */ + tx_desc.desc1 |= TX_DESC_1_END_RING; + head = 0; + } + } + + if (emac_tx_map_frag(dev, &tx_desc, tx_buf, skb, f)) { + dev_err_ratelimited(&priv->ndev->dev, + "Map TX frag %d failed\n", f); + goto err_free_skb; + } + + if (f == 0) + tx_desc.desc1 |= TX_DESC_1_FIRST_SEGMENT; + + if (f == frag_num) { + tx_desc.desc1 |= TX_DESC_1_LAST_SEGMENT; + tx_buf->skb = skb; + if (emac_tx_should_interrupt(priv, frag_num + 1)) + tx_desc.desc1 |= + TX_DESC_1_INTERRUPT_ON_COMPLETION; + } + + *tx_desc_addr = tx_desc; + } + + /* All descriptors are ready, give ownership for first desc */ + tx_desc_addr = &((struct emac_desc *)tx_ring->desc_addr)[old_head]; + dma_wmb(); + WRITE_ONCE(tx_desc_addr->desc0, tx_desc_addr->desc0 | TX_DESC_0_OWN); + + emac_dma_start_transmit(priv); + + tx_ring->head = head; + + return; + +err_free_skb: + this_cpu_inc(*priv->stat_tx_dropped); + dev_kfree_skb_any(skb); +} + +static netdev_tx_t emac_start_xmit(struct sk_buff *skb, struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + int nfrags = skb_shinfo(skb)->nr_frags; + struct device *dev = &priv->pdev->dev; + + if (unlikely(emac_tx_avail(priv) < nfrags + 1)) { + if (!netif_queue_stopped(ndev)) { + netif_stop_queue(ndev); + dev_err_ratelimited(dev, "TX ring full, stop TX queue\n"); + } + return NETDEV_TX_BUSY; + } + + emac_tx_mem_map(priv, skb); + + /* Make sure there is space in the ring for the next TX. */ + if (unlikely(emac_tx_avail(priv) <= MAX_SKB_FRAGS + 2)) + netif_stop_queue(ndev); + + return NETDEV_TX_OK; +} + +static int emac_set_mac_address(struct net_device *ndev, void *addr) +{ + struct emac_priv *priv = netdev_priv(ndev); + int ret = eth_mac_addr(ndev, addr); + + if (ret) + return ret; + + /* If running, set now; if not running it will be set in emac_up. */ + if (netif_running(ndev)) + emac_set_mac_addr(priv, ndev->dev_addr); + + return 0; +} + +static void emac_mac_multicast_filter_clear(struct emac_priv *priv) +{ + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, 0x0); +} + +/* Configure Multicast and Promiscuous modes */ +static void emac_set_rx_mode(struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + u32 crc32, bit, reg, hash, val; + struct netdev_hw_addr *ha; + u32 mc_filter[4] = { 0 }; + + val = emac_rd(priv, MAC_ADDRESS_CONTROL); + + val &= ~MREGBIT_PROMISCUOUS_MODE; + + if (ndev->flags & IFF_PROMISC) { + /* Enable promisc mode */ + val |= MREGBIT_PROMISCUOUS_MODE; + } else if ((ndev->flags & IFF_ALLMULTI) || + (netdev_mc_count(ndev) > HASH_TABLE_SIZE)) { + /* Accept all multicast frames by setting every bit */ + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, 0xffff); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, 0xffff); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, 0xffff); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, 0xffff); + } else if (!netdev_mc_empty(ndev)) { + emac_mac_multicast_filter_clear(priv); + netdev_for_each_mc_addr(ha, ndev) { + /* Calculate the CRC of the MAC address */ + crc32 = ether_crc(ETH_ALEN, ha->addr); + + /* + * The hash table is an array of 4 16-bit registers. It + * is treated like an array of 64 bits (bits[hash]). Use + * the upper 6 bits of the above CRC as the hash value. + */ + hash = (crc32 >> 26) & 0x3F; + reg = hash / 16; + bit = hash % 16; + mc_filter[reg] |= BIT(bit); + } + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, mc_filter[0]); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, mc_filter[1]); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, mc_filter[2]); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, mc_filter[3]); + } + + emac_wr(priv, MAC_ADDRESS_CONTROL, val); +} + +static int emac_change_mtu(struct net_device *ndev, int mtu) +{ + struct emac_priv *priv = netdev_priv(ndev); + u32 frame_len; + + if (netif_running(ndev)) { + netdev_err(ndev, "must be stopped to change MTU\n"); + return -EBUSY; + } + + frame_len = mtu + ETH_HLEN + ETH_FCS_LEN; + + if (frame_len <= EMAC_DEFAULT_BUFSIZE) + priv->dma_buf_sz = EMAC_DEFAULT_BUFSIZE; + else if (frame_len <= EMAC_RX_BUF_2K) + priv->dma_buf_sz = EMAC_RX_BUF_2K; + else + priv->dma_buf_sz = EMAC_RX_BUF_4K; + + ndev->mtu = mtu; + + return 0; +} + +static void emac_tx_timeout(struct net_device *ndev, unsigned int txqueue) +{ + struct emac_priv *priv = netdev_priv(ndev); + + schedule_work(&priv->tx_timeout_task); +} + +static int emac_mii_read(struct mii_bus *bus, int phy_addr, int regnum) +{ + struct emac_priv *priv = bus->priv; + u32 cmd = 0, val; + int ret; + + cmd |= phy_addr & 0x1F; + cmd |= (regnum & 0x1F) << 5; + cmd |= MREGBIT_START_MDIO_TRANS | MREGBIT_MDIO_READ_WRITE; + + emac_wr(priv, MAC_MDIO_DATA, 0x0); + emac_wr(priv, MAC_MDIO_CONTROL, cmd); + + ret = readl_poll_timeout(priv->iobase + MAC_MDIO_CONTROL, val, + !((val >> 15) & 0x1), 100, 10000); + + if (ret) + return ret; + + val = emac_rd(priv, MAC_MDIO_DATA); + return val; +} + +static int emac_mii_write(struct mii_bus *bus, int phy_addr, int regnum, + u16 value) +{ + struct emac_priv *priv = bus->priv; + u32 cmd = 0, val; + int ret; + + emac_wr(priv, MAC_MDIO_DATA, value); + + cmd |= phy_addr & 0x1F; + cmd |= (regnum & 0x1F) << 5; + cmd |= MREGBIT_START_MDIO_TRANS; + + emac_wr(priv, MAC_MDIO_CONTROL, cmd); + + ret = readl_poll_timeout(priv->iobase + MAC_MDIO_CONTROL, val, + !((val >> 15) & 0x1), 100, 10000); + + return ret; +} + +static int emac_mdio_init(struct emac_priv *priv) +{ + struct device *dev = &priv->pdev->dev; + struct device_node *mii_np; + struct mii_bus *mii; + int ret; + + mii = devm_mdiobus_alloc(dev); + if (!mii) + return -ENOMEM; + + mii->priv = priv; + mii->name = "k1_emac_mii"; + mii->read = emac_mii_read; + mii->write = emac_mii_write; + mii->parent = dev; + mii->phy_mask = 0xffffffff; + snprintf(mii->id, MII_BUS_ID_SIZE, "%s", priv->pdev->name); + + mii_np = of_get_available_child_by_name(dev->of_node, "mdio-bus"); + + ret = devm_of_mdiobus_register(dev, mii, mii_np); + if (ret) + dev_err_probe(dev, ret, "Failed to register mdio bus\n"); + + of_node_put(mii_np); + return ret; +} + +static void emac_set_tx_fc(struct emac_priv *priv, bool enable) +{ + u32 val; + + val = emac_rd(priv, MAC_FC_CONTROL); + + if (enable) { + val |= MREGBIT_FC_GENERATION_ENABLE; + val |= MREGBIT_AUTO_FC_GENERATION_ENABLE; + } else { + val &= ~MREGBIT_FC_GENERATION_ENABLE; + val &= ~MREGBIT_AUTO_FC_GENERATION_ENABLE; + } + + emac_wr(priv, MAC_FC_CONTROL, val); +} + +static void emac_set_rx_fc(struct emac_priv *priv, bool enable) +{ + u32 val = emac_rd(priv, MAC_FC_CONTROL); + + if (enable) + val |= MREGBIT_FC_DECODE_ENABLE; + else + val &= ~MREGBIT_FC_DECODE_ENABLE; + + emac_wr(priv, MAC_FC_CONTROL, val); +} + +static void emac_set_fc(struct emac_priv *priv, u8 fc) +{ + emac_set_tx_fc(priv, fc & FLOW_CTRL_TX); + emac_set_rx_fc(priv, fc & FLOW_CTRL_RX); + priv->flow_control = fc; +} + +static void emac_set_fc_autoneg(struct emac_priv *priv) +{ + struct phy_device *phydev = priv->ndev->phydev; + u32 local_adv, remote_adv; + u8 fc; + + local_adv = linkmode_adv_to_lcl_adv_t(phydev->advertising); + + remote_adv = 0; + + if (phydev->pause) + remote_adv |= LPA_PAUSE_CAP; + + if (phydev->asym_pause) + remote_adv |= LPA_PAUSE_ASYM; + + fc = mii_resolve_flowctrl_fdx(local_adv, remote_adv); + + priv->flow_control_autoneg = true; + + emac_set_fc(priv, fc); +} + +/* + * Even though this MAC supports gigabit operation, it only provides 32-bit + * statistics counters. The most overflow-prone counters are the "bytes" ones, + * which at gigabit overflow about twice a minute. + * + * Therefore, we maintain the high 32 bits of counters ourselves, incrementing + * every time statistics seem to go backwards. Also, update periodically to + * catch overflows when we are not otherwise checking the statistics often + * enough. + */ + +#define EMAC_STATS_TIMER_PERIOD 20 + +static int emac_read_stat_cnt(struct emac_priv *priv, u8 cnt, u32 *res, + u32 control_reg, u32 high_reg, u32 low_reg) +{ + u32 val; + int ret; + + /* The "read" bit is the same for TX and RX */ + + val = MREGBIT_START_TX_COUNTER_READ | cnt; + emac_wr(priv, control_reg, val); + val = emac_rd(priv, control_reg); + + ret = readl_poll_timeout_atomic(priv->iobase + control_reg, val, + !(val & MREGBIT_START_TX_COUNTER_READ), + 100, 10000); + + if (ret) { + netdev_err(priv->ndev, "Read stat timeout\n"); + return ret; + } + + *res = emac_rd(priv, high_reg) << 16; + *res |= (u16)emac_rd(priv, low_reg); + + return 0; +} + +static int emac_tx_read_stat_cnt(struct emac_priv *priv, u8 cnt, u32 *res) +{ + return emac_read_stat_cnt(priv, cnt, res, MAC_TX_STATCTR_CONTROL, + MAC_TX_STATCTR_DATA_HIGH, + MAC_TX_STATCTR_DATA_LOW); +} + +static int emac_rx_read_stat_cnt(struct emac_priv *priv, u8 cnt, u32 *res) +{ + return emac_read_stat_cnt(priv, cnt, res, MAC_RX_STATCTR_CONTROL, + MAC_RX_STATCTR_DATA_HIGH, + MAC_RX_STATCTR_DATA_LOW); +} + +static void emac_update_counter(u64 *counter, u32 new_low) +{ + u32 old_low = (u32)*counter; + u64 high = *counter >> 32; + + if (old_low > new_low) { + /* Overflowed, increment high 32 bits */ + high++; + } + + *counter = (high << 32) | new_low; +} + +static void emac_stats_update(struct emac_priv *priv) +{ + u64 *tx_stats_off = (u64 *)&priv->tx_stats_off; + u64 *rx_stats_off = (u64 *)&priv->rx_stats_off; + u64 *tx_stats = (u64 *)&priv->tx_stats; + u64 *rx_stats = (u64 *)&priv->rx_stats; + u32 i, res; + + assert_spin_locked(&priv->stats_lock); + + if (!netif_running(priv->ndev) || !netif_device_present(priv->ndev)) { + /* Not up, don't try to update */ + return; + } + + for (i = 0; i < sizeof(priv->tx_stats) / sizeof(*tx_stats); i++) { + /* + * If reading stats times out, everything is broken and there's + * nothing we can do. Reading statistics also can't return an + * error, so just return without updating and without + * rescheduling. + */ + if (emac_tx_read_stat_cnt(priv, i, &res)) + return; + + /* + * Re-initializing while bringing interface up resets counters + * to zero, so to provide continuity, we add the values saved + * last time we did emac_down() to the new hardware-provided + * value. + */ + emac_update_counter(&tx_stats[i], res + (u32)tx_stats_off[i]); + } + + /* Similar remarks as TX stats */ + for (i = 0; i < sizeof(priv->rx_stats) / sizeof(*rx_stats); i++) { + if (emac_rx_read_stat_cnt(priv, i, &res)) + return; + emac_update_counter(&rx_stats[i], res + (u32)rx_stats_off[i]); + } + + mod_timer(&priv->stats_timer, jiffies + EMAC_STATS_TIMER_PERIOD * HZ); +} + +static void emac_stats_timer(struct timer_list *t) +{ + struct emac_priv *priv = timer_container_of(priv, t, stats_timer); + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + spin_unlock(&priv->stats_lock); +} + +static const struct ethtool_rmon_hist_range emac_rmon_hist_ranges[] = { + { 64, 64 }, + { 65, 127 }, + { 128, 255 }, + { 256, 511 }, + { 512, 1023 }, + { 1024, 1518 }, + { 1519, 4096 }, + { /* sentinel */ }, +}; + +static u64 emac_get_stat_tx_dropped(struct emac_priv *priv) +{ + u64 result; + int cpu; + + for_each_possible_cpu(cpu) { + result += READ_ONCE(per_cpu(*priv->stat_tx_dropped, cpu)); + } + + return result; +} + +static void emac_get_stats64(struct net_device *dev, + struct rtnl_link_stats64 *storage) +{ + struct emac_priv *priv = netdev_priv(dev); + struct emac_hw_tx_stats *tx_stats; + struct emac_hw_rx_stats *rx_stats; + + tx_stats = &priv->tx_stats; + rx_stats = &priv->rx_stats; + + storage->tx_dropped = emac_get_stat_tx_dropped(priv); + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + storage->tx_packets = tx_stats->tx_ok_pkts; + storage->tx_bytes = tx_stats->tx_ok_bytes; + storage->tx_errors = tx_stats->tx_err_pkts; + + storage->rx_packets = rx_stats->rx_ok_pkts; + storage->rx_bytes = rx_stats->rx_ok_bytes; + storage->rx_errors = rx_stats->rx_err_total_pkts; + storage->rx_crc_errors = rx_stats->rx_crc_err_pkts; + storage->rx_frame_errors = rx_stats->rx_align_err_pkts; + storage->rx_length_errors = rx_stats->rx_len_err_pkts; + + storage->collisions = tx_stats->tx_singleclsn_pkts; + storage->collisions += tx_stats->tx_multiclsn_pkts; + storage->collisions += tx_stats->tx_excessclsn_pkts; + + storage->rx_missed_errors = rx_stats->rx_drp_fifo_full_pkts; + storage->rx_missed_errors += rx_stats->rx_truncate_fifo_full_pkts; + + spin_unlock(&priv->stats_lock); +} + +static void emac_get_rmon_stats(struct net_device *dev, + struct ethtool_rmon_stats *rmon_stats, + const struct ethtool_rmon_hist_range **ranges) +{ + struct emac_priv *priv = netdev_priv(dev); + struct emac_hw_rx_stats *rx_stats; + + rx_stats = &priv->rx_stats; + + *ranges = emac_rmon_hist_ranges; + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + rmon_stats->undersize_pkts = rx_stats->rx_len_undersize_pkts; + rmon_stats->oversize_pkts = rx_stats->rx_len_oversize_pkts; + rmon_stats->fragments = rx_stats->rx_len_fragment_pkts; + rmon_stats->jabbers = rx_stats->rx_len_jabber_pkts; + + /* Only RX has histogram stats */ + + rmon_stats->hist[0] = rx_stats->rx_64_pkts; + rmon_stats->hist[1] = rx_stats->rx_65_127_pkts; + rmon_stats->hist[2] = rx_stats->rx_128_255_pkts; + rmon_stats->hist[3] = rx_stats->rx_256_511_pkts; + rmon_stats->hist[4] = rx_stats->rx_512_1023_pkts; + rmon_stats->hist[5] = rx_stats->rx_1024_1518_pkts; + rmon_stats->hist[6] = rx_stats->rx_1519_plus_pkts; + + spin_unlock(&priv->stats_lock); +} + +static void emac_get_eth_mac_stats(struct net_device *dev, + struct ethtool_eth_mac_stats *mac_stats) +{ + struct emac_priv *priv = netdev_priv(dev); + struct emac_hw_tx_stats *tx_stats; + struct emac_hw_rx_stats *rx_stats; + + tx_stats = &priv->tx_stats; + rx_stats = &priv->rx_stats; + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + mac_stats->MulticastFramesXmittedOK = tx_stats->tx_multicast_pkts; + mac_stats->BroadcastFramesXmittedOK = tx_stats->tx_broadcast_pkts; + + mac_stats->MulticastFramesReceivedOK = rx_stats->rx_multicast_pkts; + mac_stats->BroadcastFramesReceivedOK = rx_stats->rx_broadcast_pkts; + + mac_stats->SingleCollisionFrames = tx_stats->tx_singleclsn_pkts; + mac_stats->MultipleCollisionFrames = tx_stats->tx_multiclsn_pkts; + mac_stats->LateCollisions = tx_stats->tx_lateclsn_pkts; + mac_stats->FramesAbortedDueToXSColls = tx_stats->tx_excessclsn_pkts; + + spin_unlock(&priv->stats_lock); +} + +static void emac_get_pause_stats(struct net_device *dev, + struct ethtool_pause_stats *pause_stats) +{ + struct emac_priv *priv = netdev_priv(dev); + struct emac_hw_tx_stats *tx_stats; + struct emac_hw_rx_stats *rx_stats; + + tx_stats = &priv->tx_stats; + rx_stats = &priv->rx_stats; + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + pause_stats->tx_pause_frames = tx_stats->tx_pause_pkts; + pause_stats->rx_pause_frames = rx_stats->rx_pause_pkts; + + spin_unlock(&priv->stats_lock); +} + +/* Other statistics that are not derivable from standard statistics */ + +#define EMAC_ETHTOOL_STAT(type, name) \ + { offsetof(type, name) / sizeof(u64), #name } + +static const struct emac_ethtool_stats { + size_t offset; + char str[ETH_GSTRING_LEN]; +} emac_ethtool_rx_stats[] = { + EMAC_ETHTOOL_STAT(struct emac_hw_rx_stats, rx_drp_fifo_full_pkts), + EMAC_ETHTOOL_STAT(struct emac_hw_rx_stats, rx_truncate_fifo_full_pkts), +}; + +static int emac_get_sset_count(struct net_device *dev, int sset) +{ + switch (sset) { + case ETH_SS_STATS: + return ARRAY_SIZE(emac_ethtool_rx_stats); + default: + return -EOPNOTSUPP; + } +} + +static void emac_get_strings(struct net_device *dev, u32 stringset, u8 *data) +{ + int i; + + switch (stringset) { + case ETH_SS_STATS: + for (i = 0; i < ARRAY_SIZE(emac_ethtool_rx_stats); i++) { + memcpy(data, emac_ethtool_rx_stats[i].str, + ETH_GSTRING_LEN); + data += ETH_GSTRING_LEN; + } + break; + } +} + +static void emac_get_ethtool_stats(struct net_device *dev, + struct ethtool_stats *stats, u64 *data) +{ + struct emac_priv *priv = netdev_priv(dev); + u64 *rx_stats = (u64 *)&priv->rx_stats; + int i; + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + for (i = 0; i < ARRAY_SIZE(emac_ethtool_rx_stats); i++) + data[i] = rx_stats[emac_ethtool_rx_stats[i].offset]; + + spin_unlock(&priv->stats_lock); +} + +static int emac_ethtool_get_regs_len(struct net_device *dev) +{ + return (EMAC_DMA_REG_CNT + EMAC_MAC_REG_CNT) * sizeof(u32); +} + +static void emac_ethtool_get_regs(struct net_device *dev, + struct ethtool_regs *regs, void *space) +{ + struct emac_priv *priv = netdev_priv(dev); + u32 *reg_space = space; + int i; + + regs->version = 1; + + for (i = 0; i < EMAC_DMA_REG_CNT; i++) + reg_space[i] = emac_rd(priv, DMA_CONFIGURATION + i * 4); + + for (i = 0; i < EMAC_MAC_REG_CNT; i++) + reg_space[i + EMAC_DMA_REG_CNT] = + emac_rd(priv, MAC_GLOBAL_CONTROL + i * 4); +} + +static void emac_get_pauseparam(struct net_device *dev, + struct ethtool_pauseparam *pause) +{ + struct emac_priv *priv = netdev_priv(dev); + + pause->autoneg = priv->flow_control_autoneg; + pause->tx_pause = !!(priv->flow_control & FLOW_CTRL_TX); + pause->rx_pause = !!(priv->flow_control & FLOW_CTRL_RX); +} + +static int emac_set_pauseparam(struct net_device *dev, + struct ethtool_pauseparam *pause) +{ + struct emac_priv *priv = netdev_priv(dev); + u8 fc = 0; + + priv->flow_control_autoneg = pause->autoneg; + + if (pause->autoneg) { + emac_set_fc_autoneg(priv); + } else { + if (pause->tx_pause) + fc |= FLOW_CTRL_TX; + + if (pause->rx_pause) + fc |= FLOW_CTRL_RX; + + emac_set_fc(priv, fc); + } + + return 0; +} + +static void emac_get_drvinfo(struct net_device *dev, + struct ethtool_drvinfo *info) +{ + strscpy(info->driver, DRIVER_NAME, sizeof(info->driver)); + info->n_stats = ARRAY_SIZE(emac_ethtool_rx_stats); +} + +static void emac_tx_timeout_task(struct work_struct *work) +{ + struct net_device *ndev; + struct emac_priv *priv; + + priv = container_of(work, struct emac_priv, tx_timeout_task); + ndev = priv->ndev; + + rtnl_lock(); + + /* No need to reset if already down */ + if (!netif_running(ndev)) { + rtnl_unlock(); + return; + } + + netdev_err(ndev, "MAC reset due to TX timeout\n"); + + netif_trans_update(ndev); /* prevent tx timeout */ + dev_close(ndev); + dev_open(ndev, NULL); + + rtnl_unlock(); +} + +static void emac_sw_init(struct emac_priv *priv) +{ + priv->dma_buf_sz = EMAC_DEFAULT_BUFSIZE; + + priv->tx_ring.total_cnt = DEFAULT_TX_RING_NUM; + priv->rx_ring.total_cnt = DEFAULT_RX_RING_NUM; + + spin_lock_init(&priv->stats_lock); + + INIT_WORK(&priv->tx_timeout_task, emac_tx_timeout_task); + + priv->tx_coal_frames = EMAC_TX_FRAMES; + priv->tx_coal_timeout = EMAC_TX_COAL_TIMEOUT; + + timer_setup(&priv->txtimer, emac_tx_coal_timer, 0); + timer_setup(&priv->stats_timer, emac_stats_timer, 0); +} + +static irqreturn_t emac_interrupt_handler(int irq, void *dev_id) +{ + struct net_device *ndev = (struct net_device *)dev_id; + struct emac_priv *priv = netdev_priv(ndev); + bool should_schedule = false; + u32 clr = 0; + u32 status; + + status = emac_rd(priv, DMA_STATUS_IRQ); + + if (status & MREGBIT_TRANSMIT_TRANSFER_DONE_IRQ) { + clr |= MREGBIT_TRANSMIT_TRANSFER_DONE_IRQ; + should_schedule = true; + } + + if (status & MREGBIT_TRANSMIT_DES_UNAVAILABLE_IRQ) + clr |= MREGBIT_TRANSMIT_DES_UNAVAILABLE_IRQ; + + if (status & MREGBIT_TRANSMIT_DMA_STOPPED_IRQ) + clr |= MREGBIT_TRANSMIT_DMA_STOPPED_IRQ; + + if (status & MREGBIT_RECEIVE_TRANSFER_DONE_IRQ) { + clr |= MREGBIT_RECEIVE_TRANSFER_DONE_IRQ; + should_schedule = true; + } + + if (status & MREGBIT_RECEIVE_DES_UNAVAILABLE_IRQ) + clr |= MREGBIT_RECEIVE_DES_UNAVAILABLE_IRQ; + + if (status & MREGBIT_RECEIVE_DMA_STOPPED_IRQ) + clr |= MREGBIT_RECEIVE_DMA_STOPPED_IRQ; + + if (status & MREGBIT_RECEIVE_MISSED_FRAME_IRQ) + clr |= MREGBIT_RECEIVE_MISSED_FRAME_IRQ; + + if (should_schedule) { + if (napi_schedule_prep(&priv->napi)) { + emac_disable_interrupt(priv); + __napi_schedule_irqoff(&priv->napi); + } + } + + emac_wr(priv, DMA_STATUS_IRQ, clr); + + return IRQ_HANDLED; +} + +static void emac_configure_tx(struct emac_priv *priv) +{ + u32 val; + + /* Set base address */ + val = (u32)priv->tx_ring.desc_dma_addr; + emac_wr(priv, DMA_TRANSMIT_BASE_ADDRESS, val); + + /* Set TX inter-frame gap value, enable transmit */ + val = emac_rd(priv, MAC_TRANSMIT_CONTROL); + val &= ~MREGBIT_IFG_LEN; + val |= MREGBIT_TRANSMIT_ENABLE; + val |= MREGBIT_TRANSMIT_AUTO_RETRY; + emac_wr(priv, MAC_TRANSMIT_CONTROL, val); + + emac_wr(priv, DMA_TRANSMIT_AUTO_POLL_COUNTER, 0x0); + + /* Start TX DMA */ + val = emac_rd(priv, DMA_CONTROL); + val |= MREGBIT_START_STOP_TRANSMIT_DMA; + emac_wr(priv, DMA_CONTROL, val); +} + +static void emac_configure_rx(struct emac_priv *priv) +{ + u32 val; + + /* Set base address */ + val = (u32)priv->rx_ring.desc_dma_addr; + emac_wr(priv, DMA_RECEIVE_BASE_ADDRESS, val); + + /* Enable receive */ + val = emac_rd(priv, MAC_RECEIVE_CONTROL); + val |= MREGBIT_RECEIVE_ENABLE; + val |= MREGBIT_STORE_FORWARD; + emac_wr(priv, MAC_RECEIVE_CONTROL, val); + + /* Start RX DMA */ + val = emac_rd(priv, DMA_CONTROL); + val |= MREGBIT_START_STOP_RECEIVE_DMA; + emac_wr(priv, DMA_CONTROL, val); +} + +static void emac_adjust_link(struct net_device *dev) +{ + struct emac_priv *priv = netdev_priv(dev); + struct phy_device *phydev = dev->phydev; + u32 ctrl; + + if (phydev->link) { + ctrl = emac_rd(priv, MAC_GLOBAL_CONTROL); + + /* Update duplex and speed from PHY */ + + if (!phydev->duplex) + ctrl &= ~MREGBIT_FULL_DUPLEX_MODE; + else + ctrl |= MREGBIT_FULL_DUPLEX_MODE; + + ctrl &= ~MREGBIT_SPEED; + + switch (phydev->speed) { + case SPEED_1000: + ctrl |= MREGBIT_SPEED_1000M; + break; + case SPEED_100: + ctrl |= MREGBIT_SPEED_100M; + break; + case SPEED_10: + ctrl |= MREGBIT_SPEED_10M; + break; + default: + netdev_err(dev, "Unknown speed: %d\n", phydev->speed); + phydev->speed = SPEED_UNKNOWN; + break; + } + + emac_wr(priv, MAC_GLOBAL_CONTROL, ctrl); + + emac_set_fc_autoneg(priv); + } + + phy_print_status(phydev); +} + +static void emac_update_delay_line(struct emac_priv *priv) +{ + u32 mask = 0, val = 0; + + mask |= EMAC_RX_DLINE_EN; + mask |= EMAC_RX_DLINE_STEP_MASK | EMAC_RX_DLINE_CODE_MASK; + mask |= EMAC_TX_DLINE_EN; + mask |= EMAC_TX_DLINE_STEP_MASK | EMAC_TX_DLINE_CODE_MASK; + + if (phy_interface_mode_is_rgmii(priv->phy_interface)) { + val |= EMAC_RX_DLINE_EN; + val |= FIELD_PREP(EMAC_RX_DLINE_STEP_MASK, + EMAC_DLINE_STEP_15P6); + val |= FIELD_PREP(EMAC_RX_DLINE_CODE_MASK, priv->rx_delay); + + val |= EMAC_TX_DLINE_EN; + val |= FIELD_PREP(EMAC_TX_DLINE_STEP_MASK, + EMAC_DLINE_STEP_15P6); + val |= FIELD_PREP(EMAC_TX_DLINE_CODE_MASK, priv->tx_delay); + } + + regmap_update_bits(priv->regmap_apmu, + priv->regmap_apmu_offset + APMU_EMAC_DLINE_REG, + mask, val); +} + +static int emac_phy_connect(struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + struct device *dev = &priv->pdev->dev; + struct phy_device *phydev; + struct device_node *np; + int ret; + + ret = of_get_phy_mode(dev->of_node, &priv->phy_interface); + if (ret) { + netdev_err(ndev, "No phy-mode found"); + return ret; + } + + switch (priv->phy_interface) { + case PHY_INTERFACE_MODE_RMII: + case PHY_INTERFACE_MODE_RGMII: + case PHY_INTERFACE_MODE_RGMII_ID: + case PHY_INTERFACE_MODE_RGMII_RXID: + case PHY_INTERFACE_MODE_RGMII_TXID: + break; + default: + netdev_err(ndev, "Unsupported PHY interface %s", + phy_modes(priv->phy_interface)); + return -EINVAL; + } + + np = of_parse_phandle(dev->of_node, "phy-handle", 0); + if (!np && of_phy_is_fixed_link(dev->of_node)) + np = of_node_get(dev->of_node); + + if (!np) { + netdev_err(ndev, "No PHY specified"); + return -ENODEV; + } + + ret = emac_phy_interface_config(priv); + if (ret) + goto err_node_put; + + phydev = of_phy_connect(ndev, np, &emac_adjust_link, 0, + priv->phy_interface); + if (!phydev) { + netdev_err(ndev, "Could not attach to PHY\n"); + ret = -ENODEV; + goto err_node_put; + } + + phy_support_asym_pause(phydev); + + phydev->mac_managed_pm = true; + + emac_update_delay_line(priv); + +err_node_put: + of_node_put(np); + return ret; +} + +static int emac_up(struct emac_priv *priv) +{ + struct platform_device *pdev = priv->pdev; + struct net_device *ndev = priv->ndev; + int ret; + + pm_runtime_get_sync(&pdev->dev); + + ret = emac_phy_connect(ndev); + if (ret) { + dev_err(&pdev->dev, "emac_phy_connect failed\n"); + goto err_pm_put; + } + + emac_init_hw(priv); + + emac_set_mac_addr(priv, ndev->dev_addr); + emac_configure_tx(priv); + emac_configure_rx(priv); + + emac_alloc_rx_desc_buffers(priv); + + phy_start(ndev->phydev); + + ret = request_irq(priv->irq, emac_interrupt_handler, IRQF_SHARED, + ndev->name, ndev); + if (ret) { + dev_err(&pdev->dev, "request_irq failed\n"); + goto err_reset_disconnect_phy; + } + + /* Don't enable MAC interrupts */ + emac_wr(priv, MAC_INTERRUPT_ENABLE, 0x0); + + /* Enable DMA interrupts */ + emac_wr(priv, DMA_INTERRUPT_ENABLE, + MREGBIT_TRANSMIT_TRANSFER_DONE_INTR_ENABLE | + MREGBIT_TRANSMIT_DMA_STOPPED_INTR_ENABLE | + MREGBIT_RECEIVE_TRANSFER_DONE_INTR_ENABLE | + MREGBIT_RECEIVE_DMA_STOPPED_INTR_ENABLE | + MREGBIT_RECEIVE_MISSED_FRAME_INTR_ENABLE); + + napi_enable(&priv->napi); + + netif_start_queue(ndev); + + emac_stats_timer(&priv->stats_timer); + + return 0; + +err_reset_disconnect_phy: + emac_reset_hw(priv); + phy_disconnect(ndev->phydev); + +err_pm_put: + pm_runtime_put_sync(&pdev->dev); + return ret; +} + +static int emac_down(struct emac_priv *priv) +{ + struct platform_device *pdev = priv->pdev; + struct net_device *ndev = priv->ndev; + + netif_stop_queue(ndev); + + phy_disconnect(ndev->phydev); + + emac_wr(priv, MAC_INTERRUPT_ENABLE, 0x0); + emac_wr(priv, DMA_INTERRUPT_ENABLE, 0x0); + + free_irq(priv->irq, ndev); + + napi_disable(&priv->napi); + + timer_delete_sync(&priv->txtimer); + cancel_work_sync(&priv->tx_timeout_task); + + timer_delete_sync(&priv->stats_timer); + + emac_reset_hw(priv); + + /* Update and save current stats, see emac_stats_update() for usage */ + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + priv->tx_stats_off = priv->tx_stats; + priv->rx_stats_off = priv->rx_stats; + + spin_unlock(&priv->stats_lock); + + pm_runtime_put_sync(&pdev->dev); + return 0; +} + +/* Called when net interface is brought up. */ +static int emac_open(struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + struct device *dev = &priv->pdev->dev; + int ret; + + ret = emac_alloc_tx_resources(priv); + if (ret) { + dev_err(dev, "Cannot allocate TX resources\n"); + return ret; + } + + ret = emac_alloc_rx_resources(priv); + if (ret) { + dev_err(dev, "Cannot allocate RX resources\n"); + goto err_free_tx; + } + + ret = emac_up(priv); + if (ret) { + dev_err(dev, "Error when bringing interface up\n"); + goto err_free_rx; + } + return 0; + +err_free_rx: + emac_free_rx_resources(priv); +err_free_tx: + emac_free_tx_resources(priv); + + return ret; +} + +/* Called when interface is brought down. */ +static int emac_stop(struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + + emac_down(priv); + emac_free_tx_resources(priv); + emac_free_rx_resources(priv); + + return 0; +} + +static const struct ethtool_ops emac_ethtool_ops = { + .get_link_ksettings = phy_ethtool_get_link_ksettings, + .set_link_ksettings = phy_ethtool_set_link_ksettings, + .nway_reset = phy_ethtool_nway_reset, + .get_drvinfo = emac_get_drvinfo, + .get_link = ethtool_op_get_link, + + .get_regs = emac_ethtool_get_regs, + .get_regs_len = emac_ethtool_get_regs_len, + + .get_rmon_stats = emac_get_rmon_stats, + .get_pause_stats = emac_get_pause_stats, + .get_eth_mac_stats = emac_get_eth_mac_stats, + + .get_sset_count = emac_get_sset_count, + .get_strings = emac_get_strings, + .get_ethtool_stats = emac_get_ethtool_stats, + + .get_pauseparam = emac_get_pauseparam, + .set_pauseparam = emac_set_pauseparam, +}; + +static const struct net_device_ops emac_netdev_ops = { + .ndo_open = emac_open, + .ndo_stop = emac_stop, + .ndo_start_xmit = emac_start_xmit, + .ndo_validate_addr = eth_validate_addr, + .ndo_set_mac_address = emac_set_mac_address, + .ndo_eth_ioctl = phy_do_ioctl_running, + .ndo_change_mtu = emac_change_mtu, + .ndo_tx_timeout = emac_tx_timeout, + .ndo_set_rx_mode = emac_set_rx_mode, + .ndo_get_stats64 = emac_get_stats64, +}; + +/* Currently we always use 15.6 ps/step for the delay line */ + +static u32 delay_ps_to_unit(u32 ps) +{ + return DIV_ROUND_CLOSEST(ps * 10, 156); +} + +static u32 delay_unit_to_ps(u32 unit) +{ + return DIV_ROUND_CLOSEST(unit * 156, 10); +} + +#define EMAC_MAX_DELAY_UNIT \ + FIELD_GET(EMAC_TX_DLINE_CODE_MASK, EMAC_TX_DLINE_CODE_MASK) + +/* Minus one just to be safe from rounding errors */ +#define EMAC_MAX_DELAY_PS (delay_unit_to_ps(EMAC_MAX_DELAY_UNIT - 1)) + +static int emac_config_dt(struct platform_device *pdev, struct emac_priv *priv) +{ + struct device_node *np = pdev->dev.of_node; + struct device *dev = &pdev->dev; + u8 mac_addr[ETH_ALEN] = { 0 }; + int ret; + + priv->iobase = devm_platform_ioremap_resource(pdev, 0); + if (IS_ERR(priv->iobase)) + return dev_err_probe(dev, PTR_ERR(priv->iobase), + "ioremap failed\n"); + + priv->regmap_apmu = + syscon_regmap_lookup_by_phandle_args(np, "spacemit,apmu", 1, + &priv->regmap_apmu_offset); + + if (IS_ERR(priv->regmap_apmu)) + return dev_err_probe(dev, PTR_ERR(priv->regmap_apmu), + "failed to get syscon\n"); + + priv->irq = platform_get_irq(pdev, 0); + if (priv->irq < 0) + return priv->irq; + + ret = of_get_mac_address(np, mac_addr); + if (ret) { + if (ret == -EPROBE_DEFER) + return dev_err_probe(dev, ret, + "Can't get MAC address\n"); + + dev_info(&pdev->dev, "Using random MAC address\n"); + eth_hw_addr_random(priv->ndev); + } else { + eth_hw_addr_set(priv->ndev, mac_addr); + } + + priv->tx_delay = 0; + priv->rx_delay = 0; + + of_property_read_u32(np, "tx-internal-delay-ps", &priv->tx_delay); + of_property_read_u32(np, "rx-internal-delay-ps", &priv->rx_delay); + + if (priv->tx_delay > EMAC_MAX_DELAY_PS) { + dev_err(&pdev->dev, + "tx-internal-delay-ps too large: max %d, got %d", + EMAC_MAX_DELAY_PS, priv->tx_delay); + return -EINVAL; + } + + if (priv->rx_delay > EMAC_MAX_DELAY_PS) { + dev_err(&pdev->dev, + "rx-internal-delay-ps too large: max %d, got %d", + EMAC_MAX_DELAY_PS, priv->rx_delay); + return -EINVAL; + } + + priv->tx_delay = delay_ps_to_unit(priv->tx_delay); + priv->rx_delay = delay_ps_to_unit(priv->rx_delay); + + return 0; +} + +static void emac_phy_deregister_fixed_link(void *data) +{ + struct device_node *of_node = data; + + of_phy_deregister_fixed_link(of_node); +} + +static int emac_probe(struct platform_device *pdev) +{ + struct device *dev = &pdev->dev; + struct reset_control *reset; + struct net_device *ndev; + struct emac_priv *priv; + int ret; + + ndev = devm_alloc_etherdev(dev, sizeof(struct emac_priv)); + if (!ndev) + return -ENOMEM; + + ndev->hw_features = NETIF_F_SG; + ndev->features |= ndev->hw_features; + + ndev->max_mtu = EMAC_RX_BUF_4K - (ETH_HLEN + ETH_FCS_LEN); + + priv = netdev_priv(ndev); + priv->ndev = ndev; + priv->pdev = pdev; + platform_set_drvdata(pdev, priv); + + ret = emac_config_dt(pdev, priv); + if (ret < 0) + return dev_err_probe(dev, ret, "Configuration failed\n"); + + ndev->watchdog_timeo = 5 * HZ; + ndev->base_addr = (unsigned long)priv->iobase; + ndev->irq = priv->irq; + + ndev->ethtool_ops = &emac_ethtool_ops; + ndev->netdev_ops = &emac_netdev_ops; + + devm_pm_runtime_enable(&pdev->dev); + + priv->bus_clk = devm_clk_get_enabled(&pdev->dev, NULL); + if (IS_ERR(priv->bus_clk)) + return dev_err_probe(dev, PTR_ERR(priv->bus_clk), + "Failed to get clock\n"); + + reset = devm_reset_control_get_optional_exclusive_deasserted(&pdev->dev, + NULL); + if (IS_ERR(reset)) + return dev_err_probe(dev, PTR_ERR(reset), + "Failed to get reset\n"); + + if (of_phy_is_fixed_link(dev->of_node)) { + ret = of_phy_register_fixed_link(dev->of_node); + if (ret) + return dev_err_probe(dev, ret, + "Failed to register fixed-link\n"); + + ret = devm_add_action_or_reset(dev, + emac_phy_deregister_fixed_link, + dev->of_node); + + if (ret) { + dev_err(dev, "devm_add_action_or_reset failed\n"); + return ret; + } + } + + priv->stat_tx_dropped = devm_alloc_percpu(dev, u64); + if (!priv->stat_tx_dropped) + return -ENOMEM; + + emac_sw_init(priv); + + ret = emac_mdio_init(priv); + if (ret) + goto err_timer_delete; + + SET_NETDEV_DEV(ndev, &pdev->dev); + + ret = devm_register_netdev(dev, ndev); + if (ret) { + dev_err(dev, "devm_register_netdev failed\n"); + goto err_timer_delete; + } + + netif_napi_add(ndev, &priv->napi, emac_rx_poll); + netif_carrier_off(ndev); + + return 0; + +err_timer_delete: + timer_delete_sync(&priv->txtimer); + timer_delete_sync(&priv->stats_timer); + + return ret; +} + +static void emac_remove(struct platform_device *pdev) +{ + struct emac_priv *priv = platform_get_drvdata(pdev); + + timer_shutdown_sync(&priv->txtimer); + cancel_work_sync(&priv->tx_timeout_task); + + timer_shutdown_sync(&priv->stats_timer); + + emac_reset_hw(priv); +} + +static int emac_resume(struct device *dev) +{ + struct emac_priv *priv = dev_get_drvdata(dev); + struct net_device *ndev = priv->ndev; + int ret; + + ret = clk_prepare_enable(priv->bus_clk); + if (ret < 0) { + dev_err(dev, "Failed to enable bus clock: %d\n", ret); + return ret; + } + + if (!netif_running(ndev)) + return 0; + + ret = emac_open(ndev); + if (ret) { + clk_disable_unprepare(priv->bus_clk); + return ret; + } + + netif_device_attach(ndev); + + emac_stats_timer(&priv->stats_timer); + + return 0; +} + +static int emac_suspend(struct device *dev) +{ + struct emac_priv *priv = dev_get_drvdata(dev); + struct net_device *ndev = priv->ndev; + + if (!ndev || !netif_running(ndev)) { + clk_disable_unprepare(priv->bus_clk); + return 0; + } + + emac_stop(ndev); + + clk_disable_unprepare(priv->bus_clk); + netif_device_detach(ndev); + return 0; +} + +static const struct dev_pm_ops emac_pm_ops = { + SYSTEM_SLEEP_PM_OPS(emac_suspend, emac_resume) +}; + +static const struct of_device_id emac_of_match[] = { + { .compatible = "spacemit,k1-emac" }, + { /* sentinel */ }, +}; +MODULE_DEVICE_TABLE(of, emac_of_match); + +static struct platform_driver emac_driver = { + .probe = emac_probe, + .remove = emac_remove, + .driver = { + .name = DRIVER_NAME, + .of_match_table = of_match_ptr(emac_of_match), + .pm = &emac_pm_ops, + }, +}; +module_platform_driver(emac_driver); + +MODULE_DESCRIPTION("SpacemiT K1 Ethernet driver"); +MODULE_AUTHOR("Vivian Wang "); +MODULE_LICENSE("GPL"); diff --git a/drivers/net/ethernet/spacemit/k1_emac.h b/drivers/net/ethernet/spacemit/k1_emac.h new file mode 100644 index 0000000000000000000000000000000000000000..d7b8e7aff69277c8fc311b6c70a8897db1fc8378 --- /dev/null +++ b/drivers/net/ethernet/spacemit/k1_emac.h @@ -0,0 +1,406 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * SpacemiT K1 Ethernet hardware definitions + * + * Copyright (C) 2023-2025 SpacemiT (Hangzhou) Technology Co. Ltd + * Copyright (C) 2025 Vivian Wang + */ + +#ifndef _K1_EMAC_H_ +#define _K1_EMAC_H_ + +/* APMU syscon registers */ + +#define APMU_EMAC_CTRL_REG 0x0 + +#define PHY_INTF_RGMII BIT(2) + +/* + * Only valid for RMII mode + * 0: Ref clock from External PHY + * 1: Ref clock from SoC + */ +#define REF_CLK_SEL BIT(3) + +/* + * Function clock select + * 0: 208 MHz + * 1: 312 MHz + */ +#define FUNC_CLK_SEL BIT(4) + +/* Only valid for RMII, invert TX clk */ +#define RMII_TX_CLK_SEL BIT(6) + +/* Only valid for RMII, invert RX clk */ +#define RMII_RX_CLK_SEL BIT(7) + +/* + * Only valid for RGMII + * 0: TX clk from RX clk + * 1: TX clk from SoC + */ +#define RGMII_TX_CLK_SEL BIT(8) + +#define PHY_IRQ_EN BIT(12) +#define AXI_SINGLE_ID BIT(13) + +#define APMU_EMAC_DLINE_REG 0x4 + +#define EMAC_RX_DLINE_EN BIT(0) +#define EMAC_RX_DLINE_STEP_MASK GENMASK(5, 4) +#define EMAC_RX_DLINE_CODE_MASK GENMASK(15, 8) + +#define EMAC_TX_DLINE_EN BIT(16) +#define EMAC_TX_DLINE_STEP_MASK GENMASK(21, 20) +#define EMAC_TX_DLINE_CODE_MASK GENMASK(31, 24) + +#define EMAC_DLINE_STEP_15P6 0 /* 15.6 ps/step */ +#define EMAC_DLINE_STEP_24P4 1 /* 24.4 ps/step */ +#define EMAC_DLINE_STEP_29P7 2 /* 29.7 ps/step */ +#define EMAC_DLINE_STEP_35P1 3 /* 35.1 ps/step */ + +/* DMA register set */ +#define DMA_CONFIGURATION 0x0000 +#define DMA_CONTROL 0x0004 +#define DMA_STATUS_IRQ 0x0008 +#define DMA_INTERRUPT_ENABLE 0x000c + +#define DMA_TRANSMIT_AUTO_POLL_COUNTER 0x0010 +#define DMA_TRANSMIT_POLL_DEMAND 0x0014 +#define DMA_RECEIVE_POLL_DEMAND 0x0018 + +#define DMA_TRANSMIT_BASE_ADDRESS 0x001c +#define DMA_RECEIVE_BASE_ADDRESS 0x0020 +#define DMA_MISSED_FRAME_COUNTER 0x0024 +#define DMA_STOP_FLUSH_COUNTER 0x0028 + +#define DMA_RECEIVE_IRQ_MITIGATION_CTRL 0x002c + +#define DMA_CURRENT_TRANSMIT_DESCRIPTOR_POINTER 0x0030 +#define DMA_CURRENT_TRANSMIT_BUFFER_POINTER 0x0034 +#define DMA_CURRENT_RECEIVE_DESCRIPTOR_POINTER 0x0038 +#define DMA_CURRENT_RECEIVE_BUFFER_POINTER 0x003c + +/* MAC Register set */ +#define MAC_GLOBAL_CONTROL 0x0100 +#define MAC_TRANSMIT_CONTROL 0x0104 +#define MAC_RECEIVE_CONTROL 0x0108 +#define MAC_MAXIMUM_FRAME_SIZE 0x010c +#define MAC_TRANSMIT_JABBER_SIZE 0x0110 +#define MAC_RECEIVE_JABBER_SIZE 0x0114 +#define MAC_ADDRESS_CONTROL 0x0118 +#define MAC_MDIO_CLK_DIV 0x011c +#define MAC_ADDRESS1_HIGH 0x0120 +#define MAC_ADDRESS1_MED 0x0124 +#define MAC_ADDRESS1_LOW 0x0128 +#define MAC_ADDRESS2_HIGH 0x012c +#define MAC_ADDRESS2_MED 0x0130 +#define MAC_ADDRESS2_LOW 0x0134 +#define MAC_ADDRESS3_HIGH 0x0138 +#define MAC_ADDRESS3_MED 0x013c +#define MAC_ADDRESS3_LOW 0x0140 +#define MAC_ADDRESS4_HIGH 0x0144 +#define MAC_ADDRESS4_MED 0x0148 +#define MAC_ADDRESS4_LOW 0x014c +#define MAC_MULTICAST_HASH_TABLE1 0x0150 +#define MAC_MULTICAST_HASH_TABLE2 0x0154 +#define MAC_MULTICAST_HASH_TABLE3 0x0158 +#define MAC_MULTICAST_HASH_TABLE4 0x015c +#define MAC_FC_CONTROL 0x0160 +#define MAC_FC_PAUSE_FRAME_GENERATE 0x0164 +#define MAC_FC_SOURCE_ADDRESS_HIGH 0x0168 +#define MAC_FC_SOURCE_ADDRESS_MED 0x016c +#define MAC_FC_SOURCE_ADDRESS_LOW 0x0170 +#define MAC_FC_DESTINATION_ADDRESS_HIGH 0x0174 +#define MAC_FC_DESTINATION_ADDRESS_MED 0x0178 +#define MAC_FC_DESTINATION_ADDRESS_LOW 0x017c +#define MAC_FC_PAUSE_TIME_VALUE 0x0180 +#define MAC_FC_HIGH_PAUSE_TIME 0x0184 +#define MAC_FC_LOW_PAUSE_TIME 0x0188 +#define MAC_FC_PAUSE_HIGH_THRESHOLD 0x018c +#define MAC_FC_PAUSE_LOW_THRESHOLD 0x0190 +#define MAC_MDIO_CONTROL 0x01a0 +#define MAC_MDIO_DATA 0x01a4 +#define MAC_RX_STATCTR_CONTROL 0x01a8 +#define MAC_RX_STATCTR_DATA_HIGH 0x01ac +#define MAC_RX_STATCTR_DATA_LOW 0x01b0 +#define MAC_TX_STATCTR_CONTROL 0x01b4 +#define MAC_TX_STATCTR_DATA_HIGH 0x01b8 +#define MAC_TX_STATCTR_DATA_LOW 0x01bc +#define MAC_TRANSMIT_FIFO_ALMOST_FULL 0x01c0 +#define MAC_TRANSMIT_PACKET_START_THRESHOLD 0x01c4 +#define MAC_RECEIVE_PACKET_START_THRESHOLD 0x01c8 +#define MAC_STATUS_IRQ 0x01e0 +#define MAC_INTERRUPT_ENABLE 0x01e4 + +/* Used for register dump */ +#define EMAC_DMA_REG_CNT 16 +#define EMAC_MAC_REG_CNT 124 + +/* DMA_CONFIGURATION (0x0000) */ + +/* + * 0-DMA controller in normal operation mode, + * 1-DMA controller reset to default state, + * clearing all internal state information + */ +#define MREGBIT_SOFTWARE_RESET BIT(0) + +#define MREGBIT_BURST_1WORD BIT(1) +#define MREGBIT_BURST_2WORD BIT(2) +#define MREGBIT_BURST_4WORD BIT(3) +#define MREGBIT_BURST_8WORD BIT(4) +#define MREGBIT_BURST_16WORD BIT(5) +#define MREGBIT_BURST_32WORD BIT(6) +#define MREGBIT_BURST_64WORD BIT(7) +#define MREGBIT_BURST_LENGTH GENMASK(7, 1) +#define MREGBIT_DESCRIPTOR_SKIP_LENGTH GENMASK(12, 8) + +/* For Receive and Transmit DMA operate in Big-Endian mode for Descriptors. */ +#define MREGBIT_DESCRIPTOR_BYTE_ORDERING BIT(13) + +#define MREGBIT_BIG_LITLE_ENDIAN BIT(14) +#define MREGBIT_TX_RX_ARBITRATION BIT(15) +#define MREGBIT_WAIT_FOR_DONE BIT(16) +#define MREGBIT_STRICT_BURST BIT(17) +#define MREGBIT_DMA_64BIT_MODE BIT(18) + +/* DMA_CONTROL (0x0004) */ +#define MREGBIT_START_STOP_TRANSMIT_DMA BIT(0) +#define MREGBIT_START_STOP_RECEIVE_DMA BIT(1) + +/* DMA_STATUS_IRQ (0x0008) */ +#define MREGBIT_TRANSMIT_TRANSFER_DONE_IRQ BIT(0) +#define MREGBIT_TRANSMIT_DES_UNAVAILABLE_IRQ BIT(1) +#define MREGBIT_TRANSMIT_DMA_STOPPED_IRQ BIT(2) +#define MREGBIT_RECEIVE_TRANSFER_DONE_IRQ BIT(4) +#define MREGBIT_RECEIVE_DES_UNAVAILABLE_IRQ BIT(5) +#define MREGBIT_RECEIVE_DMA_STOPPED_IRQ BIT(6) +#define MREGBIT_RECEIVE_MISSED_FRAME_IRQ BIT(7) +#define MREGBIT_MAC_IRQ BIT(8) +#define MREGBIT_TRANSMIT_DMA_STATE GENMASK(18, 16) +#define MREGBIT_RECEIVE_DMA_STATE GENMASK(23, 20) + +/* DMA_INTERRUPT_ENABLE (0x000c) */ +#define MREGBIT_TRANSMIT_TRANSFER_DONE_INTR_ENABLE BIT(0) +#define MREGBIT_TRANSMIT_DES_UNAVAILABLE_INTR_ENABLE BIT(1) +#define MREGBIT_TRANSMIT_DMA_STOPPED_INTR_ENABLE BIT(2) +#define MREGBIT_RECEIVE_TRANSFER_DONE_INTR_ENABLE BIT(4) +#define MREGBIT_RECEIVE_DES_UNAVAILABLE_INTR_ENABLE BIT(5) +#define MREGBIT_RECEIVE_DMA_STOPPED_INTR_ENABLE BIT(6) +#define MREGBIT_RECEIVE_MISSED_FRAME_INTR_ENABLE BIT(7) +#define MREGBIT_MAC_INTR_ENABLE BIT(8) + +/* DMA_RECEIVE_IRQ_MITIGATION_CTRL (0x002c) */ +#define MREGBIT_RECEIVE_IRQ_FRAME_COUNTER_MASK GENMASK(7, 0) +#define MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_MASK GENMASK(27, 8) +#define MREGBIT_RECEIVE_IRQ_FRAME_COUNTER_MODE BIT(30) +#define MREGBIT_RECEIVE_IRQ_MITIGATION_ENABLE BIT(31) + +/* MAC_GLOBAL_CONTROL (0x0100) */ +#define MREGBIT_SPEED GENMASK(1, 0) +#define MREGBIT_SPEED_10M 0x0 +#define MREGBIT_SPEED_100M BIT(0) +#define MREGBIT_SPEED_1000M BIT(1) +#define MREGBIT_FULL_DUPLEX_MODE BIT(2) +#define MREGBIT_RESET_RX_STAT_COUNTERS BIT(3) +#define MREGBIT_RESET_TX_STAT_COUNTERS BIT(4) +#define MREGBIT_UNICAST_WAKEUP_MODE BIT(8) +#define MREGBIT_MAGIC_PACKET_WAKEUP_MODE BIT(9) + +/* MAC_TRANSMIT_CONTROL (0x0104) */ +#define MREGBIT_TRANSMIT_ENABLE BIT(0) +#define MREGBIT_INVERT_FCS BIT(1) +#define MREGBIT_DISABLE_FCS_INSERT BIT(2) +#define MREGBIT_TRANSMIT_AUTO_RETRY BIT(3) +#define MREGBIT_IFG_LEN GENMASK(6, 4) +#define MREGBIT_PREAMBLE_LENGTH GENMASK(9, 7) + +/* MAC_RECEIVE_CONTROL (0x0108) */ +#define MREGBIT_RECEIVE_ENABLE BIT(0) +#define MREGBIT_DISABLE_FCS_CHECK BIT(1) +#define MREGBIT_STRIP_FCS BIT(2) +#define MREGBIT_STORE_FORWARD BIT(3) +#define MREGBIT_STATUS_FIRST BIT(4) +#define MREGBIT_PASS_BAD_FRAMES BIT(5) +#define MREGBIT_ACOOUNT_VLAN BIT(6) + +/* MAC_MAXIMUM_FRAME_SIZE (0x010c) */ +#define MREGBIT_MAX_FRAME_SIZE GENMASK(13, 0) + +/* MAC_TRANSMIT_JABBER_SIZE (0x0110) */ +#define MREGBIT_TRANSMIT_JABBER_SIZE GENMASK(15, 0) + +/* MAC_RECEIVE_JABBER_SIZE (0x0114) */ +#define MREGBIT_RECEIVE_JABBER_SIZE GENMASK(15, 0) + +/* MAC_ADDRESS_CONTROL (0x0118) */ +#define MREGBIT_MAC_ADDRESS1_ENABLE BIT(0) +#define MREGBIT_MAC_ADDRESS2_ENABLE BIT(1) +#define MREGBIT_MAC_ADDRESS3_ENABLE BIT(2) +#define MREGBIT_MAC_ADDRESS4_ENABLE BIT(3) +#define MREGBIT_INVERSE_MAC_ADDRESS1_ENABLE BIT(4) +#define MREGBIT_INVERSE_MAC_ADDRESS2_ENABLE BIT(5) +#define MREGBIT_INVERSE_MAC_ADDRESS3_ENABLE BIT(6) +#define MREGBIT_INVERSE_MAC_ADDRESS4_ENABLE BIT(7) +#define MREGBIT_PROMISCUOUS_MODE BIT(8) + +/* MAC_FC_CONTROL (0x0160) */ +#define MREGBIT_FC_DECODE_ENABLE BIT(0) +#define MREGBIT_FC_GENERATION_ENABLE BIT(1) +#define MREGBIT_AUTO_FC_GENERATION_ENABLE BIT(2) +#define MREGBIT_MULTICAST_MODE BIT(3) +#define MREGBIT_BLOCK_PAUSE_FRAMES BIT(4) + +/* MAC_FC_PAUSE_FRAME_GENERATE (0x0164) */ +#define MREGBIT_GENERATE_PAUSE_FRAME BIT(0) + +/* MAC_FC_PAUSE_TIME_VALUE (0x0180) */ +#define MREGBIT_MAC_FC_PAUSE_TIME GENMASK(15, 0) + +/* MAC_MDIO_CONTROL (0x01a0) */ +#define MREGBIT_PHY_ADDRESS GENMASK(4, 0) +#define MREGBIT_REGISTER_ADDRESS GENMASK(9, 5) +#define MREGBIT_MDIO_READ_WRITE BIT(10) +#define MREGBIT_START_MDIO_TRANS BIT(15) + +/* MAC_MDIO_DATA (0x01a4) */ +#define MREGBIT_MDIO_DATA GENMASK(15, 0) + +/* MAC_RX_STATCTR_CONTROL (0x01a8) */ +#define MREGBIT_RX_COUNTER_NUMBER GENMASK(4, 0) +#define MREGBIT_START_RX_COUNTER_READ BIT(15) + +/* MAC_RX_STATCTR_DATA_HIGH (0x01ac) */ +#define MREGBIT_RX_STATCTR_DATA_HIGH GENMASK(15, 0) +/* MAC_RX_STATCTR_DATA_LOW (0x01b0) */ +#define MREGBIT_RX_STATCTR_DATA_LOW GENMASK(15, 0) + +/* MAC_TX_STATCTR_CONTROL (0x01b4) */ +#define MREGBIT_TX_COUNTER_NUMBER GENMASK(4, 0) +#define MREGBIT_START_TX_COUNTER_READ BIT(15) + +/* MAC_TX_STATCTR_DATA_HIGH (0x01b8) */ +#define MREGBIT_TX_STATCTR_DATA_HIGH GENMASK(15, 0) +/* MAC_TX_STATCTR_DATA_LOW (0x01bc) */ +#define MREGBIT_TX_STATCTR_DATA_LOW GENMASK(15, 0) + +/* MAC_TRANSMIT_FIFO_ALMOST_FULL (0x01c0) */ +#define MREGBIT_TX_FIFO_AF GENMASK(13, 0) + +/* MAC_TRANSMIT_PACKET_START_THRESHOLD (0x01c4) */ +#define MREGBIT_TX_PACKET_START_THRESHOLD GENMASK(13, 0) + +/* MAC_RECEIVE_PACKET_START_THRESHOLD (0x01c8) */ +#define MREGBIT_RX_PACKET_START_THRESHOLD GENMASK(13, 0) + +/* MAC_STATUS_IRQ (0x01e0) */ +#define MREGBIT_MAC_UNDERRUN_IRQ BIT(0) +#define MREGBIT_MAC_JABBER_IRQ BIT(1) + +/* MAC_INTERRUPT_ENABLE (0x01e4) */ +#define MREGBIT_MAC_UNDERRUN_INTERRUPT_ENABLE BIT(0) +#define MREGBIT_JABBER_INTERRUPT_ENABLE BIT(1) + +/* RX DMA descriptor */ + +#define RX_DESC_0_FRAME_PACKET_LENGTH_MASK GENMASK(13, 0) +#define RX_DESC_0_FRAME_ALIGN_ERR BIT(14) +#define RX_DESC_0_FRAME_RUNT BIT(15) +#define RX_DESC_0_FRAME_ETHERNET_TYPE BIT(16) +#define RX_DESC_0_FRAME_VLAN BIT(17) +#define RX_DESC_0_FRAME_MULTICAST BIT(18) +#define RX_DESC_0_FRAME_BROADCAST BIT(19) +#define RX_DESC_0_FRAME_CRC_ERR BIT(20) +#define RX_DESC_0_FRAME_MAX_LEN_ERR BIT(21) +#define RX_DESC_0_FRAME_JABBER_ERR BIT(22) +#define RX_DESC_0_FRAME_LENGTH_ERR BIT(23) +#define RX_DESC_0_FRAME_MAC_ADDR1_MATCH BIT(24) +#define RX_DESC_0_FRAME_MAC_ADDR2_MATCH BIT(25) +#define RX_DESC_0_FRAME_MAC_ADDR3_MATCH BIT(26) +#define RX_DESC_0_FRAME_MAC_ADDR4_MATCH BIT(27) +#define RX_DESC_0_FRAME_PAUSE_CTRL BIT(28) +#define RX_DESC_0_LAST_DESCRIPTOR BIT(29) +#define RX_DESC_0_FIRST_DESCRIPTOR BIT(30) +#define RX_DESC_0_OWN BIT(31) + +#define RX_DESC_1_BUFFER_SIZE_1_MASK GENMASK(11, 0) +#define RX_DESC_1_BUFFER_SIZE_2_MASK GENMASK(23, 12) + /* [24] reserved */ +#define RX_DESC_1_SECOND_ADDRESS_CHAINED BIT(25) +#define RX_DESC_1_END_RING BIT(26) + /* [29:27] reserved */ +#define RX_DESC_1_RX_TIMESTAMP BIT(30) +#define RX_DESC_1_PTP_PKT BIT(31) + +/* TX DMA descriptor */ + + /* [29:0] unused */ +#define TX_DESC_0_TX_TIMESTAMP BIT(30) +#define TX_DESC_0_OWN BIT(31) + +#define TX_DESC_1_BUFFER_SIZE_1_MASK GENMASK(11, 0) +#define TX_DESC_1_BUFFER_SIZE_2_MASK GENMASK(23, 12) +#define TX_DESC_1_FORCE_EOP_ERROR BIT(24) +#define TX_DESC_1_SECOND_ADDRESS_CHAINED BIT(25) +#define TX_DESC_1_END_RING BIT(26) +#define TX_DESC_1_DISABLE_PADDING BIT(27) +#define TX_DESC_1_ADD_CRC_DISABLE BIT(28) +#define TX_DESC_1_FIRST_SEGMENT BIT(29) +#define TX_DESC_1_LAST_SEGMENT BIT(30) +#define TX_DESC_1_INTERRUPT_ON_COMPLETION BIT(31) + +struct emac_desc { + u32 desc0; + u32 desc1; + u32 buffer_addr_1; + u32 buffer_addr_2; +}; + +/* Keep stats in this order, index used for accessing hardware */ + +struct emac_hw_tx_stats { + u64 tx_ok_pkts; + u64 tx_total_pkts; + u64 tx_ok_bytes; + u64 tx_err_pkts; + u64 tx_singleclsn_pkts; + u64 tx_multiclsn_pkts; + u64 tx_lateclsn_pkts; + u64 tx_excessclsn_pkts; + u64 tx_unicast_pkts; + u64 tx_multicast_pkts; + u64 tx_broadcast_pkts; + u64 tx_pause_pkts; +}; + +struct emac_hw_rx_stats { + u64 rx_ok_pkts; + u64 rx_total_pkts; + u64 rx_crc_err_pkts; + u64 rx_align_err_pkts; + u64 rx_err_total_pkts; + u64 rx_ok_bytes; + u64 rx_total_bytes; + u64 rx_unicast_pkts; + u64 rx_multicast_pkts; + u64 rx_broadcast_pkts; + u64 rx_pause_pkts; + u64 rx_len_err_pkts; + u64 rx_len_undersize_pkts; + u64 rx_len_oversize_pkts; + u64 rx_len_fragment_pkts; + u64 rx_len_jabber_pkts; + u64 rx_64_pkts; + u64 rx_65_127_pkts; + u64 rx_128_255_pkts; + u64 rx_256_511_pkts; + u64 rx_512_1023_pkts; + u64 rx_1024_1518_pkts; + u64 rx_1519_plus_pkts; + u64 rx_drp_fifo_full_pkts; + u64 rx_truncate_fifo_full_pkts; +}; + +#endif /* _K1_EMAC_H_ */ -- 2.50.1 From wangruikang at iscas.ac.cn Mon Sep 8 05:34:24 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Mon, 08 Sep 2025 20:34:24 +0800 Subject: [PATCH net-next v10 0/5] Add Ethernet MAC support for SpacemiT K1 Message-ID: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> SpacemiT K1 has two gigabit Ethernet MACs with RGMII and RMII support. Add devicetree bindings, driver, and DTS for it. Tested primarily on BananaPi BPI-F3. Basic TX/RX functionality also tested on Milk-V Jupiter. I would like to note that even though some bit field names superficially resemble that of DesignWare MAC, all other differences point to it in fact being a custom design. Based on SpacemiT drivers [1]. These patches are also available at: https://github.com/dramforever/linux/tree/k1/ethernet/v10 [1]: https://github.com/spacemit-com/linux-k1x --- Changes in v10: - Use FIELD_GET and FIELD_PREP, remove some unused constants - Remove redundant software statistics - In particular, rx_dropped should have been and is already tracked in rx_errors. - Track tx_dropped with a percpu field - * Simon, Jakub: Using "dstats" gets hairy since dev_get_dstats64() isn't public, so I would have to recreate the logic in emac_get_stats64(). Instead I just DIY'd a percpu stat_tx_dropped. - Minor changes - Simplified int emac_rx_frame_status() -> bool emac_rx_frame_good() - Link to v9: https://lore.kernel.org/r/20250905-net-k1-emac-v9-0-f1649b98a19c at iscas.ac.cn Changes in v9: - Refactor to use phy_interface_mode_is_rgmii - Minor changes - Use netdev_err in more places - Print phy-mode by name on unsupported phy-mode - Link to v8: https://lore.kernel.org/r/20250828-net-k1-emac-v8-0-e9075dd2ca90 at iscas.ac.cn Changes in v8: - Use devres to do of_phy_deregister_fixed_link on probe failure or remove - Simplified control flow in a few places with early return or continue - Minor changes - Removed some unneeded parens in emac_configure_{tx,rx} - Link to v7: https://lore.kernel.org/r/20250826-net-k1-emac-v7-0-5bc158d086ae at iscas.ac.cn Changes in v7: - Removed scoped_guard usage - Renamed error handling path labels after destinations - Fix skb free error handling path in emac_start_xmit and emac_tx_mem_map - Cancel tx_timeout_task to prevent schedule_work lifetime problems - Minor changes: - Remove unnecessary timer_delete_sync in emac_down - Use dev_err_ratelimited in a few more places - Cosmetic fixes in error messages - Link to v6: https://lore.kernel.org/r/20250820-net-k1-emac-v6-0-c1e28f2b8be5 at iscas.ac.cn Changes in v6: - Implement pause frame support - Minor changes: - Convert comment for emac_stats_update() into assert_spin_locked() - Cosmetic fixes for some comments and whitespace - emac_set_mac_addr() is now refactored - Link to v5: https://lore.kernel.org/r/20250812-net-k1-emac-v5-0-dd17c4905f49 at iscas.ac.cn Changes in v5: - Rebased on v6.17-rc1, add back DTS now that they apply cleanly - Use standard statistics interface, handle 32-bit statistics overflow - Minor changes: - Fix clock resource handling in emac_resume - Ratelimit the message in emac_rx_frame_status - Add ndo_validate_addr = eth_validate_addr - Remove unnecessary parens in emac_set_mac_addr - Change some functions that never fail to return void instead of int - Minor rewording - Link to v4: https://lore.kernel.org/r/20250703-net-k1-emac-v4-0-686d09c4cfa8 at iscas.ac.cn Changes in v4: - Resource handling on probe and remove: timer_delete_sync and of_phy_deregister_fixed_link - Drop DTS changes and dependencies (will send through SpacemiT tree) - Minor changes: - Remove redundant phy_stop() and setting of ndev->phydev - Fix error checking for emac_open in emac_resume - Fix one missed dev_err -> dev_err_probe - Fix type of emac_start_xmit - Fix one missed reverse xmas tree formatting - Rename some functions for consistency between emac_* and ndo_* - Link to v3: https://lore.kernel.org/r/20250702-net-k1-emac-v3-0-882dc55404f3 at iscas.ac.cn Changes in v3: - Refactored and simplified emac_tx_mem_map - Addressed other minor v2 review comments - Removed what was patch 3 in v2, depend on DMA buses instead - DT nodes in alphabetical order where appropriate - Link to v2: https://lore.kernel.org/r/20250618-net-k1-emac-v2-0-94f5f07227a8 at iscas.ac.cn Changes in v2: - dts: Put eth0 and eth1 nodes under a bus with dma-ranges - dts: Added Milk-V Jupiter - Fix typo in emac_init_hw() that broke the driver (Oops!) - Reformatted line lengths to under 80 - Addressed other v1 review comments - Link to v1: https://lore.kernel.org/r/20250613-net-k1-emac-v1-0-cc6f9e510667 at iscas.ac.cn --- Vivian Wang (5): dt-bindings: net: Add support for SpacemiT K1 net: spacemit: Add K1 Ethernet MAC riscv: dts: spacemit: Add Ethernet support for K1 riscv: dts: spacemit: Add Ethernet support for BPI-F3 riscv: dts: spacemit: Add Ethernet support for Jupiter .../devicetree/bindings/net/spacemit,k1-emac.yaml | 81 + arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts | 46 + arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts | 46 + arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi | 48 + arch/riscv/boot/dts/spacemit/k1.dtsi | 22 + drivers/net/ethernet/Kconfig | 1 + drivers/net/ethernet/Makefile | 1 + drivers/net/ethernet/spacemit/Kconfig | 29 + drivers/net/ethernet/spacemit/Makefile | 6 + drivers/net/ethernet/spacemit/k1_emac.c | 2156 ++++++++++++++++++++ drivers/net/ethernet/spacemit/k1_emac.h | 406 ++++ 11 files changed, 2842 insertions(+) --- base-commit: 062b3e4a1f880f104a8d4b90b767788786aa7b78 change-id: 20250606-net-k1-emac-3e181508ea64 Best regards, -- Vivian "dramforever" Wang From wangruikang at iscas.ac.cn Mon Sep 8 05:34:28 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Mon, 08 Sep 2025 20:34:28 +0800 Subject: [PATCH net-next v10 4/5] riscv: dts: spacemit: Add Ethernet support for BPI-F3 In-Reply-To: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> References: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> Message-ID: <20250908-net-k1-emac-v10-4-90d807ccd469@iscas.ac.cn> Banana Pi BPI-F3 uses an RGMII PHY for each port and uses GPIO for PHY reset. Tested-by: Hendrik Hamerlinck Signed-off-by: Vivian Wang Reviewed-by: Yixun Lan --- arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts | 46 +++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts b/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts index fe22c747c5012fe56d42ac8a7efdbbdb694f31b6..15fa4a5ebd043f3fbb115d37e5a980c9b773a228 100644 --- a/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts +++ b/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts @@ -40,6 +40,52 @@ &emmc { status = "okay"; }; +ð0 { + phy-handle = <&rgmii0>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac0_cfg>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <0>; + status = "okay"; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + reset-gpios = <&gpio K1_GPIO(110) GPIO_ACTIVE_LOW>; + reset-delay-us = <10000>; + reset-post-delay-us = <100000>; + + rgmii0: phy at 1 { + reg = <0x1>; + }; + }; +}; + +ð1 { + phy-handle = <&rgmii1>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac1_cfg>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <250>; + status = "okay"; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + reset-gpios = <&gpio K1_GPIO(115) GPIO_ACTIVE_LOW>; + reset-delay-us = <10000>; + reset-post-delay-us = <100000>; + + rgmii1: phy at 1 { + reg = <0x1>; + }; + }; +}; + &uart0 { pinctrl-names = "default"; pinctrl-0 = <&uart0_2_cfg>; -- 2.50.1 From david at redhat.com Mon Sep 8 05:53:00 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 8 Sep 2025 14:53:00 +0200 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <727cabec-5ee8-4793-926b-8d78febcd623@lucifer.local> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <016307ba-427d-4646-8e4d-1ffefd2c1968@nvidia.com> <85e760cf-b994-40db-8d13-221feee55c60@redhat.com> <727cabec-5ee8-4793-926b-8d78febcd623@lucifer.local> Message-ID: <7ee0b58a-8fe4-46fe-bfef-f04f900f3040@redhat.com> On 08.09.25 14:25, Lorenzo Stoakes wrote: > On Sat, Sep 06, 2025 at 08:56:48AM +0200, David Hildenbrand wrote: >> On 06.09.25 03:05, John Hubbard wrote: >>> >>> Probably a similar sentiment as Lorenzo here...the above diffs make the code >>> *worse* to read. In fact, I recall adding record_subpages() here long ago, >>> specifically to help clarify what was going on. >> >> Well, there is a lot I dislike about record_subpages() to go back there. >> Starting with "as Willy keeps explaining, the concept of subpages do >> not exist and ending with "why do we fill out the array even on failure". > > Yes > >> >> :) >> >>> >>> Now it's been returned to it's original, cryptic form. >>> >> >> The code in the caller was so uncryptic that both me and Lorenzo missed >> that magical addition. :P > > :'( > >> >>> Just my take on it, for whatever that's worth. :) >> >> As always, appreciated. >> >> I could of course keep the simple loop in some "record_folio_pages" >> function and clean up what I dislike about record_subpages(). >> >> But I much rather want the call chain to be cleaned up instead, if possible. >> >> >> Roughly, what I am thinking (limiting it to pte+pmd case) about is the following: > > I cannot get the below to apply even with the original patch here applied + fix. > > It looks like (in mm-new :) commit e73f43a66d5f ("mm/gup: remove dead pgmap > refcounting code") by Alastair has conflicted here, but even then I can't make > it apply, with/without your fix...! To be clear: it was never intended to be applied, because it wouldn't even compile in the current form. It was based on this nth_page submission + fix. [...] >> } >> static int gup_fast_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr, > > OK I guess you intentionally left the rest as a TODO :) > > So I'll wait for you to post it before reviewing in-depth. > > This generally LGTM as an approach, getting rid of *nr is important that's > really horrible. Yes. Expect a cleanup in that direction soonish (again, either from me or someone else I poke) > >> -- >> 2.50.1 >> >> >> >> Oh, I might even have found a bug moving away from that questionable >> "ret==1 means success" handling in gup_fast_pte_range()? Will >> have to double-check, but likely the following is the right thing to do. >> >> >> >> From 8f48b25ef93e7ef98611fd58ec89384ad5171782 Mon Sep 17 00:00:00 2001 >> From: David Hildenbrand >> Date: Sat, 6 Sep 2025 08:46:45 +0200 >> Subject: [PATCH] mm/gup: fix handling of errors from >> arch_make_folio_accessible() in follow_page_pte() >> >> In case we call arch_make_folio_accessible() and it fails, we would >> incorrectly return a value that is "!= 0" to the caller, indicating that >> we pinned all requested pages and that the caller can keep going. >> >> follow_page_pte() is not supposed to return error values, but instead >> 0 on failure and 1 on success. >> >> That is of course wrong, because the caller will just keep going pinning >> more pages. If we happen to pin a page afterwards, we're in trouble, >> because we essentially skipped some pages. >> >> Fixes: f28d43636d6f ("mm/gup/writeback: add callbacks for inaccessible pages") >> Signed-off-by: David Hildenbrand >> --- >> mm/gup.c | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/mm/gup.c b/mm/gup.c >> index 22420f2069ee1..cff226ec0ee7d 100644 >> --- a/mm/gup.c >> +++ b/mm/gup.c >> @@ -2908,8 +2908,7 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, >> * details. >> */ >> if (flags & FOLL_PIN) { >> - ret = arch_make_folio_accessible(folio); >> - if (ret) { >> + if (arch_make_folio_accessible(folio)) { > > Oh Lord above. Lol. Yikes. > > Yeah I think your fix is valid... I sent it out earlier today. Fortunately that function shouldn't usually really fail IIUC. -- Cheers David / dhildenb From conor at kernel.org Mon Sep 8 06:12:35 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 8 Sep 2025 14:12:35 +0100 Subject: [PATCH v1] rust: cfi: only 64-bit arm and x86 support CFI_CLANG Message-ID: <20250908-distill-lint-1ae78bcf777c@spud> From: Conor Dooley The kernel uses the standard rustc targets for non-x86 targets, and out of those only 64-bit arm's target has kcfi support enabled. For x86, the custom 64-bit target enables kcfi. The HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC config option that allows CFI_CLANG to be used in combination with RUST does not check whether the rustc target supports kcfi. This breaks the build on riscv (and presumably 32-bit arm) when CFI_CLANG and RUST are enabled at the same time. Ordinarily, a rustc-option check would be used to detect target support but unfortunately rustc-option filters out the target for reasons given in commit 46e24a545cdb4 ("rust: kasan/kbuild: fix missing flags on first build"). As a result, if the host supports kcfi but the target does not, e.g. when building for riscv on x86_64, the build would remain broken. Instead, make HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC depend on the only two architectures where the target used supports it to fix the build. CC: stable at vger.kernel.org Fixes: ca627e636551e ("rust: cfi: add support for CFI_CLANG with Rust") Signed-off-by: Conor Dooley --- CC: Paul Walmsley CC: Palmer Dabbelt CC: Alexandre Ghiti CC: Miguel Ojeda CC: Alex Gaynor CC: Boqun Feng CC: Gary Guo CC: "Bj?rn Roy Baron" CC: Benno Lossin CC: Andreas Hindborg CC: Alice Ryhl CC: Trevor Gross CC: Danilo Krummrich CC: Kees Cook CC: Sami Tolvanen CC: Matthew Maurer CC: "Peter Zijlstra (Intel)" CC: linux-kernel at vger.kernel.org CC: linux-riscv at lists.infradead.org CC: rust-for-linux at vger.kernel.org --- arch/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/Kconfig b/arch/Kconfig index d1b4ffd6e0856..880cddff5eda7 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -917,6 +917,7 @@ config HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC def_bool y depends on HAVE_CFI_ICALL_NORMALIZE_INTEGERS_CLANG depends on RUSTC_VERSION >= 107900 + depends on ARM64 || X86_64 # With GCOV/KASAN we need this fix: https://github.com/rust-lang/rust/pull/129373 depends on (RUSTC_LLVM_VERSION >= 190103 && RUSTC_VERSION >= 108200) || \ (!GCOV_KERNEL && !KASAN_GENERIC && !KASAN_SW_TAGS) -- 2.47.2 From aliceryhl at google.com Mon Sep 8 06:19:02 2025 From: aliceryhl at google.com (Alice Ryhl) Date: Mon, 8 Sep 2025 15:19:02 +0200 Subject: [PATCH v1] rust: cfi: only 64-bit arm and x86 support CFI_CLANG In-Reply-To: <20250908-distill-lint-1ae78bcf777c@spud> References: <20250908-distill-lint-1ae78bcf777c@spud> Message-ID: On Mon, Sep 8, 2025 at 3:13?PM Conor Dooley wrote: > > From: Conor Dooley > > The kernel uses the standard rustc targets for non-x86 targets, and out > of those only 64-bit arm's target has kcfi support enabled. For x86, the > custom 64-bit target enables kcfi. > > The HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC config option that allows > CFI_CLANG to be used in combination with RUST does not check whether the > rustc target supports kcfi. This breaks the build on riscv (and > presumably 32-bit arm) when CFI_CLANG and RUST are enabled at the same > time. > > Ordinarily, a rustc-option check would be used to detect target support > but unfortunately rustc-option filters out the target for reasons given > in commit 46e24a545cdb4 ("rust: kasan/kbuild: fix missing flags on first > build"). As a result, if the host supports kcfi but the target does not, > e.g. when building for riscv on x86_64, the build would remain broken. > > Instead, make HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC depend on the only > two architectures where the target used supports it to fix the build. > > CC: stable at vger.kernel.org > Fixes: ca627e636551e ("rust: cfi: add support for CFI_CLANG with Rust") > Signed-off-by: Conor Dooley Reviewed-by: Alice Ryhl From kingxukai at zohomail.com Mon Sep 8 07:13:15 2025 From: kingxukai at zohomail.com (Xukai Wang) Date: Mon, 8 Sep 2025 22:13:15 +0800 Subject: [PATCH v8 2/3] clk: canaan: Add clock driver for Canaan K230 In-Reply-To: References: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> <20250905-b4-k230-clk-v8-2-96caa02d5428@zohomail.com> Message-ID: <0947d9cc-86ba-46e0-92aa-04f4714e7a20@zohomail.com> On 2025/9/7 11:13, Yao Zi wrote: >> On Fri, Sep 05, 2025 at 11:10:23AM +0800, Xukai Wang wrote: >> This patch provides basic support for the K230 clock, which covers >> all clocks in K230 SoC. >> >> The clock tree of the K230 SoC consists of a 24MHZ external crystal >> oscillator, PLLs and an external pulse input for timerX, and their >> derived clocks. >> >> Co-developed-by: Troy Mitchell >> Signed-off-by: Troy Mitchell >> Signed-off-by: Xukai Wang >> --- >> drivers/clk/Kconfig | 6 + >> drivers/clk/Makefile | 1 + >> drivers/clk/clk-k230.c | 2456 ++++++++++++++++++++++++++++++++++++++++++++++++ >> 3 files changed, 2463 insertions(+) >> >> diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig >> index 299bc678ed1b9fcd9110bb8c5937a1bd1ea60e23..b597912607a6cc8eabff459a890a1e7353ef9c1d 100644 >> --- a/drivers/clk/Kconfig >> +++ b/drivers/clk/Kconfig >> @@ -464,6 +464,12 @@ config COMMON_CLK_K210 >> help >> Support for the Canaan Kendryte K210 RISC-V SoC clocks. >> >> +config COMMON_CLK_K230 >> + bool "Clock driver for the Canaan Kendryte K230 SoC" >> + depends on ARCH_CANAAN || COMPILE_TEST >> + help >> + Support for the Canaan Kendryte K230 RISC-V SoC clocks. >> + >> config COMMON_CLK_SP7021 >> tristate "Clock driver for Sunplus SP7021 SoC" >> depends on SOC_SP7021 || COMPILE_TEST >> diff --git a/drivers/clk/Makefile b/drivers/clk/Makefile >> index fb8878a5d7d93da6bec487460cdf63f1f764a431..5df50b1e14c701ed38397bfb257db26e8dd278b8 100644 >> --- a/drivers/clk/Makefile >> +++ b/drivers/clk/Makefile >> @@ -51,6 +51,7 @@ obj-$(CONFIG_MACH_ASPEED_G6) += clk-ast2600.o >> obj-$(CONFIG_ARCH_HIGHBANK) += clk-highbank.o >> obj-$(CONFIG_CLK_HSDK) += clk-hsdk-pll.o >> obj-$(CONFIG_COMMON_CLK_K210) += clk-k210.o >> +obj-$(CONFIG_COMMON_CLK_K230) += clk-k230.o >> obj-$(CONFIG_LMK04832) += clk-lmk04832.o >> obj-$(CONFIG_COMMON_CLK_LAN966X) += clk-lan966x.o >> obj-$(CONFIG_COMMON_CLK_LOCHNAGAR) += clk-lochnagar.o >> diff --git a/drivers/clk/clk-k230.c b/drivers/clk/clk-k230.c >> new file mode 100644 >> index 0000000000000000000000000000000000000000..2ba74c008b30ae3400acbd8c08550e8315dfe205 >> --- /dev/null >> +++ b/drivers/clk/clk-k230.c >> @@ -0,0 +1,2456 @@ >> +// SPDX-License-Identifier: GPL-2.0-only >> +/* >> + * Kendryte Canaan K230 Clock Drivers >> + * >> + * Author: Xukai Wang >> + * Author: Troy Mitchell >> + */ >> + >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +#include >> + >> +/* PLL control register bits. */ >> +#define K230_PLL_BYPASS_ENABLE BIT(19) >> +#define K230_PLL_GATE_ENABLE BIT(2) >> +#define K230_PLL_GATE_WRITE_ENABLE BIT(18) >> +#define K230_PLL_OD_SHIFT 24 >> +#define K230_PLL_OD_MASK 0xF >> +#define K230_PLL_R_SHIFT 16 >> +#define K230_PLL_R_MASK 0x3F >> +#define K230_PLL_F_SHIFT 0 >> +#define K230_PLL_F_MASK 0x1FFF >> +#define K230_PLL_DIV_REG_OFFSET 0x00 >> +#define K230_PLL_BYPASS_REG_OFFSET 0x04 >> +#define K230_PLL_GATE_REG_OFFSET 0x08 >> +#define K230_PLL_LOCK_REG_OFFSET 0x0C > Maybe FIELD_PREP() and FIELD_GET() would help for the PLL-related > rountines, and you could get avoid of writing shifts and masks by hand. OK, I've already replaced the manual shifts and masks with GENMASK() and FIELD_GET(). > > ... > >> +struct k230_clk_rate_self { >> + struct clk_hw hw; >> + void __iomem *reg; >> + bool read_only; > Isn't a read-only multiplier, divider or something capable of both a > simple fixed-factor hardware? You're right. None of the rate clocks are read-only, so this flag is unnecessary and should be removed. > If so please switch to the existing clock > hardware, instead of introducing a field in description of rate clocks. > > It's worth noting that you've already had at least one fixed-ra te clock > (shrm_sram_div2). > >> + u32 write_enable_bit; >> + u32 mul_min; >> + u32 mul_max; >> + u32 mul_shift; >> + u32 mul_mask; >> + u32 div_min; >> + u32 div_max; >> + u32 div_shift; >> + u32 div_mask; >> + /* ensures mutual exclusion for concurrent register access. */ >> + spinlock_t *lock; >> +}; > ... > >> +static int k230_clk_find_approximate_mul_div(u32 mul_min, u32 mul_max, >> + u32 div_min, u32 div_max, >> + unsigned long rate, >> + unsigned long parent_rate, >> + u32 *div, u32 *mul) >> +{ >> + long abs_min; >> + long abs_current; >> + long perfect_divide; >> + >> + if (!rate || !parent_rate || !mul_min) >> + return -EINVAL; >> + >> + perfect_divide = (long)((parent_rate * 1000) / rate); >> + abs_min = abs(perfect_divide - >> + (long)(((long)div_max * 1000) / (long)mul_min)); >> + >> + *div = div_max; >> + *mul = mul_min; >> + >> + for (u32 i = div_max - 1; i >= div_min; i--) { >> + for (u32 j = mul_min + 1; j <= mul_max; j++) { >> + abs_current = abs(perfect_divide - >> + (long)(((long)i * 1000) / (long)j)); >> + >> + if (abs_min > abs_current) { >> + abs_min = abs_current; >> + *div = i; >> + *mul = j; >> + } >> + } >> + } >> + >> + return 0; >> +} > This looks like a poor version of rational_best_approximation(). Could > you please consider switching to it? OK, I've switched k230_clk_find_approximate_mul_div() to use rational_best_approximation(). Additionally, since rational_best_approximation() only supports setting the maximum value for numerator and denominator, I added extra checks after the call to ensure that the resulting values are not lower than mul_min and div_min. > >> +static int k230_clk_set_rate_mul(struct clk_hw *hw, unsigned long rate, >> + unsigned long parent_rate) >> +{ >> + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); >> + struct k230_clk_rate_self *rate_self = &clk->clk; >> + u32 div, mul, mul_reg; >> + >> + if (rate > parent_rate) >> + return -EINVAL; >> + >> + if (rate_self->read_only) >> + return 0; >> + >> + if (k230_clk_find_approximate_mul(rate_self->mul_min, rate_self->mul_max, >> + rate_self->div_min, rate_self->div_max, >> + rate, parent_rate, &div, &mul)) >> + return -EINVAL; >> + >> + guard(spinlock)(rate_self->lock); >> + >> + mul_reg = readl(rate_self->reg + clk->mul_reg_off); >> + mul_reg |= ((mul - 1) & rate_self->mul_mask) << (rate_self->mul_shift); >> + mul_reg |= BIT(rate_self->write_enable_bit); >> + writel(mul_reg, rate_self->reg + clk->mul_reg_off); >> + >> + return 0; >> +} >> + >> +static int k230_clk_set_rate_div(struct clk_hw *hw, unsigned long rate, >> + unsigned long parent_rate) >> +{ >> + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); >> + struct k230_clk_rate_self *rate_self = &clk->clk; >> + u32 div, mul, div_reg; >> + >> + if (rate > parent_rate) >> + return -EINVAL; >> + >> + if (rate_self->read_only) >> + return 0; >> + >> + if (k230_clk_find_approximate_div(rate_self->mul_min, rate_self->mul_max, >> + rate_self->div_min, rate_self->div_max, >> + rate, parent_rate, &div, &mul)) >> + return -EINVAL; >> + >> + guard(spinlock)(rate_self->lock); >> + >> + div_reg = readl(rate_self->reg + clk->div_reg_off); >> + div_reg |= ((div - 1) & rate_self->div_mask) << (rate_self->div_shift); >> + div_reg |= BIT(rate_self->write_enable_bit); >> + writel(div_reg, rate_self->reg + clk->div_reg_off); >> + >> + return 0; >> +} >> + >> +static int k230_clk_set_rate_mul_div(struct clk_hw *hw, unsigned long rate, >> + unsigned long parent_rate) >> +{ >> + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); >> + struct k230_clk_rate_self *rate_self = &clk->clk; >> + u32 div, mul, div_reg, mul_reg; >> + >> + if (rate > parent_rate) >> + return -EINVAL; >> + >> + if (rate_self->read_only) >> + return 0; >> + >> + if (k230_clk_find_approximate_mul_div(rate_self->mul_min, rate_self->mul_max, >> + rate_self->div_min, rate_self->div_max, >> + rate, parent_rate, &div, &mul)) >> + return -EINVAL; >> + >> + guard(spinlock)(rate_self->lock); >> + >> + div_reg = readl(rate_self->reg + clk->div_reg_off); >> + div_reg |= ((div - 1) & rate_self->div_mask) << (rate_self->div_shift); >> + div_reg |= BIT(rate_self->write_enable_bit); >> + writel(div_reg, rate_self->reg + clk->div_reg_off); >> + >> + mul_reg = readl(rate_self->reg + clk->mul_reg_off); >> + mul_reg |= ((mul - 1) & rate_self->mul_mask) << (rate_self->mul_shift); >> + mul_reg |= BIT(rate_self->write_enable_bit); >> + writel(mul_reg, rate_self->reg + clk->mul_reg_off); >> + >> + return 0; >> +} > There are three variants of rate clocks, mul-only, div-only and mul-div > ones, which are similar to clk-multiplier, clk-divider, > clk-fractional-divider. > > The only difference is to setup new parameters for K230's rate clocks, > a register bit, described as k230_clk_rate_self.write_enable_bit, must > be set first. Actually, I think the differences are not limited to just the write_enable_bit. There are also distinct mul_min, mul_max, div_min, and div_max values, which are not typically just 1 and (1 << bit_width) as in standard clock divider or multiplier structures. For example, the div_min for hs_sd_card_src_rate is 2, not 1. This affects the calculation of the approximate divider, and cannot be fully represented if we only use the clk_divider structure. Another example is ls_codec_adc_rate, where mul_min is 0x10, mul_max is 0x1B9, div_min is 0xC35, and div_max is 0x3D09. These specific ranges cannot be described using the normal clk_fractional_divider structure. > > What do you think of introducing support for such "write enable bit" to > the generic implementation of multipler/divider/fractional? Then you > could reuse the generic implementation in K230's driver, avoiding code > duplication. Therefore, in addition to the requirement of setting the write_enable_bit, the customizable ranges for these parameters are also important differences that should be considered. > > ... > >> +static const struct of_device_id k230_clk_ids[] = { >> + { .compatible = "canaan,k230-clk" }, >> + { /* Sentinel */ } >> +}; >> +MODULE_DEVICE_TABLE(of, k230_clk_ids); > MODULE_DEVICE_TABLE is unnecessary if your driver couldn't be built as > a module. OK, thanks for point it out. > >> +static struct platform_driver k230_clk_driver = { >> + .driver = { >> + .name = "k230_clock_controller", >> + .of_match_table = k230_clk_ids, >> + }, >> + .probe = k230_clk_probe, >> +}; >> +builtin_platform_driver(k230_clk_driver); > Best regards, > Yao Zi From miguel.ojeda.sandonis at gmail.com Mon Sep 8 07:36:09 2025 From: miguel.ojeda.sandonis at gmail.com (Miguel Ojeda) Date: Mon, 8 Sep 2025 16:36:09 +0200 Subject: [PATCH v1] rust: cfi: only 64-bit arm and x86 support CFI_CLANG In-Reply-To: <20250908-distill-lint-1ae78bcf777c@spud> References: <20250908-distill-lint-1ae78bcf777c@spud> Message-ID: On Mon, Sep 8, 2025 at 3:13?PM Conor Dooley wrote: > > From: Conor Dooley > > The kernel uses the standard rustc targets for non-x86 targets, and out > of those only 64-bit arm's target has kcfi support enabled. For x86, the > custom 64-bit target enables kcfi. > > The HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC config option that allows > CFI_CLANG to be used in combination with RUST does not check whether the > rustc target supports kcfi. This breaks the build on riscv (and > presumably 32-bit arm) when CFI_CLANG and RUST are enabled at the same > time. > > Ordinarily, a rustc-option check would be used to detect target support > but unfortunately rustc-option filters out the target for reasons given > in commit 46e24a545cdb4 ("rust: kasan/kbuild: fix missing flags on first > build"). As a result, if the host supports kcfi but the target does not, > e.g. when building for riscv on x86_64, the build would remain broken. > > Instead, make HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC depend on the only > two architectures where the target used supports it to fix the build. > > CC: stable at vger.kernel.org > Fixes: ca627e636551e ("rust: cfi: add support for CFI_CLANG with Rust") > Signed-off-by: Conor Dooley If you are taking this through RISC-V: Acked-by: Miguel Ojeda Cheers, Miguel From ni_liqiang at 126.com Mon Sep 8 08:03:46 2025 From: ni_liqiang at 126.com (niliqiang) Date: Mon, 8 Sep 2025 23:03:46 +0800 Subject: [RFC PATCH v2 00/10] RISC-V IOMMU HPM and nested IOMMU support In-Reply-To: References: Message-ID: <20250908150346.4761-1-ni_liqiang@126.com> On Tue, 2 Sep 2025 12:01:19 +0800 Zong Li wrote: > > On Mon, Sep 1, 2025 at 9:37???PM niliqiang wrote: > > > > Hi Zong > > > > Fri, 14 Jun 2024 22:21:48 +0800, Zong Li wrote: > > > > > This patch initialize the pmu stuff and uninitialize it when driver > > > removing. The interrupt handling is also provided, this handler need to > > > be primary handler instead of thread function, because pt_regs is empty > > > when threading the IRQ, but pt_regs is necessary by perf_event_overflow. > > > > > > Signed-off-by: Zong Li > > > --- > > > drivers/iommu/riscv/iommu.c | 65 +++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 65 insertions(+) > > > > > > diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c > > > index 8b6a64c1ad8d..1716b2251f38 100644 > > > --- a/drivers/iommu/riscv/iommu.c > > > +++ b/drivers/iommu/riscv/iommu.c > > > @@ -540,6 +540,62 @@ static irqreturn_t riscv_iommu_fltq_process(int irq, void *data) > > > return IRQ_HANDLED; > > > } > > > > > > +/* > > > + * IOMMU Hardware performance monitor > > > + */ > > > + > > > +/* HPM interrupt primary handler */ > > > +static irqreturn_t riscv_iommu_hpm_irq_handler(int irq, void *dev_id) > > > +{ > > > + struct riscv_iommu_device *iommu = (struct riscv_iommu_device *)dev_id; > > > + > > > + /* Process pmu irq */ > > > + riscv_iommu_pmu_handle_irq(&iommu->pmu); > > > + > > > + /* Clear performance monitoring interrupt pending */ > > > + riscv_iommu_writel(iommu, RISCV_IOMMU_REG_IPSR, RISCV_IOMMU_IPSR_PMIP); > > > + > > > + return IRQ_HANDLED; > > > +} > > > + > > > +/* HPM initialization */ > > > +static int riscv_iommu_hpm_enable(struct riscv_iommu_device *iommu) > > > +{ > > > + int rc; > > > + > > > + if (!(iommu->caps & RISCV_IOMMU_CAPABILITIES_HPM)) > > > + return 0; > > > + > > > + /* > > > + * pt_regs is empty when threading the IRQ, but pt_regs is necessary > > > + * by perf_event_overflow. Use primary handler instead of thread > > > + * function for PM IRQ. > > > + * > > > + * Set the IRQF_ONESHOT flag because this IRQ might be shared with > > > + * other threaded IRQs by other queues. > > > + */ > > > + rc = devm_request_irq(iommu->dev, > > > + iommu->irqs[riscv_iommu_queue_vec(iommu, RISCV_IOMMU_IPSR_PMIP)], > > > + riscv_iommu_hpm_irq_handler, IRQF_ONESHOT | IRQF_SHARED, NULL, iommu); > > > + if (rc) > > > + return rc; > > > + > > > + return riscv_iommu_pmu_init(&iommu->pmu, iommu->reg, dev_name(iommu->dev)); > > > +} > > > + > > > > What are the benefits of initializing the iommu-pmu driver in the iommu driver? > > > > It might be better for the RISC-V IOMMU PMU driver to be loaded as a separate module, as this would allow greater flexibility since different vendors may need to add custom events. > > > > Also, I'm not quite clear on how custom events should be added if the RISC-V iommu-pmu is placed within the iommu driver. > > Hi Liqiang, > My original idea is that, since the IOMMU HPM is not always present, > it depends on the capability.HPM bit, if we separate HPM into an > individual module, I assume that the PMU driver may not have access to > the IOMMU's complete MMIO region. I'm not sure how we would check the > capability register in the PMU driver and avoid the following > situation: capability.HPM is zero, but the IOMMU-PMU driver is still > loaded because the PMU node is present in the DTS. It will be helpful > if you have any suggestions on this. > > Regarding custom events, since we don't have the driver data, my > current rough idea is to add a vendor event map table to list the > vendor events and use Kconfig to define them respectively. This is > just an initial thought and may not be the good solution, so feel free > to share any recommendations. Of course, if we eventually decide to > move it to drivers/perf as an individual module, then we could use the > driver data for custom events, similar to what ARM does. > > Thanks > Apologies for the late reply. I understand the reasoning behind this approach now, it is indeed necessary to check the capability.HPM bit to further determine whether to load the pmu-driver. Also, the HPM-related registers are stored together with other registers without distinction. Regarding custom events, I also previously thought that distinguishing them via Kconfig was the way, though it isn't the most elegant approach. Sorry, I don't have a better idea for now. I'll be sure to follow up with you if it come across any better ideas. Best regards, Liqiang From broonie at kernel.org Mon Sep 8 08:16:13 2025 From: broonie at kernel.org (Mark Brown) Date: Mon, 8 Sep 2025 16:16:13 +0100 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <20250901150359.867252-20-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> Message-ID: On Mon, Sep 01, 2025 at 05:03:40PM +0200, David Hildenbrand wrote: > We can just cleanup the code by calculating the #refs earlier, > so we can just inline what remains of record_subpages(). > > Calculate the number of references/pages ahead of times, and record them > only once all our tests passed. I'm seeing failures in kselftest-mm in -next on at least Raspberry Pi 4 and Orion O6 which bisect to this patch. I'm seeing a NULL pointer dereference during the GUP test (which isn't actually doing anything as I'm just using a standard defconfig rather than one with the mm fragment): # # # [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping .[ 92.209804] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 ... [ 92.443816] Call trace: [ 92.446284] io_check_coalesce_buffer+0xd4/0x160 (P) [ 92.451306] io_sqe_buffers_register+0x1b0/0x22c [ 92.455976] __arm64_sys_io_uring_register+0x330/0xe74 [ 92.461176] invoke_syscall+0x48/0x104 [ 92.464966] el0_svc_common.constprop.0+0x40/0xe0 Full log: https://lava.sirena.org.uk/scheduler/job/1778528#L1985 The bisect looks to converge reasonably clearly, I didn't make much more effort to diagnose: # bad: [be5d4872e528796df9d7425f2bd9b3893eb3a42c] Add linux-next specific files for 20250905 # good: [5fe42852269dc659c8d511864410bd5cf3393e91] Merge branch 'for-linux-next-fixes' of https://gitlab.freedesktop.org/drm/misc/kernel.git # good: [0ccc1eeda155c947d88ef053e0b54e434e218ee2] ASoC: dt-bindings: wlf,wm8960: Document routing strings (pin names) # good: [7748328c2fd82efed24257b2bfd796eb1fa1d09b] ASoC: dt-bindings: qcom,lpass-va-macro: Update bindings for clocks to support ADSP # good: [dd7ae5b8b3c291c0206f127a564ae1e316705ca0] ASoC: cs42l43: Shutdown jack detection on suspend # good: [ce1a46b2d6a8465a86f7a6f71beb4c6de83bce5c] ASoC: codecs: lpass-wsa-macro: add Codev version 2.9 # good: [ce57b718006a069226b5e5d3afe7969acd59154e] ASoC: Intel: avs: ssm4567: Adjust platform name # good: [94b39cb3ad6db935b585988b36378884199cd5fc] spi: mxs: fix "transfered"->"transferred" # good: [5cc49b5a36b32a2dba41441ea13b93fb5ea21cfd] spi: spi-fsl-dspi: Report FIFO overflows as errors # good: [3279052eab235bfb7130b1fabc74029c2260ed8d] ASoC: SOF: ipc4-topology: Fix a less than zero check on a u32 # good: [8f57dcf39fd0864f5f3e6701fe885e55f45d0d3a] ASoC: qcom: audioreach: convert to cpu endainess type before accessing # good: [9d35d068fb138160709e04e3ee97fe29a6f8615b] regulator: scmi: Use int type to store negative error codes # good: [8a9772ec08f87c9e45ab1ad2c8d2b8c1763836eb] ASoC: soc-dapm: rename snd_soc_kcontrol_component() to snd_soc_kcontrol_to_component() # good: [07752abfa5dbf7cb4d9ce69fa94dc3b12bc597d9] ASoC: SOF: sof-client: Introduce sof_client_dev_entry structure # good: [d57d27171c92e9049d5301785fb38de127b28fbf] ASoC: SOF: sof-client-probes: Add available points_info(), IPC4 only # good: [f7c41911ad744177d8289820f01009dc93d8f91c] ASoC: SOF: ipc4-topology: Add support for float sample type # good: [3d439e1ec3368fae17db379354bd7a9e568ca0ab] ASoC: sof: ipc4-topology: Add support to sched_domain attribute # good: [5c39bc498f5ff7ef016abf3f16698f3e8db79677] ASoC: SOF: Intel: only detect codecs when HDA DSP probe # good: [f522da9ab56c96db8703b2ea0f09be7cdc3bffeb] ASoC: doc: Internally link to Writing an ALSA Driver docs # good: [f4672dc6e9c07643c8c755856ba8e9eb9ca95d0c] regmap: use int type to store negative error codes # good: [b088b6189a4066b97cef459afd312fd168a76dea] ASoC: mediatek: common: Switch to for_each_available_child_of_node_scoped() # good: [c42e36a488c7e01f833fc9f4814f735b66b2d494] spi: Drop dev_pm_domain_detach() call # good: [a37280daa4d583c7212681c49b285de9464a5200] ASoC: Intel: avs: Allow i2s test and non-test boards to coexist # good: [ff9a7857b7848227788f113d6dc6a72e989084e0] spi: rb4xx: use devm for clk_prepare_enable # good: [edb5c1f885207d1d74e8a1528e6937e02829ee6e] ASoC: renesas: msiof: start DMAC first # good: [e2ab5f600bb01d3625d667d97b3eb7538e388336] rust: regulator: use `to_result` for error handling # good: [5b4dcaf851df8c414bfc2ac3bf9c65fc942f3be4] ASoC: amd: acp: Remove (explicitly) unused header # good: [899fb38dd76dd3ede425bbaf8a96d390180a5d1c] regulator: core: Remove redundant ternary operators # good: [11f5c5f9e43e9020bae452232983fe98e7abfce0] ASoC: qcom: use int type to store negative error codes # good: [a12b74d2bd4724ee1883bc97ec93eac8fafc8d3c] ASoC: tlv320aic32x4: use dev_err_probe() for regulators # good: [f840737d1746398c2993be34bfdc80bdc19ecae2] ASoC: SOF: imx: Remove the use of dev_err_probe() # good: [d78e48ebe04e9566f8ecbf51471e80da3adbceeb] ASoC: dt-bindings: Minor whitespace cleanup in example # good: [96bcb34df55f7fee99795127c796315950c94fed] ASoC: test-component: Use kcalloc() instead of kzalloc() # good: [c232495d28ca092d0c39b10e35d3d613bd2414ab] ASoC: dt-bindings: omap-twl4030: convert to DT schema # good: [87a877de367d835b527d1086f75727123ef85fc4] KVM: x86: Rename handle_fastpath_set_msr_irqoff() to handle_fastpath_wrmsr() # good: [c26675447faff8c4ddc1dc5d2cd28326b8181aaf] KVM: x86: Zero XSTATE components on INIT by iterating over supported features # good: [ec0be3cdf40b5302248f3fb27a911cc630e8b855] regulator: consumer.rst: document bulk operations # good: [27848c082ba0b22850fd9fb7b185c015423dcdc7] spi: s3c64xx: Remove the use of dev_err_probe() # good: [c1dd310f1d76b4b13f1854618087af2513140897] spi: SPISG: Use devm_kcalloc() in aml_spisg_clk_init() # good: [da9881d00153cc6d3917f6b74144b1d41b58338c] ASoC: qcom: audioreach: add support for SMECNS module # good: [cf65182247761f7993737b710afe8c781699356b] ASoC: codecs: wsa883x: Handle shared reset GPIO for WSA883x speakers # good: [2a55135201d5e24b80b7624880ff42eafd8e320c] ASoC: Intel: avs: Streamline register-component function names # good: [550bc517e59347b3b1af7d290eac4fb1411a3d4e] regulator: bd718x7: Use kcalloc() instead of kzalloc() # good: [0056b410355713556d8a10306f82e55b28d33ba8] spi: offload trigger: adi-util-sigma-delta: clean up imports # good: [daf855f76a1210ceed9541f71ac5dd9be02018a6] ASoC: es8323: enable DAPM power widgets for playback DAC # good: [90179609efa421b1ccc7d8eafbc078bafb25777c] spi: spl022: use min_t() to improve code # good: [258384d8ce365dddd6c5c15204de8ccd53a7ab0a] ASoC: es8323: enable DAPM power widgets for playback DAC and output # good: [6d068f1ae2a2f713d7f21a9a602e65b3d6b6fc6d] regulator: rt5133: Fix spelling mistake "regualtor" -> "regulator" # good: [a46e95c81e3a28926ab1904d9f754fef8318074d] ASoC: wl1273: Remove # good: [48124569bbc6bfda1df3e9ee17b19d559f4b1aa3] spi: remove unneeded 'fast_io' parameter in regmap_config # good: [37533933bfe92cd5a99ef4743f31dac62ccc8de0] regulator: remove unneeded 'fast_io' parameter in regmap_config # good: [0e62438e476494a1891a8822b9785bc6e73e9c3f] ASoC: Intel: sst: Remove redundant semicolons # good: [5c36b86d2bf68fbcad16169983ef7ee8c537db59] regmap: Remove superfluous check for !config in __regmap_init() # good: [714165e1c4b0d5b8c6d095fe07f65e6e7047aaeb] regulator: rt5133: Add RT5133 PMIC regulator Support # good: [9c45f95222beecd6a284fd1284d54dd7a772cf59] spi: spi-qpic-snand: handle 'use_ecc' parameter of qcom_spi_config_cw_read() # good: [bab4ab484a6ca170847da9bffe86f1fa90df4bbe] ASoC: dt-bindings: Convert brcm,bcm2835-i2s to DT schema # good: [b832b19318534bb4f1673b24d78037fee339c679] spi: loopback-test: Don't use %pK through printk # good: [8c02c8353460f8630313aef6810f34e134a3c1ee] ASoC: dt-bindings: realtek,alc5623: convert to DT schema # good: [6b7e2aa50bdaf88cd4c2a5e2059a7bf32d85a8b1] spi: spi-qpic-snand: remove 'clr*status' members of struct 'qpic_ecc' # good: [2291a2186305faaf8525d57849d8ba12ad63f5e7] MAINTAINERS: Add entry for FourSemi audio amplifiers # good: [a54ef14188519a0994d0264f701f5771815fa11e] regulator: dt-bindings: Clean-up active-semi,act8945a duplication # good: [a1d0b0ae65ae3f32597edfbb547f16c75601cd87] spi: spi-qpic-snand: avoid double assignment in qcom_spi_probe() # good: [cf25eb8eae91bcae9b2065d84b0c0ba0f6d9dd34] ASoC: soc-component: unpack snd_soc_component_init_bias_level() # good: [595b7f155b926460a00776cc581e4dcd01220006] ASoC: Intel: avs: Conditional-path support # good: [3059067fd3378a5454e7928c08d20bf3ef186760] ASoC: cs48l32: Use PTR_ERR_OR_ZERO() to simplify code # good: [2d86d2585ab929a143d1e6f8963da1499e33bf13] ASoC: pxa: add GPIOLIB_LEGACY dependency # good: [9a200cbdb54349909a42b45379e792e4b39dd223] rust: regulator: implement Send and Sync for Regulator # good: [162e23657e5379f07c6404dbfbf4367cb438ea7d] regulator: pf0900: Add PMIC PF0900 support # good: [886f42ce96e7ce80545704e7168a9c6b60cd6c03] regmap: mmio: Add missing MODULE_DESCRIPTION() # good: [6684aba0780da9f505c202f27e68ee6d18c0aa66] XArray: Add extra debugging check to xas_lock and friends git bisect start 'be5d4872e528796df9d7425f2bd9b3893eb3a42c' '5fe42852269dc659c8d511864410bd5cf3393e91' '0ccc1eeda155c947d88ef053e0b54e434e218ee2' '7748328c2fd82efed24257b2bfd796eb1fa1d09b' 'dd7ae5b8b3c291c0206f127a564ae1e316705ca0' 'ce1a46b2d6a8465a86f7a6f71beb4c6de83bce5c' 'ce57b718006a069226b5e5d3afe7969acd59154e' '94b39cb3ad6db935b585988b36378884199cd5fc' '5cc49b5a36b32a2dba41441ea13b93fb5ea21cfd' '3279052eab235bfb7130b1fabc74029c2260ed8d' '8f57dcf39fd0864f5f3e6701fe885e55f45d0d3a' '9d35d068fb138160709e04e3ee97fe29a6f8615b' '8a9772ec08f87c9e45ab1ad2c8d2b8c1763836eb' '07752abfa5dbf7cb4d9ce69fa94dc3b12bc597d9' 'd57d27171c92e9049d5301785fb38de127b28fbf' 'f7c41911ad744177d8289820f01009dc93d8f91c' '3d439e1ec3368fae17db379354bd7a9e568ca0ab' '5c39bc498f5ff7ef016abf3f16698f3e8db79677' 'f522da9ab56c96db8703b2ea0f09be7cdc3bffeb' 'f4672dc6e9c07643c8c755856ba8e9eb9ca95d0c' 'b088b6189a4066b97cef459afd312fd168a76dea' 'c42e36a488c7e01f833fc9f4814f735b66b2d494' 'a37280daa4d583c7212681c49b285de9464a5200' 'ff9a7857b7848227788f113d6dc6a72e989084e0' 'edb5c1f885207d1d74e8a1528e6937e02829ee6e' 'e2ab5f600bb01d3625d667d97b3eb7538e388336' '5b4dcaf851df8c414bfc2ac3bf9c65fc942f3be4' '899fb38dd76dd3ede425bbaf8a96d390180a5d1c' '11f5c5f9e43e9020bae452232983fe98e7abfce0' 'a12b74d2bd4724ee1883bc97ec93eac8fafc8d3c' 'f840737d1746398c2993be34bfdc80bdc19ecae2' 'd78e48ebe04e9566f8ecbf51471e80da3adbceeb' '96bcb34df55f7fee99795127c796315950c94fed' 'c232495d28ca092d0c39b10e35d3d613bd2414ab' '87a877de367d835b527d1086f75727123ef85fc4' 'c26675447faff8c4ddc1dc5d2cd28326b8181aaf' 'ec0be3cdf40b5302248f3fb27a911cc630e8b855' '27848c082ba0b22850fd9fb7b185c015423dcdc7' 'c1dd310f1d76b4b13f1854618087af2513140897' 'da9881d00153cc6d3917f6b74144b1d41b58338c' 'cf65182247761f7993737b710afe8c781699356b' '2a55135201d5e24b80b7624880ff42eafd8e320c' '550bc517e59347b3b1af7d290eac4fb1411a3d4e' '0056b410355713556d8a10306f82e55b28d33ba8' 'daf855f76a1210ceed9541f71ac5dd9be02018a6' '90179609efa421b1ccc7d8eafbc078bafb25777c' '258384d8ce365dddd6c5c15204de8ccd53a7ab0a' '6d068f1ae2a2f713d7f21a9a602e65b3d6b6fc6d' 'a46e95c81e3a28926ab1904d9f754fef8318074d' '48124569bbc6bfda1df3e9ee17b19d559f4b1aa3' '37533933bfe92cd5a99ef4743f31dac62ccc8de0' '0e62438e476494a1891a8822b9785bc6e73e9c3f' '5c36b86d2bf68fbcad16169983ef7ee8c537db59' '714165e1c4b0d5b8c6d095fe07f65e6e7047aaeb' '9c45f95222beecd6a284fd1284d54dd7a772cf59' 'bab4ab484a6ca170847da9bffe86f1fa90df4bbe' 'b832b19318534bb4f1673b24d78037fee339c679' '8c02c8353460f8630313aef6810f34e134a3c1ee' '6b7e2aa50bdaf88cd4c2a5e2059a7bf32d85a8b1' '2291a2186305faaf8525d57849d8ba12ad63f5e7' 'a54ef14188519a0994d0264f701f5771815fa11e' 'a1d0b0ae65ae3f32597edfbb547f16c75601cd87' 'cf25eb8eae91bcae9b2065d84b0c0ba0f6d9dd34' '595b7f155b926460a00776cc581e4dcd01220006' '3059067fd3378a5454e7928c08d20bf3ef186760' '2d86d2585ab929a143d1e6f8963da1499e33bf13' '9a200cbdb54349909a42b45379e792e4b39dd223' '162e23657e5379f07c6404dbfbf4367cb438ea7d' '886f42ce96e7ce80545704e7168a9c6b60cd6c03' '6684aba0780da9f505c202f27e68ee6d18c0aa66' # test job: [0ccc1eeda155c947d88ef053e0b54e434e218ee2] https://lava.sirena.org.uk/scheduler/job/1773040 # test job: [7748328c2fd82efed24257b2bfd796eb1fa1d09b] https://lava.sirena.org.uk/scheduler/job/1773378 # test job: [dd7ae5b8b3c291c0206f127a564ae1e316705ca0] https://lava.sirena.org.uk/scheduler/job/1773233 # test job: [ce1a46b2d6a8465a86f7a6f71beb4c6de83bce5c] https://lava.sirena.org.uk/scheduler/job/1768983 # test job: [ce57b718006a069226b5e5d3afe7969acd59154e] https://lava.sirena.org.uk/scheduler/job/1768713 # test job: [94b39cb3ad6db935b585988b36378884199cd5fc] https://lava.sirena.org.uk/scheduler/job/1768603 # test job: [5cc49b5a36b32a2dba41441ea13b93fb5ea21cfd] https://lava.sirena.org.uk/scheduler/job/1769293 # test job: [3279052eab235bfb7130b1fabc74029c2260ed8d] https://lava.sirena.org.uk/scheduler/job/1762427 # test job: [8f57dcf39fd0864f5f3e6701fe885e55f45d0d3a] https://lava.sirena.org.uk/scheduler/job/1760074 # test job: [9d35d068fb138160709e04e3ee97fe29a6f8615b] https://lava.sirena.org.uk/scheduler/job/1758673 # test job: [8a9772ec08f87c9e45ab1ad2c8d2b8c1763836eb] https://lava.sirena.org.uk/scheduler/job/1758556 # test job: [07752abfa5dbf7cb4d9ce69fa94dc3b12bc597d9] https://lava.sirena.org.uk/scheduler/job/1752251 # test job: [d57d27171c92e9049d5301785fb38de127b28fbf] https://lava.sirena.org.uk/scheduler/job/1752624 # test job: [f7c41911ad744177d8289820f01009dc93d8f91c] https://lava.sirena.org.uk/scheduler/job/1752345 # test job: [3d439e1ec3368fae17db379354bd7a9e568ca0ab] https://lava.sirena.org.uk/scheduler/job/1753454 # test job: [5c39bc498f5ff7ef016abf3f16698f3e8db79677] https://lava.sirena.org.uk/scheduler/job/1751954 # test job: [f522da9ab56c96db8703b2ea0f09be7cdc3bffeb] https://lava.sirena.org.uk/scheduler/job/1751875 # test job: [f4672dc6e9c07643c8c755856ba8e9eb9ca95d0c] https://lava.sirena.org.uk/scheduler/job/1747876 # test job: [b088b6189a4066b97cef459afd312fd168a76dea] https://lava.sirena.org.uk/scheduler/job/1746202 # test job: [c42e36a488c7e01f833fc9f4814f735b66b2d494] https://lava.sirena.org.uk/scheduler/job/1746271 # test job: [a37280daa4d583c7212681c49b285de9464a5200] https://lava.sirena.org.uk/scheduler/job/1746918 # test job: [ff9a7857b7848227788f113d6dc6a72e989084e0] https://lava.sirena.org.uk/scheduler/job/1746336 # test job: [edb5c1f885207d1d74e8a1528e6937e02829ee6e] https://lava.sirena.org.uk/scheduler/job/1746134 # test job: [e2ab5f600bb01d3625d667d97b3eb7538e388336] https://lava.sirena.org.uk/scheduler/job/1746607 # test job: [5b4dcaf851df8c414bfc2ac3bf9c65fc942f3be4] https://lava.sirena.org.uk/scheduler/job/1747672 # test job: [899fb38dd76dd3ede425bbaf8a96d390180a5d1c] https://lava.sirena.org.uk/scheduler/job/1747375 # test job: [11f5c5f9e43e9020bae452232983fe98e7abfce0] https://lava.sirena.org.uk/scheduler/job/1747503 # test job: [a12b74d2bd4724ee1883bc97ec93eac8fafc8d3c] https://lava.sirena.org.uk/scheduler/job/1734077 # test job: [f840737d1746398c2993be34bfdc80bdc19ecae2] https://lava.sirena.org.uk/scheduler/job/1727318 # test job: [d78e48ebe04e9566f8ecbf51471e80da3adbceeb] https://lava.sirena.org.uk/scheduler/job/1706175 # test job: [96bcb34df55f7fee99795127c796315950c94fed] https://lava.sirena.org.uk/scheduler/job/1699577 # test job: [c232495d28ca092d0c39b10e35d3d613bd2414ab] https://lava.sirena.org.uk/scheduler/job/1699507 # test job: [87a877de367d835b527d1086f75727123ef85fc4] https://lava.sirena.org.uk/scheduler/job/1697972 # test job: [c26675447faff8c4ddc1dc5d2cd28326b8181aaf] https://lava.sirena.org.uk/scheduler/job/1698132 # test job: [ec0be3cdf40b5302248f3fb27a911cc630e8b855] https://lava.sirena.org.uk/scheduler/job/1694308 # test job: [27848c082ba0b22850fd9fb7b185c015423dcdc7] https://lava.sirena.org.uk/scheduler/job/1693100 # test job: [c1dd310f1d76b4b13f1854618087af2513140897] https://lava.sirena.org.uk/scheduler/job/1693035 # test job: [da9881d00153cc6d3917f6b74144b1d41b58338c] https://lava.sirena.org.uk/scheduler/job/1693416 # test job: [cf65182247761f7993737b710afe8c781699356b] https://lava.sirena.org.uk/scheduler/job/1687562 # test job: [2a55135201d5e24b80b7624880ff42eafd8e320c] https://lava.sirena.org.uk/scheduler/job/1685772 # test job: [550bc517e59347b3b1af7d290eac4fb1411a3d4e] https://lava.sirena.org.uk/scheduler/job/1685910 # test job: [0056b410355713556d8a10306f82e55b28d33ba8] https://lava.sirena.org.uk/scheduler/job/1685649 # test job: [daf855f76a1210ceed9541f71ac5dd9be02018a6] https://lava.sirena.org.uk/scheduler/job/1685441 # test job: [90179609efa421b1ccc7d8eafbc078bafb25777c] https://lava.sirena.org.uk/scheduler/job/1686081 # test job: [258384d8ce365dddd6c5c15204de8ccd53a7ab0a] https://lava.sirena.org.uk/scheduler/job/1673411 # test job: [6d068f1ae2a2f713d7f21a9a602e65b3d6b6fc6d] https://lava.sirena.org.uk/scheduler/job/1673133 # test job: [a46e95c81e3a28926ab1904d9f754fef8318074d] https://lava.sirena.org.uk/scheduler/job/1673748 # test job: [48124569bbc6bfda1df3e9ee17b19d559f4b1aa3] https://lava.sirena.org.uk/scheduler/job/1670184 # test job: [37533933bfe92cd5a99ef4743f31dac62ccc8de0] https://lava.sirena.org.uk/scheduler/job/1668977 # test job: [0e62438e476494a1891a8822b9785bc6e73e9c3f] https://lava.sirena.org.uk/scheduler/job/1669534 # test job: [5c36b86d2bf68fbcad16169983ef7ee8c537db59] https://lava.sirena.org.uk/scheduler/job/1667971 # test job: [714165e1c4b0d5b8c6d095fe07f65e6e7047aaeb] https://lava.sirena.org.uk/scheduler/job/1667699 # test job: [9c45f95222beecd6a284fd1284d54dd7a772cf59] https://lava.sirena.org.uk/scheduler/job/1667598 # test job: [bab4ab484a6ca170847da9bffe86f1fa90df4bbe] https://lava.sirena.org.uk/scheduler/job/1664664 # test job: [b832b19318534bb4f1673b24d78037fee339c679] https://lava.sirena.org.uk/scheduler/job/1659213 # test job: [8c02c8353460f8630313aef6810f34e134a3c1ee] https://lava.sirena.org.uk/scheduler/job/1659264 # test job: [6b7e2aa50bdaf88cd4c2a5e2059a7bf32d85a8b1] https://lava.sirena.org.uk/scheduler/job/1656585 # test job: [2291a2186305faaf8525d57849d8ba12ad63f5e7] https://lava.sirena.org.uk/scheduler/job/1655709 # test job: [a54ef14188519a0994d0264f701f5771815fa11e] https://lava.sirena.org.uk/scheduler/job/1656024 # test job: [a1d0b0ae65ae3f32597edfbb547f16c75601cd87] https://lava.sirena.org.uk/scheduler/job/1654201 # test job: [cf25eb8eae91bcae9b2065d84b0c0ba0f6d9dd34] https://lava.sirena.org.uk/scheduler/job/1654790 # test job: [595b7f155b926460a00776cc581e4dcd01220006] https://lava.sirena.org.uk/scheduler/job/1653119 # test job: [3059067fd3378a5454e7928c08d20bf3ef186760] https://lava.sirena.org.uk/scheduler/job/1655440 # test job: [2d86d2585ab929a143d1e6f8963da1499e33bf13] https://lava.sirena.org.uk/scheduler/job/1655917 # test job: [9a200cbdb54349909a42b45379e792e4b39dd223] https://lava.sirena.org.uk/scheduler/job/1654762 # test job: [162e23657e5379f07c6404dbfbf4367cb438ea7d] https://lava.sirena.org.uk/scheduler/job/1652978 # test job: [886f42ce96e7ce80545704e7168a9c6b60cd6c03] https://lava.sirena.org.uk/scheduler/job/1654270 # test job: [6684aba0780da9f505c202f27e68ee6d18c0aa66] https://lava.sirena.org.uk/scheduler/job/1738722 # test job: [be5d4872e528796df9d7425f2bd9b3893eb3a42c] https://lava.sirena.org.uk/scheduler/job/1778528 # bad: [be5d4872e528796df9d7425f2bd9b3893eb3a42c] Add linux-next specific files for 20250905 git bisect bad be5d4872e528796df9d7425f2bd9b3893eb3a42c # test job: [c3ce85ecd0268df1e0ca692e8126bb181fc89a08] https://lava.sirena.org.uk/scheduler/job/1779086 # bad: [c3ce85ecd0268df1e0ca692e8126bb181fc89a08] Merge branch 'main' of https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git git bisect bad c3ce85ecd0268df1e0ca692e8126bb181fc89a08 # test job: [973a887a5bb9a42878e276209592e0f75c287bb6] https://lava.sirena.org.uk/scheduler/job/1780104 # bad: [973a887a5bb9a42878e276209592e0f75c287bb6] Merge branch 'fs-next' of linux-next git bisect bad 973a887a5bb9a42878e276209592e0f75c287bb6 # test job: [fdabd8890022a9439b95d7395f7ae046544d96fd] https://lava.sirena.org.uk/scheduler/job/1780530 # bad: [fdabd8890022a9439b95d7395f7ae046544d96fd] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git git bisect bad fdabd8890022a9439b95d7395f7ae046544d96fd # test job: [94bd0249a4a06131c4a1c2097b6134217a658976] https://lava.sirena.org.uk/scheduler/job/1780904 # bad: [94bd0249a4a06131c4a1c2097b6134217a658976] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git git bisect bad 94bd0249a4a06131c4a1c2097b6134217a658976 # test job: [702b6c2f1008779e8fc8a4a4438410165309a4b4] https://lava.sirena.org.uk/scheduler/job/1781370 # bad: [702b6c2f1008779e8fc8a4a4438410165309a4b4] kasan-apply-write-only-mode-in-kasan-kunit-testcases-v7 git bisect bad 702b6c2f1008779e8fc8a4a4438410165309a4b4 # test job: [0ac48805721d5952a920356e454167bba8d27737] https://lava.sirena.org.uk/scheduler/job/1781448 # good: [0ac48805721d5952a920356e454167bba8d27737] mm: convert page_to_section() to memdesc_section() git bisect good 0ac48805721d5952a920356e454167bba8d27737 # test job: [dc731eba2e47fa81d50aa1cb167100889253cfe0] https://lava.sirena.org.uk/scheduler/job/1781608 # good: [dc731eba2e47fa81d50aa1cb167100889253cfe0] mm/damon/paddr: support addr_unit for MIGRATE_{HOT,COLD} git bisect good dc731eba2e47fa81d50aa1cb167100889253cfe0 # test job: [e24bb041cafabaa5fa3d76386c86af389cc324f5] https://lava.sirena.org.uk/scheduler/job/1781660 # good: [e24bb041cafabaa5fa3d76386c86af389cc324f5] mm/memremap: reject unreasonable folio/compound page sizes in memremap_pages() git bisect good e24bb041cafabaa5fa3d76386c86af389cc324f5 # test job: [62fd63f4688f40f01a6df23225523ece10d4b69a] https://lava.sirena.org.uk/scheduler/job/1781975 # bad: [62fd63f4688f40f01a6df23225523ece10d4b69a] dma-remap: drop nth_page() in dma_common_contiguous_remap() git bisect bad 62fd63f4688f40f01a6df23225523ece10d4b69a # test job: [cb42f7f6d9e4eff4e5259cddf82fd913306b8fe7] https://lava.sirena.org.uk/scheduler/job/1782145 # good: [cb42f7f6d9e4eff4e5259cddf82fd913306b8fe7] fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison() git bisect good cb42f7f6d9e4eff4e5259cddf82fd913306b8fe7 # test job: [db076b5db550aa34169dceee81d0974c7b2a2482] https://lava.sirena.org.uk/scheduler/job/1782813 # bad: [db076b5db550aa34169dceee81d0974c7b2a2482] mm/gup: remove record_subpages() git bisect bad db076b5db550aa34169dceee81d0974c7b2a2482 # test job: [891d0b3189945a5c37ce92c4e5337ec2c17b6378] https://lava.sirena.org.uk/scheduler/job/1782916 # good: [891d0b3189945a5c37ce92c4e5337ec2c17b6378] mm/pagewalk: drop nth_page() usage within folio in folio_walk_start() git bisect good 891d0b3189945a5c37ce92c4e5337ec2c17b6378 # test job: [21999f6315d786cbd21d5b2d0ad56f3f6125279f] https://lava.sirena.org.uk/scheduler/job/1783020 # good: [21999f6315d786cbd21d5b2d0ad56f3f6125279f] mm/gup: drop nth_page() usage within folio when recording subpages git bisect good 21999f6315d786cbd21d5b2d0ad56f3f6125279f # first bad commit: [db076b5db550aa34169dceee81d0974c7b2a2482] mm/gup: remove record_subpages() -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From david at redhat.com Mon Sep 8 08:22:24 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 8 Sep 2025 17:22:24 +0200 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> Message-ID: <83d3ef61-abc7-458d-b6ea-20094eeff6cd@redhat.com> On 08.09.25 17:16, Mark Brown wrote: > On Mon, Sep 01, 2025 at 05:03:40PM +0200, David Hildenbrand wrote: >> We can just cleanup the code by calculating the #refs earlier, >> so we can just inline what remains of record_subpages(). >> >> Calculate the number of references/pages ahead of times, and record them >> only once all our tests passed. > > I'm seeing failures in kselftest-mm in -next on at least Raspberry Pi 4 > and Orion O6 which bisect to this patch. I'm seeing a NULL pointer > dereference during the GUP test (which isn't actually doing anything as > I'm just using a standard defconfig rather than one with the mm > fragment): On which -next label are you on? next-20250908 should no longer have that commit. -- Cheers David / dhildenb From wangruikang at iscas.ac.cn Mon Sep 8 08:25:53 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Mon, 8 Sep 2025 23:25:53 +0800 Subject: [PATCH net-next v10 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250908-net-k1-emac-v10-2-90d807ccd469@iscas.ac.cn> References: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> <20250908-net-k1-emac-v10-2-90d807ccd469@iscas.ac.cn> Message-ID: On 9/8/25 20:34, Vivian Wang wrote: > [...] > +static u64 emac_get_stat_tx_dropped(struct emac_priv *priv) > +{ > + u64 result; Well, this should be result = 0. That was careless on my part. Will fix in v11. I need to start using this clang thing... Vivian "dramforever" Wang > + int cpu; > + > + for_each_possible_cpu(cpu) { > + result += READ_ONCE(per_cpu(*priv->stat_tx_dropped, cpu)); > + } > + > + return result; > +} From broonie at kernel.org Mon Sep 8 08:28:28 2025 From: broonie at kernel.org (Mark Brown) Date: Mon, 8 Sep 2025 16:28:28 +0100 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <83d3ef61-abc7-458d-b6ea-20094eeff6cd@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <83d3ef61-abc7-458d-b6ea-20094eeff6cd@redhat.com> Message-ID: On Mon, Sep 08, 2025 at 05:22:24PM +0200, David Hildenbrand wrote: > On 08.09.25 17:16, Mark Brown wrote: > > I'm seeing failures in kselftest-mm in -next on at least Raspberry Pi 4 > > and Orion O6 which bisect to this patch. I'm seeing a NULL pointer > On which -next label are you on? next-20250908 should no longer have that > commit. Ah, sorry - it was Friday's -next but I only saw the report this morning. Sorry for the noise. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From Jason at zx2c4.com Mon Sep 8 09:47:08 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Mon, 8 Sep 2025 18:47:08 +0200 Subject: [PATCH RFC 05/35] wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu kernel config In-Reply-To: <20250821200701.1329277-6-david@redhat.com> References: <20250821200701.1329277-1-david@redhat.com> <20250821200701.1329277-6-david@redhat.com> Message-ID: Applied, thanks. From Jason at zx2c4.com Mon Sep 8 09:48:04 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Mon, 8 Sep 2025 18:48:04 +0200 Subject: [PATCH v2 05/37] wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu kernel config In-Reply-To: <20250901150359.867252-6-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-6-david@redhat.com> Message-ID: Applied this one, actually. Thank you. From bmasney at redhat.com Mon Sep 8 10:11:57 2025 From: bmasney at redhat.com (Brian Masney) Date: Mon, 8 Sep 2025 13:11:57 -0400 Subject: [PATCH 000/114] clk: convert drivers from deprecated round_rate() to determine_rate() In-Reply-To: <20250811-clk-for-stephen-round-rate-v1-0-b3bf97b038dc@redhat.com> References: <20250811-clk-for-stephen-round-rate-v1-0-b3bf97b038dc@redhat.com> Message-ID: On Mon, Aug 11, 2025 at 11:17:52AM -0400, Brian Masney wrote: > The round_rate() clk ops is deprecated in the clk framework in favor > of the determine_rate() clk ops, so let's go ahead and convert the > various clk drivers using the Coccinelle semantic patch posted below. > I did a few minor cosmetic cleanups of the code in a few cases. I included a subset of these patches in this pull request to Stephen: https://lore.kernel.org/linux-clk/aL8MXYrR5uoBa4cB at x1/T/#u Brian From jhubbard at nvidia.com Mon Sep 8 10:12:00 2025 From: jhubbard at nvidia.com (John Hubbard) Date: Mon, 8 Sep 2025 10:12:00 -0700 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <7ee0b58a-8fe4-46fe-bfef-f04f900f3040@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <016307ba-427d-4646-8e4d-1ffefd2c1968@nvidia.com> <85e760cf-b994-40db-8d13-221feee55c60@redhat.com> <727cabec-5ee8-4793-926b-8d78febcd623@lucifer.local> <7ee0b58a-8fe4-46fe-bfef-f04f900f3040@redhat.com> Message-ID: On 9/8/25 5:53 AM, David Hildenbrand wrote: > On 08.09.25 14:25, Lorenzo Stoakes wrote: >> On Sat, Sep 06, 2025 at 08:56:48AM +0200, David Hildenbrand wrote: >>> On 06.09.25 03:05, John Hubbard wrote: ... >>> Roughly, what I am thinking (limiting it to pte+pmd case) about is >>> the following: >> >> I cannot get the below to apply even with the original patch here >> applied + fix. >> >> It looks like (in mm-new :) commit e73f43a66d5f ("mm/gup: remove dead >> pgmap >> refcounting code") by Alastair has conflicted here, but even then I >> can't make >> it apply, with/without your fix...! I eventually resorted to telling the local AI to read the diffs and apply them on top of the nth_page series locally. :) Attaching the resulting patch, which worked well enough to at least see the proposal clearly. > > To be clear: it was never intended to be applied, because it wouldn't > even compile in the current form. > > It was based on this nth_page submission + fix. > > thanks, -- John Hubbard -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-David-Hildenbrand-s-fix-for-record_subpages.patch Type: text/x-patch Size: 5603 bytes Desc: not available URL: From cleger at rivosinc.com Mon Sep 8 11:17:02 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 8 Sep 2025 18:17:02 +0000 Subject: [PATCH v7 0/5] riscv: add support for SBI Supervisor Software Events Message-ID: <20250908181717.1997461-1-cleger@rivosinc.com> The SBI Supervisor Software Events (SSE) extensions provides a mechanism to inject software events from an SBI implementation to supervisor software such that it preempts all other supervisor level traps and interrupts. This extension is introduced by the SBI v3.0 specification[1]. Various events are defined and can be send asynchronously to supervisor software (RAS, PMU, DEBUG, Asynchronous page fault) from SBI as well as platform specific events. Events can be either local (per-hart) or global. Events can be nested on top of each other based on priority and can interrupt the kernel at any time. First patch adds the SSE definitions. Second one adds support for SSE at arch level (entry code and stack allocations) and third one at driver level. Finally, the last patch add support for SSE events in the SBI PMU driver. Additional testing for that part is highly welcomed since there are a lot of possible path that needs to be exercised. Amongst the specific points that needs to be handle is the interruption at any point of the kernel execution and more specifically at the beginning of exception handling. Due to the fact that the exception entry implementation uses the SCRATCH CSR as both the current task struct and as the temporary register to switch the stack and save register, it is difficult to reliably get the current task struct if we get interrupted at this specific moment (ie, it might contain 0, the task pointer or tp). A fixup-like mechanism is not possible due to the nested nature of SSE which makes it really hard to obtain the original interruption site. In order to retrieve the task in a reliable manner, add an additional __sse_entry_task per_cpu array which stores the current task. Ideally, we would need to modify the way we retrieve/store the current task in exception handling so that it does not depend on the place where it's interrupted. Contrary to pseudo NMI [2], SSE does not modifies the way interrupts are handled and does not adds any overhead to existing code. Moreover, it provides "true" NMI-like interrupts which can interrupt the kernel at any time (even in exception handling). This is particularly crucial for RAS errors which needs to be handled as fast as possible to avoid any fault propagation. A test suite is available as a separate kselftest module. In order to build it, you can use the following command: $ KDIR= make O=build TARGETS="riscv/sse"-j $(($(nproc)-1)) -C tools/testing/selftests Then load the module using: $ sh run_sse_test.sh A KVM SBI SSE extension implementation is available at [2]. Link: https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/v3.0-rc7/riscv-sbi.pdf [1] Link: https://github.com/rivosinc/linux/tree/dev/cleger/sse_kvm [2] --- Changes in v7: - Check return values of sse_on_each_cpu() - Fix typos in commit - Rename SBI_SSE_EVENT_SIGNAL to SBI_SSE_EVENT_INJECT - Rename SBI_SSE_EVENT_HART_UNMASK/MASK to SBI_SSE_HART_UNMASK/MASK - Add tlb flush for vmap stack to avoid taking exception during sse handler upon stack access. (Alex) - Move some assembly instruction to slow path - Renamed sse.c to sbi_sse.c, ditto for other files - Renamed RISCV_SSE to RISCV_SBI_SSE - Renamed sse_event_handler to sse_event_handler_fn - Put ifdef around sse_evt in PMU SBI driver Changes in v6: - Fix comment in assembly argument - Check hart id to be the expected one in order to skip CPU id matching in sse assembly. Changes in v5: - Added a SSE test module in kselftests - Removed an unused variable - Applied checkpatch.pl --strict and fix all errors - Use scope_guard(cpus_read_lock) instead of manual cpus_read_lock() - Fix wrong variable returned in sse_get_event - Remove useless init of events list - Remove useless empty for loop on cpus - Set sse_available as __ro_after_init - Changed a few pr_debug to pr_warn - Fix event enabled stated updated in case of failure - Change no_lock to nolock - Rename attr_buf to attr - renamed sse_get_event_phys() to sse_event_get_attr_phys() and removed the second argument - Simplify return value in sse_event_attr_set_nolock() - Remove while loop(-EINVAL) for event cpu set call - Renamed interrupted_state_phys to interrupted_phys - Use scoped_guards/guard for sse_mutex - Remove useless struct forward declaration in sse.h - Add more explanations as to why we set SIE bit in IP - Unconditionnally set SIE in SIP - Move SSE_STACK_SIZE adjustement in sse_stack_alloc/free() - Replace move instructions with mv - Rename NR_CPUS asm symbol to ASM_NR_CPUS - Restore SSTATUS first in sse_entry return path so that it works for double trap without any modification later. - Implement proper per cpu revert if enable/register fails Changes in v4: - Fix a bug when using per_cpu ptr for local event (Andrew) - Add sse_event_disable/enable_local() - Add pmu_disable/pmu_enable() to disable/enable SSE event - Update event ID description according to the latest spec - Fix comment about arguments in handle_sse() - Added Himanchu as a SSE reviewer - Used SYM_DATA_*() macros instead of hardcoded labels - Invoke softirqs only if not returning to kernel with irqs disabled - Remove invalid state check for write attribute function. - Remove useless bneq statement in sse_entry.S Changes in v3: - Split arch/driver support - Fix potential register failure reporting - Set a few pr_err as pr_debug - Allow CONFIG_RISCV_SSE to be disabled - Fix build without CONFIG_RISCV_SSE - Remove fixup-like mechanism and use a per-cpu array - Fixed SSCRATCH being corrupted when interrupting the kernel in early exception path. - Split SSE assembly from entry.S - Add Himanchu SSE mask/unmask and runtime PM support. - Disable user memory access/floating point/vector in SSE handler - Rebased on master v2: https://lore.kernel.org/linux-riscv/20240112111720.2975069-1-cleger at rivosinc.com/ Changes in v2: - Implemented specification v2 - Fix various error handling cases - Added shadow stack support v1: https://lore.kernel.org/linux-riscv/20231026143122.279437-1-cleger at rivosinc.com/ Cl?ment L?ger (5): riscv: add SBI SSE extension definitions riscv: add support for SBI Supervisor Software Events extension drivers: firmware: add riscv SSE support perf: RISC-V: add support for SSE event selftests/riscv: add SSE test module MAINTAINERS | 15 + arch/riscv/include/asm/asm.h | 14 +- arch/riscv/include/asm/sbi.h | 61 ++ arch/riscv/include/asm/scs.h | 7 + arch/riscv/include/asm/sse.h | 47 ++ arch/riscv/include/asm/switch_to.h | 14 + arch/riscv/include/asm/thread_info.h | 1 + arch/riscv/kernel/Makefile | 1 + arch/riscv/kernel/asm-offsets.c | 14 + arch/riscv/kernel/sbi_sse.c | 174 +++++ arch/riscv/kernel/sbi_sse_entry.S | 178 +++++ drivers/firmware/Kconfig | 1 + drivers/firmware/Makefile | 1 + drivers/firmware/riscv/Kconfig | 15 + drivers/firmware/riscv/Makefile | 3 + drivers/firmware/riscv/riscv_sbi_sse.c | 701 ++++++++++++++++++ drivers/perf/Kconfig | 10 + drivers/perf/riscv_pmu.c | 23 + drivers/perf/riscv_pmu_sbi.c | 71 +- include/linux/perf/riscv_pmu.h | 5 + include/linux/riscv_sbi_sse.h | 57 ++ tools/testing/selftests/riscv/Makefile | 2 +- tools/testing/selftests/riscv/sse/Makefile | 5 + .../selftests/riscv/sse/module/Makefile | 16 + .../riscv/sse/module/riscv_sse_test.c | 513 +++++++++++++ .../selftests/riscv/sse/run_sse_test.sh | 44 ++ 26 files changed, 1979 insertions(+), 14 deletions(-) create mode 100644 arch/riscv/include/asm/sse.h create mode 100644 arch/riscv/kernel/sbi_sse.c create mode 100644 arch/riscv/kernel/sbi_sse_entry.S create mode 100644 drivers/firmware/riscv/Kconfig create mode 100644 drivers/firmware/riscv/Makefile create mode 100644 drivers/firmware/riscv/riscv_sbi_sse.c create mode 100644 include/linux/riscv_sbi_sse.h create mode 100644 tools/testing/selftests/riscv/sse/Makefile create mode 100644 tools/testing/selftests/riscv/sse/module/Makefile create mode 100644 tools/testing/selftests/riscv/sse/module/riscv_sse_test.c create mode 100644 tools/testing/selftests/riscv/sse/run_sse_test.sh -- 2.43.0 From cleger at rivosinc.com Mon Sep 8 11:17:03 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 8 Sep 2025 18:17:03 +0000 Subject: [PATCH v7 1/5] riscv: add SBI SSE extension definitions In-Reply-To: <20250908181717.1997461-1-cleger@rivosinc.com> References: <20250908181717.1997461-1-cleger@rivosinc.com> Message-ID: <20250908181717.1997461-2-cleger@rivosinc.com> Add needed definitions for SBI Supervisor Software Events extension [1]. This extension enables the SBI to inject events into supervisor software much like ARM SDEI. [1] https://lists.riscv.org/g/tech-prs/message/515 Signed-off-by: Cl?ment L?ger --- arch/riscv/include/asm/sbi.h | 61 ++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index 341e74238aa0..874cc1d7603a 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -36,6 +36,7 @@ enum sbi_ext_id { SBI_EXT_STA = 0x535441, SBI_EXT_NACL = 0x4E41434C, SBI_EXT_FWFT = 0x46574654, + SBI_EXT_SSE = 0x535345, /* Experimentals extensions must lie within this range */ SBI_EXT_EXPERIMENTAL_START = 0x08000000, @@ -430,6 +431,66 @@ enum sbi_fwft_feature_t { #define SBI_FWFT_SET_FLAG_LOCK BIT(0) +enum sbi_ext_sse_fid { + SBI_SSE_EVENT_ATTR_READ = 0, + SBI_SSE_EVENT_ATTR_WRITE, + SBI_SSE_EVENT_REGISTER, + SBI_SSE_EVENT_UNREGISTER, + SBI_SSE_EVENT_ENABLE, + SBI_SSE_EVENT_DISABLE, + SBI_SSE_EVENT_COMPLETE, + SBI_SSE_EVENT_INJECT, + SBI_SSE_HART_UNMASK, + SBI_SSE_HART_MASK, +}; + +enum sbi_sse_state { + SBI_SSE_STATE_UNUSED = 0, + SBI_SSE_STATE_REGISTERED = 1, + SBI_SSE_STATE_ENABLED = 2, + SBI_SSE_STATE_RUNNING = 3, +}; + +/* SBI SSE Event Attributes. */ +enum sbi_sse_attr_id { + SBI_SSE_ATTR_STATUS = 0x00000000, + SBI_SSE_ATTR_PRIO = 0x00000001, + SBI_SSE_ATTR_CONFIG = 0x00000002, + SBI_SSE_ATTR_PREFERRED_HART = 0x00000003, + SBI_SSE_ATTR_ENTRY_PC = 0x00000004, + SBI_SSE_ATTR_ENTRY_ARG = 0x00000005, + SBI_SSE_ATTR_INTERRUPTED_SEPC = 0x00000006, + SBI_SSE_ATTR_INTERRUPTED_FLAGS = 0x00000007, + SBI_SSE_ATTR_INTERRUPTED_A6 = 0x00000008, + SBI_SSE_ATTR_INTERRUPTED_A7 = 0x00000009, + + SBI_SSE_ATTR_MAX = 0x0000000A +}; + +#define SBI_SSE_ATTR_STATUS_STATE_OFFSET 0 +#define SBI_SSE_ATTR_STATUS_STATE_MASK 0x3 +#define SBI_SSE_ATTR_STATUS_PENDING_OFFSET 2 +#define SBI_SSE_ATTR_STATUS_INJECT_OFFSET 3 + +#define SBI_SSE_ATTR_CONFIG_ONESHOT BIT(0) + +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPP BIT(0) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPIE BIT(1) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPV BIT(2) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPVP BIT(3) + +#define SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS 0x00000000 +#define SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP 0x00000001 +#define SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS 0x00008000 +#define SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW 0x00010000 +#define SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS 0x00100000 +#define SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS 0x00108000 +#define SBI_SSE_EVENT_LOCAL_SOFTWARE_INJECTED 0xffff0000 +#define SBI_SSE_EVENT_GLOBAL_SOFTWARE_INJECTED 0xffff8000 + +#define SBI_SSE_EVENT_PLATFORM BIT(14) +#define SBI_SSE_EVENT_GLOBAL BIT(15) + /* SBI spec version fields */ #define SBI_SPEC_VERSION_DEFAULT 0x1 #define SBI_SPEC_VERSION_MAJOR_SHIFT 24 -- 2.43.0 From cleger at rivosinc.com Mon Sep 8 11:17:04 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 8 Sep 2025 18:17:04 +0000 Subject: [PATCH v7 2/5] riscv: add support for SBI Supervisor Software Events extension In-Reply-To: <20250908181717.1997461-1-cleger@rivosinc.com> References: <20250908181717.1997461-1-cleger@rivosinc.com> Message-ID: <20250908181717.1997461-3-cleger@rivosinc.com> The SBI SSE extension allows the supervisor software to be notified by the SBI of specific events that are not maskable. The context switch is handled partially by the firmware which will save registers a6 and a7. When entering kernel we can rely on these 2 registers to setup the stack and save all the registers. Since SSE events can be delivered at any time to the kernel (including during exception handling, we need a way to locate the current_task for context tracking. On RISC-V, it is sotred in scratch when in user space or tp when in kernel space (in which case SSCRATCH is zero). But at a at the beginning of exception handling, SSCRATCH is used to swap tp and check the origin of the exception. If interrupted at that point, then, there is no way to reliably know were is located the current task_struct. Even checking the interruption location won't work as SSE event can be nested on top of each other so the original interruption site might be lost at some point. In order to retrieve it reliably, store the current task in an additional __sse_entry_task per_cpu array. This array is then used to retrieve the current task based on the hart ID that is passed to the SSE event handler in a6. That being said, the way the current task struct is stored should probably be reworked to find a better reliable alternative. Since each events (and each CPU for local events) have their own context and can preempt each other, allocate a stack (and a shadow stack if needed for each of them (and for each cpu for local events). When completing the event, if we were coming from kernel with interrupts disabled, simply return there. If coming from userspace or kernel with interrupts enabled, simulate an interrupt exception by setting IE_SIE in CSR_IP to allow delivery of signals to user task. For instance this can happen, when a RAS event has been generated by a user application and a SIGBUS has been sent to a task. Signed-off-by: Cl?ment L?ger --- arch/riscv/include/asm/asm.h | 14 ++- arch/riscv/include/asm/scs.h | 7 ++ arch/riscv/include/asm/sse.h | 47 +++++++ arch/riscv/include/asm/switch_to.h | 14 +++ arch/riscv/include/asm/thread_info.h | 1 + arch/riscv/kernel/Makefile | 1 + arch/riscv/kernel/asm-offsets.c | 14 +++ arch/riscv/kernel/sbi_sse.c | 174 ++++++++++++++++++++++++++ arch/riscv/kernel/sbi_sse_entry.S | 178 +++++++++++++++++++++++++++ 9 files changed, 447 insertions(+), 3 deletions(-) create mode 100644 arch/riscv/include/asm/sse.h create mode 100644 arch/riscv/kernel/sbi_sse.c create mode 100644 arch/riscv/kernel/sbi_sse_entry.S diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h index 2a16e88e13de..416dddd37d67 100644 --- a/arch/riscv/include/asm/asm.h +++ b/arch/riscv/include/asm/asm.h @@ -90,16 +90,24 @@ #define PER_CPU_OFFSET_SHIFT 3 #endif -.macro asm_per_cpu dst sym tmp - lw \tmp, TASK_TI_CPU_NUM(tp) - slli \tmp, \tmp, PER_CPU_OFFSET_SHIFT +.macro asm_per_cpu_with_cpu dst sym tmp cpu + slli \tmp, \cpu, PER_CPU_OFFSET_SHIFT la \dst, __per_cpu_offset add \dst, \dst, \tmp REG_L \tmp, 0(\dst) la \dst, \sym add \dst, \dst, \tmp .endm + +.macro asm_per_cpu dst sym tmp + lw \tmp, TASK_TI_CPU_NUM(tp) + asm_per_cpu_with_cpu \dst \sym \tmp \tmp +.endm #else /* CONFIG_SMP */ +.macro asm_per_cpu_with_cpu dst sym tmp cpu + la \dst, \sym +.endm + .macro asm_per_cpu dst sym tmp la \dst, \sym .endm diff --git a/arch/riscv/include/asm/scs.h b/arch/riscv/include/asm/scs.h index 0e45db78b24b..62344daad73d 100644 --- a/arch/riscv/include/asm/scs.h +++ b/arch/riscv/include/asm/scs.h @@ -18,6 +18,11 @@ load_per_cpu gp, irq_shadow_call_stack_ptr, \tmp .endm +/* Load the per-CPU IRQ shadow call stack to gp. */ +.macro scs_load_sse_stack reg_evt + REG_L gp, SSE_REG_EVT_SHADOW_STACK(\reg_evt) +.endm + /* Load task_scs_sp(current) to gp. */ .macro scs_load_current REG_L gp, TASK_TI_SCS_SP(tp) @@ -41,6 +46,8 @@ .endm .macro scs_load_irq_stack tmp .endm +.macro scs_load_sse_stack reg_evt +.endm .macro scs_load_current .endm .macro scs_load_current_if_task_changed prev diff --git a/arch/riscv/include/asm/sse.h b/arch/riscv/include/asm/sse.h new file mode 100644 index 000000000000..d3ce8c2b5221 --- /dev/null +++ b/arch/riscv/include/asm/sse.h @@ -0,0 +1,47 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2024 Rivos Inc. + */ +#ifndef __ASM_SSE_H +#define __ASM_SSE_H + +#include + +#ifdef CONFIG_RISCV_SBI_SSE + +struct sse_event_interrupted_state { + unsigned long a6; + unsigned long a7; +}; + +struct sse_event_arch_data { + void *stack; + void *shadow_stack; + unsigned long tmp; + struct sse_event_interrupted_state interrupted; + unsigned long interrupted_phys; + u32 evt_id; + unsigned int hart_id; + unsigned int cpu_id; +}; + +static inline bool sse_event_is_global(u32 evt) +{ + return !!(evt & SBI_SSE_EVENT_GLOBAL); +} + +void arch_sse_event_update_cpu(struct sse_event_arch_data *arch_evt, int cpu); +int arch_sse_init_event(struct sse_event_arch_data *arch_evt, u32 evt_id, + int cpu); +void arch_sse_free_event(struct sse_event_arch_data *arch_evt); +int arch_sse_register_event(struct sse_event_arch_data *arch_evt); + +void sse_handle_event(struct sse_event_arch_data *arch_evt, + struct pt_regs *regs); +asmlinkage void handle_sse(void); +asmlinkage void do_sse(struct sse_event_arch_data *arch_evt, + struct pt_regs *reg); + +#endif + +#endif diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h index 0e71eb82f920..70e68e630216 100644 --- a/arch/riscv/include/asm/switch_to.h +++ b/arch/riscv/include/asm/switch_to.h @@ -88,6 +88,19 @@ static inline void __switch_to_envcfg(struct task_struct *next) :: "r" (next->thread.envcfg) : "memory"); } +#ifdef CONFIG_RISCV_SBI_SSE +DECLARE_PER_CPU(struct task_struct *, __sbi_sse_entry_task); + +static inline void __switch_sbi_sse_entry_task(struct task_struct *next) +{ + __this_cpu_write(__sbi_sse_entry_task, next); +} +#else +static inline void __switch_sbi_sse_entry_task(struct task_struct *next) +{ +} +#endif + extern struct task_struct *__switch_to(struct task_struct *, struct task_struct *); @@ -122,6 +135,7 @@ do { \ if (switch_to_should_flush_icache(__next)) \ local_flush_icache_all(); \ __switch_to_envcfg(__next); \ + __switch_sbi_sse_entry_task(__next); \ ((last) = __switch_to(__prev, __next)); \ } while (0) diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h index f5916a70879a..28e9805e61fc 100644 --- a/arch/riscv/include/asm/thread_info.h +++ b/arch/riscv/include/asm/thread_info.h @@ -36,6 +36,7 @@ #define OVERFLOW_STACK_SIZE SZ_4K #define IRQ_STACK_SIZE THREAD_SIZE +#define SSE_STACK_SIZE THREAD_SIZE #ifndef __ASSEMBLY__ diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index c7b542573407..16637e01a6b3 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -99,6 +99,7 @@ obj-$(CONFIG_DYNAMIC_FTRACE) += mcount-dyn.o obj-$(CONFIG_PERF_EVENTS) += perf_callchain.o obj-$(CONFIG_HAVE_PERF_REGS) += perf_regs.o obj-$(CONFIG_RISCV_SBI) += sbi.o sbi_ecall.o +obj-$(CONFIG_RISCV_SBI_SSE) += sbi_sse.o sbi_sse_entry.o ifeq ($(CONFIG_RISCV_SBI), y) obj-$(CONFIG_SMP) += sbi-ipi.o obj-$(CONFIG_SMP) += cpu_ops_sbi.o diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c index 6e8c0d6feae9..1b0d8624ef6e 100644 --- a/arch/riscv/kernel/asm-offsets.c +++ b/arch/riscv/kernel/asm-offsets.c @@ -14,6 +14,8 @@ #include #include #include +#include +#include #include void asm_offsets(void); @@ -528,4 +530,16 @@ void asm_offsets(void) DEFINE(FREGS_A6, offsetof(struct __arch_ftrace_regs, a6)); DEFINE(FREGS_A7, offsetof(struct __arch_ftrace_regs, a7)); #endif + +#ifdef CONFIG_RISCV_SBI_SSE + OFFSET(SSE_REG_EVT_STACK, sse_event_arch_data, stack); + OFFSET(SSE_REG_EVT_SHADOW_STACK, sse_event_arch_data, shadow_stack); + OFFSET(SSE_REG_EVT_TMP, sse_event_arch_data, tmp); + OFFSET(SSE_REG_HART_ID, sse_event_arch_data, hart_id); + OFFSET(SSE_REG_CPU_ID, sse_event_arch_data, cpu_id); + + DEFINE(SBI_EXT_SSE, SBI_EXT_SSE); + DEFINE(SBI_SSE_EVENT_COMPLETE, SBI_SSE_EVENT_COMPLETE); + DEFINE(ASM_NR_CPUS, NR_CPUS); +#endif } diff --git a/arch/riscv/kernel/sbi_sse.c b/arch/riscv/kernel/sbi_sse.c new file mode 100644 index 000000000000..626912a0927d --- /dev/null +++ b/arch/riscv/kernel/sbi_sse.c @@ -0,0 +1,174 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2024 Rivos Inc. + */ +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +DEFINE_PER_CPU(struct task_struct *, __sbi_sse_entry_task); + +void __weak sse_handle_event(struct sse_event_arch_data *arch_evt, struct pt_regs *regs) +{ +} + +void do_sse(struct sse_event_arch_data *arch_evt, struct pt_regs *regs) +{ + nmi_enter(); + + /* Retrieve missing GPRs from SBI */ + sbi_ecall(SBI_EXT_SSE, SBI_SSE_EVENT_ATTR_READ, arch_evt->evt_id, + SBI_SSE_ATTR_INTERRUPTED_A6, + (SBI_SSE_ATTR_INTERRUPTED_A7 - SBI_SSE_ATTR_INTERRUPTED_A6) + 1, + arch_evt->interrupted_phys, 0, 0); + + memcpy(®s->a6, &arch_evt->interrupted, sizeof(arch_evt->interrupted)); + + sse_handle_event(arch_evt, regs); + + /* + * The SSE delivery path does not uses the "standard" exception path + * (see sse_entry.S) and does not process any pending signal/softirqs + * due to being similar to a NMI. + * Some drivers (PMU, RAS) enqueue pending work that needs to be handled + * as soon as possible by bottom halves. For that purpose, set the SIP + * software interrupt pending bit which will force a software interrupt + * to be serviced once interrupts are reenabled in the interrupted + * context if they were masked or directly if unmasked. + */ + csr_set(CSR_IP, IE_SIE); + + nmi_exit(); +} + +static void *alloc_to_stack_pointer(void *alloc) +{ + return alloc ? alloc + SSE_STACK_SIZE : NULL; +} + +static void *stack_pointer_to_alloc(void *stack) +{ + return stack - SSE_STACK_SIZE; +} + +#ifdef CONFIG_VMAP_STACK +static void *sse_stack_alloc(unsigned int cpu) +{ + void *stack = arch_alloc_vmap_stack(SSE_STACK_SIZE, cpu_to_node(cpu)); + + return alloc_to_stack_pointer(stack); +} + +static void sse_stack_free(void *stack) +{ + vfree(stack_pointer_to_alloc(stack)); +} + +static void arch_sse_stack_cpu_sync(struct sse_event_arch_data *arch_evt) +{ + void *p_stack = arch_evt->stack; + unsigned long stack = (unsigned long) stack_pointer_to_alloc(p_stack); + unsigned long stack_end = stack + SSE_STACK_SIZE; + + /* + * Flush the tlb to avoid taking any exception when accessing the + * vmapped stack inside the SSE handler + */ + if (sse_event_is_global(arch_evt->evt_id)) + flush_tlb_kernel_range(stack, stack_end); + else + local_flush_tlb_kernel_range(stack, (unsigned long) stack_end); +} +#else /* CONFIG_VMAP_STACK */ +static void *sse_stack_alloc(unsigned int cpu) +{ + void *stack = kmalloc(SSE_STACK_SIZE, GFP_KERNEL); + + return alloc_to_stack_pointer(stack); +} + +static void sse_stack_free(void *stack) +{ + kfree(stack_pointer_to_alloc(stack)); +} + +static void arch_sse_stack_cpu_sync(struct sse_event_arch_data *arch_evt) {} +#endif /* CONFIG_VMAP_STACK */ + +static int sse_init_scs(int cpu, struct sse_event_arch_data *arch_evt) +{ + void *stack; + + if (!scs_is_enabled()) + return 0; + + stack = scs_alloc(cpu_to_node(cpu)); + if (!stack) + return -ENOMEM; + + arch_evt->shadow_stack = stack; + + return 0; +} + +void arch_sse_event_update_cpu(struct sse_event_arch_data *arch_evt, int cpu) +{ + arch_evt->cpu_id = cpu; + arch_evt->hart_id = cpuid_to_hartid_map(cpu); +} + +int arch_sse_init_event(struct sse_event_arch_data *arch_evt, u32 evt_id, + int cpu) +{ + void *stack; + + arch_evt->evt_id = evt_id; + stack = sse_stack_alloc(cpu); + if (!stack) + return -ENOMEM; + + arch_evt->stack = stack; + + if (sse_init_scs(cpu, arch_evt)) { + sse_stack_free(arch_evt->stack); + return -ENOMEM; + } + + if (sse_event_is_global(evt_id)) { + arch_evt->interrupted_phys = + virt_to_phys(&arch_evt->interrupted); + } else { + arch_evt->interrupted_phys = + per_cpu_ptr_to_phys(&arch_evt->interrupted); + } + + arch_sse_event_update_cpu(arch_evt, cpu); + + return 0; +} + +void arch_sse_free_event(struct sse_event_arch_data *arch_evt) +{ + scs_free(arch_evt->shadow_stack); + sse_stack_free(arch_evt->stack); +} + +int arch_sse_register_event(struct sse_event_arch_data *arch_evt) +{ + struct sbiret sret; + + arch_sse_stack_cpu_sync(arch_evt); + + sret = sbi_ecall(SBI_EXT_SSE, SBI_SSE_EVENT_REGISTER, arch_evt->evt_id, + (unsigned long)handle_sse, (unsigned long)arch_evt, 0, + 0, 0); + + return sbi_err_map_linux_errno(sret.error); +} diff --git a/arch/riscv/kernel/sbi_sse_entry.S b/arch/riscv/kernel/sbi_sse_entry.S new file mode 100644 index 000000000000..612510b98445 --- /dev/null +++ b/arch/riscv/kernel/sbi_sse_entry.S @@ -0,0 +1,178 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2024 Rivos Inc. + */ + +#include +#include + +#include +#include +#include + +/* When entering handle_sse, the following registers are set: + * a6: contains the hartid + * a7: contains a sse_event_arch_data struct pointer + */ +SYM_CODE_START(handle_sse) + /* Save stack temporarily */ + REG_S sp, SSE_REG_EVT_TMP(a7) + /* Set entry stack */ + REG_L sp, SSE_REG_EVT_STACK(a7) + + addi sp, sp, -(PT_SIZE_ON_STACK) + REG_S ra, PT_RA(sp) + REG_S s0, PT_S0(sp) + REG_S s1, PT_S1(sp) + REG_S s2, PT_S2(sp) + REG_S s3, PT_S3(sp) + REG_S s4, PT_S4(sp) + REG_S s5, PT_S5(sp) + REG_S s6, PT_S6(sp) + REG_S s7, PT_S7(sp) + REG_S s8, PT_S8(sp) + REG_S s9, PT_S9(sp) + REG_S s10, PT_S10(sp) + REG_S s11, PT_S11(sp) + REG_S tp, PT_TP(sp) + REG_S t0, PT_T0(sp) + REG_S t1, PT_T1(sp) + REG_S t2, PT_T2(sp) + REG_S t3, PT_T3(sp) + REG_S t4, PT_T4(sp) + REG_S t5, PT_T5(sp) + REG_S t6, PT_T6(sp) + REG_S gp, PT_GP(sp) + REG_S a0, PT_A0(sp) + REG_S a1, PT_A1(sp) + REG_S a2, PT_A2(sp) + REG_S a3, PT_A3(sp) + REG_S a4, PT_A4(sp) + REG_S a5, PT_A5(sp) + + /* Retrieve entry sp */ + REG_L a4, SSE_REG_EVT_TMP(a7) + /* Save CSRs */ + csrr a0, CSR_EPC + csrr a1, CSR_SSTATUS + csrr a2, CSR_STVAL + csrr a3, CSR_SCAUSE + + REG_S a0, PT_EPC(sp) + REG_S a1, PT_STATUS(sp) + REG_S a2, PT_BADADDR(sp) + REG_S a3, PT_CAUSE(sp) + REG_S a4, PT_SP(sp) + + /* Disable user memory access and floating/vector computing */ + li t0, SR_SUM | SR_FS_VS + csrc CSR_STATUS, t0 + + load_global_pointer + scs_load_sse_stack a7 + +#ifdef CONFIG_SMP + lw t4, SSE_REG_HART_ID(a7) + lw t3, SSE_REG_CPU_ID(a7) + + bne t4, a6, .Lfind_hart_id_slowpath + +.Lcpu_id_found: +#else + mv t3, zero +#endif + + asm_per_cpu_with_cpu t2 __sbi_sse_entry_task t1 t3 + REG_L tp, 0(t2) + + mv a1, sp /* pt_regs on stack */ + + /* + * Save sscratch for restoration since we might have interrupted the + * kernel in early exception path and thus, we don't know the content of + * sscratch. + */ + csrrw s4, CSR_SSCRATCH, x0 + + mv a0, a7 + + call do_sse + + csrw CSR_SSCRATCH, s4 + + REG_L a0, PT_STATUS(sp) + REG_L a1, PT_EPC(sp) + REG_L a2, PT_BADADDR(sp) + REG_L a3, PT_CAUSE(sp) + csrw CSR_SSTATUS, a0 + csrw CSR_EPC, a1 + csrw CSR_STVAL, a2 + csrw CSR_SCAUSE, a3 + + REG_L ra, PT_RA(sp) + REG_L s0, PT_S0(sp) + REG_L s1, PT_S1(sp) + REG_L s2, PT_S2(sp) + REG_L s3, PT_S3(sp) + REG_L s4, PT_S4(sp) + REG_L s5, PT_S5(sp) + REG_L s6, PT_S6(sp) + REG_L s7, PT_S7(sp) + REG_L s8, PT_S8(sp) + REG_L s9, PT_S9(sp) + REG_L s10, PT_S10(sp) + REG_L s11, PT_S11(sp) + REG_L tp, PT_TP(sp) + REG_L t0, PT_T0(sp) + REG_L t1, PT_T1(sp) + REG_L t2, PT_T2(sp) + REG_L t3, PT_T3(sp) + REG_L t4, PT_T4(sp) + REG_L t5, PT_T5(sp) + REG_L t6, PT_T6(sp) + REG_L gp, PT_GP(sp) + REG_L a0, PT_A0(sp) + REG_L a1, PT_A1(sp) + REG_L a2, PT_A2(sp) + REG_L a3, PT_A3(sp) + REG_L a4, PT_A4(sp) + REG_L a5, PT_A5(sp) + + REG_L sp, PT_SP(sp) + + li a7, SBI_EXT_SSE + li a6, SBI_SSE_EVENT_COMPLETE + ecall + +#ifdef CONFIG_SMP +.Lfind_hart_id_slowpath: + + /* Restore current task struct from __sbi_sse_entry_task */ + li t1, ASM_NR_CPUS + /* Slowpath to find the CPU id associated to the hart id */ + la t0, __cpuid_to_hartid_map + +.Lhart_id_loop: + REG_L t2, 0(t0) + beq t2, a6, .Lcpu_id_found + + /* Increment pointer and CPU number */ + addi t3, t3, 1 + addi t0, t0, RISCV_SZPTR + bltu t3, t1, .Lhart_id_loop + + /* + * This should never happen since we expect the hart_id to match one + * of our CPU, but better be safe than sorry + */ + la tp, init_task + la a0, sse_hart_id_panic_string + la t0, panic + jalr t0 +#endif + +SYM_CODE_END(handle_sse) + +SYM_DATA_START_LOCAL(sse_hart_id_panic_string) + .ascii "Unable to match hart_id with cpu\0" +SYM_DATA_END(sse_hart_id_panic_string) -- 2.43.0 From cleger at rivosinc.com Mon Sep 8 11:17:05 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 8 Sep 2025 18:17:05 +0000 Subject: [PATCH v7 3/5] drivers: firmware: add riscv SSE support In-Reply-To: <20250908181717.1997461-1-cleger@rivosinc.com> References: <20250908181717.1997461-1-cleger@rivosinc.com> Message-ID: <20250908181717.1997461-4-cleger@rivosinc.com> Add driver level interface to use RISC-V SSE arch support. This interface allows registering SSE handlers, and receive them. This will be used by PMU and GHES driver. Co-developed-by: Himanshu Chauhan Signed-off-by: Himanshu Chauhan Signed-off-by: Cl?ment L?ger Acked-by: Conor Dooley --- MAINTAINERS | 15 + drivers/firmware/Kconfig | 1 + drivers/firmware/Makefile | 1 + drivers/firmware/riscv/Kconfig | 15 + drivers/firmware/riscv/Makefile | 3 + drivers/firmware/riscv/riscv_sbi_sse.c | 701 +++++++++++++++++++++++++ include/linux/riscv_sbi_sse.h | 57 ++ 7 files changed, 793 insertions(+) create mode 100644 drivers/firmware/riscv/Kconfig create mode 100644 drivers/firmware/riscv/Makefile create mode 100644 drivers/firmware/riscv/riscv_sbi_sse.c create mode 100644 include/linux/riscv_sbi_sse.h diff --git a/MAINTAINERS b/MAINTAINERS index fe168477caa4..684d23f852c3 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -21648,6 +21648,13 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux.git F: Documentation/devicetree/bindings/iommu/riscv,iommu.yaml F: drivers/iommu/riscv/ +RISC-V FIRMWARE DRIVERS +M: Conor Dooley +L: linux-riscv at lists.infradead.org +S: Maintained +T: git git://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git +F: drivers/firmware/riscv/* + RISC-V MICROCHIP FPGA SUPPORT M: Conor Dooley M: Daire McNamara @@ -21712,6 +21719,14 @@ F: arch/riscv/boot/dts/spacemit/ N: spacemit K: spacemit +RISC-V SSE DRIVER +M: Cl?ment L?ger +R: Himanshu Chauhan +L: linux-riscv at lists.infradead.org +S: Maintained +F: drivers/firmware/riscv/riscv_sse.c +F: include/linux/riscv_sse.h + RISC-V THEAD SoC SUPPORT M: Drew Fustini M: Guo Ren diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig index bbd2155d8483..1894df87b08e 100644 --- a/drivers/firmware/Kconfig +++ b/drivers/firmware/Kconfig @@ -294,6 +294,7 @@ source "drivers/firmware/meson/Kconfig" source "drivers/firmware/microchip/Kconfig" source "drivers/firmware/psci/Kconfig" source "drivers/firmware/qcom/Kconfig" +source "drivers/firmware/riscv/Kconfig" source "drivers/firmware/samsung/Kconfig" source "drivers/firmware/smccc/Kconfig" source "drivers/firmware/tegra/Kconfig" diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile index 4ddec2820c96..6cdd84570ea7 100644 --- a/drivers/firmware/Makefile +++ b/drivers/firmware/Makefile @@ -34,6 +34,7 @@ obj-y += efi/ obj-y += imx/ obj-y += psci/ obj-y += qcom/ +obj-y += riscv/ obj-y += samsung/ obj-y += smccc/ obj-y += tegra/ diff --git a/drivers/firmware/riscv/Kconfig b/drivers/firmware/riscv/Kconfig new file mode 100644 index 000000000000..ed5b663ac5f9 --- /dev/null +++ b/drivers/firmware/riscv/Kconfig @@ -0,0 +1,15 @@ +# SPDX-License-Identifier: GPL-2.0-only +menu "Risc-V Specific firmware drivers" +depends on RISCV + +config RISCV_SBI_SSE + bool "Enable SBI Supervisor Software Events support" + depends on RISCV_SBI + default y + help + The Supervisor Software Events support allows the SBI to deliver + NMI-like notifications to the supervisor mode software. When enabled, + this option provides support to register callbacks on specific SSE + events. + +endmenu diff --git a/drivers/firmware/riscv/Makefile b/drivers/firmware/riscv/Makefile new file mode 100644 index 000000000000..c8795d4bbb2e --- /dev/null +++ b/drivers/firmware/riscv/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-$(CONFIG_RISCV_SBI_SSE) += riscv_sbi_sse.o diff --git a/drivers/firmware/riscv/riscv_sbi_sse.c b/drivers/firmware/riscv/riscv_sbi_sse.c new file mode 100644 index 000000000000..57b6dad92482 --- /dev/null +++ b/drivers/firmware/riscv/riscv_sbi_sse.c @@ -0,0 +1,701 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2024 Rivos Inc. + */ + +#define pr_fmt(fmt) "sse: " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +struct sse_event { + struct list_head list; + u32 evt_id; + u32 priority; + sse_event_handler_fn *handler; + void *handler_arg; + /* Only valid for global events */ + unsigned int cpu; + + union { + struct sse_registered_event *global; + struct sse_registered_event __percpu *local; + }; +}; + +static int sse_hp_state; +static bool sse_available __ro_after_init; +static DEFINE_SPINLOCK(events_list_lock); +static LIST_HEAD(events); +static DEFINE_MUTEX(sse_mutex); + +struct sse_registered_event { + struct sse_event_arch_data arch; + struct sse_event *event; + unsigned long attr; + bool is_enabled; +}; + +void sse_handle_event(struct sse_event_arch_data *arch_event, + struct pt_regs *regs) +{ + int ret; + struct sse_registered_event *reg_evt = + container_of(arch_event, struct sse_registered_event, arch); + struct sse_event *evt = reg_evt->event; + + ret = evt->handler(evt->evt_id, evt->handler_arg, regs); + if (ret) + pr_warn("event %x handler failed with error %d\n", evt->evt_id, ret); +} + +static struct sse_event *sse_event_get(u32 evt) +{ + struct sse_event *event = NULL; + + scoped_guard(spinlock, &events_list_lock) { + list_for_each_entry(event, &events, list) { + if (event->evt_id == evt) + return event; + } + } + + return NULL; +} + +static phys_addr_t sse_event_get_attr_phys(struct sse_registered_event *reg_evt) +{ + phys_addr_t phys; + void *addr = ®_evt->attr; + + if (sse_event_is_global(reg_evt->event->evt_id)) + phys = virt_to_phys(addr); + else + phys = per_cpu_ptr_to_phys(addr); + + return phys; +} + +static struct sse_registered_event *sse_get_reg_evt(struct sse_event *event) +{ + if (sse_event_is_global(event->evt_id)) + return event->global; + else + return per_cpu_ptr(event->local, smp_processor_id()); +} + +static int sse_sbi_event_func(struct sse_event *event, unsigned long func) +{ + struct sbiret ret; + u32 evt = event->evt_id; + struct sse_registered_event *reg_evt = sse_get_reg_evt(event); + + ret = sbi_ecall(SBI_EXT_SSE, func, evt, 0, 0, 0, 0, 0); + if (ret.error) { + pr_warn("Failed to execute func %lx, event %x, error %ld\n", + func, evt, ret.error); + return sbi_err_map_linux_errno(ret.error); + } + + if (func == SBI_SSE_EVENT_DISABLE) + reg_evt->is_enabled = false; + else if (func == SBI_SSE_EVENT_ENABLE) + reg_evt->is_enabled = true; + + return 0; +} + +int sse_event_disable_local(struct sse_event *event) +{ + return sse_sbi_event_func(event, SBI_SSE_EVENT_DISABLE); +} +EXPORT_SYMBOL_GPL(sse_event_disable_local); + +int sse_event_enable_local(struct sse_event *event) +{ + return sse_sbi_event_func(event, SBI_SSE_EVENT_ENABLE); +} +EXPORT_SYMBOL_GPL(sse_event_enable_local); + +static int sse_event_attr_get_no_lock(struct sse_registered_event *reg_evt, + unsigned long attr_id, unsigned long *val) +{ + struct sbiret sret; + u32 evt = reg_evt->event->evt_id; + unsigned long phys; + + phys = sse_event_get_attr_phys(reg_evt); + + sret = sbi_ecall(SBI_EXT_SSE, SBI_SSE_EVENT_ATTR_READ, evt, attr_id, 1, + phys, 0, 0); + if (sret.error) { + pr_debug("Failed to get event %x attr %lx, error %ld\n", evt, + attr_id, sret.error); + return sbi_err_map_linux_errno(sret.error); + } + + *val = reg_evt->attr; + + return 0; +} + +static int sse_event_attr_set_nolock(struct sse_registered_event *reg_evt, + unsigned long attr_id, unsigned long val) +{ + struct sbiret sret; + u32 evt = reg_evt->event->evt_id; + unsigned long phys; + + reg_evt->attr = val; + phys = sse_event_get_attr_phys(reg_evt); + + sret = sbi_ecall(SBI_EXT_SSE, SBI_SSE_EVENT_ATTR_WRITE, evt, attr_id, 1, + phys, 0, 0); + if (sret.error) + pr_debug("Failed to set event %x attr %lx, error %ld\n", evt, + attr_id, sret.error); + + return sbi_err_map_linux_errno(sret.error); +} + +static void sse_global_event_update_cpu(struct sse_event *event, + unsigned int cpu) +{ + struct sse_registered_event *reg_evt = event->global; + + event->cpu = cpu; + arch_sse_event_update_cpu(®_evt->arch, cpu); +} + +static int sse_event_set_target_cpu_nolock(struct sse_event *event, + unsigned int cpu) +{ + unsigned long hart_id = cpuid_to_hartid_map(cpu); + struct sse_registered_event *reg_evt = event->global; + u32 evt = event->evt_id; + bool was_enabled; + int ret; + + if (!sse_event_is_global(evt)) + return -EINVAL; + + was_enabled = reg_evt->is_enabled; + if (was_enabled) + sse_event_disable_local(event); + + ret = sse_event_attr_set_nolock(reg_evt, SBI_SSE_ATTR_PREFERRED_HART, + hart_id); + if (ret == 0) + sse_global_event_update_cpu(event, cpu); + + if (was_enabled) + sse_event_enable_local(event); + + return 0; +} + +int sse_event_set_target_cpu(struct sse_event *event, unsigned int cpu) +{ + int ret; + + scoped_guard(mutex, &sse_mutex) { + scoped_guard(cpus_read_lock) { + if (!cpu_online(cpu)) + return -EINVAL; + + ret = sse_event_set_target_cpu_nolock(event, cpu); + } + } + + return ret; +} +EXPORT_SYMBOL_GPL(sse_event_set_target_cpu); + +static int sse_event_init_registered(unsigned int cpu, + struct sse_registered_event *reg_evt, + struct sse_event *event) +{ + reg_evt->event = event; + + return arch_sse_init_event(®_evt->arch, event->evt_id, cpu); +} + +static void sse_event_free_registered(struct sse_registered_event *reg_evt) +{ + arch_sse_free_event(®_evt->arch); +} + +static int sse_event_alloc_global(struct sse_event *event) +{ + int err; + struct sse_registered_event *reg_evt; + + reg_evt = kzalloc(sizeof(*reg_evt), GFP_KERNEL); + if (!reg_evt) + return -ENOMEM; + + event->global = reg_evt; + err = sse_event_init_registered(smp_processor_id(), reg_evt, event); + if (err) + kfree(reg_evt); + + return err; +} + +static int sse_event_alloc_local(struct sse_event *event) +{ + int err; + unsigned int cpu, err_cpu; + struct sse_registered_event *reg_evt; + struct sse_registered_event __percpu *reg_evts; + + reg_evts = alloc_percpu(struct sse_registered_event); + if (!reg_evts) + return -ENOMEM; + + event->local = reg_evts; + + for_each_possible_cpu(cpu) { + reg_evt = per_cpu_ptr(reg_evts, cpu); + err = sse_event_init_registered(cpu, reg_evt, event); + if (err) { + err_cpu = cpu; + goto err_free_per_cpu; + } + } + + return 0; + +err_free_per_cpu: + for_each_possible_cpu(cpu) { + if (cpu == err_cpu) + break; + reg_evt = per_cpu_ptr(reg_evts, cpu); + sse_event_free_registered(reg_evt); + } + + free_percpu(reg_evts); + + return err; +} + +static struct sse_event *sse_event_alloc(u32 evt, u32 priority, + sse_event_handler_fn *handler, void *arg) +{ + int err; + struct sse_event *event; + + event = kzalloc(sizeof(*event), GFP_KERNEL); + if (!event) + return ERR_PTR(-ENOMEM); + + event->evt_id = evt; + event->priority = priority; + event->handler_arg = arg; + event->handler = handler; + + if (sse_event_is_global(evt)) + err = sse_event_alloc_global(event); + else + err = sse_event_alloc_local(event); + + if (err) { + kfree(event); + return ERR_PTR(err); + } + + return event; +} + +static int sse_sbi_register_event(struct sse_event *event, + struct sse_registered_event *reg_evt) +{ + int ret; + + ret = sse_event_attr_set_nolock(reg_evt, SBI_SSE_ATTR_PRIO, + event->priority); + if (ret) + return ret; + + return arch_sse_register_event(®_evt->arch); +} + +static int sse_event_register_local(struct sse_event *event) +{ + int ret; + struct sse_registered_event *reg_evt; + + reg_evt = per_cpu_ptr(event->local, smp_processor_id()); + ret = sse_sbi_register_event(event, reg_evt); + if (ret) + pr_debug("Failed to register event %x: err %d\n", event->evt_id, + ret); + + return ret; +} + +static int sse_sbi_unregister_event(struct sse_event *event) +{ + return sse_sbi_event_func(event, SBI_SSE_EVENT_UNREGISTER); +} + +struct sse_per_cpu_evt { + struct sse_event *event; + unsigned long func; + cpumask_t error; +}; + +static void sse_event_per_cpu_func(void *info) +{ + int ret; + struct sse_per_cpu_evt *cpu_evt = info; + + if (cpu_evt->func == SBI_SSE_EVENT_REGISTER) + ret = sse_event_register_local(cpu_evt->event); + else + ret = sse_sbi_event_func(cpu_evt->event, cpu_evt->func); + + if (ret) + cpumask_set_cpu(smp_processor_id(), &cpu_evt->error); +} + +static void sse_event_free(struct sse_event *event) +{ + unsigned int cpu; + struct sse_registered_event *reg_evt; + + if (sse_event_is_global(event->evt_id)) { + sse_event_free_registered(event->global); + kfree(event->global); + } else { + for_each_possible_cpu(cpu) { + reg_evt = per_cpu_ptr(event->local, cpu); + sse_event_free_registered(reg_evt); + } + free_percpu(event->local); + } + + kfree(event); +} + +static int sse_on_each_cpu(struct sse_event *event, unsigned long func, + unsigned long revert_func) +{ + struct sse_per_cpu_evt cpu_evt; + + cpu_evt.event = event; + cpumask_clear(&cpu_evt.error); + cpu_evt.func = func; + on_each_cpu(sse_event_per_cpu_func, &cpu_evt, 1); + /* + * If there are some error reported by CPUs, revert event state on the + * other ones + */ + if (!cpumask_empty(&cpu_evt.error)) { + cpumask_t revert; + + cpumask_andnot(&revert, cpu_online_mask, &cpu_evt.error); + cpu_evt.func = revert_func; + on_each_cpu_mask(&revert, sse_event_per_cpu_func, &cpu_evt, 1); + + return -EIO; + } + + return 0; +} + +int sse_event_enable(struct sse_event *event) +{ + int ret = 0; + + scoped_guard(mutex, &sse_mutex) { + scoped_guard(cpus_read_lock) { + if (sse_event_is_global(event->evt_id)) { + ret = sse_event_enable_local(event); + } else { + ret = sse_on_each_cpu(event, + SBI_SSE_EVENT_ENABLE, + SBI_SSE_EVENT_DISABLE); + } + } + } + return ret; +} +EXPORT_SYMBOL_GPL(sse_event_enable); + +static int sse_events_mask(void) +{ + struct sbiret ret; + + ret = sbi_ecall(SBI_EXT_SSE, SBI_SSE_HART_MASK, 0, 0, 0, 0, 0, 0); + + return sbi_err_map_linux_errno(ret.error); +} + +static int sse_events_unmask(void) +{ + struct sbiret ret; + + ret = sbi_ecall(SBI_EXT_SSE, SBI_SSE_HART_UNMASK, 0, 0, 0, 0, 0, 0); + + return sbi_err_map_linux_errno(ret.error); +} + +static void sse_event_disable_nolock(struct sse_event *event) +{ + struct sse_per_cpu_evt cpu_evt; + + if (sse_event_is_global(event->evt_id)) { + sse_event_disable_local(event); + } else { + cpu_evt.event = event; + cpu_evt.func = SBI_SSE_EVENT_DISABLE; + on_each_cpu(sse_event_per_cpu_func, &cpu_evt, 1); + } +} + +void sse_event_disable(struct sse_event *event) +{ + scoped_guard(mutex, &sse_mutex) { + scoped_guard(cpus_read_lock) { + sse_event_disable_nolock(event); + } + } +} +EXPORT_SYMBOL_GPL(sse_event_disable); + +struct sse_event *sse_event_register(u32 evt, u32 priority, + sse_event_handler_fn *handler, void *arg) +{ + struct sse_event *event; + int cpu; + int ret = 0; + + if (!sse_available) + return ERR_PTR(-EOPNOTSUPP); + + guard(mutex)(&sse_mutex); + if (sse_event_get(evt)) + return ERR_PTR(-EEXIST); + + event = sse_event_alloc(evt, priority, handler, arg); + if (IS_ERR(event)) + return event; + + scoped_guard(cpus_read_lock) { + if (sse_event_is_global(evt)) { + unsigned long preferred_hart; + + ret = sse_event_attr_get_no_lock(event->global, + SBI_SSE_ATTR_PREFERRED_HART, + &preferred_hart); + if (ret) + goto err_event_free; + + cpu = riscv_hartid_to_cpuid(preferred_hart); + sse_global_event_update_cpu(event, cpu); + + ret = sse_sbi_register_event(event, event->global); + if (ret) + goto err_event_free; + + } else { + ret = sse_on_each_cpu(event, SBI_SSE_EVENT_REGISTER, + SBI_SSE_EVENT_DISABLE); + if (ret) + goto err_event_free; + } + } + + scoped_guard(spinlock, &events_list_lock) + list_add(&event->list, &events); + + return event; + +err_event_free: + sse_event_free(event); + + return ERR_PTR(ret); +} +EXPORT_SYMBOL_GPL(sse_event_register); + +static void sse_event_unregister_nolock(struct sse_event *event) +{ + struct sse_per_cpu_evt cpu_evt; + + if (sse_event_is_global(event->evt_id)) { + sse_sbi_unregister_event(event); + } else { + cpu_evt.event = event; + cpu_evt.func = SBI_SSE_EVENT_UNREGISTER; + on_each_cpu(sse_event_per_cpu_func, &cpu_evt, 1); + } +} + +void sse_event_unregister(struct sse_event *event) +{ + scoped_guard(mutex, &sse_mutex) { + scoped_guard(cpus_read_lock) + sse_event_unregister_nolock(event); + + scoped_guard(spinlock, &events_list_lock) + list_del(&event->list); + + sse_event_free(event); + } +} +EXPORT_SYMBOL_GPL(sse_event_unregister); + +static int sse_cpu_online(unsigned int cpu) +{ + struct sse_event *event; + + scoped_guard(spinlock, &events_list_lock) { + list_for_each_entry(event, &events, list) { + if (sse_event_is_global(event->evt_id)) + continue; + + sse_event_register_local(event); + if (sse_get_reg_evt(event)) + sse_event_enable_local(event); + } + } + + /* Ready to handle events. Unmask SSE. */ + return sse_events_unmask(); +} + +static int sse_cpu_teardown(unsigned int cpu) +{ + int ret = 0; + unsigned int next_cpu; + struct sse_event *event; + + /* Mask the sse events */ + ret = sse_events_mask(); + if (ret) + return ret; + + scoped_guard(spinlock, &events_list_lock) { + list_for_each_entry(event, &events, list) { + if (!sse_event_is_global(event->evt_id)) { + if (event->global->is_enabled) + sse_event_disable_local(event); + + sse_sbi_unregister_event(event); + continue; + } + + if (event->cpu != smp_processor_id()) + continue; + + /* Update destination hart for global event */ + next_cpu = cpumask_any_but(cpu_online_mask, cpu); + ret = sse_event_set_target_cpu_nolock(event, next_cpu); + } + } + + return ret; +} + +static void sse_reset(void) +{ + struct sse_event *event; + + list_for_each_entry(event, &events, list) { + sse_event_disable_nolock(event); + sse_event_unregister_nolock(event); + } +} + +static int sse_pm_notifier(struct notifier_block *nb, unsigned long action, + void *data) +{ + WARN_ON_ONCE(preemptible()); + + switch (action) { + case CPU_PM_ENTER: + sse_events_mask(); + break; + case CPU_PM_EXIT: + case CPU_PM_ENTER_FAILED: + sse_events_unmask(); + break; + default: + return NOTIFY_DONE; + } + + return NOTIFY_OK; +} + +static struct notifier_block sse_pm_nb = { + .notifier_call = sse_pm_notifier, +}; + +/* + * Mask all CPUs and unregister all events on panic, reboot or kexec. + */ +static int sse_reboot_notifier(struct notifier_block *nb, unsigned long action, + void *data) +{ + cpuhp_remove_state(sse_hp_state); + sse_reset(); + + return NOTIFY_OK; +} + +static struct notifier_block sse_reboot_nb = { + .notifier_call = sse_reboot_notifier, +}; + +static int __init sse_init(void) +{ + int ret; + + if (sbi_probe_extension(SBI_EXT_SSE) <= 0) { + pr_err("Missing SBI SSE extension\n"); + return -EOPNOTSUPP; + } + pr_info("SBI SSE extension detected\n"); + + ret = cpu_pm_register_notifier(&sse_pm_nb); + if (ret) { + pr_warn("Failed to register CPU PM notifier...\n"); + return ret; + } + + ret = register_reboot_notifier(&sse_reboot_nb); + if (ret) { + pr_warn("Failed to register reboot notifier...\n"); + goto remove_cpupm; + } + + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "riscv/sse:online", + sse_cpu_online, sse_cpu_teardown); + if (ret < 0) + goto remove_reboot; + + sse_hp_state = ret; + sse_available = true; + + return 0; + +remove_reboot: + unregister_reboot_notifier(&sse_reboot_nb); + +remove_cpupm: + cpu_pm_unregister_notifier(&sse_pm_nb); + + return ret; +} +arch_initcall(sse_init); diff --git a/include/linux/riscv_sbi_sse.h b/include/linux/riscv_sbi_sse.h new file mode 100644 index 000000000000..a1b58e89dd19 --- /dev/null +++ b/include/linux/riscv_sbi_sse.h @@ -0,0 +1,57 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024 Rivos Inc. + */ + +#ifndef __LINUX_RISCV_SBI_SSE_H +#define __LINUX_RISCV_SBI_SSE_H + +#include +#include + +struct sse_event; +struct pt_regs; + +typedef int (sse_event_handler_fn)(u32 event_num, void *arg, + struct pt_regs *regs); + +#ifdef CONFIG_RISCV_SBI_SSE + +struct sse_event *sse_event_register(u32 event_num, u32 priority, + sse_event_handler_fn *handler, void *arg); + +void sse_event_unregister(struct sse_event *evt); + +int sse_event_set_target_cpu(struct sse_event *sse_evt, unsigned int cpu); + +int sse_event_enable(struct sse_event *sse_evt); + +void sse_event_disable(struct sse_event *sse_evt); + +int sse_event_enable_local(struct sse_event *sse_evt); +int sse_event_disable_local(struct sse_event *sse_evt); + +#else +static inline struct sse_event *sse_event_register(u32 event_num, u32 priority, + sse_event_handler_fn *handler, + void *arg) +{ + return ERR_PTR(-EOPNOTSUPP); +} + +static inline void sse_event_unregister(struct sse_event *evt) {} + +static inline int sse_event_set_target_cpu(struct sse_event *sse_evt, + unsigned int cpu) +{ + return -EOPNOTSUPP; +} + +static inline int sse_event_enable(struct sse_event *sse_evt) +{ + return -EOPNOTSUPP; +} + +static inline void sse_event_disable(struct sse_event *sse_evt) {} +#endif +#endif /* __LINUX_RISCV_SBI_SSE_H */ -- 2.43.0 From cleger at rivosinc.com Mon Sep 8 11:17:06 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 8 Sep 2025 18:17:06 +0000 Subject: [PATCH v7 4/5] perf: RISC-V: add support for SSE event In-Reply-To: <20250908181717.1997461-1-cleger@rivosinc.com> References: <20250908181717.1997461-1-cleger@rivosinc.com> Message-ID: <20250908181717.1997461-5-cleger@rivosinc.com> In order to use SSE within PMU drivers, register a SSE handler for the local PMU event. Reuse the existing overflow IRQ handler and pass appropriate pt_regs. Add a config option RISCV_PMU_SSE to select event delivery via SSE events. Signed-off-by: Cl?ment L?ger --- drivers/perf/Kconfig | 10 +++++ drivers/perf/riscv_pmu.c | 23 +++++++++++ drivers/perf/riscv_pmu_sbi.c | 71 +++++++++++++++++++++++++++++----- include/linux/perf/riscv_pmu.h | 5 +++ 4 files changed, 99 insertions(+), 10 deletions(-) diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index a9188dec36fe..bea08d4689b1 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -105,6 +105,16 @@ config RISCV_PMU_SBI full perf feature support i.e. counter overflow, privilege mode filtering, counter configuration. +config RISCV_PMU_SBI_SSE + depends on RISCV_PMU && RISCV_SBI_SSE + bool "RISC-V PMU SSE events" + default n + help + Say y if you want to use SSE events to deliver PMU interrupts. This + provides a way to profile the kernel at any level by using NMI-like + SSE events. SSE events being really intrusive, this option allows + to select it only if needed. + config STARFIVE_STARLINK_PMU depends on ARCH_STARFIVE || COMPILE_TEST depends on 64BIT diff --git a/drivers/perf/riscv_pmu.c b/drivers/perf/riscv_pmu.c index 7644147d50b4..dda2814801c0 100644 --- a/drivers/perf/riscv_pmu.c +++ b/drivers/perf/riscv_pmu.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include @@ -254,6 +255,24 @@ void riscv_pmu_start(struct perf_event *event, int flags) perf_event_update_userpage(event); } +#ifdef CONFIG_RISCV_PMU_SBI_SSE +static void riscv_pmu_disable(struct pmu *pmu) +{ + struct riscv_pmu *rvpmu = to_riscv_pmu(pmu); + + if (rvpmu->sse_evt) + sse_event_disable_local(rvpmu->sse_evt); +} + +static void riscv_pmu_enable(struct pmu *pmu) +{ + struct riscv_pmu *rvpmu = to_riscv_pmu(pmu); + + if (rvpmu->sse_evt) + sse_event_enable_local(rvpmu->sse_evt); +} +#endif + static int riscv_pmu_add(struct perf_event *event, int flags) { struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu); @@ -411,6 +430,10 @@ struct riscv_pmu *riscv_pmu_alloc(void) .event_mapped = riscv_pmu_event_mapped, .event_unmapped = riscv_pmu_event_unmapped, .event_idx = riscv_pmu_event_idx, +#ifdef CONFIG_RISCV_PMU_SBI_SSE + .pmu_enable = riscv_pmu_enable, + .pmu_disable = riscv_pmu_disable, +#endif .add = riscv_pmu_add, .del = riscv_pmu_del, .start = riscv_pmu_start, diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c index 698de8ddf895..a864a543ccc8 100644 --- a/drivers/perf/riscv_pmu_sbi.c +++ b/drivers/perf/riscv_pmu_sbi.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -948,10 +949,10 @@ static void pmu_sbi_start_overflow_mask(struct riscv_pmu *pmu, pmu_sbi_start_ovf_ctrs_sbi(cpu_hw_evt, ctr_ovf_mask); } -static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev) +static irqreturn_t pmu_sbi_ovf_handler(struct cpu_hw_events *cpu_hw_evt, + struct pt_regs *regs, bool from_sse) { struct perf_sample_data data; - struct pt_regs *regs; struct hw_perf_event *hw_evt; union sbi_pmu_ctr_info *info; int lidx, hidx, fidx; @@ -959,7 +960,6 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev) struct perf_event *event; u64 overflow; u64 overflowed_ctrs = 0; - struct cpu_hw_events *cpu_hw_evt = dev; u64 start_clock = sched_clock(); struct riscv_pmu_snapshot_data *sdata = cpu_hw_evt->snapshot_addr; @@ -969,13 +969,15 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev) /* Firmware counter don't support overflow yet */ fidx = find_first_bit(cpu_hw_evt->used_hw_ctrs, RISCV_MAX_COUNTERS); if (fidx == RISCV_MAX_COUNTERS) { - csr_clear(CSR_SIP, BIT(riscv_pmu_irq_num)); + if (!from_sse) + csr_clear(CSR_SIP, BIT(riscv_pmu_irq_num)); return IRQ_NONE; } event = cpu_hw_evt->events[fidx]; if (!event) { - ALT_SBI_PMU_OVF_CLEAR_PENDING(riscv_pmu_irq_mask); + if (!from_sse) + ALT_SBI_PMU_OVF_CLEAR_PENDING(riscv_pmu_irq_mask); return IRQ_NONE; } @@ -990,16 +992,16 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev) /* * Overflow interrupt pending bit should only be cleared after stopping - * all the counters to avoid any race condition. + * all the counters to avoid any race condition. When using SSE, + * interrupt is cleared when stopping counters. */ - ALT_SBI_PMU_OVF_CLEAR_PENDING(riscv_pmu_irq_mask); + if (!from_sse) + ALT_SBI_PMU_OVF_CLEAR_PENDING(riscv_pmu_irq_mask); /* No overflow bit is set */ if (!overflow) return IRQ_NONE; - regs = get_irq_regs(); - for_each_set_bit(lidx, cpu_hw_evt->used_hw_ctrs, RISCV_MAX_COUNTERS) { struct perf_event *event = cpu_hw_evt->events[lidx]; @@ -1055,6 +1057,51 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev) return IRQ_HANDLED; } +static irqreturn_t pmu_sbi_ovf_irq_handler(int irq, void *dev) +{ + return pmu_sbi_ovf_handler(dev, get_irq_regs(), false); +} + +#ifdef CONFIG_RISCV_PMU_SBI_SSE +static int pmu_sbi_ovf_sse_handler(u32 evt, void *arg, struct pt_regs *regs) +{ + struct cpu_hw_events __percpu *hw_events = arg; + struct cpu_hw_events *hw_event = raw_cpu_ptr(hw_events); + + pmu_sbi_ovf_handler(hw_event, regs, true); + + return 0; +} + +static int pmu_sbi_setup_sse(struct riscv_pmu *pmu) +{ + int ret; + struct sse_event *evt; + struct cpu_hw_events __percpu *hw_events = pmu->hw_events; + + evt = sse_event_register(SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, 0, + pmu_sbi_ovf_sse_handler, hw_events); + if (IS_ERR(evt)) + return PTR_ERR(evt); + + ret = sse_event_enable(evt); + if (ret) { + sse_event_unregister(evt); + return ret; + } + + pr_info("using SSE for PMU event delivery\n"); + pmu->sse_evt = evt; + + return ret; +} +#else +static int pmu_sbi_setup_sse(struct riscv_pmu *pmu) +{ + return -EOPNOTSUPP; +} +#endif + static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node) { struct riscv_pmu *pmu = hlist_entry_safe(node, struct riscv_pmu, node); @@ -1105,6 +1152,10 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde struct cpu_hw_events __percpu *hw_events = pmu->hw_events; struct irq_domain *domain = NULL; + ret = pmu_sbi_setup_sse(pmu); + if (!ret) + return 0; + if (riscv_isa_extension_available(NULL, SSCOFPMF)) { riscv_pmu_irq_num = RV_IRQ_PMU; riscv_pmu_use_irq = true; @@ -1139,7 +1190,7 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde return -ENODEV; } - ret = request_percpu_irq(riscv_pmu_irq, pmu_sbi_ovf_handler, "riscv-pmu", hw_events); + ret = request_percpu_irq(riscv_pmu_irq, pmu_sbi_ovf_irq_handler, "riscv-pmu", hw_events); if (ret) { pr_err("registering percpu irq failed [%d]\n", ret); return ret; diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h index 701974639ff2..cd493fcab9b3 100644 --- a/include/linux/perf/riscv_pmu.h +++ b/include/linux/perf/riscv_pmu.h @@ -28,6 +28,8 @@ #define RISCV_PMU_CONFIG1_GUEST_EVENTS 0x1 +struct sse_event; + struct cpu_hw_events { /* currently enabled events */ int n_events; @@ -54,6 +56,9 @@ struct riscv_pmu { char *name; irqreturn_t (*handle_irq)(int irq_num, void *dev); +#ifdef CONFIG_RISCV_PMU_SBI_SSE + struct sse_event *sse_evt; +#endif unsigned long cmask; u64 (*ctr_read)(struct perf_event *event); -- 2.43.0 From cleger at rivosinc.com Mon Sep 8 11:17:07 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 8 Sep 2025 18:17:07 +0000 Subject: [PATCH v7 5/5] selftests/riscv: add SSE test module In-Reply-To: <20250908181717.1997461-1-cleger@rivosinc.com> References: <20250908181717.1997461-1-cleger@rivosinc.com> Message-ID: <20250908181717.1997461-6-cleger@rivosinc.com> This module, once loaded, will execute a series of tests using the SSE framework. The provided script will check for any error reported by the test module. Signed-off-by: Cl?ment L?ger --- tools/testing/selftests/riscv/Makefile | 2 +- tools/testing/selftests/riscv/sse/Makefile | 5 + .../selftests/riscv/sse/module/Makefile | 16 + .../riscv/sse/module/riscv_sse_test.c | 513 ++++++++++++++++++ .../selftests/riscv/sse/run_sse_test.sh | 44 ++ 5 files changed, 579 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/riscv/sse/Makefile create mode 100644 tools/testing/selftests/riscv/sse/module/Makefile create mode 100644 tools/testing/selftests/riscv/sse/module/riscv_sse_test.c create mode 100644 tools/testing/selftests/riscv/sse/run_sse_test.sh diff --git a/tools/testing/selftests/riscv/Makefile b/tools/testing/selftests/riscv/Makefile index 099b8c1f46f8..c62f58414b29 100644 --- a/tools/testing/selftests/riscv/Makefile +++ b/tools/testing/selftests/riscv/Makefile @@ -5,7 +5,7 @@ ARCH ?= $(shell uname -m 2>/dev/null || echo not) ifneq (,$(filter $(ARCH),riscv)) -RISCV_SUBTARGETS ?= abi hwprobe mm sigreturn vector +RISCV_SUBTARGETS ?= abi hwprobe mm sigreturn vector sse else RISCV_SUBTARGETS := endif diff --git a/tools/testing/selftests/riscv/sse/Makefile b/tools/testing/selftests/riscv/sse/Makefile new file mode 100644 index 000000000000..67eaee06f213 --- /dev/null +++ b/tools/testing/selftests/riscv/sse/Makefile @@ -0,0 +1,5 @@ +TEST_GEN_MODS_DIR := module + +TEST_FILES := run_sse_test.sh + +include ../../lib.mk diff --git a/tools/testing/selftests/riscv/sse/module/Makefile b/tools/testing/selftests/riscv/sse/module/Makefile new file mode 100644 index 000000000000..02018f083456 --- /dev/null +++ b/tools/testing/selftests/riscv/sse/module/Makefile @@ -0,0 +1,16 @@ +TESTMODS_DIR := $(realpath $(dir $(abspath $(lastword $(MAKEFILE_LIST))))) +KDIR ?= /lib/modules/$(shell uname -r)/build + +obj-m += riscv_sse_test.o + +# Ensure that KDIR exists, otherwise skip the compilation +modules: +ifneq ("$(wildcard $(KDIR))", "") + $(Q)$(MAKE) -C $(KDIR) modules KBUILD_EXTMOD=$(TESTMODS_DIR) +endif + +# Ensure that KDIR exists, otherwise skip the clean target +clean: +ifneq ("$(wildcard $(KDIR))", "") + $(Q)$(MAKE) -C $(KDIR) clean KBUILD_EXTMOD=$(TESTMODS_DIR) +endif diff --git a/tools/testing/selftests/riscv/sse/module/riscv_sse_test.c b/tools/testing/selftests/riscv/sse/module/riscv_sse_test.c new file mode 100644 index 000000000000..e49a2f1179f6 --- /dev/null +++ b/tools/testing/selftests/riscv/sse/module/riscv_sse_test.c @@ -0,0 +1,513 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2024 Rivos Inc. + */ + +#define pr_fmt(fmt) "riscv_sse_test: " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#define RUN_LOOP_COUNT 1000 +#define SSE_FAILED_PREFIX "FAILED: " +#define sse_err(...) pr_err(SSE_FAILED_PREFIX __VA_ARGS__) + +struct sse_event_desc { + u32 evt_id; + const char *name; + bool can_inject; +}; + +static struct sse_event_desc sse_event_descs[] = { + { + .evt_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, + .name = "local_high_prio_ras", + }, + { + .evt_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP, + .name = "local_double_trap", + }, + { + .evt_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, + .name = "global_high_prio_ras", + }, + { + .evt_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, + .name = "local_pmu_overflow", + }, + { + .evt_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, + .name = "local_low_prio_ras", + }, + { + .evt_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, + .name = "global_low_prio_ras", + }, + { + .evt_id = SBI_SSE_EVENT_LOCAL_SOFTWARE_INJECTED, + .name = "local_software_injected", + }, + { + .evt_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE_INJECTED, + .name = "global_software_injected", + } +}; + +static struct sse_event_desc *sse_get_evt_desc(u32 evt) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(sse_event_descs); i++) { + if (sse_event_descs[i].evt_id == evt) + return &sse_event_descs[i]; + } + + return NULL; +} + +static const char *sse_evt_name(u32 evt) +{ + struct sse_event_desc *desc = sse_get_evt_desc(evt); + + return desc != NULL ? desc->name : NULL; +} + +static bool sse_test_can_inject_event(u32 evt) +{ + struct sse_event_desc *desc = sse_get_evt_desc(evt); + + return desc != NULL ? desc->can_inject : false; +} + +static struct sbiret sbi_sse_ecall(int fid, unsigned long arg0, unsigned long arg1) +{ + return sbi_ecall(SBI_EXT_SSE, fid, arg0, arg1, 0, 0, 0, 0); +} + +static int sse_event_attr_get(u32 evt, unsigned long attr_id, + unsigned long *val) +{ + struct sbiret sret; + unsigned long *attr_buf, phys; + + attr_buf = kmalloc(sizeof(unsigned long), GFP_KERNEL); + if (!attr_buf) + return -ENOMEM; + + phys = virt_to_phys(attr_buf); + + sret = sbi_ecall(SBI_EXT_SSE, SBI_SSE_EVENT_ATTR_READ, evt, attr_id, 1, + phys, 0, 0); + if (sret.error) + return sbi_err_map_linux_errno(sret.error); + + *val = *attr_buf; + + return 0; +} + +static int sse_test_signal(u32 evt, unsigned int cpu) +{ + unsigned int hart_id = cpuid_to_hartid_map(cpu); + struct sbiret ret; + + ret = sbi_sse_ecall(SBI_SSE_EVENT_INJECT, evt, hart_id); + if (ret.error) { + sse_err("Failed to signal event %x, error %ld\n", evt, ret.error); + return sbi_err_map_linux_errno(ret.error); + } + + return 0; +} + +static int sse_test_inject_event(struct sse_event *event, u32 evt, unsigned int cpu) +{ + int res; + unsigned long status; + + if (sse_event_is_global(evt)) { + /* + * Due to the fact the completion might happen faster than + * the call to SBI_SSE_COMPLETE in the handler, if the event was + * running on another CPU, we need to wait for the event status + * to be !RUNNING. + */ + do { + res = sse_event_attr_get(evt, SBI_SSE_ATTR_STATUS, &status); + if (res) { + sse_err("Failed to get status for evt %x, error %d\n", evt, res); + return res; + } + status = status & SBI_SSE_ATTR_STATUS_STATE_MASK; + } while (status == SBI_SSE_STATE_RUNNING); + + res = sse_event_set_target_cpu(event, cpu); + if (res) { + sse_err("Failed to set cpu for evt %x, error %d\n", evt, res); + return res; + } + } + + return sse_test_signal(evt, cpu); +} + +struct fast_test_arg { + u32 evt; + int cpu; + bool completion; +}; + +static int sse_test_handler(u32 evt, void *arg, struct pt_regs *regs) +{ + int ret = 0; + struct fast_test_arg *targ = arg; + u32 test_evt = READ_ONCE(targ->evt); + int cpu = READ_ONCE(targ->cpu); + + if (evt != test_evt) { + sse_err("Received SSE event id %x instead of %x\n", test_evt, evt); + ret = -EINVAL; + } + + if (cpu != smp_processor_id()) { + sse_err("Received SSE event %d on CPU %d instead of %d\n", evt, smp_processor_id(), + cpu); + ret = -EINVAL; + } + + WRITE_ONCE(targ->completion, true); + + return ret; +} + +static void sse_run_fast_test(struct fast_test_arg *test_arg, struct sse_event *event, u32 evt) +{ + unsigned long timeout; + int ret, cpu; + + for_each_online_cpu(cpu) { + WRITE_ONCE(test_arg->completion, false); + WRITE_ONCE(test_arg->cpu, cpu); + /* Test arg is used on another CPU */ + smp_wmb(); + + ret = sse_test_inject_event(event, evt, cpu); + if (ret) { + sse_err("event %s injection failed, err %d\n", sse_evt_name(evt), ret); + return; + } + + timeout = jiffies + HZ / 100; + /* We can not use since they are not NMI safe */ + while (!READ_ONCE(test_arg->completion) && + time_before(jiffies, timeout)) { + cpu_relax(); + } + if (!time_before(jiffies, timeout)) { + sse_err("Failed to wait for event %s completion on CPU %d\n", + sse_evt_name(evt), cpu); + return; + } + } +} + +static void sse_test_injection_fast(void) +{ + int i, ret = 0, j; + u32 evt; + struct fast_test_arg test_arg; + struct sse_event *event; + + pr_info("Starting SSE test (fast)\n"); + + for (i = 0; i < ARRAY_SIZE(sse_event_descs); i++) { + evt = sse_event_descs[i].evt_id; + WRITE_ONCE(test_arg.evt, evt); + + if (!sse_event_descs[i].can_inject) + continue; + + event = sse_event_register(evt, 0, sse_test_handler, + (void *)&test_arg); + if (IS_ERR(event)) { + sse_err("Failed to register event %s, err %ld\n", sse_evt_name(evt), + PTR_ERR(event)); + goto out; + } + + ret = sse_event_enable(event); + if (ret) { + sse_err("Failed to enable event %s, err %d\n", sse_evt_name(evt), ret); + goto err_unregister; + } + + pr_info("Starting testing event %s\n", sse_evt_name(evt)); + + for (j = 0; j < RUN_LOOP_COUNT; j++) + sse_run_fast_test(&test_arg, event, evt); + + pr_info("Finished testing event %s\n", sse_evt_name(evt)); + + sse_event_disable(event); +err_unregister: + sse_event_unregister(event); + } +out: + pr_info("Finished SSE test (fast)\n"); +} + +struct priority_test_arg { + unsigned long evt; + struct sse_event *event; + bool called; + u32 prio; + struct priority_test_arg *next_evt_arg; + void (*check_func)(struct priority_test_arg *arg); +}; + +static int sse_hi_priority_test_handler(u32 evt, void *arg, + struct pt_regs *regs) +{ + struct priority_test_arg *targ = arg; + struct priority_test_arg *next = READ_ONCE(targ->next_evt_arg); + + WRITE_ONCE(targ->called, 1); + + if (next) { + sse_test_signal(next->evt, smp_processor_id()); + if (!READ_ONCE(next->called)) { + sse_err("Higher priority event %s was not handled %s\n", + sse_evt_name(next->evt), sse_evt_name(evt)); + } + } + + return 0; +} + +static int sse_low_priority_test_handler(u32 evt, void *arg, struct pt_regs *regs) +{ + struct priority_test_arg *targ = arg; + struct priority_test_arg *next = READ_ONCE(targ->next_evt_arg); + + WRITE_ONCE(targ->called, 1); + + if (next) { + sse_test_signal(next->evt, smp_processor_id()); + if (READ_ONCE(next->called)) { + sse_err("Lower priority event %s was handle before %s\n", + sse_evt_name(next->evt), sse_evt_name(evt)); + } + } + + return 0; +} + +static void sse_test_injection_priority_arg(struct priority_test_arg *args, unsigned int args_size, + sse_event_handler_fn handler, const char *test_name) +{ + unsigned int i; + int ret; + struct sse_event *event; + struct priority_test_arg *arg, *first_arg = NULL, *prev_arg = NULL; + + pr_info("Starting SSE priority test (%s)\n", test_name); + for (i = 0; i < args_size; i++) { + arg = &args[i]; + + if (!sse_test_can_inject_event(arg->evt)) + continue; + + WRITE_ONCE(arg->called, false); + WRITE_ONCE(arg->next_evt_arg, NULL); + if (prev_arg) + WRITE_ONCE(prev_arg->next_evt_arg, arg); + + prev_arg = arg; + + if (!first_arg) + first_arg = arg; + + event = sse_event_register(arg->evt, arg->prio, handler, (void *)arg); + if (IS_ERR(event)) { + sse_err("Failed to register event %s, err %ld\n", sse_evt_name(arg->evt), + PTR_ERR(event)); + goto release_events; + } + arg->event = event; + + if (sse_event_is_global(arg->evt)) { + /* Target event at current CPU */ + ret = sse_event_set_target_cpu(event, smp_processor_id()); + if (ret) { + sse_err("Failed to set event %s target CPU, err %d\n", + sse_evt_name(arg->evt), ret); + goto release_events; + } + } + + ret = sse_event_enable(event); + if (ret) { + sse_err("Failed to enable event %s, err %d\n", sse_evt_name(arg->evt), ret); + goto release_events; + } + } + + if (!first_arg) { + sse_err("No injectable event available\n"); + return; + } + + /* Inject first event, handler should trigger the others in chain. */ + ret = sse_test_inject_event(first_arg->event, first_arg->evt, smp_processor_id()); + if (ret) { + sse_err("SSE event %s injection failed\n", sse_evt_name(first_arg->evt)); + goto release_events; + } + + /* + * Event are injected directly on the current CPU after calling sse_test_inject_event() + * so that execution is premmpted right away, no need to wait for timeout. + */ + arg = first_arg; + while (arg) { + if (!READ_ONCE(arg->called)) { + sse_err("Event %s handler was not called\n", + sse_evt_name(arg->evt)); + ret = -EINVAL; + } + + + event = arg->event; + arg = READ_ONCE(arg->next_evt_arg); + } + +release_events: + + arg = first_arg; + while (arg) { + event = arg->event; + if (!event) + break; + + sse_event_disable(event); + sse_event_unregister(event); + arg = READ_ONCE(arg->next_evt_arg); + } + + pr_info("Finished SSE priority test (%s)\n", test_name); +} + +static void sse_test_injection_priority(void) +{ + struct priority_test_arg default_hi_prio_args[] = { + { .evt = SBI_SSE_EVENT_GLOBAL_SOFTWARE_INJECTED }, + { .evt = SBI_SSE_EVENT_LOCAL_SOFTWARE_INJECTED }, + { .evt = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS }, + { .evt = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS }, + { .evt = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW }, + { .evt = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS }, + { .evt = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP }, + { .evt = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS }, + }; + + struct priority_test_arg default_low_prio_args[] = { + { .evt = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS }, + { .evt = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP }, + { .evt = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS }, + { .evt = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW }, + { .evt = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS }, + { .evt = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS }, + { .evt = SBI_SSE_EVENT_LOCAL_SOFTWARE_INJECTED }, + { .evt = SBI_SSE_EVENT_GLOBAL_SOFTWARE_INJECTED }, + + }; + struct priority_test_arg set_prio_args[] = { + { .evt = SBI_SSE_EVENT_GLOBAL_SOFTWARE_INJECTED, .prio = 5 }, + { .evt = SBI_SSE_EVENT_LOCAL_SOFTWARE_INJECTED, .prio = 10 }, + { .evt = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 15 }, + { .evt = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 20 }, + { .evt = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 25 }, + { .evt = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 30 }, + { .evt = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP, .prio = 35 }, + { .evt = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 40 }, + }; + + struct priority_test_arg same_prio_args[] = { + { .evt = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 0 }, + { .evt = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 10 }, + { .evt = SBI_SSE_EVENT_LOCAL_SOFTWARE_INJECTED, .prio = 10 }, + { .evt = SBI_SSE_EVENT_GLOBAL_SOFTWARE_INJECTED, .prio = 10 }, + { .evt = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20 }, + }; + + sse_test_injection_priority_arg(default_hi_prio_args, ARRAY_SIZE(default_hi_prio_args), + sse_hi_priority_test_handler, "high"); + + sse_test_injection_priority_arg(default_low_prio_args, ARRAY_SIZE(default_low_prio_args), + sse_low_priority_test_handler, "low"); + + sse_test_injection_priority_arg(set_prio_args, ARRAY_SIZE(set_prio_args), + sse_low_priority_test_handler, "set"); + + sse_test_injection_priority_arg(same_prio_args, ARRAY_SIZE(same_prio_args), + sse_low_priority_test_handler, "same_prio_args"); +} + + +static bool sse_get_inject_status(u32 evt) +{ + int ret; + unsigned long val; + + /* Check if injection is supported */ + ret = sse_event_attr_get(evt, SBI_SSE_ATTR_STATUS, &val); + if (ret) + return false; + + return !!(val & BIT(SBI_SSE_ATTR_STATUS_INJECT_OFFSET)); +} + +static void sse_init_events(void) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(sse_event_descs); i++) { + struct sse_event_desc *desc = &sse_event_descs[i]; + + desc->can_inject = sse_get_inject_status(desc->evt_id); + if (!desc->can_inject) + pr_info("Can not inject event %s, tests using this event will be skipped\n", + desc->name); + } +} + +static int __init sse_test_init(void) +{ + sse_init_events(); + + sse_test_injection_fast(); + sse_test_injection_priority(); + + return 0; +} + +static void __exit sse_test_exit(void) +{ +} + +module_init(sse_test_init); +module_exit(sse_test_exit); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Cl?ment L?ger "); +MODULE_DESCRIPTION("Test module for SSE"); diff --git a/tools/testing/selftests/riscv/sse/run_sse_test.sh b/tools/testing/selftests/riscv/sse/run_sse_test.sh new file mode 100644 index 000000000000..888bc4a99cb3 --- /dev/null +++ b/tools/testing/selftests/riscv/sse/run_sse_test.sh @@ -0,0 +1,44 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (C) 2025 Rivos Inc. + +MODULE_NAME=riscv_sse_test +DRIVER="./module/${MODULE_NAME}.ko" + +check_test_failed_prefix() { + if dmesg | grep -q "${MODULE_NAME}: FAILED:";then + echo "${MODULE_NAME} failed, please check dmesg" + exit 1 + fi +} + +# Kselftest framework requirement - SKIP code is 4. +ksft_skip=4 + +check_test_requirements() +{ + uid=$(id -u) + if [ $uid -ne 0 ]; then + echo "$0: Must be run as root" + exit $ksft_skip + fi + + if ! which insmod > /dev/null 2>&1; then + echo "$0: You need insmod installed" + exit $ksft_skip + fi + + if [ ! -f $DRIVER ]; then + echo "$0: You need to compile ${MODULE_NAME} module" + exit $ksft_skip + fi +} + +check_test_requirements + +insmod $DRIVER > /dev/null 2>&1 +rmmod $MODULE_NAME +check_test_failed_prefix + +exit 0 -- 2.43.0 From conor at kernel.org Mon Sep 8 11:41:10 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 8 Sep 2025 19:41:10 +0100 Subject: [PATCH v1 0/2] riscv: dts: starfive: jh7110-common: drop no-mmc and power-on-delay-ms from mmc interfaces In-Reply-To: <20250903101346.861076-1-e@freeshell.de> References: <20250903101346.861076-1-e@freeshell.de> Message-ID: <20250908-prognosis-nimbly-3a10aa0bcb22@spud> From: Conor Dooley On Wed, 03 Sep 2025 03:13:34 -0700, E Shattow wrote: > Drop no-mmc and power-on-delay-ms properties. > > The committer cannot be reached for comment and per discussion [1] and > testing there is not any observable problem that is being solved by > having these properties for the VisionFive 2 or similar variant boards > through the jh7110-common.dtsi include. > > [...] Applied to riscv-dt-for-next, thanks! [1/2] riscv: dts: starfive: jh7110-common: drop no-mmc property from mmc1 https://git.kernel.org/conor/c/08128670a931 [2/2] riscv: dts: starfive: jh7110-common: drop mmc post-power-on-delay-ms https://git.kernel.org/conor/c/b5a861a438d1 Thanks, Conor. From ameryhung at gmail.com Mon Sep 8 15:34:24 2025 From: ameryhung at gmail.com (Amery Hung) Date: Mon, 8 Sep 2025 15:34:24 -0700 Subject: [PATCH v2 bpf-next] riscv, bpf: Sign extend struct ops return values properly In-Reply-To: References: <20250904103806.18937-1-hengqi.chen@gmail.com> <5829abcf-f1b9-4fb0-8811-b6098fdd8a29@gmail.com> Message-ID: On Thu, Sep 4, 2025 at 6:24?PM Hengqi Chen wrote: > > On Fri, Sep 5, 2025 at 6:42?AM Amery Hung wrote: > > > > > > > > On 9/4/25 3:38 AM, Hengqi Chen wrote: > > > The ns_bpf_qdisc selftest triggers a kernel panic: > > > > > > Unable to handle kernel paging request at virtual address ffffffffa38dbf58 > > > Current test_progs pgtable: 4K pagesize, 57-bit VAs, pgdp=0x00000001109cc000 > > > [ffffffffa38dbf58] pgd=000000011fffd801, p4d=000000011fffd401, pud=000000011fffd001, pmd=0000000000000000 > > > Oops [#1] > > > Modules linked in: bpf_testmod(OE) xt_conntrack nls_iso8859_1 dm_mod drm drm_panel_orientation_quirks configfs backlight btrfs blake2b_generic xor lzo_compress zlib_deflate raid6_pq efivarfs [last unloaded: bpf_testmod(OE)] > > > CPU: 1 UID: 0 PID: 23584 Comm: test_progs Tainted: G W OE 6.17.0-rc1-g2465bb83e0b4 #1 NONE > > > Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > > > Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2024.01+dfsg-1ubuntu5.1 01/01/2024 > > > epc : __qdisc_run+0x82/0x6f0 > > > ra : __qdisc_run+0x6e/0x6f0 > > > epc : ffffffff80bd5c7a ra : ffffffff80bd5c66 sp : ff2000000eecb550 > > > gp : ffffffff82472098 tp : ff60000096895940 t0 : ffffffff8001f180 > > > t1 : ffffffff801e1664 t2 : 0000000000000000 s0 : ff2000000eecb5d0 > > > s1 : ff60000093a6a600 a0 : ffffffffa38dbee8 a1 : 0000000000000001 > > > a2 : ff2000000eecb510 a3 : 0000000000000001 a4 : 0000000000000000 > > > a5 : 0000000000000010 a6 : 0000000000000000 a7 : 0000000000735049 > > > s2 : ffffffffa38dbee8 s3 : 0000000000000040 s4 : ff6000008bcda000 > > > s5 : 0000000000000008 s6 : ff60000093a6a680 s7 : ff60000093a6a6f0 > > > s8 : ff60000093a6a6ac s9 : ff60000093140000 s10: 0000000000000000 > > > s11: ff2000000eecb9d0 t3 : 0000000000000000 t4 : 0000000000ff0000 > > > t5 : 0000000000000000 t6 : ff60000093a6a8b6 > > > status: 0000000200000120 badaddr: ffffffffa38dbf58 cause: 000000000000000d > > > [] __qdisc_run+0x82/0x6f0 > > > [] __dev_queue_xmit+0x4c0/0x1128 > > > [] neigh_resolve_output+0xd0/0x170 > > > [] ip6_finish_output2+0x226/0x6c8 > > > [] ip6_finish_output+0x10c/0x2a0 > > > [] ip6_output+0x5e/0x178 > > > [] ip6_xmit+0x29a/0x608 > > > [] inet6_csk_xmit+0xe6/0x140 > > > [] __tcp_transmit_skb+0x45c/0xaa8 > > > [] tcp_connect+0x9ce/0xd10 > > > [] tcp_v6_connect+0x4ac/0x5e8 > > > [] __inet_stream_connect+0xd8/0x318 > > > [] inet_stream_connect+0x3e/0x68 > > > [] __sys_connect_file+0x50/0x88 > > > [] __sys_connect+0x96/0xc8 > > > [] __riscv_sys_connect+0x20/0x30 > > > [] do_trap_ecall_u+0x256/0x378 > > > [] handle_exception+0x14a/0x156 > > > Code: 892a 0363 1205 489c 8bc1 c7e5 2d03 084a 2703 080a (2783) 0709 > > > ---[ end trace 0000000000000000 ]--- > > > > > > The bpf_fifo_dequeue prog returns a skb which is a pointer. > > > The pointer is treated as a 32bit value and sign extend to > > > 64bit in epilogue. This behavior is right for most bpf prog > > > types but wrong for struct ops which requires RISC-V ABI. > > > > > > So let's sign extend struct ops return values according to > > > the function model and RISC-V ABI([0]). > > > > > > [0]: https://riscv.org/wp-content/uploads/2024/12/riscv-calling.pdf > > > > > > Fixes: 25ad10658dc1 ("riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace framework") > > > Signed-off-by: Hengqi Chen > > > --- > > > arch/riscv/net/bpf_jit_comp64.c | 38 ++++++++++++++++++++++++++++++++- > > > 1 file changed, 37 insertions(+), 1 deletion(-) > > > > > > diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c > > > index 549c3063c7f1..c7ae4d0a8361 100644 > > > --- a/arch/riscv/net/bpf_jit_comp64.c > > > +++ b/arch/riscv/net/bpf_jit_comp64.c > > > @@ -954,6 +954,35 @@ static int invoke_bpf_prog(struct bpf_tramp_link *l, int args_off, int retval_of > > > return ret; > > > } > > > > > > +/* > > > + * Sign-extend the register if necessary > > > + */ > > > +static int sign_extend(int rd, int rs, u8 size, u8 flags, struct rv_jit_context *ctx) > > > +{ > > > + if (!(flags & BTF_FMODEL_SIGNED_ARG) && (size == 1 || size == 2)) > > > + return 0; > > > + > > > + switch (size) { > > > + case 1: > > > + emit_sextb(rd, rs, ctx); > > > + break; > > > + case 2: > > > + emit_sexth(rd, rs, ctx); > > > + break; > > > + case 4: > > > + emit_sextw(rd, rs, ctx); > > > + break; > > > + case 8: > > > + emit_mv(rd, rs, ctx); > > > + break; > > > + default: > > > + pr_err("bpf-jit: invalid size %d for sign_extend\n", size); > > > + return -EINVAL; > > > > Will this accidentally rejects struct_ops functions that return void? > > > > No, see https://elixir.bootlin.com/linux/v6.16.4/source/kernel/bpf/bpf_struct_ops.c#L601-L602 Ah, I see. Thanks for pointing it out. > > > > + } > > > + > > > + return 0; > > > +} > > > + > > > static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, > > > const struct btf_func_model *m, > > > struct bpf_tramp_links *tlinks, > > > @@ -1175,8 +1204,15 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, > > > restore_args(min_t(int, nr_arg_slots, RV_MAX_REG_ARGS), args_off, ctx); > > > > > > if (save_ret) { > > > - emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > > > emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx); > > > + if (is_struct_ops) { > > > + ret = sign_extend(RV_REG_A0, regmap[BPF_REG_0], > > > + m->ret_size, m->ret_flags, ctx); > > > + if (ret) > > > + goto out; > > > + } else { > > > + emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); > > > + } > > > } > > > > > > emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); > > From kuba at kernel.org Mon Sep 8 17:34:05 2025 From: kuba at kernel.org (Jakub Kicinski) Date: Mon, 8 Sep 2025 17:34:05 -0700 Subject: [PATCH net-next v9 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> <20250905-net-k1-emac-v9-2-f1649b98a19c@iscas.ac.cn> <20250905153500.GH553991@horms.kernel.org> <0605f176-5cdb-4f5b-9a6b-afa139c96732@iscas.ac.cn> <20250905160158.GI553991@horms.kernel.org> <45053235-3b01-42d8-98aa-042681104d11@iscas.ac.cn> <20250905165908.69548ce0@kernel.org> Message-ID: <20250908173405.08aec56d@kernel.org> On Sun, 7 Sep 2025 16:22:44 +0800 Vivian Wang wrote: > "dstats" is meant for tunnels. This doesn't look like the right thing to > use, and no other pcpu_stat_type gives me tx_dropped. Do you think I > should use dstats anyway? You can use dstats From ziyao at disroot.org Mon Sep 8 19:51:29 2025 From: ziyao at disroot.org (Yao Zi) Date: Tue, 9 Sep 2025 02:51:29 +0000 Subject: [PATCH v8 2/3] clk: canaan: Add clock driver for Canaan K230 In-Reply-To: <0947d9cc-86ba-46e0-92aa-04f4714e7a20@zohomail.com> References: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> <20250905-b4-k230-clk-v8-2-96caa02d5428@zohomail.com> <0947d9cc-86ba-46e0-92aa-04f4714e7a20@zohomail.com> Message-ID: On Mon, Sep 08, 2025 at 10:13:15PM +0800, Xukai Wang wrote: > > On 2025/9/7 11:13, Yao Zi wrote: > >> On Fri, Sep 05, 2025 at 11:10:23AM +0800, Xukai Wang wrote: > >> This patch provides basic support for the K230 clock, which covers > >> all clocks in K230 SoC. > >> > >> The clock tree of the K230 SoC consists of a 24MHZ external crystal > >> oscillator, PLLs and an external pulse input for timerX, and their > >> derived clocks. > >> > >> Co-developed-by: Troy Mitchell > >> Signed-off-by: Troy Mitchell > >> Signed-off-by: Xukai Wang > >> --- > >> drivers/clk/Kconfig | 6 + > >> drivers/clk/Makefile | 1 + > >> drivers/clk/clk-k230.c | 2456 ++++++++++++++++++++++++++++++++++++++++++++++++ > >> 3 files changed, 2463 insertions(+) ... > >> new file mode 100644 > >> index 0000000000000000000000000000000000000000..2ba74c008b30ae3400acbd8c08550e8315dfe205 > >> --- /dev/null > >> +++ b/drivers/clk/clk-k230.c > >> @@ -0,0 +1,2456 @@ ... > > > >> +static int k230_clk_set_rate_mul(struct clk_hw *hw, unsigned long rate, > >> + unsigned long parent_rate) > >> +{ > >> + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); > >> + struct k230_clk_rate_self *rate_self = &clk->clk; > >> + u32 div, mul, mul_reg; > >> + > >> + if (rate > parent_rate) > >> + return -EINVAL; > >> + > >> + if (rate_self->read_only) > >> + return 0; > >> + > >> + if (k230_clk_find_approximate_mul(rate_self->mul_min, rate_self->mul_max, > >> + rate_self->div_min, rate_self->div_max, > >> + rate, parent_rate, &div, &mul)) > >> + return -EINVAL; > >> + > >> + guard(spinlock)(rate_self->lock); > >> + > >> + mul_reg = readl(rate_self->reg + clk->mul_reg_off); > >> + mul_reg |= ((mul - 1) & rate_self->mul_mask) << (rate_self->mul_shift); > >> + mul_reg |= BIT(rate_self->write_enable_bit); > >> + writel(mul_reg, rate_self->reg + clk->mul_reg_off); > >> + > >> + return 0; > >> +} > >> + > >> +static int k230_clk_set_rate_div(struct clk_hw *hw, unsigned long rate, > >> + unsigned long parent_rate) > >> +{ > >> + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); > >> + struct k230_clk_rate_self *rate_self = &clk->clk; > >> + u32 div, mul, div_reg; > >> + > >> + if (rate > parent_rate) > >> + return -EINVAL; > >> + > >> + if (rate_self->read_only) > >> + return 0; > >> + > >> + if (k230_clk_find_approximate_div(rate_self->mul_min, rate_self->mul_max, > >> + rate_self->div_min, rate_self->div_max, > >> + rate, parent_rate, &div, &mul)) > >> + return -EINVAL; > >> + > >> + guard(spinlock)(rate_self->lock); > >> + > >> + div_reg = readl(rate_self->reg + clk->div_reg_off); > >> + div_reg |= ((div - 1) & rate_self->div_mask) << (rate_self->div_shift); > >> + div_reg |= BIT(rate_self->write_enable_bit); > >> + writel(div_reg, rate_self->reg + clk->div_reg_off); > >> + > >> + return 0; > >> +} > >> + > >> +static int k230_clk_set_rate_mul_div(struct clk_hw *hw, unsigned long rate, > >> + unsigned long parent_rate) > >> +{ > >> + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); > >> + struct k230_clk_rate_self *rate_self = &clk->clk; > >> + u32 div, mul, div_reg, mul_reg; > >> + > >> + if (rate > parent_rate) > >> + return -EINVAL; > >> + > >> + if (rate_self->read_only) > >> + return 0; > >> + > >> + if (k230_clk_find_approximate_mul_div(rate_self->mul_min, rate_self->mul_max, > >> + rate_self->div_min, rate_self->div_max, > >> + rate, parent_rate, &div, &mul)) > >> + return -EINVAL; > >> + > >> + guard(spinlock)(rate_self->lock); > >> + > >> + div_reg = readl(rate_self->reg + clk->div_reg_off); > >> + div_reg |= ((div - 1) & rate_self->div_mask) << (rate_self->div_shift); > >> + div_reg |= BIT(rate_self->write_enable_bit); > >> + writel(div_reg, rate_self->reg + clk->div_reg_off); > >> + > >> + mul_reg = readl(rate_self->reg + clk->mul_reg_off); > >> + mul_reg |= ((mul - 1) & rate_self->mul_mask) << (rate_self->mul_shift); > >> + mul_reg |= BIT(rate_self->write_enable_bit); > >> + writel(mul_reg, rate_self->reg + clk->mul_reg_off); > >> + > >> + return 0; > >> +} > > There are three variants of rate clocks, mul-only, div-only and mul-div > > ones, which are similar to clk-multiplier, clk-divider, > > clk-fractional-divider. > > > > The only difference is to setup new parameters for K230's rate clocks, > > a register bit, described as k230_clk_rate_self.write_enable_bit, must > > be set first. > Actually, I think the differences are not limited to just the > write_enable_bit. There are also distinct mul_min, mul_max, div_min, and > div_max values, which are not typically just 1 and (1 << bit_width) as > in standard clock divider or multiplier structures. Oops, I missed these members, so there're more differences, but... > For example, the div_min for hs_sd_card_src_rate is 2, not 1. This > affects the calculation of the approximate divider, and cannot be fully > represented if we only use the clk_divider structure. Reading through the TRM[1], I cannot find why using one as divisor isn't valid for hs_sd_card_src_rate. The clock corresponds to field hs_SDCLK_CFG.sd_cclk_div, and is described as "Sd card clock divider. N: (N+1) divider. Sd0?sd1 cclk is divided from this clock". Do you have any extra information about the limitation? > Another example is ls_codec_adc_rate, where mul_min is 0x10, mul_max is > 0x1B9, div_min is 0xC35, and div_max is 0x3D09. These specific ranges > cannot be described using the normal clk_fractional_divider structure. According to the TRM, the two fields in control of the fractional clock are described as > codec clock stup. For example, audio_clk: 25644.1K, source clock: > 400M, 400M/(25644.1K) can be simplied to : 15625/441. sum is set to : > 15625, step is set to 441 and > codec clock sum still I cannot find any information about the range you described with mul_min and div_min. Could you confirm whether they're really necessary? > > > > What do you think of introducing support for such "write enable bit" to > > the generic implementation of multipler/divider/fractional? Then you > > could reuse the generic implementation in K230's driver, avoiding code > > duplication. > Therefore, in addition to the requirement of setting the > write_enable_bit, the customizable ranges for these parameters are also > important differences that should be considered. Best regards, Yao Zi [1]: https://github.com/revyos/external-docs/blob/master/K230/en-us/K230_Technical_Reference_Manual_V0.3.1_20241118.pdf From kees at kernel.org Mon Sep 8 20:11:48 2025 From: kees at kernel.org (Kees Cook) Date: Mon, 8 Sep 2025 20:11:48 -0700 Subject: [PATCH v1] rust: cfi: only 64-bit arm and x86 support CFI_CLANG In-Reply-To: <20250908-distill-lint-1ae78bcf777c@spud> References: <20250908-distill-lint-1ae78bcf777c@spud> Message-ID: <202509082009.4A8DC97BD2@keescook> On Mon, Sep 08, 2025 at 02:12:35PM +0100, Conor Dooley wrote: > From: Conor Dooley > > The kernel uses the standard rustc targets for non-x86 targets, and out > of those only 64-bit arm's target has kcfi support enabled. For x86, the > custom 64-bit target enables kcfi. > > The HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC config option that allows > CFI_CLANG to be used in combination with RUST does not check whether the > rustc target supports kcfi. This breaks the build on riscv (and > presumably 32-bit arm) when CFI_CLANG and RUST are enabled at the same > time. > > Ordinarily, a rustc-option check would be used to detect target support > but unfortunately rustc-option filters out the target for reasons given > in commit 46e24a545cdb4 ("rust: kasan/kbuild: fix missing flags on first > build"). As a result, if the host supports kcfi but the target does not, > e.g. when building for riscv on x86_64, the build would remain broken. > > Instead, make HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC depend on the only > two architectures where the target used supports it to fix the build. I'm generally fine with this, but normally we do arch-specific stuff only in arch/$arch/Kconfig, and expose some kind of ARCH_HAS_CFI_ICALL_NORMALIZE_INTEGERS that would get tested here. Should we do that here too? -Kees > > CC: stable at vger.kernel.org > Fixes: ca627e636551e ("rust: cfi: add support for CFI_CLANG with Rust") > Signed-off-by: Conor Dooley > --- > CC: Paul Walmsley > CC: Palmer Dabbelt > CC: Alexandre Ghiti > CC: Miguel Ojeda > CC: Alex Gaynor > CC: Boqun Feng > CC: Gary Guo > CC: "Bj?rn Roy Baron" > CC: Benno Lossin > CC: Andreas Hindborg > CC: Alice Ryhl > CC: Trevor Gross > CC: Danilo Krummrich > CC: Kees Cook > CC: Sami Tolvanen > CC: Matthew Maurer > CC: "Peter Zijlstra (Intel)" > CC: linux-kernel at vger.kernel.org > CC: linux-riscv at lists.infradead.org > CC: rust-for-linux at vger.kernel.org > --- > arch/Kconfig | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/Kconfig b/arch/Kconfig > index d1b4ffd6e0856..880cddff5eda7 100644 > --- a/arch/Kconfig > +++ b/arch/Kconfig > @@ -917,6 +917,7 @@ config HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC > def_bool y > depends on HAVE_CFI_ICALL_NORMALIZE_INTEGERS_CLANG > depends on RUSTC_VERSION >= 107900 > + depends on ARM64 || X86_64 > # With GCOV/KASAN we need this fix: https://github.com/rust-lang/rust/pull/129373 > depends on (RUSTC_LLVM_VERSION >= 190103 && RUSTC_VERSION >= 108200) || \ > (!GCOV_KERNEL && !KASAN_GENERIC && !KASAN_SW_TAGS) > -- > 2.47.2 > -- Kees Cook From cuiyunhui at bytedance.com Mon Sep 8 20:13:35 2025 From: cuiyunhui at bytedance.com (yunhui cui) Date: Tue, 9 Sep 2025 11:13:35 +0800 Subject: [REPORT] Should rdcycle be deprecated? Message-ID: Hi All, 1. To use rdcycle in user mode, one must first go through perf_user_access. However, in reality, the return value of rdcycle remains unchanged. This is because SBI_PMU_CY_IR_MASK in SBI includes the bit corresponding to "cycle", and the kernel's pmu_sbi_stop_all() function disables the counting of cycles. 2. Currently, some application software (e.g., DPDK) uses the rdcycle instruction. In fact, rdcycle is affected by WFI (Wait for Interrupt) and CPU frequency variations. 3. Some applications mainly run on server CPUs. Therefore, the precision design of rdtime should be higher. For example, the TSC (Time-Stamp Counter) of x86 architectures is generally around 2 GHz, which can meet the application's requirements for timestamp precision. 4. What are the future plans for rdcycle? Thanks, Yunhui From lkp at intel.com Mon Sep 8 21:20:46 2025 From: lkp at intel.com (kernel test robot) Date: Tue, 9 Sep 2025 12:20:46 +0800 Subject: [PATCH net-next v10 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250908-net-k1-emac-v10-2-90d807ccd469@iscas.ac.cn> References: <20250908-net-k1-emac-v10-2-90d807ccd469@iscas.ac.cn> Message-ID: <202509091137.JnioPegN-lkp@intel.com> Hi Vivian, kernel test robot noticed the following build warnings: [auto build test WARNING on 062b3e4a1f880f104a8d4b90b767788786aa7b78] url: https://github.com/intel-lab-lkp/linux/commits/Vivian-Wang/dt-bindings-net-Add-support-for-SpacemiT-K1/20250908-203917 base: 062b3e4a1f880f104a8d4b90b767788786aa7b78 patch link: https://lore.kernel.org/r/20250908-net-k1-emac-v10-2-90d807ccd469%40iscas.ac.cn patch subject: [PATCH net-next v10 2/5] net: spacemit: Add K1 Ethernet MAC config: m68k-allmodconfig (https://download.01.org/0day-ci/archive/20250909/202509091137.JnioPegN-lkp at intel.com/config) compiler: m68k-linux-gcc (GCC) 15.1.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250909/202509091137.JnioPegN-lkp at intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202509091137.JnioPegN-lkp at intel.com/ All warnings (new ones prefixed by >>): In function 'emac_get_stat_tx_dropped', inlined from 'emac_get_stats64' at drivers/net/ethernet/spacemit/k1_emac.c:1234:24: >> drivers/net/ethernet/spacemit/k1_emac.c:1218:24: warning: 'result' is used uninitialized [-Wuninitialized] 1218 | result += READ_ONCE(per_cpu(*priv->stat_tx_dropped, cpu)); | ^~ drivers/net/ethernet/spacemit/k1_emac.c: In function 'emac_get_stats64': drivers/net/ethernet/spacemit/k1_emac.c:1214:13: note: 'result' was declared here 1214 | u64 result; | ^~~~~~ vim +/result +1218 drivers/net/ethernet/spacemit/k1_emac.c 1211 1212 static u64 emac_get_stat_tx_dropped(struct emac_priv *priv) 1213 { 1214 u64 result; 1215 int cpu; 1216 1217 for_each_possible_cpu(cpu) { > 1218 result += READ_ONCE(per_cpu(*priv->stat_tx_dropped, cpu)); 1219 } 1220 1221 return result; 1222 } 1223 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki From akpm at linux-foundation.org Mon Sep 8 21:25:18 2025 From: akpm at linux-foundation.org (Andrew Morton) Date: Mon, 8 Sep 2025 21:25:18 -0700 Subject: [PATCH v2 19/37] mm/gup: remove record_subpages() In-Reply-To: <64fe4c61-f9cc-4a5a-9c33-07bd0f089e94@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-20-david@redhat.com> <5090355d-546a-4d06-99e1-064354d156b5@redhat.com> <20250905230006.GA1776@sol> <64fe4c61-f9cc-4a5a-9c33-07bd0f089e94@redhat.com> Message-ID: <20250908212518.77671b31aaad2832c17eab07@linux-foundation.org> On Sat, 6 Sep 2025 08:57:37 +0200 David Hildenbrand wrote: > >> @@ -3024,6 +3025,7 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, > >> return 0; > >> } > >> + pages += *nr; > >> *nr += refs; > >> for (; refs; refs--) > >> *(pages++) = page++; > > > > Can this get folded in soon? This bug is causing crashes in AF_ALG too. > > Andrew immediately dropped the original patch, so it's gone from > mm-unstable and should be gone from next soon (today?). I restored it once you sent out the fix. It doesn't seem to be in present -next but it should be there in the next one. From kingxukai at zohomail.com Mon Sep 8 22:01:10 2025 From: kingxukai at zohomail.com (Xukai Wang) Date: Tue, 9 Sep 2025 13:01:10 +0800 Subject: [PATCH v8 2/3] clk: canaan: Add clock driver for Canaan K230 In-Reply-To: References: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> <20250905-b4-k230-clk-v8-2-96caa02d5428@zohomail.com> <0947d9cc-86ba-46e0-92aa-04f4714e7a20@zohomail.com> Message-ID: On 2025/9/9 10:51, Yao Zi wrote: > On Mon, Sep 08, 2025 at 10:13:15PM +0800, Xukai Wang wrote: >> On 2025/9/7 11:13, Yao Zi wrote: >>>> On Fri, Sep 05, 2025 at 11:10:23AM +0800, Xukai Wang wrote: >>>> This patch provides basic support for the K230 clock, which covers >>>> all clocks in K230 SoC. >>>> >>>> The clock tree of the K230 SoC consists of a 24MHZ external crystal >>>> oscillator, PLLs and an external pulse input for timerX, and their >>>> derived clocks. >>>> >>>> Co-developed-by: Troy Mitchell >>>> Signed-off-by: Troy Mitchell >>>> Signed-off-by: Xukai Wang >>>> --- >>>> drivers/clk/Kconfig | 6 + >>>> drivers/clk/Makefile | 1 + >>>> drivers/clk/clk-k230.c | 2456 ++++++++++++++++++++++++++++++++++++++++++++++++ >>>> 3 files changed, 2463 insertions(+) > ... >>>> +static int k230_clk_set_rate_mul(struct clk_hw *hw, unsigned long rate, >>>> + unsigned long parent_rate) >>>> +{ >>>> + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); >>>> + struct k230_clk_rate_self *rate_self = &clk->clk; >>>> + u32 div, mul, mul_reg; >>>> + >>>> + if (rate > parent_rate) >>>> + return -EINVAL; >>>> + >>>> + if (rate_self->read_only) >>>> + return 0; >>>> + >>>> + if (k230_clk_find_approximate_mul(rate_self->mul_min, rate_self->mul_max, >>>> + rate_self->div_min, rate_self->div_max, >>>> + rate, parent_rate, &div, &mul)) >>>> + return -EINVAL; >>>> + >>>> + guard(spinlock)(rate_self->lock); >>>> + >>>> + mul_reg = readl(rate_self->reg + clk->mul_reg_off); >>>> + mul_reg |= ((mul - 1) & rate_self->mul_mask) << (rate_self->mul_shift); >>>> + mul_reg |= BIT(rate_self->write_enable_bit); >>>> + writel(mul_reg, rate_self->reg + clk->mul_reg_off); >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +static int k230_clk_set_rate_div(struct clk_hw *hw, unsigned long rate, >>>> + unsigned long parent_rate) >>>> +{ >>>> + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); >>>> + struct k230_clk_rate_self *rate_self = &clk->clk; >>>> + u32 div, mul, div_reg; >>>> + >>>> + if (rate > parent_rate) >>>> + return -EINVAL; >>>> + >>>> + if (rate_self->read_only) >>>> + return 0; >>>> + >>>> + if (k230_clk_find_approximate_div(rate_self->mul_min, rate_self->mul_max, >>>> + rate_self->div_min, rate_self->div_max, >>>> + rate, parent_rate, &div, &mul)) >>>> + return -EINVAL; >>>> + >>>> + guard(spinlock)(rate_self->lock); >>>> + >>>> + div_reg = readl(rate_self->reg + clk->div_reg_off); >>>> + div_reg |= ((div - 1) & rate_self->div_mask) << (rate_self->div_shift); >>>> + div_reg |= BIT(rate_self->write_enable_bit); >>>> + writel(div_reg, rate_self->reg + clk->div_reg_off); >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +static int k230_clk_set_rate_mul_div(struct clk_hw *hw, unsigned long rate, >>>> + unsigned long parent_rate) >>>> +{ >>>> + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); >>>> + struct k230_clk_rate_self *rate_self = &clk->clk; >>>> + u32 div, mul, div_reg, mul_reg; >>>> + >>>> + if (rate > parent_rate) >>>> + return -EINVAL; >>>> + >>>> + if (rate_self->read_only) >>>> + return 0; >>>> + >>>> + if (k230_clk_find_approximate_mul_div(rate_self->mul_min, rate_self->mul_max, >>>> + rate_self->div_min, rate_self->div_max, >>>> + rate, parent_rate, &div, &mul)) >>>> + return -EINVAL; >>>> + >>>> + guard(spinlock)(rate_self->lock); >>>> + >>>> + div_reg = readl(rate_self->reg + clk->div_reg_off); >>>> + div_reg |= ((div - 1) & rate_self->div_mask) << (rate_self->div_shift); >>>> + div_reg |= BIT(rate_self->write_enable_bit); >>>> + writel(div_reg, rate_self->reg + clk->div_reg_off); >>>> + >>>> + mul_reg = readl(rate_self->reg + clk->mul_reg_off); >>>> + mul_reg |= ((mul - 1) & rate_self->mul_mask) << (rate_self->mul_shift); >>>> + mul_reg |= BIT(rate_self->write_enable_bit); >>>> + writel(mul_reg, rate_self->reg + clk->mul_reg_off); >>>> + >>>> + return 0; >>>> +} >>> There are three variants of rate clocks, mul-only, div-only and mul-div >>> ones, which are similar to clk-multiplier, clk-divider, >>> clk-fractional-divider. >>> >>> The only difference is to setup new parameters for K230's rate clocks, >>> a register bit, described as k230_clk_rate_self.write_enable_bit, must >>> be set first. >> Actually, I think the differences are not limited to just the >> write_enable_bit. There are also distinct mul_min, mul_max, div_min, and >> div_max values, which are not typically just 1 and (1 << bit_width) as >> in standard clock divider or multiplier structures. > Oops, I missed these members, so there're more differences, but... > >> For example, the div_min for hs_sd_card_src_rate is 2, not 1. This >> affects the calculation of the approximate divider, and cannot be fully >> represented if we only use the clk_divider structure. > Reading through the TRM[1], I cannot find why using one as divisor isn't > valid for hs_sd_card_src_rate. The clock corresponds to field > hs_SDCLK_CFG.sd_cclk_div, and is described as "Sd card clock divider. > N: (N+1) divider. Sd0?sd1 cclk is divided from this clock". > > Do you have any extra information about the limitation? This limitation comes from the vendor's hardware reference code[2], which indicates this constraint, but unfortunately it's not documented in the public TRM[1]. > >> Another example is ls_codec_adc_rate, where mul_min is 0x10, mul_max is >> 0x1B9, div_min is 0xC35, and div_max is 0x3D09. These specific ranges >> cannot be described using the normal clk_fractional_divider structure. > According to the TRM, the two fields in control of the fractional clock > are described as > >> codec clock stup. For example, audio_clk: 25644.1K, source clock: >> 400M, 400M/(25644.1K) can be simplied > to : 15625/441. sum is set to : >> 15625, step is set to 441 > and > >> codec clock sum > still I cannot find any information about the range you described with > mul_min and div_min. Could you confirm whether they're really > necessary? > >>> What do you think of introducing support for such "write enable bit" to >>> the generic implementation of multipler/divider/fractional? Then you >>> could reuse the generic implementation in K230's driver, avoiding code >>> duplication. >> Therefore, in addition to the requirement of setting the >> write_enable_bit, the customizable ranges for these parameters are also >> important differences that should be considered. > Best regards, > Yao Zi > > [1]: https://github.com/revyos/external-docs/blob/master/K230/en-us/K230_Technical_Reference_Manual_V0.3.1_20241118.pdf [2]: https://github.com/ruyisdk/linux-xuantie-kernel/blob/4d69bb363fd873f2b0ac7daa488ca0206d0b6760/arch/riscv/boot/dts/canaan/k230_clock_provider.dtsi#L918 From anup at brainfault.org Mon Sep 8 23:49:43 2025 From: anup at brainfault.org (Anup Patel) Date: Tue, 9 Sep 2025 12:19:43 +0530 Subject: [PATCH v3] riscv: skip csr restore if vcpu preempted reload In-Reply-To: <20250825121411.86573-1-tjytimi@163.com> References: <20250825121411.86573-1-tjytimi@163.com> Message-ID: On Mon, Aug 25, 2025 at 5:44?PM Jinyu Tang wrote: > > The kvm_arch_vcpu_load() function is called in two cases for riscv: > 1. When entering KVM_RUN from userspace ioctl. > 2. When a preempted VCPU is scheduled back. > > In the second case, if no other KVM VCPU has run on this CPU since the > current VCPU was preempted, the guest CSR (including AIA CSRS and HGTAP) > values are still valid in the hardware and do not need to be restored. > > This patch is to skip the CSR write path when: > 1. The VCPU was previously preempted > (vcpu->scheduled_out == 1). > 2. It is being reloaded on the same physical CPU > (vcpu->arch.last_exit_cpu == cpu). > 3. No other KVM VCPU has used this CPU in the meantime > (vcpu == __this_cpu_read(kvm_former_vcpu)). > > This reduces many CSR writes with frequent preemption on the same CPU. Currently, I see the following issues with this patch: 1) It's making Guest usage of IMSIC VS-files on the QEMU virt machine very unstable and Guest never boots. It could be some QEMU issue but I don't want to increase instability on QEMU since it is our primary development vehicle. 2) We have CSRs like hedeleg which can be updated by KVM user space via set_guest_debug() ioctl. The direction of the patch is fine but it is very fragile at the moment. Regards, Anup > > Signed-off-by: Jinyu Tang > Reviewed-by: Nutty Liu > --- > v2 -> v3: > v2 was missing a critical check because I generated the patch from my > wrong (experimental) branch. This is fixed in v3. Sorry for my trouble. > > v1 -> v2: > Apply the logic to aia csr load. Thanks for > Andrew Jones's advice. > > arch/riscv/kvm/vcpu.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c > index f001e5640..66bd3ddd5 100644 > --- a/arch/riscv/kvm/vcpu.c > +++ b/arch/riscv/kvm/vcpu.c > @@ -25,6 +25,8 @@ > #define CREATE_TRACE_POINTS > #include "trace.h" > > +static DEFINE_PER_CPU(struct kvm_vcpu *, kvm_former_vcpu); > + > const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = { > KVM_GENERIC_VCPU_STATS(), > STATS_DESC_COUNTER(VCPU, ecall_exit_stat), > @@ -581,6 +583,10 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) > struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr; > struct kvm_vcpu_config *cfg = &vcpu->arch.cfg; > > + if (vcpu->scheduled_out && vcpu == __this_cpu_read(kvm_former_vcpu) && > + vcpu->arch.last_exit_cpu == cpu) > + goto csr_restore_done; > + > if (kvm_riscv_nacl_sync_csr_available()) { > nsh = nacl_shmem(); > nacl_csr_write(nsh, CSR_VSSTATUS, csr->vsstatus); > @@ -624,6 +630,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) > > kvm_riscv_mmu_update_hgatp(vcpu); > > + kvm_riscv_vcpu_aia_load(vcpu, cpu); > + > +csr_restore_done: > kvm_riscv_vcpu_timer_restore(vcpu); > > kvm_riscv_vcpu_host_fp_save(&vcpu->arch.host_context); > @@ -633,8 +642,6 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) > kvm_riscv_vcpu_guest_vector_restore(&vcpu->arch.guest_context, > vcpu->arch.isa); > > - kvm_riscv_vcpu_aia_load(vcpu, cpu); > - > kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu); > > vcpu->cpu = cpu; > @@ -645,6 +652,8 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) > void *nsh; > struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr; > > + __this_cpu_write(kvm_former_vcpu, vcpu); > + > vcpu->cpu = -1; > > kvm_riscv_vcpu_aia_put(vcpu); > -- > 2.43.0 > From krzk at kernel.org Mon Sep 8 23:53:05 2025 From: krzk at kernel.org (Krzysztof Kozlowski) Date: Tue, 9 Sep 2025 08:53:05 +0200 Subject: [PATCH v3 2/6] dt-bindings: riscv: microchip: document icicle kit with production device In-Reply-To: <20250908115732.31092-3-valentina.fernandezalanis@microchip.com> References: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> <20250908115732.31092-3-valentina.fernandezalanis@microchip.com> Message-ID: <20250909-accurate-wisteria-spider-222341@kuoka> On Mon, Sep 08, 2025 at 12:57:28PM +0100, Valentina Fernandez wrote: > With the introduction of the Icicle Kit using the production MPFS250T > device, it's necessary to distinguish it from the engineering sample > (-es) variant. Engineering samples cannot write to flash from the MSS, > as noted in the PolarFire SoC FPGA ES errata. > > Add specific compatibles for the Icicle Kit with Production device > (MPFS250T) and Icicle Kit with Engineering Sample (MPFS250T_ES). > > The icicle kit reference designs in the v2025.07 release include the > Mi-V IHC IP v2, used to send/receive data between clusters when > using Asymmetric Multiprocessing (AMP) mode. > > In reference design releases prior to v2025.07, the MI-V IHC subsystem > was included as a proof of concept in the design prior to becoming an > IP available in the Libero catalog. > > Among other improvements, the new Mi-V IHC IP v2 includes some > changes to the register map. For this reason, make use of a new > reference design compatible to denote that v2025.07 reference design > releases are not backwards compatible. > > Signed-off-by: Valentina Fernandez > --- > Documentation/devicetree/bindings/riscv/microchip.yaml | 8 ++++++++ > 1 file changed, 8 insertions(+) Why are you sending patches which are already applied? For two weeks? Best regards, Krzysztof From wangruikang at iscas.ac.cn Tue Sep 9 00:02:55 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Tue, 9 Sep 2025 15:02:55 +0800 Subject: [PATCH v8 2/3] clk: canaan: Add clock driver for Canaan K230 In-Reply-To: <0947d9cc-86ba-46e0-92aa-04f4714e7a20@zohomail.com> References: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> <20250905-b4-k230-clk-v8-2-96caa02d5428@zohomail.com> <0947d9cc-86ba-46e0-92aa-04f4714e7a20@zohomail.com> Message-ID: <8ca70773-42b0-4dcc-8b54-338594e9a8ea@iscas.ac.cn> On 9/8/25 22:13, Xukai Wang wrote: >>> [...] >>> >>> + >>> +static int k230_clk_set_rate_mul_div(struct clk_hw *hw, unsigned long rate, >>> + unsigned long parent_rate) >>> +{ >>> + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); >>> + struct k230_clk_rate_self *rate_self = &clk->clk; >>> + u32 div, mul, div_reg, mul_reg; >>> + >>> + if (rate > parent_rate) >>> + return -EINVAL; >>> + >>> + if (rate_self->read_only) >>> + return 0; >>> + >>> + if (k230_clk_find_approximate_mul_div(rate_self->mul_min, rate_self->mul_max, >>> + rate_self->div_min, rate_self->div_max, >>> + rate, parent_rate, &div, &mul)) >>> + return -EINVAL; >>> + >>> + guard(spinlock)(rate_self->lock); >>> + >>> + div_reg = readl(rate_self->reg + clk->div_reg_off); >>> + div_reg |= ((div - 1) & rate_self->div_mask) << (rate_self->div_shift); >>> + div_reg |= BIT(rate_self->write_enable_bit); >>> + writel(div_reg, rate_self->reg + clk->div_reg_off); >>> + >>> + mul_reg = readl(rate_self->reg + clk->mul_reg_off); >>> + mul_reg |= ((mul - 1) & rate_self->mul_mask) << (rate_self->mul_shift); >>> + mul_reg |= BIT(rate_self->write_enable_bit); >>> + writel(mul_reg, rate_self->reg + clk->mul_reg_off); >>> + >>> + return 0; >>> +} >> There are three variants of rate clocks, mul-only, div-only and mul-div >> ones, which are similar to clk-multiplier, clk-divider, >> clk-fractional-divider. >> >> The only difference is to setup new parameters for K230's rate clocks, >> a register bit, described as k230_clk_rate_self.write_enable_bit, must >> be set first. > Actually, I think the differences are not limited to just the > write_enable_bit. There are also distinct mul_min, mul_max, div_min, and > div_max values, which are not typically just 1 and (1 << bit_width) as > in standard clock divider or multiplier structures. So the part I have been thinking about is, consider just checking the {mul,div}_{min,max} values to determine which kind it is? As is this is just redundant information, since you can infer whether there is a configurable multiplier by checking if mul_{min,max} are equal. Same for div_{min,max}. Vivian "dramforever" Wang > For example, the div_min for hs_sd_card_src_rate is 2, not 1. This > affects the calculation of the approximate divider, and cannot be fully > represented if we only use the clk_divider structure. > > Another example is ls_codec_adc_rate, where mul_min is 0x10, mul_max is > 0x1B9, div_min is 0xC35, and div_max is 0x3D09. These specific ranges > cannot be described using the normal clk_fractional_divider structure. > From atishp at rivosinc.com Tue Sep 9 00:03:19 2025 From: atishp at rivosinc.com (Atish Patra) Date: Tue, 09 Sep 2025 00:03:19 -0700 Subject: [PATCH v6 0/8] Add SBI v3.0 PMU enhancements Message-ID: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> SBI v3.0 specification[1] added two new improvements to the PMU chaper. The SBI v3.0 specification is frozen and under public review phase as per the RISC-V International guidelines. 1. Added an additional get_event_info function to query event availablity in bulk instead of individual SBI calls for each event. This helps in improving the boot time. 2. Raw event width allowed by the platform is widened to have 56 bits with RAW event v2 as per new clarification in the priv ISA[2]. Apart from implementing these new features, this series improves the gpa range check in KVM and updates the kvm SBI implementation to SBI v3.0. The opensbi patches have been merged. This series can be found at [3]. [1] https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/v3.0-rc7/riscv-sbi.pdf [2] https://github.com/riscv/riscv-isa-manual/issues/1578 [3] https://github.com/atishp04/linux/tree/b4/pmu_event_info_v6 Signed-off-by: Atish Patra --- Changes in v6: - Dropped the helper function to check writable slot - Updated PATCH 7 to return invalid address error if vcpu_write_guest fails - Link to v5: https://lore.kernel.org/r/20250829-pmu_event_info-v5-0-9dca26139a33 at rivosinc.com Changes in v5: - Rebased on top of v6.17-rc3 - Updated PATCH 6 as per feedback to improve the generic helper function - Adapted PATCH 7 & 8 as per PATCH 6. - Link to v4: https://lore.kernel.org/r/20250721-pmu_event_info-v4-0-ac76758a4269 at rivosinc.com Changes in v4: - Rebased on top of v6.16-rc7 - Fixed a potential compilation issue in PATCH5. - Minor typos fixed PATCH2 and PATCH3. - Fixed variable ordering in PATCH6 - Link to v3: https://lore.kernel.org/r/20250522-pmu_event_info-v3-0-f7bba7fd9cfe at rivosinc.com Changes in v3: - Rebased on top of v6.15-rc7 - Link to v2: https://lore.kernel.org/r/20250115-pmu_event_info-v2-0-84815b70383b at rivosinc.com Changes in v2: - Dropped PATCH 2 to be taken during rcX. - Improved gpa range check validation by introducing a helper function and checking the entire range. - Link to v1: https://lore.kernel.org/r/20241119-pmu_event_info-v1-0-a4f9691421f8 at rivosinc.com --- Atish Patra (8): drivers/perf: riscv: Add SBI v3.0 flag drivers/perf: riscv: Add raw event v2 support RISC-V: KVM: Add support for Raw event v2 drivers/perf: riscv: Implement PMU event info function drivers/perf: riscv: Export PMU event info function RISC-V: KVM: No need of explicit writable slot check RISC-V: KVM: Implement get event info function RISC-V: KVM: Upgrade the supported SBI version to 3.0 arch/riscv/include/asm/kvm_vcpu_pmu.h | 3 + arch/riscv/include/asm/kvm_vcpu_sbi.h | 2 +- arch/riscv/include/asm/sbi.h | 13 +++ arch/riscv/kvm/vcpu_pmu.c | 74 +++++++++++-- arch/riscv/kvm/vcpu_sbi_pmu.c | 3 + arch/riscv/kvm/vcpu_sbi_sta.c | 9 +- drivers/perf/riscv_pmu_sbi.c | 191 +++++++++++++++++++++++++--------- include/linux/perf/riscv_pmu.h | 1 + 8 files changed, 229 insertions(+), 67 deletions(-) --- base-commit: e32a80927434907f973f38a88cd19d7e51991d24 change-id: 20241018-pmu_event_info-986e21ce6bd3 -- Regards, Atish patra From atishp at rivosinc.com Tue Sep 9 00:03:22 2025 From: atishp at rivosinc.com (Atish Patra) Date: Tue, 09 Sep 2025 00:03:22 -0700 Subject: [PATCH v6 3/8] RISC-V: KVM: Add support for Raw event v2 In-Reply-To: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> References: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> Message-ID: <20250909-pmu_event_info-v6-3-d8f80cacb884@rivosinc.com> SBI v3.0 introduced a new raw event type v2 for wider mhpmeventX programming. Add the support in kvm for that. Reviewed-by: Anup Patel Signed-off-by: Atish Patra --- arch/riscv/kvm/vcpu_pmu.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c index 78ac3216a54d..15d71a7b75ba 100644 --- a/arch/riscv/kvm/vcpu_pmu.c +++ b/arch/riscv/kvm/vcpu_pmu.c @@ -60,6 +60,7 @@ static u32 kvm_pmu_get_perf_event_type(unsigned long eidx) type = PERF_TYPE_HW_CACHE; break; case SBI_PMU_EVENT_TYPE_RAW: + case SBI_PMU_EVENT_TYPE_RAW_V2: case SBI_PMU_EVENT_TYPE_FW: type = PERF_TYPE_RAW; break; @@ -128,6 +129,9 @@ static u64 kvm_pmu_get_perf_event_config(unsigned long eidx, uint64_t evt_data) case SBI_PMU_EVENT_TYPE_RAW: config = evt_data & RISCV_PMU_RAW_EVENT_MASK; break; + case SBI_PMU_EVENT_TYPE_RAW_V2: + config = evt_data & RISCV_PMU_RAW_EVENT_V2_MASK; + break; case SBI_PMU_EVENT_TYPE_FW: if (ecode < SBI_PMU_FW_MAX) config = (1ULL << 63) | ecode; -- 2.43.0 From atishp at rivosinc.com Tue Sep 9 00:03:23 2025 From: atishp at rivosinc.com (Atish Patra) Date: Tue, 09 Sep 2025 00:03:23 -0700 Subject: [PATCH v6 4/8] drivers/perf: riscv: Implement PMU event info function In-Reply-To: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> References: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> Message-ID: <20250909-pmu_event_info-v6-4-d8f80cacb884@rivosinc.com> With the new SBI PMU event info function, we can query the availability of the all standard SBI PMU events at boot time with a single ecall. This improves the bootime by avoiding making an SBI call for each standard PMU event. Since this function is defined only in SBI v3.0, invoke this only if the underlying SBI implementation is v3.0 or higher. Signed-off-by: Atish Patra --- arch/riscv/include/asm/sbi.h | 9 ++++++ drivers/perf/riscv_pmu_sbi.c | 69 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 78 insertions(+) diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index b0c41ef56968..5ca7cebc13cc 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -136,6 +136,7 @@ enum sbi_ext_pmu_fid { SBI_EXT_PMU_COUNTER_FW_READ, SBI_EXT_PMU_COUNTER_FW_READ_HI, SBI_EXT_PMU_SNAPSHOT_SET_SHMEM, + SBI_EXT_PMU_EVENT_GET_INFO, }; union sbi_pmu_ctr_info { @@ -159,6 +160,14 @@ struct riscv_pmu_snapshot_data { u64 reserved[447]; }; +struct riscv_pmu_event_info { + u32 event_idx; + u32 output; + u64 event_data; +}; + +#define RISCV_PMU_EVENT_INFO_OUTPUT_MASK 0x01 + #define RISCV_PMU_RAW_EVENT_MASK GENMASK_ULL(47, 0) #define RISCV_PMU_PLAT_FW_EVENT_MASK GENMASK_ULL(61, 0) /* SBI v3.0 allows extended hpmeventX width value */ diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c index 3644bed4c8ab..a6c479f853e1 100644 --- a/drivers/perf/riscv_pmu_sbi.c +++ b/drivers/perf/riscv_pmu_sbi.c @@ -299,6 +299,66 @@ static struct sbi_pmu_event_data pmu_cache_event_map[PERF_COUNT_HW_CACHE_MAX] }, }; +static int pmu_sbi_check_event_info(void) +{ + int num_events = ARRAY_SIZE(pmu_hw_event_map) + PERF_COUNT_HW_CACHE_MAX * + PERF_COUNT_HW_CACHE_OP_MAX * PERF_COUNT_HW_CACHE_RESULT_MAX; + struct riscv_pmu_event_info *event_info_shmem; + phys_addr_t base_addr; + int i, j, k, result = 0, count = 0; + struct sbiret ret; + + event_info_shmem = kcalloc(num_events, sizeof(*event_info_shmem), GFP_KERNEL); + if (!event_info_shmem) + return -ENOMEM; + + for (i = 0; i < ARRAY_SIZE(pmu_hw_event_map); i++) + event_info_shmem[count++].event_idx = pmu_hw_event_map[i].event_idx; + + for (i = 0; i < ARRAY_SIZE(pmu_cache_event_map); i++) { + for (j = 0; j < ARRAY_SIZE(pmu_cache_event_map[i]); j++) { + for (k = 0; k < ARRAY_SIZE(pmu_cache_event_map[i][j]); k++) + event_info_shmem[count++].event_idx = + pmu_cache_event_map[i][j][k].event_idx; + } + } + + base_addr = __pa(event_info_shmem); + if (IS_ENABLED(CONFIG_32BIT)) + ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_EVENT_GET_INFO, lower_32_bits(base_addr), + upper_32_bits(base_addr), count, 0, 0, 0); + else + ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_EVENT_GET_INFO, base_addr, 0, + count, 0, 0, 0); + if (ret.error) { + result = -EOPNOTSUPP; + goto free_mem; + } + + for (i = 0; i < ARRAY_SIZE(pmu_hw_event_map); i++) { + if (!(event_info_shmem[i].output & RISCV_PMU_EVENT_INFO_OUTPUT_MASK)) + pmu_hw_event_map[i].event_idx = -ENOENT; + } + + count = ARRAY_SIZE(pmu_hw_event_map); + + for (i = 0; i < ARRAY_SIZE(pmu_cache_event_map); i++) { + for (j = 0; j < ARRAY_SIZE(pmu_cache_event_map[i]); j++) { + for (k = 0; k < ARRAY_SIZE(pmu_cache_event_map[i][j]); k++) { + if (!(event_info_shmem[count].output & + RISCV_PMU_EVENT_INFO_OUTPUT_MASK)) + pmu_cache_event_map[i][j][k].event_idx = -ENOENT; + count++; + } + } + } + +free_mem: + kfree(event_info_shmem); + + return result; +} + static void pmu_sbi_check_event(struct sbi_pmu_event_data *edata) { struct sbiret ret; @@ -316,6 +376,15 @@ static void pmu_sbi_check_event(struct sbi_pmu_event_data *edata) static void pmu_sbi_check_std_events(struct work_struct *work) { + int ret; + + if (sbi_v3_available) { + ret = pmu_sbi_check_event_info(); + if (ret) + pr_err("pmu_sbi_check_event_info failed with error %d\n", ret); + return; + } + for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_map); i++) pmu_sbi_check_event(&pmu_hw_event_map[i]); -- 2.43.0 From atishp at rivosinc.com Tue Sep 9 00:03:25 2025 From: atishp at rivosinc.com (Atish Patra) Date: Tue, 09 Sep 2025 00:03:25 -0700 Subject: [PATCH v6 6/8] RISC-V: KVM: No need of explicit writable slot check In-Reply-To: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> References: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> Message-ID: <20250909-pmu_event_info-v6-6-d8f80cacb884@rivosinc.com> There is not much value in checking if a memslot is writable explicitly before a write as it may change underneath after the check. Rather, return invalid address error when write_guest fails as it checks if the slot is writable anyways. Suggested-by: Sean Christopherson Signed-off-by: Atish Patra --- arch/riscv/kvm/vcpu_pmu.c | 11 ++--------- arch/riscv/kvm/vcpu_sbi_sta.c | 9 ++------- 2 files changed, 4 insertions(+), 16 deletions(-) diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c index 15d71a7b75ba..f8514086bd6b 100644 --- a/arch/riscv/kvm/vcpu_pmu.c +++ b/arch/riscv/kvm/vcpu_pmu.c @@ -409,8 +409,6 @@ int kvm_riscv_vcpu_pmu_snapshot_set_shmem(struct kvm_vcpu *vcpu, unsigned long s int snapshot_area_size = sizeof(struct riscv_pmu_snapshot_data); int sbiret = 0; gpa_t saddr; - unsigned long hva; - bool writable; if (!kvpmu || flags) { sbiret = SBI_ERR_INVALID_PARAM; @@ -432,19 +430,14 @@ int kvm_riscv_vcpu_pmu_snapshot_set_shmem(struct kvm_vcpu *vcpu, unsigned long s goto out; } - hva = kvm_vcpu_gfn_to_hva_prot(vcpu, saddr >> PAGE_SHIFT, &writable); - if (kvm_is_error_hva(hva) || !writable) { - sbiret = SBI_ERR_INVALID_ADDRESS; - goto out; - } - kvpmu->sdata = kzalloc(snapshot_area_size, GFP_ATOMIC); if (!kvpmu->sdata) return -ENOMEM; + /* No need to check writable slot explicitly as kvm_vcpu_write_guest does it internally */ if (kvm_vcpu_write_guest(vcpu, saddr, kvpmu->sdata, snapshot_area_size)) { kfree(kvpmu->sdata); - sbiret = SBI_ERR_FAILURE; + sbiret = SBI_ERR_INVALID_ADDRESS; goto out; } diff --git a/arch/riscv/kvm/vcpu_sbi_sta.c b/arch/riscv/kvm/vcpu_sbi_sta.c index cc6cb7c8f0e4..caaa28460ca4 100644 --- a/arch/riscv/kvm/vcpu_sbi_sta.c +++ b/arch/riscv/kvm/vcpu_sbi_sta.c @@ -85,8 +85,6 @@ static int kvm_sbi_sta_steal_time_set_shmem(struct kvm_vcpu *vcpu) unsigned long shmem_phys_hi = cp->a1; u32 flags = cp->a2; struct sbi_sta_struct zero_sta = {0}; - unsigned long hva; - bool writable; gpa_t shmem; int ret; @@ -111,13 +109,10 @@ static int kvm_sbi_sta_steal_time_set_shmem(struct kvm_vcpu *vcpu) return SBI_ERR_INVALID_ADDRESS; } - hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable); - if (kvm_is_error_hva(hva) || !writable) - return SBI_ERR_INVALID_ADDRESS; - + /* No need to check writable slot explicitly as kvm_vcpu_write_guest does it internally */ ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta)); if (ret) - return SBI_ERR_FAILURE; + return SBI_ERR_INVALID_ADDRESS; vcpu->arch.sta.shmem = shmem; vcpu->arch.sta.last_steal = current->sched_info.run_delay; -- 2.43.0 From atishp at rivosinc.com Tue Sep 9 00:03:26 2025 From: atishp at rivosinc.com (Atish Patra) Date: Tue, 09 Sep 2025 00:03:26 -0700 Subject: [PATCH v6 7/8] RISC-V: KVM: Implement get event info function In-Reply-To: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> References: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> Message-ID: <20250909-pmu_event_info-v6-7-d8f80cacb884@rivosinc.com> The new get_event_info funciton allows the guest to query the presence of multiple events with single SBI call. Currently, the perf driver in linux guest invokes it for all the standard SBI PMU events. Support the SBI function implementation in KVM as well. Reviewed-by: Anup Patel Signed-off-by: Atish Patra --- arch/riscv/include/asm/kvm_vcpu_pmu.h | 3 ++ arch/riscv/kvm/vcpu_pmu.c | 59 +++++++++++++++++++++++++++++++++++ arch/riscv/kvm/vcpu_sbi_pmu.c | 3 ++ 3 files changed, 65 insertions(+) diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h index 1d85b6617508..9a930afc8f57 100644 --- a/arch/riscv/include/asm/kvm_vcpu_pmu.h +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h @@ -98,6 +98,9 @@ void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu); int kvm_riscv_vcpu_pmu_snapshot_set_shmem(struct kvm_vcpu *vcpu, unsigned long saddr_low, unsigned long saddr_high, unsigned long flags, struct kvm_vcpu_sbi_return *retdata); +int kvm_riscv_vcpu_pmu_event_info(struct kvm_vcpu *vcpu, unsigned long saddr_low, + unsigned long saddr_high, unsigned long num_events, + unsigned long flags, struct kvm_vcpu_sbi_return *retdata); void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu); void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu); diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c index f8514086bd6b..a2fae70ee174 100644 --- a/arch/riscv/kvm/vcpu_pmu.c +++ b/arch/riscv/kvm/vcpu_pmu.c @@ -449,6 +449,65 @@ int kvm_riscv_vcpu_pmu_snapshot_set_shmem(struct kvm_vcpu *vcpu, unsigned long s return 0; } +int kvm_riscv_vcpu_pmu_event_info(struct kvm_vcpu *vcpu, unsigned long saddr_low, + unsigned long saddr_high, unsigned long num_events, + unsigned long flags, struct kvm_vcpu_sbi_return *retdata) +{ + struct riscv_pmu_event_info *einfo = NULL; + int shmem_size = num_events * sizeof(*einfo); + gpa_t shmem; + u32 eidx, etype; + u64 econfig; + int ret; + + if (flags != 0 || (saddr_low & (SZ_16 - 1) || num_events == 0)) { + ret = SBI_ERR_INVALID_PARAM; + goto out; + } + + shmem = saddr_low; + if (saddr_high != 0) { + if (IS_ENABLED(CONFIG_32BIT)) { + shmem |= ((gpa_t)saddr_high << 32); + } else { + ret = SBI_ERR_INVALID_ADDRESS; + goto out; + } + } + + einfo = kzalloc(shmem_size, GFP_KERNEL); + if (!einfo) + return -ENOMEM; + + ret = kvm_vcpu_read_guest(vcpu, shmem, einfo, shmem_size); + if (ret) { + ret = SBI_ERR_FAILURE; + goto free_mem; + } + + for (int i = 0; i < num_events; i++) { + eidx = einfo[i].event_idx; + etype = kvm_pmu_get_perf_event_type(eidx); + econfig = kvm_pmu_get_perf_event_config(eidx, einfo[i].event_data); + ret = riscv_pmu_get_event_info(etype, econfig, NULL); + einfo[i].output = (ret > 0) ? 1 : 0; + } + + ret = kvm_vcpu_write_guest(vcpu, shmem, einfo, shmem_size); + if (ret) { + ret = SBI_ERR_INVALID_ADDRESS; + goto free_mem; + } + + ret = 0; +free_mem: + kfree(einfo); +out: + retdata->err_val = ret; + + return 0; +} + int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata) { diff --git a/arch/riscv/kvm/vcpu_sbi_pmu.c b/arch/riscv/kvm/vcpu_sbi_pmu.c index e4be34e03e83..a020d979d179 100644 --- a/arch/riscv/kvm/vcpu_sbi_pmu.c +++ b/arch/riscv/kvm/vcpu_sbi_pmu.c @@ -73,6 +73,9 @@ static int kvm_sbi_ext_pmu_handler(struct kvm_vcpu *vcpu, struct kvm_run *run, case SBI_EXT_PMU_SNAPSHOT_SET_SHMEM: ret = kvm_riscv_vcpu_pmu_snapshot_set_shmem(vcpu, cp->a0, cp->a1, cp->a2, retdata); break; + case SBI_EXT_PMU_EVENT_GET_INFO: + ret = kvm_riscv_vcpu_pmu_event_info(vcpu, cp->a0, cp->a1, cp->a2, cp->a3, retdata); + break; default: retdata->err_val = SBI_ERR_NOT_SUPPORTED; } -- 2.43.0 From atishp at rivosinc.com Tue Sep 9 00:03:27 2025 From: atishp at rivosinc.com (Atish Patra) Date: Tue, 09 Sep 2025 00:03:27 -0700 Subject: [PATCH v6 8/8] RISC-V: KVM: Upgrade the supported SBI version to 3.0 In-Reply-To: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> References: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> Message-ID: <20250909-pmu_event_info-v6-8-d8f80cacb884@rivosinc.com> Upgrade the SBI version to v3.0 so that corresponding features can be enabled in the guest. Reviewed-by: Anup Patel Signed-off-by: Atish Patra --- arch/riscv/include/asm/kvm_vcpu_sbi.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h index d678fd7e5973..f9c350ab84d9 100644 --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h @@ -11,7 +11,7 @@ #define KVM_SBI_IMPID 3 -#define KVM_SBI_VERSION_MAJOR 2 +#define KVM_SBI_VERSION_MAJOR 3 #define KVM_SBI_VERSION_MINOR 0 enum kvm_riscv_sbi_ext_status { -- 2.43.0 From atishp at rivosinc.com Tue Sep 9 00:03:21 2025 From: atishp at rivosinc.com (Atish Patra) Date: Tue, 09 Sep 2025 00:03:21 -0700 Subject: [PATCH v6 2/8] drivers/perf: riscv: Add raw event v2 support In-Reply-To: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> References: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> Message-ID: <20250909-pmu_event_info-v6-2-d8f80cacb884@rivosinc.com> SBI v3.0 introduced a new raw event type that allows wider mhpmeventX width to be programmed via CFG_MATCH. Use the raw event v2 if SBI v3.0 is available. Reviewed-by: Anup Patel Signed-off-by: Atish Patra --- arch/riscv/include/asm/sbi.h | 4 ++++ drivers/perf/riscv_pmu_sbi.c | 16 +++++++++++----- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index 341e74238aa0..b0c41ef56968 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -161,7 +161,10 @@ struct riscv_pmu_snapshot_data { #define RISCV_PMU_RAW_EVENT_MASK GENMASK_ULL(47, 0) #define RISCV_PMU_PLAT_FW_EVENT_MASK GENMASK_ULL(61, 0) +/* SBI v3.0 allows extended hpmeventX width value */ +#define RISCV_PMU_RAW_EVENT_V2_MASK GENMASK_ULL(55, 0) #define RISCV_PMU_RAW_EVENT_IDX 0x20000 +#define RISCV_PMU_RAW_EVENT_V2_IDX 0x30000 #define RISCV_PLAT_FW_EVENT 0xFFFF /** General pmu event codes specified in SBI PMU extension */ @@ -219,6 +222,7 @@ enum sbi_pmu_event_type { SBI_PMU_EVENT_TYPE_HW = 0x0, SBI_PMU_EVENT_TYPE_CACHE = 0x1, SBI_PMU_EVENT_TYPE_RAW = 0x2, + SBI_PMU_EVENT_TYPE_RAW_V2 = 0x3, SBI_PMU_EVENT_TYPE_FW = 0xf, }; diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c index cfd6946fca42..3644bed4c8ab 100644 --- a/drivers/perf/riscv_pmu_sbi.c +++ b/drivers/perf/riscv_pmu_sbi.c @@ -59,7 +59,7 @@ asm volatile(ALTERNATIVE( \ #define PERF_EVENT_FLAG_USER_ACCESS BIT(SYSCTL_USER_ACCESS) #define PERF_EVENT_FLAG_LEGACY BIT(SYSCTL_LEGACY) -PMU_FORMAT_ATTR(event, "config:0-47"); +PMU_FORMAT_ATTR(event, "config:0-55"); PMU_FORMAT_ATTR(firmware, "config:62-63"); static bool sbi_v2_available; @@ -527,8 +527,10 @@ static int pmu_sbi_event_map(struct perf_event *event, u64 *econfig) break; case PERF_TYPE_RAW: /* - * As per SBI specification, the upper 16 bits must be unused - * for a hardware raw event. + * As per SBI v0.3 specification, + * -- the upper 16 bits must be unused for a hardware raw event. + * As per SBI v2.0 specification, + * -- the upper 8 bits must be unused for a hardware raw event. * Bits 63:62 are used to distinguish between raw events * 00 - Hardware raw event * 10 - SBI firmware events @@ -537,8 +539,12 @@ static int pmu_sbi_event_map(struct perf_event *event, u64 *econfig) switch (config >> 62) { case 0: - /* Return error any bits [48-63] is set as it is not allowed by the spec */ - if (!(config & ~RISCV_PMU_RAW_EVENT_MASK)) { + if (sbi_v3_available) { + if (!(config & ~RISCV_PMU_RAW_EVENT_V2_MASK)) { + *econfig = config & RISCV_PMU_RAW_EVENT_V2_MASK; + ret = RISCV_PMU_RAW_EVENT_V2_IDX; + } + } else if (!(config & ~RISCV_PMU_RAW_EVENT_MASK)) { *econfig = config & RISCV_PMU_RAW_EVENT_MASK; ret = RISCV_PMU_RAW_EVENT_IDX; } -- 2.43.0 From atishp at rivosinc.com Tue Sep 9 00:03:20 2025 From: atishp at rivosinc.com (Atish Patra) Date: Tue, 09 Sep 2025 00:03:20 -0700 Subject: [PATCH v6 1/8] drivers/perf: riscv: Add SBI v3.0 flag In-Reply-To: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> References: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> Message-ID: <20250909-pmu_event_info-v6-1-d8f80cacb884@rivosinc.com> There are new PMU related features introduced in SBI v3.0. 1. Raw Event v2 which allows mhpmeventX value to be 56 bit wide. 2. Get Event info function to do a bulk query at one shot. Reviewed-by: Anup Patel Signed-off-by: Atish Patra --- drivers/perf/riscv_pmu_sbi.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c index 698de8ddf895..cfd6946fca42 100644 --- a/drivers/perf/riscv_pmu_sbi.c +++ b/drivers/perf/riscv_pmu_sbi.c @@ -63,6 +63,7 @@ PMU_FORMAT_ATTR(event, "config:0-47"); PMU_FORMAT_ATTR(firmware, "config:62-63"); static bool sbi_v2_available; +static bool sbi_v3_available; static DEFINE_STATIC_KEY_FALSE(sbi_pmu_snapshot_available); #define sbi_pmu_snapshot_available() \ static_branch_unlikely(&sbi_pmu_snapshot_available) @@ -1452,6 +1453,9 @@ static int __init pmu_sbi_devinit(void) if (sbi_spec_version >= sbi_mk_version(2, 0)) sbi_v2_available = true; + if (sbi_spec_version >= sbi_mk_version(3, 0)) + sbi_v3_available = true; + ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_RISCV_STARTING, "perf/riscv/pmu:starting", pmu_sbi_starting_cpu, pmu_sbi_dying_cpu); -- 2.43.0 From atishp at rivosinc.com Tue Sep 9 00:03:24 2025 From: atishp at rivosinc.com (Atish Patra) Date: Tue, 09 Sep 2025 00:03:24 -0700 Subject: [PATCH v6 5/8] drivers/perf: riscv: Export PMU event info function In-Reply-To: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> References: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> Message-ID: <20250909-pmu_event_info-v6-5-d8f80cacb884@rivosinc.com> The event mapping function can be used in event info function to find out the corresponding SBI PMU event encoding during the get_event_info function as well. Refactor and export it so that it can be invoked from kvm and internal driver. Signed-off-by: Atish Patra Reviewed-by: Anup Patel --- drivers/perf/riscv_pmu_sbi.c | 122 ++++++++++++++++++++++------------------- include/linux/perf/riscv_pmu.h | 1 + 2 files changed, 68 insertions(+), 55 deletions(-) diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c index a6c479f853e1..0392900d828e 100644 --- a/drivers/perf/riscv_pmu_sbi.c +++ b/drivers/perf/riscv_pmu_sbi.c @@ -100,6 +100,7 @@ static unsigned int riscv_pmu_irq; /* Cache the available counters in a bitmask */ static unsigned long cmask; +static int pmu_event_find_cache(u64 config); struct sbi_pmu_event_data { union { union { @@ -412,6 +413,71 @@ static bool pmu_sbi_ctr_is_fw(int cidx) return (info->type == SBI_PMU_CTR_TYPE_FW) ? true : false; } +int riscv_pmu_get_event_info(u32 type, u64 config, u64 *econfig) +{ + int ret = -ENOENT; + + switch (type) { + case PERF_TYPE_HARDWARE: + if (config >= PERF_COUNT_HW_MAX) + return -EINVAL; + ret = pmu_hw_event_map[config].event_idx; + break; + case PERF_TYPE_HW_CACHE: + ret = pmu_event_find_cache(config); + break; + case PERF_TYPE_RAW: + /* + * As per SBI v0.3 specification, + * -- the upper 16 bits must be unused for a hardware raw event. + * As per SBI v2.0 specification, + * -- the upper 8 bits must be unused for a hardware raw event. + * Bits 63:62 are used to distinguish between raw events + * 00 - Hardware raw event + * 10 - SBI firmware events + * 11 - Risc-V platform specific firmware event + */ + switch (config >> 62) { + case 0: + if (sbi_v3_available) { + /* Return error any bits [56-63] is set as it is not allowed by the spec */ + if (!(config & ~RISCV_PMU_RAW_EVENT_V2_MASK)) { + if (econfig) + *econfig = config & RISCV_PMU_RAW_EVENT_V2_MASK; + ret = RISCV_PMU_RAW_EVENT_V2_IDX; + } + /* Return error any bits [48-63] is set as it is not allowed by the spec */ + } else if (!(config & ~RISCV_PMU_RAW_EVENT_MASK)) { + if (econfig) + *econfig = config & RISCV_PMU_RAW_EVENT_MASK; + ret = RISCV_PMU_RAW_EVENT_IDX; + } + break; + case 2: + ret = (config & 0xFFFF) | (SBI_PMU_EVENT_TYPE_FW << 16); + break; + case 3: + /* + * For Risc-V platform specific firmware events + * Event code - 0xFFFF + * Event data - raw event encoding + */ + ret = SBI_PMU_EVENT_TYPE_FW << 16 | RISCV_PLAT_FW_EVENT; + if (econfig) + *econfig = config & RISCV_PMU_PLAT_FW_EVENT_MASK; + break; + default: + break; + } + break; + default: + break; + } + + return ret; +} +EXPORT_SYMBOL_GPL(riscv_pmu_get_event_info); + /* * Returns the counter width of a programmable counter and number of hardware * counters. As we don't support heterogeneous CPUs yet, it is okay to just @@ -577,7 +643,6 @@ static int pmu_sbi_event_map(struct perf_event *event, u64 *econfig) { u32 type = event->attr.type; u64 config = event->attr.config; - int ret = -ENOENT; /* * Ensure we are finished checking standard hardware events for @@ -585,60 +650,7 @@ static int pmu_sbi_event_map(struct perf_event *event, u64 *econfig) */ flush_work(&check_std_events_work); - switch (type) { - case PERF_TYPE_HARDWARE: - if (config >= PERF_COUNT_HW_MAX) - return -EINVAL; - ret = pmu_hw_event_map[event->attr.config].event_idx; - break; - case PERF_TYPE_HW_CACHE: - ret = pmu_event_find_cache(config); - break; - case PERF_TYPE_RAW: - /* - * As per SBI v0.3 specification, - * -- the upper 16 bits must be unused for a hardware raw event. - * As per SBI v2.0 specification, - * -- the upper 8 bits must be unused for a hardware raw event. - * Bits 63:62 are used to distinguish between raw events - * 00 - Hardware raw event - * 10 - SBI firmware events - * 11 - Risc-V platform specific firmware event - */ - - switch (config >> 62) { - case 0: - if (sbi_v3_available) { - if (!(config & ~RISCV_PMU_RAW_EVENT_V2_MASK)) { - *econfig = config & RISCV_PMU_RAW_EVENT_V2_MASK; - ret = RISCV_PMU_RAW_EVENT_V2_IDX; - } - } else if (!(config & ~RISCV_PMU_RAW_EVENT_MASK)) { - *econfig = config & RISCV_PMU_RAW_EVENT_MASK; - ret = RISCV_PMU_RAW_EVENT_IDX; - } - break; - case 2: - ret = (config & 0xFFFF) | (SBI_PMU_EVENT_TYPE_FW << 16); - break; - case 3: - /* - * For Risc-V platform specific firmware events - * Event code - 0xFFFF - * Event data - raw event encoding - */ - ret = SBI_PMU_EVENT_TYPE_FW << 16 | RISCV_PLAT_FW_EVENT; - *econfig = config & RISCV_PMU_PLAT_FW_EVENT_MASK; - break; - default: - break; - } - break; - default: - break; - } - - return ret; + return riscv_pmu_get_event_info(type, config, econfig); } static void pmu_sbi_snapshot_free(struct riscv_pmu *pmu) diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h index 701974639ff2..f82a28040594 100644 --- a/include/linux/perf/riscv_pmu.h +++ b/include/linux/perf/riscv_pmu.h @@ -89,6 +89,7 @@ static inline void riscv_pmu_legacy_skip_init(void) {}; struct riscv_pmu *riscv_pmu_alloc(void); #ifdef CONFIG_RISCV_PMU_SBI int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr); +int riscv_pmu_get_event_info(u32 type, u64 config, u64 *econfig); #endif #endif /* CONFIG_RISCV_PMU */ -- 2.43.0 From kamil.nakielski at bizosto.pl Tue Sep 9 00:35:43 2025 From: kamil.nakielski at bizosto.pl (Kamil Nakielski) Date: Tue, 9 Sep 2025 07:35:43 GMT Subject: Zatrudnienie nowej osoby Message-ID: <20250909064500-0.1.3u.19xrl.0.oz779x1a3j@bizosto.pl> Dzie? dobry, kontaktuj? si? w sprawie zatrudnienia pracownik?w z Ukrainy w Pa?stwa zak?adzie. Stale obs?ugujemy firmy produkcyjne w tym temacie. Je?li potrzebujecie Pa?stwo dodatkowego personelu do pracy, prosz? o wiadomo??. Pozdrawiam Kamil Nakielski From marek.kucharski at fundixo.pl Tue Sep 9 00:45:43 2025 From: marek.kucharski at fundixo.pl (Marek Kucharski) Date: Tue, 9 Sep 2025 07:45:43 GMT Subject: =?UTF-8?Q?Prosz=C4=99_o_kontakt?= Message-ID: <20250909084501-0.1.hj.43v4x.0.dqmljqubde@CyberCitadel.pl> Dzie? dobry, Czy jest mo?liwo?? nawi?zania wsp??pracy z Pa?stwem? Z ch?ci? porozmawiam z osob? zajmuj?c? si? dzia?aniami zwi?zanymi ze sprzeda??. Pomagamy skutecznie pozyskiwa? nowych klient?w. Zapraszam do kontaktu. Pozdrawiam Marek Kucharski From conor at kernel.org Tue Sep 9 01:35:46 2025 From: conor at kernel.org (Conor Dooley) Date: Tue, 9 Sep 2025 09:35:46 +0100 Subject: [PATCH v3 2/6] dt-bindings: riscv: microchip: document icicle kit with production device In-Reply-To: <20250909-accurate-wisteria-spider-222341@kuoka> References: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> <20250908115732.31092-3-valentina.fernandezalanis@microchip.com> <20250909-accurate-wisteria-spider-222341@kuoka> Message-ID: <20250909-division-fetch-01bf5027f6d9@spud> On Tue, Sep 09, 2025 at 08:53:05AM +0200, Krzysztof Kozlowski wrote: > On Mon, Sep 08, 2025 at 12:57:28PM +0100, Valentina Fernandez wrote: > > With the introduction of the Icicle Kit using the production MPFS250T > > device, it's necessary to distinguish it from the engineering sample > > (-es) variant. Engineering samples cannot write to flash from the MSS, > > as noted in the PolarFire SoC FPGA ES errata. > > > > Add specific compatibles for the Icicle Kit with Production device > > (MPFS250T) and Icicle Kit with Engineering Sample (MPFS250T_ES). > > > > The icicle kit reference designs in the v2025.07 release include the > > Mi-V IHC IP v2, used to send/receive data between clusters when > > using Asymmetric Multiprocessing (AMP) mode. > > > > In reference design releases prior to v2025.07, the MI-V IHC subsystem > > was included as a proof of concept in the design prior to becoming an > > IP available in the Libero catalog. > > > > Among other improvements, the new Mi-V IHC IP v2 includes some > > changes to the register map. For this reason, make use of a new > > reference design compatible to denote that v2025.07 reference design > > releases are not backwards compatible. > > > > Signed-off-by: Valentina Fernandez > > --- > > Documentation/devicetree/bindings/riscv/microchip.yaml | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > Why are you sending patches which are already applied? For two weeks? That's probably my bad, I dropped the series when you had complaints about the version that I applied and forgot to mention it. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From ajd at linux.ibm.com Tue Sep 9 02:13:23 2025 From: ajd at linux.ibm.com (Andrew Donnellan) Date: Tue, 9 Sep 2025 19:13:23 +1000 Subject: [PATCH v17 00/12] Support page table check on PowerPC Message-ID: <20250909091335.183439-1-ajd@linux.ibm.com> Support page table check on all PowerPC platforms. This works by serialising assignments, reassignments and clears of page table entries at each level in order to ensure that anonymous mappings have at most one writable consumer, and likewise that file-backed mappings are not simultaneously also anonymous mappings. In order to support this infrastructure, a number of helpers or stubs must be defined or updated for all powerpc platforms. Additionally, we separate set_pte_at() and set_pte_at_unchecked(), to allow for internal, uninstrumented mappings. On some PowerPC platforms, implementing {pte,pmd,pud}_user_accessible_page() requires the address. We revert previous changes that removed the address parameter from various interfaces, and add it to some other interfaces, in order to allow this. Note that on 32 bit systems with CONFIG_KFENCE=y, you need [0] to avoid possible failures in init code (this is a code patching/static keys issue, which was discovered by a user testing this series but isn't a bug in page table check). (This series was initially written by Rohan McLure, who has left IBM and is no longer working on powerpc.) [0] https://lore.kernel.org/linuxppc-dev/4b5e6eb281d7b1ea77619bee17095f905a125168.1757003584.git.christophe.leroy at csgroup.eu/ v17: * Rebase on mm-new to fix build failure on commit 3f3806eff23f ("riscv: use an atomic xchg in pudp_huge_get_and_clear()") * Remove patch 10 ("powerpc: mm: Add pud_pfn() stub"), as the original reasoning for the stub is now wrong (pud_pfn() is now used more broadly in generic code, and commit 35a76f5c0863 ("mm/arch: provide pud_pfn() fallback") now provides a generic fallback. This fixes the build failure on some powerpc platforms (0day) v16: * Rebase on mainline Link: https://lore.kernel.org/all/20250813062614.51759-1-ajd at linux.ibm.com/ v15: * Rebase on mainline, including commit 91e40668e70a ("mm/page_table_check: Batch-check pmds/puds just like ptes") and associated arm64 changes * Clarify/fix some commit messages * Fix handling of address in a loop in __page_table_check_ptes_set() Link: https://lore.kernel.org/all/20250625063753.77511-1-ajd at linux.ibm.com/ v14: * Fix a call to page_table_check_pud_set() that was missed (akpm) Link: https://lore.kernel.org/all/20250411054354.511145-1-ajd at linux.ibm.com/ v13: * Rebase on mainline * Don't use set_pte_at_unchecked() for early boot purposes (Pasha) Link: https://lore.kernel.org/linuxppc-dev/20250211161404.850215-1-ajd at linux.ibm.com/ v12: * Rename commits that revert changes to instead reflect that we are reinstating old behaviour due to it providing more flexibility * Add return line to pud_pfn() stub * Instrument ptep_get_and_clear() for nohash Link: https://lore.kernel.org/linuxppc-dev/20240402051154.476244-1-rmclure at linux.ibm.com/ v11: * The pud_pfn() stub, which previously had no legitimate users on any powerpc platform, now has users in Book3s64 with transparent pages. Include a stub of the same name for each platform that does not define their own. * Drop patch that standardised use of p*d_leaf(), as already included upstream in v6.9. * Provide fallback definitions of p{m,u}d_user_accessible_page() that do not reference p*d_leaf(), p*d_pte(), as they are defined after powerpc/mm headers by linux/mm headers. * Ensure that set_pte_at_unchecked() has the same checks as set_pte_at(). Link: https://lore.kernel.org/linuxppc-dev/20240328045535.194800-14-rmclure at linux.ibm.com/ v10: * Revert patches that removed address and mm parameters from page table check routines, including consuming code from arm64, x86_64 and riscv. * Implement *_user_accessible_page() routines in terms of pte_user() where available (64-bit, book3s) but otherwise by checking the address (on platforms where the pte does not imply whether the mapping is for user or kernel) * Internal set_pte_at() calls replaced with set_pte_at_unchecked(), which is identical, but prevents double instrumentation. Link: https://lore.kernel.org/linuxppc-dev/20240313042118.230397-9-rmclure at linux.ibm.com/T/ v9: * Adapt to using the set_ptes() API, using __set_pte_at() where we need must avoid instrumentation. * Use the logic of *_access_permitted() for implementing *_user_accessible_page(), which are required routines for page table check. * Even though we no longer need p{m,u,4}d_leaf(), still default implement these to assist in refactoring out extant p{m,u,4}_is_leaf(). * Add p{m,u}_pte() stubs where asm-generic does not provide them, as page table check wants all *user_accessible_page() variants, and we would like to default implement the variants in terms of pte_user_accessible_page(). * Avoid the ugly pmdp_collapse_flush() macro nonsense! Just instrument its constituent calls instead for radix and hash. Link: https://lore.kernel.org/linuxppc-dev/20231130025404.37179-2-rmclure at linux.ibm.com/ v8: * Fix linux/page_table_check.h include in asm/pgtable.h breaking 32-bit. Link: https://lore.kernel.org/linuxppc-dev/20230215231153.2147454-1-rmclure at linux.ibm.com/ v7: * Remove use of extern in set_pte prototypes * Clean up pmdp_collapse_flush macro * Replace set_pte_at with static inline function * Fix commit message for patch 7 Link: https://lore.kernel.org/linuxppc-dev/20230215020155.1969194-1-rmclure at linux.ibm.com/ v6: * Support huge pages and p{m,u}d accounting. * Remove instrumentation from set_pte from kernel internal pages. * 64s: Implement pmdp_collapse_flush in terms of __pmdp_collapse_flush as access to the mm_struct * is required. Link: https://lore.kernel.org/linuxppc-dev/20230214015939.1853438-1-rmclure at linux.ibm.com/ v5: Link: https://lore.kernel.org/linuxppc-dev/20221118002146.25979-1-rmclure at linux.ibm.com/ Andrew Donnellan (2): arm64/mm: Add addr parameter to __set_ptes_anysz() arm64/mm: Add addr parameter to __ptep_get_and_clear_anysz() Rohan McLure (10): mm/page_table_check: Reinstate address parameter in [__]page_table_check_pud[s]_set() mm/page_table_check: Reinstate address parameter in [__]page_table_check_pmd[s]_set() mm/page_table_check: Provide addr parameter to page_table_check_ptes_set() mm/page_table_check: Reinstate address parameter in [__]page_table_check_pud_clear() mm/page_table_check: Reinstate address parameter in [__]page_table_check_pmd_clear() mm/page_table_check: Reinstate address parameter in [__]page_table_check_pte_clear() mm: Provide address parameter to p{te,md,ud}_user_accessible_page() powerpc: mm: Implement *_user_accessible_page() for ptes powerpc: mm: Use set_pte_at_unchecked() for internal usages powerpc: mm: Support page table check arch/arm64/include/asm/pgtable.h | 46 ++++++------- arch/arm64/mm/hugetlbpage.c | 17 ++--- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/book3s/32/pgtable.h | 12 +++- arch/powerpc/include/asm/book3s/64/pgtable.h | 62 +++++++++++++++--- arch/powerpc/include/asm/nohash/pgtable.h | 13 +++- arch/powerpc/include/asm/pgtable.h | 10 +++ arch/powerpc/mm/book3s64/hash_pgtable.c | 4 ++ arch/powerpc/mm/book3s64/pgtable.c | 17 +++-- arch/powerpc/mm/book3s64/radix_pgtable.c | 9 ++- arch/powerpc/mm/pgtable.c | 12 ++++ arch/riscv/include/asm/pgtable.h | 22 +++---- arch/x86/include/asm/pgtable.h | 22 +++---- include/linux/page_table_check.h | 69 ++++++++++++-------- include/linux/pgtable.h | 10 +-- mm/page_table_check.c | 41 ++++++------ 16 files changed, 240 insertions(+), 127 deletions(-) -- 2.51.0 From ajd at linux.ibm.com Tue Sep 9 02:13:24 2025 From: ajd at linux.ibm.com (Andrew Donnellan) Date: Tue, 9 Sep 2025 19:13:24 +1000 Subject: [PATCH v17 01/12] arm64/mm: Add addr parameter to __set_ptes_anysz() In-Reply-To: <20250909091335.183439-1-ajd@linux.ibm.com> References: <20250909091335.183439-1-ajd@linux.ibm.com> Message-ID: <20250909091335.183439-2-ajd@linux.ibm.com> To provide support for page table check on powerpc, we need to reinstate the address parameter in several functions, including page_table_check_{ptes,pmds,puds}_set(). In preparation for this, add the addr parameter to arm64's __set_ptes_anysz() and change its callsites accordingly. Signed-off-by: Andrew Donnellan --- v15: new patch v16: rebase --- arch/arm64/include/asm/pgtable.h | 19 ++++++++----------- arch/arm64/mm/hugetlbpage.c | 10 +++++----- 2 files changed, 13 insertions(+), 16 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index abd2dee416b3..ed644be48d87 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -698,8 +698,8 @@ static inline pgprot_t pud_pgprot(pud_t pud) return __pgprot(pud_val(pfn_pud(pfn, __pgprot(0))) ^ pud_val(pud)); } -static inline void __set_ptes_anysz(struct mm_struct *mm, pte_t *ptep, - pte_t pte, unsigned int nr, +static inline void __set_ptes_anysz(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned int nr, unsigned long pgsize) { unsigned long stride = pgsize >> PAGE_SHIFT; @@ -734,26 +734,23 @@ static inline void __set_ptes_anysz(struct mm_struct *mm, pte_t *ptep, __set_pte_complete(pte); } -static inline void __set_ptes(struct mm_struct *mm, - unsigned long __always_unused addr, +static inline void __set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr) { - __set_ptes_anysz(mm, ptep, pte, nr, PAGE_SIZE); + __set_ptes_anysz(mm, addr, ptep, pte, nr, PAGE_SIZE); } -static inline void __set_pmds(struct mm_struct *mm, - unsigned long __always_unused addr, +static inline void __set_pmds(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd, unsigned int nr) { - __set_ptes_anysz(mm, (pte_t *)pmdp, pmd_pte(pmd), nr, PMD_SIZE); + __set_ptes_anysz(mm, addr, (pte_t *)pmdp, pmd_pte(pmd), nr, PMD_SIZE); } #define set_pmd_at(mm, addr, pmdp, pmd) __set_pmds(mm, addr, pmdp, pmd, 1) -static inline void __set_puds(struct mm_struct *mm, - unsigned long __always_unused addr, +static inline void __set_puds(struct mm_struct *mm, unsigned long addr, pud_t *pudp, pud_t pud, unsigned int nr) { - __set_ptes_anysz(mm, (pte_t *)pudp, pud_pte(pud), nr, PUD_SIZE); + __set_ptes_anysz(mm, addr, (pte_t *)pudp, pud_pte(pud), nr, PUD_SIZE); } #define set_pud_at(mm, addr, pudp, pud) __set_puds(mm, addr, pudp, pud, 1) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 1d90a7e75333..1003b5020752 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -225,8 +225,8 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, ncontig = num_contig_ptes(sz, &pgsize); if (!pte_present(pte)) { - for (i = 0; i < ncontig; i++, ptep++) - __set_ptes_anysz(mm, ptep, pte, 1, pgsize); + for (i = 0; i < ncontig; i++, ptep++, addr += pgsize) + __set_ptes_anysz(mm, addr, ptep, pte, 1, pgsize); return; } @@ -234,7 +234,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, if (pte_cont(pte) && pte_valid(__ptep_get(ptep))) clear_flush(mm, addr, ptep, pgsize, ncontig); - __set_ptes_anysz(mm, ptep, pte, ncontig, pgsize); + __set_ptes_anysz(mm, addr, ptep, pte, ncontig, pgsize); } pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, @@ -449,7 +449,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, if (pte_young(orig_pte)) pte = pte_mkyoung(pte); - __set_ptes_anysz(mm, ptep, pte, ncontig, pgsize); + __set_ptes_anysz(mm, addr, ptep, pte, ncontig, pgsize); return 1; } @@ -473,7 +473,7 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, pte = get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); pte = pte_wrprotect(pte); - __set_ptes_anysz(mm, ptep, pte, ncontig, pgsize); + __set_ptes_anysz(mm, addr, ptep, pte, ncontig, pgsize); } pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, -- 2.51.0 From ajd at linux.ibm.com Tue Sep 9 02:13:25 2025 From: ajd at linux.ibm.com (Andrew Donnellan) Date: Tue, 9 Sep 2025 19:13:25 +1000 Subject: [PATCH v17 02/12] arm64/mm: Add addr parameter to __ptep_get_and_clear_anysz() In-Reply-To: <20250909091335.183439-1-ajd@linux.ibm.com> References: <20250909091335.183439-1-ajd@linux.ibm.com> Message-ID: <20250909091335.183439-3-ajd@linux.ibm.com> To provide support for page table check on powerpc, we need to reinstate the address parameter in several functions, including page_table_check_{pte,pmd,pud}_clear(). In preparation for this, add the addr parameter to arm64's __ptep_get_and_clear_anysz() and change its callsites accordingly. Signed-off-by: Andrew Donnellan --- v15: new patch --- arch/arm64/include/asm/pgtable.h | 5 +++-- arch/arm64/mm/hugetlbpage.c | 7 ++++--- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index ed644be48d87..66b5309fcad8 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1357,6 +1357,7 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma, #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG */ static inline pte_t __ptep_get_and_clear_anysz(struct mm_struct *mm, + unsigned long address, pte_t *ptep, unsigned long pgsize) { @@ -1384,7 +1385,7 @@ static inline pte_t __ptep_get_and_clear_anysz(struct mm_struct *mm, static inline pte_t __ptep_get_and_clear(struct mm_struct *mm, unsigned long address, pte_t *ptep) { - return __ptep_get_and_clear_anysz(mm, ptep, PAGE_SIZE); + return __ptep_get_and_clear_anysz(mm, address, ptep, PAGE_SIZE); } static inline void __clear_full_ptes(struct mm_struct *mm, unsigned long addr, @@ -1423,7 +1424,7 @@ static inline pte_t __get_and_clear_full_ptes(struct mm_struct *mm, static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long address, pmd_t *pmdp) { - return pte_pmd(__ptep_get_and_clear_anysz(mm, (pte_t *)pmdp, PMD_SIZE)); + return pte_pmd(__ptep_get_and_clear_anysz(mm, address, (pte_t *)pmdp, PMD_SIZE)); } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 1003b5020752..bcc28031eb7a 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -159,11 +159,12 @@ static pte_t get_clear_contig(struct mm_struct *mm, pte_t pte, tmp_pte; bool present; - pte = __ptep_get_and_clear_anysz(mm, ptep, pgsize); + pte = __ptep_get_and_clear_anysz(mm, addr, ptep, pgsize); present = pte_present(pte); while (--ncontig) { ptep++; - tmp_pte = __ptep_get_and_clear_anysz(mm, ptep, pgsize); + addr += pgsize; + tmp_pte = __ptep_get_and_clear_anysz(mm, addr, ptep, pgsize); if (present) { if (pte_dirty(tmp_pte)) pte = pte_mkdirty(pte); @@ -207,7 +208,7 @@ static void clear_flush(struct mm_struct *mm, unsigned long i, saddr = addr; for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) - __ptep_get_and_clear_anysz(mm, ptep, pgsize); + __ptep_get_and_clear_anysz(mm, addr, ptep, pgsize); if (mm == &init_mm) flush_tlb_kernel_range(saddr, addr); -- 2.51.0 From ajd at linux.ibm.com Tue Sep 9 02:13:26 2025 From: ajd at linux.ibm.com (Andrew Donnellan) Date: Tue, 9 Sep 2025 19:13:26 +1000 Subject: [PATCH v17 03/12] mm/page_table_check: Reinstate address parameter in [__]page_table_check_pud[s]_set() In-Reply-To: <20250909091335.183439-1-ajd@linux.ibm.com> References: <20250909091335.183439-1-ajd@linux.ibm.com> Message-ID: <20250909091335.183439-4-ajd@linux.ibm.com> From: Rohan McLure This reverts commit 6d144436d954 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pud_set"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. Apply this to __page_table_check_puds_set(), page_table_check_puds_set() and the page_table_check_pud_set() wrapper macro. [ajd at linux.ibm.com: rebase on riscv + arm64 changes, update commit message] Signed-off-by: Rohan McLure Reviewed-by: Pasha Tatashin Acked-by: Ingo Molnar # x86 Acked-by: Alexandre Ghiti # riscv Signed-off-by: Andrew Donnellan --- v13: remove inaccurate comment on riscv in the commit message v14: fix an x86 usage I missed (found by akpm) v15: rebase, amend commit message --- arch/arm64/include/asm/pgtable.h | 3 ++- arch/riscv/include/asm/pgtable.h | 4 ++-- arch/x86/include/asm/pgtable.h | 4 ++-- include/linux/page_table_check.h | 12 ++++++------ mm/page_table_check.c | 4 ++-- 5 files changed, 14 insertions(+), 13 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 66b5309fcad8..8070b653c409 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -713,7 +713,8 @@ static inline void __set_ptes_anysz(struct mm_struct *mm, unsigned long addr, break; #ifndef __PAGETABLE_PMD_FOLDED case PUD_SIZE: - page_table_check_puds_set(mm, (pud_t *)ptep, pte_pud(pte), nr); + page_table_check_puds_set(mm, addr, (pud_t *)ptep, + pte_pud(pte), nr); break; #endif default: diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index e69346307e78..3a113c837605 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -812,7 +812,7 @@ static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, pud_t *pudp, pud_t pud) { - page_table_check_pud_set(mm, pudp, pud); + page_table_check_pud_set(mm, addr, pudp, pud); return __set_pte_at(mm, (pte_t *)pudp, pud_pte(pud)); } @@ -969,7 +969,7 @@ static inline void update_mmu_cache_pud(struct vm_area_struct *vma, static inline pud_t pudp_establish(struct vm_area_struct *vma, unsigned long address, pud_t *pudp, pud_t pud) { - page_table_check_pud_set(vma->vm_mm, pudp, pud); + page_table_check_pud_set(vma->vm_mm, address, pudp, pud); return __pud(atomic_long_xchg((atomic_long_t *)pudp, pud_val(pud))); } diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index e33df3da6980..0603793acb3a 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1220,7 +1220,7 @@ static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, pud_t *pudp, pud_t pud) { - page_table_check_pud_set(mm, pudp, pud); + page_table_check_pud_set(mm, addr, pudp, pud); native_set_pud(pudp, pud); } @@ -1371,7 +1371,7 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, static inline pud_t pudp_establish(struct vm_area_struct *vma, unsigned long address, pud_t *pudp, pud_t pud) { - page_table_check_pud_set(vma->vm_mm, pudp, pud); + page_table_check_pud_set(vma->vm_mm, address, pudp, pud); if (IS_ENABLED(CONFIG_SMP)) { return xchg(pudp, pud); } else { diff --git a/include/linux/page_table_check.h b/include/linux/page_table_check.h index 289620d4aad3..0bf18b884a12 100644 --- a/include/linux/page_table_check.h +++ b/include/linux/page_table_check.h @@ -21,8 +21,8 @@ void __page_table_check_ptes_set(struct mm_struct *mm, pte_t *ptep, pte_t pte, unsigned int nr); void __page_table_check_pmds_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd, unsigned int nr); -void __page_table_check_puds_set(struct mm_struct *mm, pud_t *pudp, pud_t pud, - unsigned int nr); +void __page_table_check_puds_set(struct mm_struct *mm, unsigned long addr, + pud_t *pudp, pud_t pud, unsigned int nr); void __page_table_check_pte_clear_range(struct mm_struct *mm, unsigned long addr, pmd_t pmd); @@ -86,12 +86,12 @@ static inline void page_table_check_pmds_set(struct mm_struct *mm, } static inline void page_table_check_puds_set(struct mm_struct *mm, - pud_t *pudp, pud_t pud, unsigned int nr) + unsigned long addr, pud_t *pudp, pud_t pud, unsigned int nr) { if (static_branch_likely(&page_table_check_disabled)) return; - __page_table_check_puds_set(mm, pudp, pud, nr); + __page_table_check_puds_set(mm, addr, pudp, pud, nr); } static inline void page_table_check_pte_clear_range(struct mm_struct *mm, @@ -137,7 +137,7 @@ static inline void page_table_check_pmds_set(struct mm_struct *mm, } static inline void page_table_check_puds_set(struct mm_struct *mm, - pud_t *pudp, pud_t pud, unsigned int nr) + unsigned long addr, pud_t *pudp, pud_t pud, unsigned int nr) { } @@ -150,6 +150,6 @@ static inline void page_table_check_pte_clear_range(struct mm_struct *mm, #endif /* CONFIG_PAGE_TABLE_CHECK */ #define page_table_check_pmd_set(mm, pmdp, pmd) page_table_check_pmds_set(mm, pmdp, pmd, 1) -#define page_table_check_pud_set(mm, pudp, pud) page_table_check_puds_set(mm, pudp, pud, 1) +#define page_table_check_pud_set(mm, addr, pudp, pud) page_table_check_puds_set(mm, addr, pudp, pud, 1) #endif /* __LINUX_PAGE_TABLE_CHECK_H */ diff --git a/mm/page_table_check.c b/mm/page_table_check.c index 4eeca782b888..3c39e4375886 100644 --- a/mm/page_table_check.c +++ b/mm/page_table_check.c @@ -236,8 +236,8 @@ void __page_table_check_pmds_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd, } EXPORT_SYMBOL(__page_table_check_pmds_set); -void __page_table_check_puds_set(struct mm_struct *mm, pud_t *pudp, pud_t pud, - unsigned int nr) +void __page_table_check_puds_set(struct mm_struct *mm, unsigned long addr, + pud_t *pudp, pud_t pud, unsigned int nr) { unsigned long stride = PUD_SIZE >> PAGE_SHIFT; unsigned int i; -- 2.51.0 From ajd at linux.ibm.com Tue Sep 9 02:13:27 2025 From: ajd at linux.ibm.com (Andrew Donnellan) Date: Tue, 9 Sep 2025 19:13:27 +1000 Subject: [PATCH v17 04/12] mm/page_table_check: Reinstate address parameter in [__]page_table_check_pmd[s]_set() In-Reply-To: <20250909091335.183439-1-ajd@linux.ibm.com> References: <20250909091335.183439-1-ajd@linux.ibm.com> Message-ID: <20250909091335.183439-5-ajd@linux.ibm.com> From: Rohan McLure This reverts commit a3b837130b58 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pmd_set"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. Apply this to __page_table_check_pmds_set(), page_table_check_pmd_set(), and the page_table_check_pmd_set() wrapper macro. [ajd at linux.ibm.com: rebase on arm64 + riscv changes, update commit message] Signed-off-by: Rohan McLure Reviewed-by: Pasha Tatashin Acked-by: Ingo Molnar # x86 Acked-by: Alexandre Ghiti # riscv Signed-off-by: Andrew Donnellan --- v13: remove inaccurate comment on riscv in the commit message v14: rebase v15: rebase, amend commit message --- arch/arm64/include/asm/pgtable.h | 5 +++-- arch/riscv/include/asm/pgtable.h | 4 ++-- arch/x86/include/asm/pgtable.h | 4 ++-- include/linux/page_table_check.h | 12 ++++++------ mm/page_table_check.c | 4 ++-- 5 files changed, 15 insertions(+), 14 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 8070b653c409..9fe3af8b4cad 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -709,7 +709,8 @@ static inline void __set_ptes_anysz(struct mm_struct *mm, unsigned long addr, page_table_check_ptes_set(mm, ptep, pte, nr); break; case PMD_SIZE: - page_table_check_pmds_set(mm, (pmd_t *)ptep, pte_pmd(pte), nr); + page_table_check_pmds_set(mm, addr, (pmd_t *)ptep, + pte_pmd(pte), nr); break; #ifndef __PAGETABLE_PMD_FOLDED case PUD_SIZE: @@ -1514,7 +1515,7 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm, static inline pmd_t pmdp_establish(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t pmd) { - page_table_check_pmd_set(vma->vm_mm, pmdp, pmd); + page_table_check_pmd_set(vma->vm_mm, address, pmdp, pmd); return __pmd(xchg_relaxed(&pmd_val(*pmdp), pmd_val(pmd))); } #endif diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 3a113c837605..98e56d4ff840 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -805,7 +805,7 @@ static inline pud_t pud_mkspecial(pud_t pud) static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd) { - page_table_check_pmd_set(mm, pmdp, pmd); + page_table_check_pmd_set(mm, addr, pmdp, pmd); return __set_pte_at(mm, (pte_t *)pmdp, pmd_pte(pmd)); } @@ -876,7 +876,7 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm, static inline pmd_t pmdp_establish(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t pmd) { - page_table_check_pmd_set(vma->vm_mm, pmdp, pmd); + page_table_check_pmd_set(vma->vm_mm, address, pmdp, pmd); return __pmd(atomic_long_xchg((atomic_long_t *)pmdp, pmd_val(pmd))); } diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 0603793acb3a..8ee301b16b50 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1213,7 +1213,7 @@ static inline pud_t native_local_pudp_get_and_clear(pud_t *pudp) static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd) { - page_table_check_pmd_set(mm, pmdp, pmd); + page_table_check_pmd_set(mm, addr, pmdp, pmd); set_pmd(pmdp, pmd); } @@ -1356,7 +1356,7 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm, static inline pmd_t pmdp_establish(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t pmd) { - page_table_check_pmd_set(vma->vm_mm, pmdp, pmd); + page_table_check_pmd_set(vma->vm_mm, address, pmdp, pmd); if (IS_ENABLED(CONFIG_SMP)) { return xchg(pmdp, pmd); } else { diff --git a/include/linux/page_table_check.h b/include/linux/page_table_check.h index 0bf18b884a12..cf7c28d8d468 100644 --- a/include/linux/page_table_check.h +++ b/include/linux/page_table_check.h @@ -19,8 +19,8 @@ void __page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd); void __page_table_check_pud_clear(struct mm_struct *mm, pud_t pud); void __page_table_check_ptes_set(struct mm_struct *mm, pte_t *ptep, pte_t pte, unsigned int nr); -void __page_table_check_pmds_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd, - unsigned int nr); +void __page_table_check_pmds_set(struct mm_struct *mm, unsigned long addr, + pmd_t *pmdp, pmd_t pmd, unsigned int nr); void __page_table_check_puds_set(struct mm_struct *mm, unsigned long addr, pud_t *pudp, pud_t pud, unsigned int nr); void __page_table_check_pte_clear_range(struct mm_struct *mm, @@ -77,12 +77,12 @@ static inline void page_table_check_ptes_set(struct mm_struct *mm, } static inline void page_table_check_pmds_set(struct mm_struct *mm, - pmd_t *pmdp, pmd_t pmd, unsigned int nr) + unsigned long addr, pmd_t *pmdp, pmd_t pmd, unsigned int nr) { if (static_branch_likely(&page_table_check_disabled)) return; - __page_table_check_pmds_set(mm, pmdp, pmd, nr); + __page_table_check_pmds_set(mm, addr, pmdp, pmd, nr); } static inline void page_table_check_puds_set(struct mm_struct *mm, @@ -132,7 +132,7 @@ static inline void page_table_check_ptes_set(struct mm_struct *mm, } static inline void page_table_check_pmds_set(struct mm_struct *mm, - pmd_t *pmdp, pmd_t pmd, unsigned int nr) + unsigned long addr, pmd_t *pmdp, pmd_t pmd, unsigned int nr) { } @@ -149,7 +149,7 @@ static inline void page_table_check_pte_clear_range(struct mm_struct *mm, #endif /* CONFIG_PAGE_TABLE_CHECK */ -#define page_table_check_pmd_set(mm, pmdp, pmd) page_table_check_pmds_set(mm, pmdp, pmd, 1) +#define page_table_check_pmd_set(mm, addr, pmdp, pmd) page_table_check_pmds_set(mm, addr, pmdp, pmd, 1) #define page_table_check_pud_set(mm, addr, pudp, pud) page_table_check_puds_set(mm, addr, pudp, pud, 1) #endif /* __LINUX_PAGE_TABLE_CHECK_H */ diff --git a/mm/page_table_check.c b/mm/page_table_check.c index 3c39e4375886..09258f2ad93f 100644 --- a/mm/page_table_check.c +++ b/mm/page_table_check.c @@ -218,8 +218,8 @@ static inline void page_table_check_pmd_flags(pmd_t pmd) WARN_ON_ONCE(swap_cached_writable(pmd_to_swp_entry(pmd))); } -void __page_table_check_pmds_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd, - unsigned int nr) +void __page_table_check_pmds_set(struct mm_struct *mm, unsigned long addr, + pmd_t *pmdp, pmd_t pmd, unsigned int nr) { unsigned long stride = PMD_SIZE >> PAGE_SHIFT; unsigned int i; -- 2.51.0 From ajd at linux.ibm.com Tue Sep 9 02:13:28 2025 From: ajd at linux.ibm.com (Andrew Donnellan) Date: Tue, 9 Sep 2025 19:13:28 +1000 Subject: [PATCH v17 05/12] mm/page_table_check: Provide addr parameter to page_table_check_ptes_set() In-Reply-To: <20250909091335.183439-1-ajd@linux.ibm.com> References: <20250909091335.183439-1-ajd@linux.ibm.com> Message-ID: <20250909091335.183439-6-ajd@linux.ibm.com> From: Rohan McLure To provide support for powerpc platforms, provide an addr parameter to the __page_table_check_ptes_set() and page_table_check_ptes_set() routines. This parameter is needed on some powerpc platforms which do not encode whether a mapping is for user or kernel in the pte. On such platforms, this can be inferred from the addr parameter. [ajd at linux.ibm.com: rebase on arm64 + riscv changes, update commit message] Signed-off-by: Rohan McLure Reviewed-by: Pasha Tatashin Acked-by: Alexandre Ghiti # riscv Signed-off-by: Andrew Donnellan --- v15: rebase, amend commit message --- arch/arm64/include/asm/pgtable.h | 2 +- arch/riscv/include/asm/pgtable.h | 2 +- include/linux/page_table_check.h | 12 +++++++----- include/linux/pgtable.h | 2 +- mm/page_table_check.c | 4 ++-- 5 files changed, 12 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 9fe3af8b4cad..06ea6a4f300b 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -706,7 +706,7 @@ static inline void __set_ptes_anysz(struct mm_struct *mm, unsigned long addr, switch (pgsize) { case PAGE_SIZE: - page_table_check_ptes_set(mm, ptep, pte, nr); + page_table_check_ptes_set(mm, addr, ptep, pte, nr); break; case PMD_SIZE: page_table_check_pmds_set(mm, addr, (pmd_t *)ptep, diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 98e56d4ff840..bef95776504d 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -560,7 +560,7 @@ static inline void __set_pte_at(struct mm_struct *mm, pte_t *ptep, pte_t pteval) static inline void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pteval, unsigned int nr) { - page_table_check_ptes_set(mm, ptep, pteval, nr); + page_table_check_ptes_set(mm, addr, ptep, pteval, nr); for (;;) { __set_pte_at(mm, ptep, pteval); diff --git a/include/linux/page_table_check.h b/include/linux/page_table_check.h index cf7c28d8d468..66e109238416 100644 --- a/include/linux/page_table_check.h +++ b/include/linux/page_table_check.h @@ -17,8 +17,8 @@ void __page_table_check_zero(struct page *page, unsigned int order); void __page_table_check_pte_clear(struct mm_struct *mm, pte_t pte); void __page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd); void __page_table_check_pud_clear(struct mm_struct *mm, pud_t pud); -void __page_table_check_ptes_set(struct mm_struct *mm, pte_t *ptep, pte_t pte, - unsigned int nr); +void __page_table_check_ptes_set(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned int nr); void __page_table_check_pmds_set(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd, unsigned int nr); void __page_table_check_puds_set(struct mm_struct *mm, unsigned long addr, @@ -68,12 +68,13 @@ static inline void page_table_check_pud_clear(struct mm_struct *mm, pud_t pud) } static inline void page_table_check_ptes_set(struct mm_struct *mm, - pte_t *ptep, pte_t pte, unsigned int nr) + unsigned long addr, pte_t *ptep, + pte_t pte, unsigned int nr) { if (static_branch_likely(&page_table_check_disabled)) return; - __page_table_check_ptes_set(mm, ptep, pte, nr); + __page_table_check_ptes_set(mm, addr, ptep, pte, nr); } static inline void page_table_check_pmds_set(struct mm_struct *mm, @@ -127,7 +128,8 @@ static inline void page_table_check_pud_clear(struct mm_struct *mm, pud_t pud) } static inline void page_table_check_ptes_set(struct mm_struct *mm, - pte_t *ptep, pte_t pte, unsigned int nr) + unsigned long addr, pte_t *ptep, + pte_t pte, unsigned int nr) { } diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 94249e671a7e..a422fdf31ffb 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -289,7 +289,7 @@ static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr) static inline void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr) { - page_table_check_ptes_set(mm, ptep, pte, nr); + page_table_check_ptes_set(mm, addr, ptep, pte, nr); for (;;) { set_pte(ptep, pte); diff --git a/mm/page_table_check.c b/mm/page_table_check.c index 09258f2ad93f..0957767a2940 100644 --- a/mm/page_table_check.c +++ b/mm/page_table_check.c @@ -193,8 +193,8 @@ static inline void page_table_check_pte_flags(pte_t pte) WARN_ON_ONCE(swap_cached_writable(pte_to_swp_entry(pte))); } -void __page_table_check_ptes_set(struct mm_struct *mm, pte_t *ptep, pte_t pte, - unsigned int nr) +void __page_table_check_ptes_set(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned int nr) { unsigned int i; -- 2.51.0 From ajd at linux.ibm.com Tue Sep 9 02:13:29 2025 From: ajd at linux.ibm.com (Andrew Donnellan) Date: Tue, 9 Sep 2025 19:13:29 +1000 Subject: [PATCH v17 06/12] mm/page_table_check: Reinstate address parameter in [__]page_table_check_pud_clear() In-Reply-To: <20250909091335.183439-1-ajd@linux.ibm.com> References: <20250909091335.183439-1-ajd@linux.ibm.com> Message-ID: <20250909091335.183439-7-ajd@linux.ibm.com> From: Rohan McLure This reverts commit 931c38e16499 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pud_clear"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. [ajd at linux.ibm.com: rebase on arm64 changes] Signed-off-by: Rohan McLure Reviewed-by: Pasha Tatashin Acked-by: Ingo Molnar # x86 Signed-off-by: Andrew Donnellan --- v15: rebase v17: rebase, fix conflict with riscv patch --- arch/arm64/include/asm/pgtable.h | 2 +- arch/riscv/include/asm/pgtable.h | 2 +- arch/x86/include/asm/pgtable.h | 2 +- include/linux/page_table_check.h | 11 +++++++---- include/linux/pgtable.h | 2 +- mm/page_table_check.c | 5 +++-- 6 files changed, 14 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 06ea6a4f300b..81f06e5e32b2 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1374,7 +1374,7 @@ static inline pte_t __ptep_get_and_clear_anysz(struct mm_struct *mm, break; #ifndef __PAGETABLE_PMD_FOLDED case PUD_SIZE: - page_table_check_pud_clear(mm, pte_pud(pte)); + page_table_check_pud_clear(mm, address, pte_pud(pte)); break; #endif default: diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index bef95776504d..3d152933eb99 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -948,7 +948,7 @@ static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, { pud_t pud = __pud(atomic_long_xchg((atomic_long_t *)pudp, 0)); - page_table_check_pud_clear(mm, pud); + page_table_check_pud_clear(mm, address, pud); return pud; } diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 8ee301b16b50..8b45e0c41923 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1329,7 +1329,7 @@ static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, { pud_t pud = native_pudp_get_and_clear(pudp); - page_table_check_pud_clear(mm, pud); + page_table_check_pud_clear(mm, addr, pud); return pud; } diff --git a/include/linux/page_table_check.h b/include/linux/page_table_check.h index 66e109238416..808cc3a48c28 100644 --- a/include/linux/page_table_check.h +++ b/include/linux/page_table_check.h @@ -16,7 +16,8 @@ extern struct page_ext_operations page_table_check_ops; void __page_table_check_zero(struct page *page, unsigned int order); void __page_table_check_pte_clear(struct mm_struct *mm, pte_t pte); void __page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd); -void __page_table_check_pud_clear(struct mm_struct *mm, pud_t pud); +void __page_table_check_pud_clear(struct mm_struct *mm, unsigned long addr, + pud_t pud); void __page_table_check_ptes_set(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr); void __page_table_check_pmds_set(struct mm_struct *mm, unsigned long addr, @@ -59,12 +60,13 @@ static inline void page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd) __page_table_check_pmd_clear(mm, pmd); } -static inline void page_table_check_pud_clear(struct mm_struct *mm, pud_t pud) +static inline void page_table_check_pud_clear(struct mm_struct *mm, + unsigned long addr, pud_t pud) { if (static_branch_likely(&page_table_check_disabled)) return; - __page_table_check_pud_clear(mm, pud); + __page_table_check_pud_clear(mm, addr, pud); } static inline void page_table_check_ptes_set(struct mm_struct *mm, @@ -123,7 +125,8 @@ static inline void page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd) { } -static inline void page_table_check_pud_clear(struct mm_struct *mm, pud_t pud) +static inline void page_table_check_pud_clear(struct mm_struct *mm, + unsigned long addr, pud_t pud) { } diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index a422fdf31ffb..6d00d0948bf4 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -661,7 +661,7 @@ static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, pud_t pud = *pudp; pud_clear(pudp); - page_table_check_pud_clear(mm, pud); + page_table_check_pud_clear(mm, address, pud); return pud; } diff --git a/mm/page_table_check.c b/mm/page_table_check.c index 0957767a2940..bd1242087a35 100644 --- a/mm/page_table_check.c +++ b/mm/page_table_check.c @@ -167,7 +167,8 @@ void __page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd) } EXPORT_SYMBOL(__page_table_check_pmd_clear); -void __page_table_check_pud_clear(struct mm_struct *mm, pud_t pud) +void __page_table_check_pud_clear(struct mm_struct *mm, unsigned long addr, + pud_t pud) { if (&init_mm == mm) return; @@ -246,7 +247,7 @@ void __page_table_check_puds_set(struct mm_struct *mm, unsigned long addr, return; for (i = 0; i < nr; i++) - __page_table_check_pud_clear(mm, *(pudp + i)); + __page_table_check_pud_clear(mm, addr + PUD_SIZE * i, *(pudp + i)); if (pud_user_accessible_page(pud)) page_table_check_set(pud_pfn(pud), stride * nr, pud_write(pud)); } -- 2.51.0 From ajd at linux.ibm.com Tue Sep 9 02:13:30 2025 From: ajd at linux.ibm.com (Andrew Donnellan) Date: Tue, 9 Sep 2025 19:13:30 +1000 Subject: [PATCH v17 07/12] mm/page_table_check: Reinstate address parameter in [__]page_table_check_pmd_clear() In-Reply-To: <20250909091335.183439-1-ajd@linux.ibm.com> References: <20250909091335.183439-1-ajd@linux.ibm.com> Message-ID: <20250909091335.183439-8-ajd@linux.ibm.com> From: Rohan McLure This reverts commit 1831414cd729 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pmd_clear"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. [ajd at linux.ibm.com: rebase on arm64 changes] Signed-off-by: Rohan McLure Reviewed-by: Pasha Tatashin Acked-by: Ingo Molnar # x86 Acked-by: Alexandre Ghiti # riscv Signed-off-by: Andrew Donnellan --- v15: rebase --- arch/arm64/include/asm/pgtable.h | 2 +- arch/riscv/include/asm/pgtable.h | 2 +- arch/x86/include/asm/pgtable.h | 2 +- include/linux/page_table_check.h | 11 +++++++---- include/linux/pgtable.h | 2 +- mm/page_table_check.c | 5 +++-- 6 files changed, 14 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 81f06e5e32b2..dfcdf051b114 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1370,7 +1370,7 @@ static inline pte_t __ptep_get_and_clear_anysz(struct mm_struct *mm, page_table_check_pte_clear(mm, pte); break; case PMD_SIZE: - page_table_check_pmd_clear(mm, pte_pmd(pte)); + page_table_check_pmd_clear(mm, address, pte_pmd(pte)); break; #ifndef __PAGETABLE_PMD_FOLDED case PUD_SIZE: diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 3d152933eb99..d8bf210b57aa 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -860,7 +860,7 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, { pmd_t pmd = __pmd(atomic_long_xchg((atomic_long_t *)pmdp, 0)); - page_table_check_pmd_clear(mm, pmd); + page_table_check_pmd_clear(mm, address, pmd); return pmd; } diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 8b45e0c41923..b68bea15f32d 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1318,7 +1318,7 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long { pmd_t pmd = native_pmdp_get_and_clear(pmdp); - page_table_check_pmd_clear(mm, pmd); + page_table_check_pmd_clear(mm, addr, pmd); return pmd; } diff --git a/include/linux/page_table_check.h b/include/linux/page_table_check.h index 808cc3a48c28..3973b69ae294 100644 --- a/include/linux/page_table_check.h +++ b/include/linux/page_table_check.h @@ -15,7 +15,8 @@ extern struct page_ext_operations page_table_check_ops; void __page_table_check_zero(struct page *page, unsigned int order); void __page_table_check_pte_clear(struct mm_struct *mm, pte_t pte); -void __page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd); +void __page_table_check_pmd_clear(struct mm_struct *mm, unsigned long addr, + pmd_t pmd); void __page_table_check_pud_clear(struct mm_struct *mm, unsigned long addr, pud_t pud); void __page_table_check_ptes_set(struct mm_struct *mm, unsigned long addr, @@ -52,12 +53,13 @@ static inline void page_table_check_pte_clear(struct mm_struct *mm, pte_t pte) __page_table_check_pte_clear(mm, pte); } -static inline void page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd) +static inline void page_table_check_pmd_clear(struct mm_struct *mm, + unsigned long addr, pmd_t pmd) { if (static_branch_likely(&page_table_check_disabled)) return; - __page_table_check_pmd_clear(mm, pmd); + __page_table_check_pmd_clear(mm, addr, pmd); } static inline void page_table_check_pud_clear(struct mm_struct *mm, @@ -121,7 +123,8 @@ static inline void page_table_check_pte_clear(struct mm_struct *mm, pte_t pte) { } -static inline void page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd) +static inline void page_table_check_pmd_clear(struct mm_struct *mm, + unsigned long addr, pmd_t pmd) { } diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 6d00d0948bf4..46fe3daa4b18 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -648,7 +648,7 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, pmd_t pmd = *pmdp; pmd_clear(pmdp); - page_table_check_pmd_clear(mm, pmd); + page_table_check_pmd_clear(mm, address, pmd); return pmd; } diff --git a/mm/page_table_check.c b/mm/page_table_check.c index bd1242087a35..e8c26b616aed 100644 --- a/mm/page_table_check.c +++ b/mm/page_table_check.c @@ -156,7 +156,8 @@ void __page_table_check_pte_clear(struct mm_struct *mm, pte_t pte) } EXPORT_SYMBOL(__page_table_check_pte_clear); -void __page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd) +void __page_table_check_pmd_clear(struct mm_struct *mm, unsigned long addr, + pmd_t pmd) { if (&init_mm == mm) return; @@ -231,7 +232,7 @@ void __page_table_check_pmds_set(struct mm_struct *mm, unsigned long addr, page_table_check_pmd_flags(pmd); for (i = 0; i < nr; i++) - __page_table_check_pmd_clear(mm, *(pmdp + i)); + __page_table_check_pmd_clear(mm, addr + PMD_SIZE * i, *(pmdp + i)); if (pmd_user_accessible_page(pmd)) page_table_check_set(pmd_pfn(pmd), stride * nr, pmd_write(pmd)); } -- 2.51.0 From ajd at linux.ibm.com Tue Sep 9 02:13:31 2025 From: ajd at linux.ibm.com (Andrew Donnellan) Date: Tue, 9 Sep 2025 19:13:31 +1000 Subject: [PATCH v17 08/12] mm/page_table_check: Reinstate address parameter in [__]page_table_check_pte_clear() In-Reply-To: <20250909091335.183439-1-ajd@linux.ibm.com> References: <20250909091335.183439-1-ajd@linux.ibm.com> Message-ID: <20250909091335.183439-9-ajd@linux.ibm.com> From: Rohan McLure This reverts commit aa232204c468 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pte_clear"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. [ajd at linux.ibm.com: rebase, fix additional occurrence and loop handling] Signed-off-by: Rohan McLure Reviewed-by: Pasha Tatashin Acked-by: Ingo Molnar # x86 Acked-by: Alexandre Ghiti # riscv Signed-off-by: Andrew Donnellan --- v13: fix an additional occurrence v15: rebase, fix loop handling --- arch/arm64/include/asm/pgtable.h | 2 +- arch/riscv/include/asm/pgtable.h | 2 +- arch/x86/include/asm/pgtable.h | 4 ++-- include/linux/page_table_check.h | 11 +++++++---- include/linux/pgtable.h | 4 ++-- mm/page_table_check.c | 7 ++++--- 6 files changed, 17 insertions(+), 13 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index dfcdf051b114..2203ebac81d9 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1367,7 +1367,7 @@ static inline pte_t __ptep_get_and_clear_anysz(struct mm_struct *mm, switch (pgsize) { case PAGE_SIZE: - page_table_check_pte_clear(mm, pte); + page_table_check_pte_clear(mm, address, pte); break; case PMD_SIZE: page_table_check_pmd_clear(mm, address, pte_pmd(pte)); diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index d8bf210b57aa..65e8bc4ce45e 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -591,7 +591,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, { pte_t pte = __pte(atomic_long_xchg((atomic_long_t *)ptep, 0)); - page_table_check_pte_clear(mm, pte); + page_table_check_pte_clear(mm, address, pte); return pte; } diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index b68bea15f32d..63350b76c0c6 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1251,7 +1251,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { pte_t pte = native_ptep_get_and_clear(ptep); - page_table_check_pte_clear(mm, pte); + page_table_check_pte_clear(mm, addr, pte); return pte; } @@ -1267,7 +1267,7 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, * care about updates and native needs no locking */ pte = native_local_ptep_get_and_clear(ptep); - page_table_check_pte_clear(mm, pte); + page_table_check_pte_clear(mm, addr, pte); } else { pte = ptep_get_and_clear(mm, addr, ptep); } diff --git a/include/linux/page_table_check.h b/include/linux/page_table_check.h index 3973b69ae294..12268a32e8be 100644 --- a/include/linux/page_table_check.h +++ b/include/linux/page_table_check.h @@ -14,7 +14,8 @@ extern struct static_key_true page_table_check_disabled; extern struct page_ext_operations page_table_check_ops; void __page_table_check_zero(struct page *page, unsigned int order); -void __page_table_check_pte_clear(struct mm_struct *mm, pte_t pte); +void __page_table_check_pte_clear(struct mm_struct *mm, unsigned long addr, + pte_t pte); void __page_table_check_pmd_clear(struct mm_struct *mm, unsigned long addr, pmd_t pmd); void __page_table_check_pud_clear(struct mm_struct *mm, unsigned long addr, @@ -45,12 +46,13 @@ static inline void page_table_check_free(struct page *page, unsigned int order) __page_table_check_zero(page, order); } -static inline void page_table_check_pte_clear(struct mm_struct *mm, pte_t pte) +static inline void page_table_check_pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t pte) { if (static_branch_likely(&page_table_check_disabled)) return; - __page_table_check_pte_clear(mm, pte); + __page_table_check_pte_clear(mm, addr, pte); } static inline void page_table_check_pmd_clear(struct mm_struct *mm, @@ -119,7 +121,8 @@ static inline void page_table_check_free(struct page *page, unsigned int order) { } -static inline void page_table_check_pte_clear(struct mm_struct *mm, pte_t pte) +static inline void page_table_check_pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t pte) { } diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 46fe3daa4b18..e3f302ddb734 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -494,7 +494,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, { pte_t pte = ptep_get(ptep); pte_clear(mm, address, ptep); - page_table_check_pte_clear(mm, pte); + page_table_check_pte_clear(mm, address, pte); return pte; } #endif @@ -553,7 +553,7 @@ static inline void ptep_clear(struct mm_struct *mm, unsigned long addr, * No need for ptep_get_and_clear(): page table check doesn't care about * any bits that could have been set by HW concurrently. */ - page_table_check_pte_clear(mm, pte); + page_table_check_pte_clear(mm, addr, pte); } #ifdef CONFIG_GUP_GET_PXX_LOW_HIGH diff --git a/mm/page_table_check.c b/mm/page_table_check.c index e8c26b616aed..1c33439b9c0b 100644 --- a/mm/page_table_check.c +++ b/mm/page_table_check.c @@ -145,7 +145,8 @@ void __page_table_check_zero(struct page *page, unsigned int order) rcu_read_unlock(); } -void __page_table_check_pte_clear(struct mm_struct *mm, pte_t pte) +void __page_table_check_pte_clear(struct mm_struct *mm, unsigned long addr, + pte_t pte) { if (&init_mm == mm) return; @@ -206,7 +207,7 @@ void __page_table_check_ptes_set(struct mm_struct *mm, unsigned long addr, page_table_check_pte_flags(pte); for (i = 0; i < nr; i++) - __page_table_check_pte_clear(mm, ptep_get(ptep + i)); + __page_table_check_pte_clear(mm, addr + PAGE_SIZE * i, ptep_get(ptep + i)); if (pte_user_accessible_page(pte)) page_table_check_set(pte_pfn(pte), nr, pte_write(pte)); } @@ -268,7 +269,7 @@ void __page_table_check_pte_clear_range(struct mm_struct *mm, if (WARN_ON(!ptep)) return; for (i = 0; i < PTRS_PER_PTE; i++) { - __page_table_check_pte_clear(mm, ptep_get(ptep)); + __page_table_check_pte_clear(mm, addr, ptep_get(ptep)); addr += PAGE_SIZE; ptep++; } -- 2.51.0 From ajd at linux.ibm.com Tue Sep 9 02:13:32 2025 From: ajd at linux.ibm.com (Andrew Donnellan) Date: Tue, 9 Sep 2025 19:13:32 +1000 Subject: [PATCH v17 09/12] mm: Provide address parameter to p{te,md,ud}_user_accessible_page() In-Reply-To: <20250909091335.183439-1-ajd@linux.ibm.com> References: <20250909091335.183439-1-ajd@linux.ibm.com> Message-ID: <20250909091335.183439-10-ajd@linux.ibm.com> From: Rohan McLure On several powerpc platforms, a page table entry may not imply whether the relevant mapping is for userspace or kernelspace. Instead, such platforms infer this by the address which is being accessed. Add an additional address argument to each of these routines in order to provide support for page table check on powerpc. [ajd at linux.ibm.com: rebase on arm64 changes] Signed-off-by: Rohan McLure Reviewed-by: Pasha Tatashin Acked-by: Ingo Molnar # x86 Acked-by: Alexandre Ghiti # riscv Signed-off-by: Andrew Donnellan --- v15: rebase --- arch/arm64/include/asm/pgtable.h | 6 +++--- arch/riscv/include/asm/pgtable.h | 6 +++--- arch/x86/include/asm/pgtable.h | 6 +++--- mm/page_table_check.c | 12 ++++++------ 4 files changed, 15 insertions(+), 15 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 2203ebac81d9..254265e9a423 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1290,17 +1290,17 @@ static inline int pmdp_set_access_flags(struct vm_area_struct *vma, #endif #ifdef CONFIG_PAGE_TABLE_CHECK -static inline bool pte_user_accessible_page(pte_t pte) +static inline bool pte_user_accessible_page(pte_t pte, unsigned long addr) { return pte_valid(pte) && (pte_user(pte) || pte_user_exec(pte)); } -static inline bool pmd_user_accessible_page(pmd_t pmd) +static inline bool pmd_user_accessible_page(pmd_t pmd, unsigned long addr) { return pmd_valid(pmd) && !pmd_table(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)); } -static inline bool pud_user_accessible_page(pud_t pud) +static inline bool pud_user_accessible_page(pud_t pud, unsigned long addr) { return pud_valid(pud) && !pud_table(pud) && (pud_user(pud) || pud_user_exec(pud)); } diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 65e8bc4ce45e..507afb8e8ce6 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -817,17 +817,17 @@ static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, } #ifdef CONFIG_PAGE_TABLE_CHECK -static inline bool pte_user_accessible_page(pte_t pte) +static inline bool pte_user_accessible_page(pte_t pte, unsigned long addr) { return pte_present(pte) && pte_user(pte); } -static inline bool pmd_user_accessible_page(pmd_t pmd) +static inline bool pmd_user_accessible_page(pmd_t pmd, unsigned long addr) { return pmd_leaf(pmd) && pmd_user(pmd); } -static inline bool pud_user_accessible_page(pud_t pud) +static inline bool pud_user_accessible_page(pud_t pud, unsigned long addr) { return pud_leaf(pud) && pud_user(pud); } diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 63350b76c0c6..b977cebb5f44 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1679,17 +1679,17 @@ static inline bool arch_has_hw_nonleaf_pmd_young(void) #endif #ifdef CONFIG_PAGE_TABLE_CHECK -static inline bool pte_user_accessible_page(pte_t pte) +static inline bool pte_user_accessible_page(pte_t pte, unsigned long addr) { return (pte_val(pte) & _PAGE_PRESENT) && (pte_val(pte) & _PAGE_USER); } -static inline bool pmd_user_accessible_page(pmd_t pmd) +static inline bool pmd_user_accessible_page(pmd_t pmd, unsigned long addr) { return pmd_leaf(pmd) && (pmd_val(pmd) & _PAGE_PRESENT) && (pmd_val(pmd) & _PAGE_USER); } -static inline bool pud_user_accessible_page(pud_t pud) +static inline bool pud_user_accessible_page(pud_t pud, unsigned long addr) { return pud_leaf(pud) && (pud_val(pud) & _PAGE_PRESENT) && (pud_val(pud) & _PAGE_USER); } diff --git a/mm/page_table_check.c b/mm/page_table_check.c index 1c33439b9c0b..abc2232ceb39 100644 --- a/mm/page_table_check.c +++ b/mm/page_table_check.c @@ -151,7 +151,7 @@ void __page_table_check_pte_clear(struct mm_struct *mm, unsigned long addr, if (&init_mm == mm) return; - if (pte_user_accessible_page(pte)) { + if (pte_user_accessible_page(pte, addr)) { page_table_check_clear(pte_pfn(pte), PAGE_SIZE >> PAGE_SHIFT); } } @@ -163,7 +163,7 @@ void __page_table_check_pmd_clear(struct mm_struct *mm, unsigned long addr, if (&init_mm == mm) return; - if (pmd_user_accessible_page(pmd)) { + if (pmd_user_accessible_page(pmd, addr)) { page_table_check_clear(pmd_pfn(pmd), PMD_SIZE >> PAGE_SHIFT); } } @@ -175,7 +175,7 @@ void __page_table_check_pud_clear(struct mm_struct *mm, unsigned long addr, if (&init_mm == mm) return; - if (pud_user_accessible_page(pud)) { + if (pud_user_accessible_page(pud, addr)) { page_table_check_clear(pud_pfn(pud), PUD_SIZE >> PAGE_SHIFT); } } @@ -208,7 +208,7 @@ void __page_table_check_ptes_set(struct mm_struct *mm, unsigned long addr, for (i = 0; i < nr; i++) __page_table_check_pte_clear(mm, addr + PAGE_SIZE * i, ptep_get(ptep + i)); - if (pte_user_accessible_page(pte)) + if (pte_user_accessible_page(pte, addr)) page_table_check_set(pte_pfn(pte), nr, pte_write(pte)); } EXPORT_SYMBOL(__page_table_check_ptes_set); @@ -234,7 +234,7 @@ void __page_table_check_pmds_set(struct mm_struct *mm, unsigned long addr, for (i = 0; i < nr; i++) __page_table_check_pmd_clear(mm, addr + PMD_SIZE * i, *(pmdp + i)); - if (pmd_user_accessible_page(pmd)) + if (pmd_user_accessible_page(pmd, addr)) page_table_check_set(pmd_pfn(pmd), stride * nr, pmd_write(pmd)); } EXPORT_SYMBOL(__page_table_check_pmds_set); @@ -250,7 +250,7 @@ void __page_table_check_puds_set(struct mm_struct *mm, unsigned long addr, for (i = 0; i < nr; i++) __page_table_check_pud_clear(mm, addr + PUD_SIZE * i, *(pudp + i)); - if (pud_user_accessible_page(pud)) + if (pud_user_accessible_page(pud, addr)) page_table_check_set(pud_pfn(pud), stride * nr, pud_write(pud)); } EXPORT_SYMBOL(__page_table_check_puds_set); -- 2.51.0 From ajd at linux.ibm.com Tue Sep 9 02:13:33 2025 From: ajd at linux.ibm.com (Andrew Donnellan) Date: Tue, 9 Sep 2025 19:13:33 +1000 Subject: [PATCH v17 10/12] powerpc: mm: Implement *_user_accessible_page() for ptes In-Reply-To: <20250909091335.183439-1-ajd@linux.ibm.com> References: <20250909091335.183439-1-ajd@linux.ibm.com> Message-ID: <20250909091335.183439-11-ajd@linux.ibm.com> From: Rohan McLure Page table checking depends on architectures providing an implementation of p{te,md,ud}_user_accessible_page. With refactorisations made on powerpc/mm, the pte_access_permitted() and similar methods verify whether a userland page is accessible with the required permissions. Since page table checking is the only user of p{te,md,ud}_user_accessible_page(), implement these for all platforms, using some of the same preliminary checks taken by pte_access_permitted() on that platform. Since commit 8e9bd41e4ce1 ("powerpc/nohash: Replace pte_user() by pte_read()") pte_user() is no longer required to be present on all platforms as it may be equivalent to or implied by pte_read(). Hence implementations of pte_user_accessible_page() are specialised. [ajd at linux.ibm.com: rebase and fix commit message] Signed-off-by: Rohan McLure Reviewed-by: Pasha Tatashin Signed-off-by: Andrew Donnellan Acked-by: Madhavan Srinivasan --- arch/powerpc/include/asm/book3s/32/pgtable.h | 5 +++++ arch/powerpc/include/asm/book3s/64/pgtable.h | 17 +++++++++++++++++ arch/powerpc/include/asm/nohash/pgtable.h | 5 +++++ arch/powerpc/include/asm/pgtable.h | 8 ++++++++ 4 files changed, 35 insertions(+) diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index 92d21c6faf1e..b225967f85ea 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -437,6 +437,11 @@ static inline bool pte_access_permitted(pte_t pte, bool write) return true; } +static inline bool pte_user_accessible_page(pte_t pte, unsigned long addr) +{ + return pte_present(pte) && !is_kernel_addr(addr); +} + /* Conversion functions: convert a page and protection to a page entry, * and a page entry and page directory to the page they refer to. * diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index c19800365315..48f3a41317dd 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -539,6 +539,11 @@ static inline bool pte_access_permitted(pte_t pte, bool write) return arch_pte_access_permitted(pte_val(pte), write, 0); } +static inline bool pte_user_accessible_page(pte_t pte, unsigned long addr) +{ + return pte_present(pte) && pte_user(pte); +} + /* * Conversion functions: convert a page and protection to a page entry, * and a page entry and page directory to the page they refer to. @@ -1381,5 +1386,17 @@ static inline bool is_pte_rw_upgrade(unsigned long old_val, unsigned long new_va return false; } +#define pmd_user_accessible_page pmd_user_accessible_page +static inline bool pmd_user_accessible_page(pmd_t pmd, unsigned long addr) +{ + return pmd_leaf(pmd) && pte_user_accessible_page(pmd_pte(pmd), addr); +} + +#define pud_user_accessible_page pud_user_accessible_page +static inline bool pud_user_accessible_page(pud_t pud, unsigned long addr) +{ + return pud_leaf(pud) && pte_user_accessible_page(pud_pte(pud), addr); +} + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */ diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h index 7d6b9e5b286e..a8bc4f24beb1 100644 --- a/arch/powerpc/include/asm/nohash/pgtable.h +++ b/arch/powerpc/include/asm/nohash/pgtable.h @@ -243,6 +243,11 @@ static inline bool pte_access_permitted(pte_t pte, bool write) return true; } +static inline bool pte_user_accessible_page(pte_t pte, unsigned long addr) +{ + return pte_present(pte) && !is_kernel_addr(addr); +} + /* Conversion functions: convert a page and protection to a page entry, * and a page entry and page directory to the page they refer to. * diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index d8f944a5a037..fa43f663e615 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -202,6 +202,14 @@ static inline bool arch_supports_memmap_on_memory(unsigned long vmemmap_size) #endif /* CONFIG_PPC64 */ +#ifndef pmd_user_accessible_page +#define pmd_user_accessible_page(pmd, addr) false +#endif + +#ifndef pud_user_accessible_page +#define pud_user_accessible_page(pud, addr) false +#endif + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_PGTABLE_H */ -- 2.51.0 From ajd at linux.ibm.com Tue Sep 9 02:13:34 2025 From: ajd at linux.ibm.com (Andrew Donnellan) Date: Tue, 9 Sep 2025 19:13:34 +1000 Subject: [PATCH v17 11/12] powerpc: mm: Use set_pte_at_unchecked() for internal usages In-Reply-To: <20250909091335.183439-1-ajd@linux.ibm.com> References: <20250909091335.183439-1-ajd@linux.ibm.com> Message-ID: <20250909091335.183439-12-ajd@linux.ibm.com> From: Rohan McLure In the new set_ptes() API, set_pte_at() (a special case of set_ptes()) is intended to be instrumented by the page table check facility. There are however several other routines that constitute the API for setting page table entries, including set_pmd_at() among others. Such routines are themselves implemented in terms of set_ptes_at(). A future patch providing support for page table checking on powerpc must take care to avoid duplicate calls to page_table_check_p{te,md,ud}_set(). Allow for assignment of pte entries without instrumentation through the set_pte_at_unchecked() routine introduced in this patch. Cause API-facing routines that call set_pte_at() to instead call set_pte_at_unchecked(), which will remain uninstrumented by page table check. set_ptes() is itself implemented by calls to __set_pte_at(), so this eliminates redundant code. [ajd at linux.ibm.com: don't change to unchecked for early boot/kernel mappings] Signed-off-by: Rohan McLure Signed-off-by: Andrew Donnellan Acked-by: Madhavan Srinivasan --- v13: don't use the unchecked version for early-boot kernel mappings (Pasha) --- arch/powerpc/include/asm/pgtable.h | 2 ++ arch/powerpc/mm/book3s64/pgtable.c | 6 +++--- arch/powerpc/mm/book3s64/radix_pgtable.c | 6 +++--- arch/powerpc/mm/pgtable.c | 8 ++++++++ 4 files changed, 16 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index fa43f663e615..3983efb365cc 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -34,6 +34,8 @@ struct mm_struct; void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr); #define set_ptes set_ptes +void set_pte_at_unchecked(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte); #define update_mmu_cache(vma, addr, ptep) \ update_mmu_cache_range(NULL, vma, addr, ptep, 1) diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c index c9431ae7f78a..ff0c5a1988f8 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -127,7 +127,7 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr, WARN_ON(!(pmd_leaf(pmd))); #endif trace_hugepage_set_pmd(addr, pmd_val(pmd)); - return set_pte_at(mm, addr, pmdp_ptep(pmdp), pmd_pte(pmd)); + return set_pte_at_unchecked(mm, addr, pmdp_ptep(pmdp), pmd_pte(pmd)); } void set_pud_at(struct mm_struct *mm, unsigned long addr, @@ -144,7 +144,7 @@ void set_pud_at(struct mm_struct *mm, unsigned long addr, WARN_ON(!(pud_leaf(pud))); #endif trace_hugepage_set_pud(addr, pud_val(pud)); - return set_pte_at(mm, addr, pudp_ptep(pudp), pud_pte(pud)); + return set_pte_at_unchecked(mm, addr, pudp_ptep(pudp), pud_pte(pud)); } static void do_serialize(void *arg) @@ -549,7 +549,7 @@ void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, if (radix_enabled()) return radix__ptep_modify_prot_commit(vma, addr, ptep, old_pte, pte); - set_pte_at(vma->vm_mm, addr, ptep, pte); + set_pte_at_unchecked(vma->vm_mm, addr, ptep, pte); } #ifdef CONFIG_TRANSPARENT_HUGEPAGE diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c index 73977dbabcf2..b2541bf33d01 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -1606,7 +1606,7 @@ void radix__ptep_modify_prot_commit(struct vm_area_struct *vma, (atomic_read(&mm->context.copros) > 0)) radix__flush_tlb_page(vma, addr); - set_pte_at(mm, addr, ptep, pte); + set_pte_at_unchecked(mm, addr, ptep, pte); } int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) @@ -1617,7 +1617,7 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) if (!radix_enabled()) return 0; - set_pte_at(&init_mm, 0 /* radix unused */, ptep, new_pud); + set_pte_at_unchecked(&init_mm, 0 /* radix unused */, ptep, new_pud); return 1; } @@ -1664,7 +1664,7 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) if (!radix_enabled()) return 0; - set_pte_at(&init_mm, 0 /* radix unused */, ptep, new_pmd); + set_pte_at_unchecked(&init_mm, 0 /* radix unused */, ptep, new_pmd); return 1; } diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 56d7e8960e77..7b69cd16e011 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -224,6 +224,14 @@ void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, } } +void set_pte_at_unchecked(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep)); + pte = set_pte_filter(pte, addr); + __set_pte_at(mm, addr, ptep, pte, 0); +} + void unmap_kernel_page(unsigned long va) { pmd_t *pmdp = pmd_off_k(va); -- 2.51.0 From brgl at bgdev.pl Tue Sep 9 02:15:27 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:27 +0200 Subject: [PATCH 00/15] gpio: replace legacy bgpio_init() with its modernized alternative - part 4 Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Here's the final part of the generic GPIO chip conversions. Once all the existing users are switched to the new API, the final patch in the series removes bgpio_init(), moves the gpio-mmio fields out of struct gpio_chip and into struct gpio_generic_chip and adjusts gpio-mmio.c to the new situation. Down the line we could probably improve gpio-mmio.c by using lock guards and replacing the - now obsolete - "bgpio" prefix with "gpio_generic" or something similar but this series is already big as is so I'm leaving that for the future. Signed-off-by: Bartosz Golaszewski --- Bartosz Golaszewski (15): gpio: loongson1: allow building the module with COMPILE_TEST enabled gpio: loongson1: use new generic GPIO chip API gpio: hlwd: use new generic GPIO chip API gpio: ath79: use new generic GPIO chip API gpio: ath79: use the generic GPIO chip lock for IRQ handling gpio: xgene-sb: use generic GPIO chip register read and write APIs gpio: brcmstb: use new generic GPIO chip API gpio: mt7621: use new generic GPIO chip API gpio: mt7621: use the generic GPIO chip lock for IRQ handling gpio: menz127: use new generic GPIO chip API gpio: sifive: use new generic GPIO chip API gpio: spacemit-k1: use new generic GPIO chip API gpio: sodaville: use new generic GPIO chip API gpio: mmio: use new generic GPIO chip API gpio: move gpio-mmio-specific fields out of struct gpio_chip drivers/gpio/Kconfig | 2 +- drivers/gpio/TODO | 5 - drivers/gpio/gpio-ath79.c | 88 +++++----- drivers/gpio/gpio-brcmstb.c | 112 +++++++------ drivers/gpio/gpio-hlwd.c | 105 ++++++------ drivers/gpio/gpio-loongson1.c | 40 +++-- drivers/gpio/gpio-menz127.c | 31 ++-- drivers/gpio/gpio-mlxbf2.c | 2 +- drivers/gpio/gpio-mmio.c | 350 +++++++++++++++++++++------------------- drivers/gpio/gpio-mpc8xxx.c | 5 +- drivers/gpio/gpio-mt7621.c | 80 ++++----- drivers/gpio/gpio-sifive.c | 73 +++++---- drivers/gpio/gpio-sodaville.c | 20 ++- drivers/gpio/gpio-spacemit-k1.c | 28 +++- drivers/gpio/gpio-xgene-sb.c | 5 +- include/linux/gpio/driver.h | 44 ----- include/linux/gpio/generic.h | 67 +++++--- 17 files changed, 548 insertions(+), 509 deletions(-) --- base-commit: 65dd046ef55861190ecde44c6d9fcde54b9fb77d change-id: 20250904-gpio-mmio-gpio-conv-part4-5e1f772ba724 Best regards, -- Bartosz Golaszewski From brgl at bgdev.pl Tue Sep 9 02:15:28 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:28 +0200 Subject: [PATCH 01/15] gpio: loongson1: allow building the module with COMPILE_TEST enabled In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-1-9f723dc3524a@linaro.org> From: Bartosz Golaszewski Increase build coverage by allowing the module to be built with COMPILE_TEST=y. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index 31f8bab4b09df1640c892f4d839860edaa2ad6a3..09cb144f076661e0a2069016175d0692257fb156 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -885,7 +885,7 @@ config GPIO_ZYNQMP_MODEPIN config GPIO_LOONGSON1 tristate "Loongson1 GPIO support" - depends on MACH_LOONGSON32 + depends on MACH_LOONGSON32 || COMPILE_TEST select GPIO_GENERIC help Say Y or M here to support GPIO on Loongson1 SoCs. -- 2.48.1 From brgl at bgdev.pl Tue Sep 9 02:15:29 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:29 +0200 Subject: [PATCH 02/15] gpio: loongson1: use new generic GPIO chip API In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-2-9f723dc3524a@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-loongson1.c | 40 +++++++++++++++++++++++----------------- 1 file changed, 23 insertions(+), 17 deletions(-) diff --git a/drivers/gpio/gpio-loongson1.c b/drivers/gpio/gpio-loongson1.c index 6ca3b969db4df231517d021a7b4b5e3ddcf626f7..bb0e101e920889522aa4bbc69e5d6d2c49586cee 100644 --- a/drivers/gpio/gpio-loongson1.c +++ b/drivers/gpio/gpio-loongson1.c @@ -5,10 +5,11 @@ * Copyright (C) 2015-2023 Keguang Zhang */ +#include #include #include +#include #include -#include /* Loongson 1 GPIO Register Definitions */ #define GPIO_CFG 0x0 @@ -17,19 +18,18 @@ #define GPIO_OUTPUT 0x30 struct ls1x_gpio_chip { - struct gpio_chip gc; + struct gpio_generic_chip chip; void __iomem *reg_base; }; static int ls1x_gpio_request(struct gpio_chip *gc, unsigned int offset) { struct ls1x_gpio_chip *ls1x_gc = gpiochip_get_data(gc); - unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + guard(gpio_generic_lock_irqsave)(&ls1x_gc->chip); + __raw_writel(__raw_readl(ls1x_gc->reg_base + GPIO_CFG) | BIT(offset), ls1x_gc->reg_base + GPIO_CFG); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); return 0; } @@ -37,16 +37,16 @@ static int ls1x_gpio_request(struct gpio_chip *gc, unsigned int offset) static void ls1x_gpio_free(struct gpio_chip *gc, unsigned int offset) { struct ls1x_gpio_chip *ls1x_gc = gpiochip_get_data(gc); - unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + guard(gpio_generic_lock_irqsave)(&ls1x_gc->chip); + __raw_writel(__raw_readl(ls1x_gc->reg_base + GPIO_CFG) & ~BIT(offset), ls1x_gc->reg_base + GPIO_CFG); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); } static int ls1x_gpio_probe(struct platform_device *pdev) { + struct gpio_generic_chip_config config; struct device *dev = &pdev->dev; struct ls1x_gpio_chip *ls1x_gc; int ret; @@ -59,29 +59,35 @@ static int ls1x_gpio_probe(struct platform_device *pdev) if (IS_ERR(ls1x_gc->reg_base)) return PTR_ERR(ls1x_gc->reg_base); - ret = bgpio_init(&ls1x_gc->gc, dev, 4, ls1x_gc->reg_base + GPIO_DATA, - ls1x_gc->reg_base + GPIO_OUTPUT, NULL, - NULL, ls1x_gc->reg_base + GPIO_DIR, 0); + config = (typeof(config)){ + .dev = dev, + .sz = 4, + .dat = ls1x_gc->reg_base + GPIO_DATA, + .set = ls1x_gc->reg_base + GPIO_OUTPUT, + .dirin = ls1x_gc->reg_base + GPIO_DIR, + }; + + ret = gpio_generic_chip_init(&ls1x_gc->chip, &config); if (ret) goto err; - ls1x_gc->gc.owner = THIS_MODULE; - ls1x_gc->gc.request = ls1x_gpio_request; - ls1x_gc->gc.free = ls1x_gpio_free; + ls1x_gc->chip.gc.owner = THIS_MODULE; + ls1x_gc->chip.gc.request = ls1x_gpio_request; + ls1x_gc->chip.gc.free = ls1x_gpio_free; /* * Clear ngpio to let gpiolib get the correct number * by reading ngpios property */ - ls1x_gc->gc.ngpio = 0; + ls1x_gc->chip.gc.ngpio = 0; - ret = devm_gpiochip_add_data(dev, &ls1x_gc->gc, ls1x_gc); + ret = devm_gpiochip_add_data(dev, &ls1x_gc->chip.gc, ls1x_gc); if (ret) goto err; platform_set_drvdata(pdev, ls1x_gc); dev_info(dev, "GPIO controller registered with %d pins\n", - ls1x_gc->gc.ngpio); + ls1x_gc->chip.gc.ngpio); return 0; err: -- 2.48.1 From brgl at bgdev.pl Tue Sep 9 02:15:30 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:30 +0200 Subject: [PATCH 03/15] gpio: hlwd: use new generic GPIO chip API In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-3-9f723dc3524a@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-hlwd.c | 105 ++++++++++++++++++++++++----------------------- 1 file changed, 54 insertions(+), 51 deletions(-) diff --git a/drivers/gpio/gpio-hlwd.c b/drivers/gpio/gpio-hlwd.c index 0580f6712bea9a4d510bd332645982adbc5c6a32..137f17c9ff221d524a4281fdbf91d8f27ee24182 100644 --- a/drivers/gpio/gpio-hlwd.c +++ b/drivers/gpio/gpio-hlwd.c @@ -6,6 +6,7 @@ // Nintendo Wii (Hollywood) GPIO driver #include +#include #include #include #include @@ -48,7 +49,7 @@ #define HW_GPIO_OWNER 0x3c struct hlwd_gpio { - struct gpio_chip gpioc; + struct gpio_generic_chip gpioc; struct device *dev; void __iomem *regs; int irq; @@ -61,45 +62,44 @@ static void hlwd_gpio_irqhandler(struct irq_desc *desc) struct hlwd_gpio *hlwd = gpiochip_get_data(irq_desc_get_handler_data(desc)); struct irq_chip *chip = irq_desc_get_chip(desc); - unsigned long flags; unsigned long pending; int hwirq; u32 emulated_pending; - raw_spin_lock_irqsave(&hlwd->gpioc.bgpio_lock, flags); - pending = ioread32be(hlwd->regs + HW_GPIOB_INTFLAG); - pending &= ioread32be(hlwd->regs + HW_GPIOB_INTMASK); + scoped_guard(gpio_generic_lock_irqsave, &hlwd->gpioc) { + pending = ioread32be(hlwd->regs + HW_GPIOB_INTFLAG); + pending &= ioread32be(hlwd->regs + HW_GPIOB_INTMASK); - /* Treat interrupts due to edge trigger emulation separately */ - emulated_pending = hlwd->edge_emulation & pending; - pending &= ~emulated_pending; - if (emulated_pending) { - u32 level, rising, falling; + /* Treat interrupts due to edge trigger emulation separately */ + emulated_pending = hlwd->edge_emulation & pending; + pending &= ~emulated_pending; + if (emulated_pending) { + u32 level, rising, falling; - level = ioread32be(hlwd->regs + HW_GPIOB_INTLVL); - rising = level & emulated_pending; - falling = ~level & emulated_pending; + level = ioread32be(hlwd->regs + HW_GPIOB_INTLVL); + rising = level & emulated_pending; + falling = ~level & emulated_pending; - /* Invert the levels */ - iowrite32be(level ^ emulated_pending, - hlwd->regs + HW_GPIOB_INTLVL); + /* Invert the levels */ + iowrite32be(level ^ emulated_pending, + hlwd->regs + HW_GPIOB_INTLVL); - /* Ack all emulated-edge interrupts */ - iowrite32be(emulated_pending, hlwd->regs + HW_GPIOB_INTFLAG); + /* Ack all emulated-edge interrupts */ + iowrite32be(emulated_pending, hlwd->regs + HW_GPIOB_INTFLAG); - /* Signal interrupts only on the correct edge */ - rising &= hlwd->rising_edge; - falling &= hlwd->falling_edge; + /* Signal interrupts only on the correct edge */ + rising &= hlwd->rising_edge; + falling &= hlwd->falling_edge; - /* Mark emulated interrupts as pending */ - pending |= rising | falling; + /* Mark emulated interrupts as pending */ + pending |= rising | falling; + } } - raw_spin_unlock_irqrestore(&hlwd->gpioc.bgpio_lock, flags); chained_irq_enter(chip, desc); for_each_set_bit(hwirq, &pending, 32) - generic_handle_domain_irq(hlwd->gpioc.irq.domain, hwirq); + generic_handle_domain_irq(hlwd->gpioc.gc.irq.domain, hwirq); chained_irq_exit(chip, desc); } @@ -116,30 +116,29 @@ static void hlwd_gpio_irq_mask(struct irq_data *data) { struct hlwd_gpio *hlwd = gpiochip_get_data(irq_data_get_irq_chip_data(data)); - unsigned long flags; u32 mask; - raw_spin_lock_irqsave(&hlwd->gpioc.bgpio_lock, flags); - mask = ioread32be(hlwd->regs + HW_GPIOB_INTMASK); - mask &= ~BIT(data->hwirq); - iowrite32be(mask, hlwd->regs + HW_GPIOB_INTMASK); - raw_spin_unlock_irqrestore(&hlwd->gpioc.bgpio_lock, flags); - gpiochip_disable_irq(&hlwd->gpioc, irqd_to_hwirq(data)); + scoped_guard(gpio_generic_lock_irqsave, &hlwd->gpioc) { + mask = ioread32be(hlwd->regs + HW_GPIOB_INTMASK); + mask &= ~BIT(data->hwirq); + iowrite32be(mask, hlwd->regs + HW_GPIOB_INTMASK); + } + gpiochip_disable_irq(&hlwd->gpioc.gc, irqd_to_hwirq(data)); } static void hlwd_gpio_irq_unmask(struct irq_data *data) { struct hlwd_gpio *hlwd = gpiochip_get_data(irq_data_get_irq_chip_data(data)); - unsigned long flags; u32 mask; - gpiochip_enable_irq(&hlwd->gpioc, irqd_to_hwirq(data)); - raw_spin_lock_irqsave(&hlwd->gpioc.bgpio_lock, flags); + gpiochip_enable_irq(&hlwd->gpioc.gc, irqd_to_hwirq(data)); + + guard(gpio_generic_lock_irqsave)(&hlwd->gpioc); + mask = ioread32be(hlwd->regs + HW_GPIOB_INTMASK); mask |= BIT(data->hwirq); iowrite32be(mask, hlwd->regs + HW_GPIOB_INTMASK); - raw_spin_unlock_irqrestore(&hlwd->gpioc.bgpio_lock, flags); } static void hlwd_gpio_irq_enable(struct irq_data *data) @@ -173,10 +172,9 @@ static int hlwd_gpio_irq_set_type(struct irq_data *data, unsigned int flow_type) { struct hlwd_gpio *hlwd = gpiochip_get_data(irq_data_get_irq_chip_data(data)); - unsigned long flags; u32 level; - raw_spin_lock_irqsave(&hlwd->gpioc.bgpio_lock, flags); + guard(gpio_generic_lock_irqsave)(&hlwd->gpioc); hlwd->edge_emulation &= ~BIT(data->hwirq); @@ -197,11 +195,9 @@ static int hlwd_gpio_irq_set_type(struct irq_data *data, unsigned int flow_type) hlwd_gpio_irq_setup_emulation(hlwd, data->hwirq, flow_type); break; default: - raw_spin_unlock_irqrestore(&hlwd->gpioc.bgpio_lock, flags); return -EINVAL; } - raw_spin_unlock_irqrestore(&hlwd->gpioc.bgpio_lock, flags); return 0; } @@ -225,6 +221,7 @@ static const struct irq_chip hlwd_gpio_irq_chip = { static int hlwd_gpio_probe(struct platform_device *pdev) { + struct gpio_generic_chip_config config; struct hlwd_gpio *hlwd; u32 ngpios; int res; @@ -244,25 +241,31 @@ static int hlwd_gpio_probe(struct platform_device *pdev) * systems where the AHBPROT memory firewall hasn't been configured to * permit PPC access to HW_GPIO_*. * - * Note that this has to happen before bgpio_init reads the - * HW_GPIOB_OUT and HW_GPIOB_DIR, because otherwise it reads the wrong - * values. + * Note that this has to happen before gpio_generic_chip_init() reads + * the HW_GPIOB_OUT and HW_GPIOB_DIR, because otherwise it reads the + * wrong values. */ iowrite32be(0xffffffff, hlwd->regs + HW_GPIO_OWNER); - res = bgpio_init(&hlwd->gpioc, &pdev->dev, 4, - hlwd->regs + HW_GPIOB_IN, hlwd->regs + HW_GPIOB_OUT, - NULL, hlwd->regs + HW_GPIOB_DIR, NULL, - BGPIOF_BIG_ENDIAN_BYTE_ORDER); + config = (typeof(config)){ + .dev = &pdev->dev, + .sz = 4, + .dat = hlwd->regs + HW_GPIOB_IN, + .set = hlwd->regs + HW_GPIOB_OUT, + .dirout = hlwd->regs + HW_GPIOB_DIR, + .flags = BGPIOF_BIG_ENDIAN_BYTE_ORDER, + }; + + res = gpio_generic_chip_init(&hlwd->gpioc, &config); if (res < 0) { - dev_warn(&pdev->dev, "bgpio_init failed: %d\n", res); + dev_warn(&pdev->dev, "failed to initialize generic GPIO chip: %d\n", res); return res; } res = of_property_read_u32(pdev->dev.of_node, "ngpios", &ngpios); if (res) ngpios = 32; - hlwd->gpioc.ngpio = ngpios; + hlwd->gpioc.gc.ngpio = ngpios; /* Mask and ack all interrupts */ iowrite32be(0, hlwd->regs + HW_GPIOB_INTMASK); @@ -282,7 +285,7 @@ static int hlwd_gpio_probe(struct platform_device *pdev) return hlwd->irq; } - girq = &hlwd->gpioc.irq; + girq = &hlwd->gpioc.gc.irq; gpio_irq_chip_set_chip(girq, &hlwd_gpio_irq_chip); girq->parent_handler = hlwd_gpio_irqhandler; girq->num_parents = 1; @@ -296,7 +299,7 @@ static int hlwd_gpio_probe(struct platform_device *pdev) girq->handler = handle_level_irq; } - return devm_gpiochip_add_data(&pdev->dev, &hlwd->gpioc, hlwd); + return devm_gpiochip_add_data(&pdev->dev, &hlwd->gpioc.gc, hlwd); } static const struct of_device_id hlwd_gpio_match[] = { -- 2.48.1 From brgl at bgdev.pl Tue Sep 9 02:15:31 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:31 +0200 Subject: [PATCH 04/15] gpio: ath79: use new generic GPIO chip API In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-4-9f723dc3524a@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-ath79.c | 39 ++++++++++++++++++++++++--------------- 1 file changed, 24 insertions(+), 15 deletions(-) diff --git a/drivers/gpio/gpio-ath79.c b/drivers/gpio/gpio-ath79.c index de4cc12e5e0399abcef61a89c8c91a1b203d20fb..1b2a59ddbec4088c95fb766277bb94ffff8692b2 100644 --- a/drivers/gpio/gpio-ath79.c +++ b/drivers/gpio/gpio-ath79.c @@ -10,6 +10,7 @@ #include #include +#include #include #include #include @@ -28,7 +29,7 @@ #define AR71XX_GPIO_REG_INT_MASK 0x24 struct ath79_gpio_ctrl { - struct gpio_chip gc; + struct gpio_generic_chip chip; void __iomem *base; raw_spinlock_t lock; unsigned long both_edges; @@ -37,8 +38,9 @@ struct ath79_gpio_ctrl { static struct ath79_gpio_ctrl *irq_data_to_ath79_gpio(struct irq_data *data) { struct gpio_chip *gc = irq_data_get_irq_chip_data(data); + struct gpio_generic_chip *gen_gc = to_gpio_generic_chip(gc); - return container_of(gc, struct ath79_gpio_ctrl, gc); + return container_of(gen_gc, struct ath79_gpio_ctrl, chip); } static u32 ath79_gpio_read(struct ath79_gpio_ctrl *ctrl, unsigned reg) @@ -72,7 +74,7 @@ static void ath79_gpio_irq_unmask(struct irq_data *data) u32 mask = BIT(irqd_to_hwirq(data)); unsigned long flags; - gpiochip_enable_irq(&ctrl->gc, irqd_to_hwirq(data)); + gpiochip_enable_irq(&ctrl->chip.gc, irqd_to_hwirq(data)); raw_spin_lock_irqsave(&ctrl->lock, flags); ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, mask); raw_spin_unlock_irqrestore(&ctrl->lock, flags); @@ -87,7 +89,7 @@ static void ath79_gpio_irq_mask(struct irq_data *data) raw_spin_lock_irqsave(&ctrl->lock, flags); ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, 0); raw_spin_unlock_irqrestore(&ctrl->lock, flags); - gpiochip_disable_irq(&ctrl->gc, irqd_to_hwirq(data)); + gpiochip_disable_irq(&ctrl->chip.gc, irqd_to_hwirq(data)); } static void ath79_gpio_irq_enable(struct irq_data *data) @@ -187,8 +189,9 @@ static void ath79_gpio_irq_handler(struct irq_desc *desc) { struct gpio_chip *gc = irq_desc_get_handler_data(desc); struct irq_chip *irqchip = irq_desc_get_chip(desc); + struct gpio_generic_chip *gen_gc = to_gpio_generic_chip(gc); struct ath79_gpio_ctrl *ctrl = - container_of(gc, struct ath79_gpio_ctrl, gc); + container_of(gen_gc, struct ath79_gpio_ctrl, chip); unsigned long flags, pending; u32 both_edges, state; int irq; @@ -224,6 +227,7 @@ MODULE_DEVICE_TABLE(of, ath79_gpio_of_match); static int ath79_gpio_probe(struct platform_device *pdev) { + struct gpio_generic_chip_config config; struct device *dev = &pdev->dev; struct ath79_gpio_ctrl *ctrl; struct gpio_irq_chip *girq; @@ -253,21 +257,26 @@ static int ath79_gpio_probe(struct platform_device *pdev) return PTR_ERR(ctrl->base); raw_spin_lock_init(&ctrl->lock); - err = bgpio_init(&ctrl->gc, dev, 4, - ctrl->base + AR71XX_GPIO_REG_IN, - ctrl->base + AR71XX_GPIO_REG_SET, - ctrl->base + AR71XX_GPIO_REG_CLEAR, - oe_inverted ? NULL : ctrl->base + AR71XX_GPIO_REG_OE, - oe_inverted ? ctrl->base + AR71XX_GPIO_REG_OE : NULL, - 0); + + config = (typeof(config)){ + .dev = dev, + .sz = 4, + .dat = ctrl->base + AR71XX_GPIO_REG_IN, + .set = ctrl->base + AR71XX_GPIO_REG_SET, + .clr = ctrl->base + AR71XX_GPIO_REG_CLEAR, + .dirout = oe_inverted ? NULL : ctrl->base + AR71XX_GPIO_REG_OE, + .dirin = oe_inverted ? ctrl->base + AR71XX_GPIO_REG_OE : NULL, + }; + + err = gpio_generic_chip_init(&ctrl->chip, &config); if (err) { - dev_err(dev, "bgpio_init failed\n"); + dev_err(dev, "failed to initialize generic GPIO chip\n"); return err; } /* Optional interrupt setup */ if (device_property_read_bool(dev, "interrupt-controller")) { - girq = &ctrl->gc.irq; + girq = &ctrl->chip.gc.irq; gpio_irq_chip_set_chip(girq, &ath79_gpio_irqchip); girq->parent_handler = ath79_gpio_irq_handler; girq->num_parents = 1; @@ -280,7 +289,7 @@ static int ath79_gpio_probe(struct platform_device *pdev) girq->handler = handle_simple_irq; } - return devm_gpiochip_add_data(dev, &ctrl->gc, ctrl); + return devm_gpiochip_add_data(dev, &ctrl->chip.gc, ctrl); } static struct platform_driver ath79_gpio_driver = { -- 2.48.1 From brgl at bgdev.pl Tue Sep 9 02:15:32 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:32 +0200 Subject: [PATCH 05/15] gpio: ath79: use the generic GPIO chip lock for IRQ handling In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-5-9f723dc3524a@linaro.org> From: Bartosz Golaszewski This driver uses its own raw spinlock in interrupt routines while the generic GPIO chip callbacks use a separate one. This is, of course, racy so use the fact that the lock in generic GPIO chip is also a raw spinlock and convert the interrupt handling functions in this module to using the provided generic GPIO chip locking API. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-ath79.c | 51 ++++++++++++++++++----------------------------- 1 file changed, 19 insertions(+), 32 deletions(-) diff --git a/drivers/gpio/gpio-ath79.c b/drivers/gpio/gpio-ath79.c index 1b2a59ddbec4088c95fb766277bb94ffff8692b2..75c9e3bf7db1b5fbfede960dd1c0b3a76d2ecb8f 100644 --- a/drivers/gpio/gpio-ath79.c +++ b/drivers/gpio/gpio-ath79.c @@ -31,7 +31,6 @@ struct ath79_gpio_ctrl { struct gpio_generic_chip chip; void __iomem *base; - raw_spinlock_t lock; unsigned long both_edges; }; @@ -72,23 +71,22 @@ static void ath79_gpio_irq_unmask(struct irq_data *data) { struct ath79_gpio_ctrl *ctrl = irq_data_to_ath79_gpio(data); u32 mask = BIT(irqd_to_hwirq(data)); - unsigned long flags; gpiochip_enable_irq(&ctrl->chip.gc, irqd_to_hwirq(data)); - raw_spin_lock_irqsave(&ctrl->lock, flags); + + guard(gpio_generic_lock_irqsave)(&ctrl->chip); + ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, mask); - raw_spin_unlock_irqrestore(&ctrl->lock, flags); } static void ath79_gpio_irq_mask(struct irq_data *data) { struct ath79_gpio_ctrl *ctrl = irq_data_to_ath79_gpio(data); u32 mask = BIT(irqd_to_hwirq(data)); - unsigned long flags; - raw_spin_lock_irqsave(&ctrl->lock, flags); - ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, 0); - raw_spin_unlock_irqrestore(&ctrl->lock, flags); + scoped_guard(gpio_generic_lock_irqsave, &ctrl->chip) + ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, 0); + gpiochip_disable_irq(&ctrl->chip.gc, irqd_to_hwirq(data)); } @@ -96,24 +94,20 @@ static void ath79_gpio_irq_enable(struct irq_data *data) { struct ath79_gpio_ctrl *ctrl = irq_data_to_ath79_gpio(data); u32 mask = BIT(irqd_to_hwirq(data)); - unsigned long flags; - raw_spin_lock_irqsave(&ctrl->lock, flags); + guard(gpio_generic_lock_irqsave)(&ctrl->chip); ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_ENABLE, mask, mask); ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, mask); - raw_spin_unlock_irqrestore(&ctrl->lock, flags); } static void ath79_gpio_irq_disable(struct irq_data *data) { struct ath79_gpio_ctrl *ctrl = irq_data_to_ath79_gpio(data); u32 mask = BIT(irqd_to_hwirq(data)); - unsigned long flags; - raw_spin_lock_irqsave(&ctrl->lock, flags); + guard(gpio_generic_lock_irqsave)(&ctrl->chip); ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, 0); ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_ENABLE, mask, 0); - raw_spin_unlock_irqrestore(&ctrl->lock, flags); } static int ath79_gpio_irq_set_type(struct irq_data *data, @@ -122,7 +116,6 @@ static int ath79_gpio_irq_set_type(struct irq_data *data, struct ath79_gpio_ctrl *ctrl = irq_data_to_ath79_gpio(data); u32 mask = BIT(irqd_to_hwirq(data)); u32 type = 0, polarity = 0; - unsigned long flags; bool disabled; switch (flow_type) { @@ -144,7 +137,7 @@ static int ath79_gpio_irq_set_type(struct irq_data *data, return -EINVAL; } - raw_spin_lock_irqsave(&ctrl->lock, flags); + guard(gpio_generic_lock_irqsave)(&ctrl->chip); if (flow_type == IRQ_TYPE_EDGE_BOTH) { ctrl->both_edges |= mask; @@ -169,8 +162,6 @@ static int ath79_gpio_irq_set_type(struct irq_data *data, ath79_gpio_update_bits( ctrl, AR71XX_GPIO_REG_INT_ENABLE, mask, mask); - raw_spin_unlock_irqrestore(&ctrl->lock, flags); - return 0; } @@ -192,26 +183,24 @@ static void ath79_gpio_irq_handler(struct irq_desc *desc) struct gpio_generic_chip *gen_gc = to_gpio_generic_chip(gc); struct ath79_gpio_ctrl *ctrl = container_of(gen_gc, struct ath79_gpio_ctrl, chip); - unsigned long flags, pending; + unsigned long pending; u32 both_edges, state; int irq; chained_irq_enter(irqchip, desc); - raw_spin_lock_irqsave(&ctrl->lock, flags); + scoped_guard(gpio_generic_lock_irqsave, &ctrl->chip) { + pending = ath79_gpio_read(ctrl, AR71XX_GPIO_REG_INT_PENDING); - pending = ath79_gpio_read(ctrl, AR71XX_GPIO_REG_INT_PENDING); - - /* Update the polarity of the both edges irqs */ - both_edges = ctrl->both_edges & pending; - if (both_edges) { - state = ath79_gpio_read(ctrl, AR71XX_GPIO_REG_IN); - ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_POLARITY, - both_edges, ~state); + /* Update the polarity of the both edges irqs */ + both_edges = ctrl->both_edges & pending; + if (both_edges) { + state = ath79_gpio_read(ctrl, AR71XX_GPIO_REG_IN); + ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_POLARITY, + both_edges, ~state); + } } - raw_spin_unlock_irqrestore(&ctrl->lock, flags); - for_each_set_bit(irq, &pending, gc->ngpio) generic_handle_domain_irq(gc->irq.domain, irq); @@ -256,8 +245,6 @@ static int ath79_gpio_probe(struct platform_device *pdev) if (IS_ERR(ctrl->base)) return PTR_ERR(ctrl->base); - raw_spin_lock_init(&ctrl->lock); - config = (typeof(config)){ .dev = dev, .sz = 4, -- 2.48.1 From brgl at bgdev.pl Tue Sep 9 02:15:33 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:33 +0200 Subject: [PATCH 06/15] gpio: xgene-sb: use generic GPIO chip register read and write APIs In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-6-9f723dc3524a@linaro.org> From: Bartosz Golaszewski The conversion to using the modernized generic GPIO chip API was incomplete without also converting the direct calls to write/read_reg() callbacks. Use the provided wrappers from linux/gpio/generic.h. Fixes: 38d98a822c14 ("gpio: xgene-sb: use new generic GPIO chip API") Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-xgene-sb.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpio/gpio-xgene-sb.c b/drivers/gpio/gpio-xgene-sb.c index c559a89aadf7a77bd9cce7e5a7d4a2b241307812..62545e358b6c4b1cab25e1135cb24ccc3e955078 100644 --- a/drivers/gpio/gpio-xgene-sb.c +++ b/drivers/gpio/gpio-xgene-sb.c @@ -63,14 +63,15 @@ struct xgene_gpio_sb { static void xgene_gpio_set_bit(struct gpio_chip *gc, void __iomem *reg, u32 gpio, int val) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); u32 data; - data = gc->read_reg(reg); + data = gpio_generic_read_reg(chip, reg); if (val) data |= GPIO_MASK(gpio); else data &= ~GPIO_MASK(gpio); - gc->write_reg(reg, data); + gpio_generic_write_reg(chip, reg, data); } static int xgene_gpio_sb_irq_set_type(struct irq_data *d, unsigned int type) -- 2.48.1 From brgl at bgdev.pl Tue Sep 9 02:15:34 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:34 +0200 Subject: [PATCH 07/15] gpio: brcmstb: use new generic GPIO chip API In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-7-9f723dc3524a@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-brcmstb.c | 112 ++++++++++++++++++++++++-------------------- 1 file changed, 60 insertions(+), 52 deletions(-) diff --git a/drivers/gpio/gpio-brcmstb.c b/drivers/gpio/gpio-brcmstb.c index e29a9589b3ccbd17d10f6671088dca3e76537927..d03ff4ed9ef4c9d75f3e8c9c6fcb39bc577bcb79 100644 --- a/drivers/gpio/gpio-brcmstb.c +++ b/drivers/gpio/gpio-brcmstb.c @@ -3,6 +3,7 @@ #include #include +#include #include #include #include @@ -37,7 +38,7 @@ enum gio_reg_index { struct brcmstb_gpio_bank { struct list_head node; int id; - struct gpio_chip gc; + struct gpio_generic_chip chip; struct brcmstb_gpio_priv *parent_priv; u32 width; u32 wake_active; @@ -72,19 +73,18 @@ __brcmstb_gpio_get_active_irqs(struct brcmstb_gpio_bank *bank) { void __iomem *reg_base = bank->parent_priv->reg_base; - return bank->gc.read_reg(reg_base + GIO_STAT(bank->id)) & - bank->gc.read_reg(reg_base + GIO_MASK(bank->id)); + return gpio_generic_read_reg(&bank->chip, reg_base + GIO_STAT(bank->id)) & + gpio_generic_read_reg(&bank->chip, reg_base + GIO_MASK(bank->id)); } static unsigned long brcmstb_gpio_get_active_irqs(struct brcmstb_gpio_bank *bank) { unsigned long status; - unsigned long flags; - raw_spin_lock_irqsave(&bank->gc.bgpio_lock, flags); + guard(gpio_generic_lock_irqsave)(&bank->chip); + status = __brcmstb_gpio_get_active_irqs(bank); - raw_spin_unlock_irqrestore(&bank->gc.bgpio_lock, flags); return status; } @@ -92,26 +92,26 @@ brcmstb_gpio_get_active_irqs(struct brcmstb_gpio_bank *bank) static int brcmstb_gpio_hwirq_to_offset(irq_hw_number_t hwirq, struct brcmstb_gpio_bank *bank) { - return hwirq - bank->gc.offset; + return hwirq - bank->chip.gc.offset; } static void brcmstb_gpio_set_imask(struct brcmstb_gpio_bank *bank, unsigned int hwirq, bool enable) { - struct gpio_chip *gc = &bank->gc; struct brcmstb_gpio_priv *priv = bank->parent_priv; u32 mask = BIT(brcmstb_gpio_hwirq_to_offset(hwirq, bank)); u32 imask; - unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); - imask = gc->read_reg(priv->reg_base + GIO_MASK(bank->id)); + guard(gpio_generic_lock_irqsave)(&bank->chip); + + imask = gpio_generic_read_reg(&bank->chip, + priv->reg_base + GIO_MASK(bank->id)); if (enable) imask |= mask; else imask &= ~mask; - gc->write_reg(priv->reg_base + GIO_MASK(bank->id), imask); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_MASK(bank->id), imask); } static int brcmstb_gpio_to_irq(struct gpio_chip *gc, unsigned offset) @@ -150,7 +150,8 @@ static void brcmstb_gpio_irq_ack(struct irq_data *d) struct brcmstb_gpio_priv *priv = bank->parent_priv; u32 mask = BIT(brcmstb_gpio_hwirq_to_offset(d->hwirq, bank)); - gc->write_reg(priv->reg_base + GIO_STAT(bank->id), mask); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_STAT(bank->id), mask); } static int brcmstb_gpio_irq_set_type(struct irq_data *d, unsigned int type) @@ -162,7 +163,6 @@ static int brcmstb_gpio_irq_set_type(struct irq_data *d, unsigned int type) u32 edge_insensitive, iedge_insensitive; u32 edge_config, iedge_config; u32 level, ilevel; - unsigned long flags; switch (type) { case IRQ_TYPE_LEVEL_LOW: @@ -194,23 +194,25 @@ static int brcmstb_gpio_irq_set_type(struct irq_data *d, unsigned int type) return -EINVAL; } - raw_spin_lock_irqsave(&bank->gc.bgpio_lock, flags); + guard(gpio_generic_lock_irqsave)(&bank->chip); - iedge_config = bank->gc.read_reg(priv->reg_base + - GIO_EC(bank->id)) & ~mask; - iedge_insensitive = bank->gc.read_reg(priv->reg_base + - GIO_EI(bank->id)) & ~mask; - ilevel = bank->gc.read_reg(priv->reg_base + - GIO_LEVEL(bank->id)) & ~mask; + iedge_config = gpio_generic_read_reg(&bank->chip, + priv->reg_base + GIO_EC(bank->id)) & ~mask; + iedge_insensitive = gpio_generic_read_reg(&bank->chip, + priv->reg_base + GIO_EI(bank->id)) & ~mask; + ilevel = gpio_generic_read_reg(&bank->chip, + priv->reg_base + GIO_LEVEL(bank->id)) & ~mask; - bank->gc.write_reg(priv->reg_base + GIO_EC(bank->id), - iedge_config | edge_config); - bank->gc.write_reg(priv->reg_base + GIO_EI(bank->id), - iedge_insensitive | edge_insensitive); - bank->gc.write_reg(priv->reg_base + GIO_LEVEL(bank->id), - ilevel | level); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_EC(bank->id), + iedge_config | edge_config); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_EI(bank->id), + iedge_insensitive | edge_insensitive); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_LEVEL(bank->id), + ilevel | level); - raw_spin_unlock_irqrestore(&bank->gc.bgpio_lock, flags); return 0; } @@ -263,7 +265,7 @@ static void brcmstb_gpio_irq_bank_handler(struct brcmstb_gpio_bank *bank) { struct brcmstb_gpio_priv *priv = bank->parent_priv; struct irq_domain *domain = priv->irq_domain; - int hwbase = bank->gc.offset; + int hwbase = bank->chip.gc.offset; unsigned long status; while ((status = brcmstb_gpio_get_active_irqs(bank))) { @@ -303,7 +305,7 @@ static struct brcmstb_gpio_bank *brcmstb_gpio_hwirq_to_bank( /* banks are in descending order */ list_for_each_entry_reverse(bank, &priv->bank_list, node) { - i += bank->gc.ngpio; + i += bank->chip.gc.ngpio; if (hwirq < i) return bank; } @@ -332,7 +334,7 @@ static int brcmstb_gpio_irq_map(struct irq_domain *d, unsigned int irq, dev_dbg(&pdev->dev, "Mapping irq %d for gpio line %d (bank %d)\n", irq, (int)hwirq, bank->id); - ret = irq_set_chip_data(irq, &bank->gc); + ret = irq_set_chip_data(irq, &bank->chip.gc); if (ret < 0) return ret; irq_set_lockdep_class(irq, &brcmstb_gpio_irq_lock_class, @@ -394,7 +396,7 @@ static void brcmstb_gpio_remove(struct platform_device *pdev) * more important to actually perform all of the steps. */ list_for_each_entry(bank, &priv->bank_list, node) - gpiochip_remove(&bank->gc); + gpiochip_remove(&bank->chip.gc); } static int brcmstb_gpio_of_xlate(struct gpio_chip *gc, @@ -412,7 +414,7 @@ static int brcmstb_gpio_of_xlate(struct gpio_chip *gc, if (WARN_ON(gpiospec->args_count < gc->of_gpio_n_cells)) return -EINVAL; - offset = gpiospec->args[0] - bank->gc.offset; + offset = gpiospec->args[0] - bank->chip.gc.offset; if (offset >= gc->ngpio || offset < 0) return -EINVAL; @@ -493,19 +495,17 @@ static int brcmstb_gpio_irq_setup(struct platform_device *pdev, static void brcmstb_gpio_bank_save(struct brcmstb_gpio_priv *priv, struct brcmstb_gpio_bank *bank) { - struct gpio_chip *gc = &bank->gc; unsigned int i; for (i = 0; i < GIO_REG_STAT; i++) - bank->saved_regs[i] = gc->read_reg(priv->reg_base + - GIO_BANK_OFF(bank->id, i)); + bank->saved_regs[i] = gpio_generic_read_reg(&bank->chip, + priv->reg_base + GIO_BANK_OFF(bank->id, i)); } static void brcmstb_gpio_quiesce(struct device *dev, bool save) { struct brcmstb_gpio_priv *priv = dev_get_drvdata(dev); struct brcmstb_gpio_bank *bank; - struct gpio_chip *gc; u32 imask; /* disable non-wake interrupt */ @@ -513,8 +513,6 @@ static void brcmstb_gpio_quiesce(struct device *dev, bool save) disable_irq(priv->parent_irq); list_for_each_entry(bank, &priv->bank_list, node) { - gc = &bank->gc; - if (save) brcmstb_gpio_bank_save(priv, bank); @@ -523,8 +521,9 @@ static void brcmstb_gpio_quiesce(struct device *dev, bool save) imask = bank->wake_active; else imask = 0; - gc->write_reg(priv->reg_base + GIO_MASK(bank->id), - imask); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_MASK(bank->id), + imask); } } @@ -538,12 +537,12 @@ static void brcmstb_gpio_shutdown(struct platform_device *pdev) static void brcmstb_gpio_bank_restore(struct brcmstb_gpio_priv *priv, struct brcmstb_gpio_bank *bank) { - struct gpio_chip *gc = &bank->gc; unsigned int i; for (i = 0; i < GIO_REG_STAT; i++) - gc->write_reg(priv->reg_base + GIO_BANK_OFF(bank->id, i), - bank->saved_regs[i]); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_BANK_OFF(bank->id, i), + bank->saved_regs[i]); } static int brcmstb_gpio_suspend(struct device *dev) @@ -585,6 +584,7 @@ static const struct dev_pm_ops brcmstb_gpio_pm_ops = { static int brcmstb_gpio_probe(struct platform_device *pdev) { + struct gpio_generic_chip_config config; struct device *dev = &pdev->dev; struct device_node *np = dev->of_node; void __iomem *reg_base; @@ -665,17 +665,24 @@ static int brcmstb_gpio_probe(struct platform_device *pdev) bank->width = bank_width; } + gc = &bank->chip.gc; + /* * Regs are 4 bytes wide, have data reg, no set/clear regs, * and direction bits have 0 = output and 1 = input */ - gc = &bank->gc; - err = bgpio_init(gc, dev, 4, - reg_base + GIO_DATA(bank->id), - NULL, NULL, NULL, - reg_base + GIO_IODIR(bank->id), flags); + + config = (typeof(config)){ + .dev = dev, + .sz = 4, + .dat = reg_base + GIO_DATA(bank->id), + .dirin = reg_base + GIO_IODIR(bank->id), + .flags = flags, + }; + + err = gpio_generic_chip_init(&bank->chip, &config); if (err) { - dev_err(dev, "bgpio_init() failed\n"); + dev_err(dev, "failed to initialize generic GPIO chip\n"); goto fail; } @@ -700,7 +707,8 @@ static int brcmstb_gpio_probe(struct platform_device *pdev) * be retained from S5 cold boot */ need_wakeup_event |= !!__brcmstb_gpio_get_active_irqs(bank); - gc->write_reg(reg_base + GIO_MASK(bank->id), 0); + gpio_generic_write_reg(&bank->chip, + reg_base + GIO_MASK(bank->id), 0); err = gpiochip_add_data(gc, bank); if (err) { -- 2.48.1 From brgl at bgdev.pl Tue Sep 9 02:15:35 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:35 +0200 Subject: [PATCH 08/15] gpio: mt7621: use new generic GPIO chip API In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-8-9f723dc3524a@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-mt7621.c | 51 +++++++++++++++++++++++++++++----------------- 1 file changed, 32 insertions(+), 19 deletions(-) diff --git a/drivers/gpio/gpio-mt7621.c b/drivers/gpio/gpio-mt7621.c index 93facbebb80efadbdd3fb4500e0db14936287f1a..ed444cc8bc7c2b921be6588ce850027a2e3088b4 100644 --- a/drivers/gpio/gpio-mt7621.c +++ b/drivers/gpio/gpio-mt7621.c @@ -6,6 +6,7 @@ #include #include +#include #include #include #include @@ -30,7 +31,7 @@ struct mtk_gc { struct irq_chip irq_chip; - struct gpio_chip chip; + struct gpio_generic_chip chip; spinlock_t lock; int bank; u32 rising; @@ -59,27 +60,29 @@ struct mtk { static inline struct mtk_gc * to_mediatek_gpio(struct gpio_chip *chip) { - return container_of(chip, struct mtk_gc, chip); + struct gpio_generic_chip *gen_gc = to_gpio_generic_chip(chip); + + return container_of(gen_gc, struct mtk_gc, chip); } static inline void mtk_gpio_w32(struct mtk_gc *rg, u32 offset, u32 val) { - struct gpio_chip *gc = &rg->chip; + struct gpio_chip *gc = &rg->chip.gc; struct mtk *mtk = gpiochip_get_data(gc); offset = (rg->bank * GPIO_BANK_STRIDE) + offset; - gc->write_reg(mtk->base + offset, val); + gpio_generic_write_reg(&rg->chip, mtk->base + offset, val); } static inline u32 mtk_gpio_r32(struct mtk_gc *rg, u32 offset) { - struct gpio_chip *gc = &rg->chip; + struct gpio_chip *gc = &rg->chip.gc; struct mtk *mtk = gpiochip_get_data(gc); offset = (rg->bank * GPIO_BANK_STRIDE) + offset; - return gc->read_reg(mtk->base + offset); + return gpio_generic_read_reg(&rg->chip, mtk->base + offset); } static irqreturn_t @@ -220,6 +223,7 @@ static const struct irq_chip mt7621_irq_chip = { static int mediatek_gpio_bank_probe(struct device *dev, int bank) { + struct gpio_generic_chip_config config; struct mtk *mtk = dev_get_drvdata(dev); struct mtk_gc *rg; void __iomem *dat, *set, *ctrl, *diro; @@ -236,21 +240,30 @@ mediatek_gpio_bank_probe(struct device *dev, int bank) ctrl = mtk->base + GPIO_REG_DCLR + (rg->bank * GPIO_BANK_STRIDE); diro = mtk->base + GPIO_REG_CTRL + (rg->bank * GPIO_BANK_STRIDE); - ret = bgpio_init(&rg->chip, dev, 4, dat, set, ctrl, diro, NULL, - BGPIOF_NO_SET_ON_INPUT); + config = (typeof(config)){ + .dev = dev, + .sz = 4, + .dat = dat, + .set = set, + .clr = ctrl, + .dirout = diro, + .flags = BGPIOF_NO_SET_ON_INPUT, + }; + + ret = gpio_generic_chip_init(&rg->chip, &config); if (ret) { - dev_err(dev, "bgpio_init() failed\n"); + dev_err(dev, "failed to initialize generic GPIO chip\n"); return ret; } - rg->chip.of_gpio_n_cells = 2; - rg->chip.of_xlate = mediatek_gpio_xlate; - rg->chip.label = devm_kasprintf(dev, GFP_KERNEL, "%s-bank%d", + rg->chip.gc.of_gpio_n_cells = 2; + rg->chip.gc.of_xlate = mediatek_gpio_xlate; + rg->chip.gc.label = devm_kasprintf(dev, GFP_KERNEL, "%s-bank%d", dev_name(dev), bank); - if (!rg->chip.label) + if (!rg->chip.gc.label) return -ENOMEM; - rg->chip.offset = bank * MTK_BANK_WIDTH; + rg->chip.gc.offset = bank * MTK_BANK_WIDTH; if (mtk->gpio_irq) { struct gpio_irq_chip *girq; @@ -261,7 +274,7 @@ mediatek_gpio_bank_probe(struct device *dev, int bank) */ ret = devm_request_irq(dev, mtk->gpio_irq, mediatek_gpio_irq_handler, IRQF_SHARED, - rg->chip.label, &rg->chip); + rg->chip.gc.label, &rg->chip.gc); if (ret) { dev_err(dev, "Error requesting IRQ %d: %d\n", @@ -269,7 +282,7 @@ mediatek_gpio_bank_probe(struct device *dev, int bank) return ret; } - girq = &rg->chip.irq; + girq = &rg->chip.gc.irq; gpio_irq_chip_set_chip(girq, &mt7621_irq_chip); /* This will let us handle the parent IRQ in the driver */ girq->parent_handler = NULL; @@ -279,17 +292,17 @@ mediatek_gpio_bank_probe(struct device *dev, int bank) girq->handler = handle_simple_irq; } - ret = devm_gpiochip_add_data(dev, &rg->chip, mtk); + ret = devm_gpiochip_add_data(dev, &rg->chip.gc, mtk); if (ret < 0) { dev_err(dev, "Could not register gpio %d, ret=%d\n", - rg->chip.ngpio, ret); + rg->chip.gc.ngpio, ret); return ret; } /* set polarity to low for all gpios */ mtk_gpio_w32(rg, GPIO_REG_POL, 0); - dev_info(dev, "registering %d gpios\n", rg->chip.ngpio); + dev_info(dev, "registering %d gpios\n", rg->chip.gc.ngpio); return 0; } -- 2.48.1 From brgl at bgdev.pl Tue Sep 9 02:15:36 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:36 +0200 Subject: [PATCH 09/15] gpio: mt7621: use the generic GPIO chip lock for IRQ handling In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-9-9f723dc3524a@linaro.org> From: Bartosz Golaszewski This driver uses its own spinlock in interrupt routines while the generic GPIO chip callbacks use a separate one. This is, of course, racy so use the fact that the lock in generic GPIO chip is also a spinlock and convert the interrupt handling functions in this module to using the provided generic GPIO chip locking API. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-mt7621.c | 29 ++++++++++++----------------- 1 file changed, 12 insertions(+), 17 deletions(-) diff --git a/drivers/gpio/gpio-mt7621.c b/drivers/gpio/gpio-mt7621.c index ed444cc8bc7c2b921be6588ce850027a2e3088b4..31736f12ca100ef615d4aa4b2c968db6b58ef4e1 100644 --- a/drivers/gpio/gpio-mt7621.c +++ b/drivers/gpio/gpio-mt7621.c @@ -11,7 +11,6 @@ #include #include #include -#include #define MTK_BANK_CNT 3 #define MTK_BANK_WIDTH 32 @@ -32,7 +31,6 @@ struct mtk_gc { struct irq_chip irq_chip; struct gpio_generic_chip chip; - spinlock_t lock; int bank; u32 rising; u32 falling; @@ -111,12 +109,12 @@ mediatek_gpio_irq_unmask(struct irq_data *d) struct gpio_chip *gc = irq_data_get_irq_chip_data(d); struct mtk_gc *rg = to_mediatek_gpio(gc); int pin = d->hwirq; - unsigned long flags; u32 rise, fall, high, low; gpiochip_enable_irq(gc, d->hwirq); - spin_lock_irqsave(&rg->lock, flags); + guard(gpio_generic_lock_irqsave)(&rg->chip); + rise = mtk_gpio_r32(rg, GPIO_REG_REDGE); fall = mtk_gpio_r32(rg, GPIO_REG_FEDGE); high = mtk_gpio_r32(rg, GPIO_REG_HLVL); @@ -125,7 +123,6 @@ mediatek_gpio_irq_unmask(struct irq_data *d) mtk_gpio_w32(rg, GPIO_REG_FEDGE, fall | (BIT(pin) & rg->falling)); mtk_gpio_w32(rg, GPIO_REG_HLVL, high | (BIT(pin) & rg->hlevel)); mtk_gpio_w32(rg, GPIO_REG_LLVL, low | (BIT(pin) & rg->llevel)); - spin_unlock_irqrestore(&rg->lock, flags); } static void @@ -134,19 +131,18 @@ mediatek_gpio_irq_mask(struct irq_data *d) struct gpio_chip *gc = irq_data_get_irq_chip_data(d); struct mtk_gc *rg = to_mediatek_gpio(gc); int pin = d->hwirq; - unsigned long flags; u32 rise, fall, high, low; - spin_lock_irqsave(&rg->lock, flags); - rise = mtk_gpio_r32(rg, GPIO_REG_REDGE); - fall = mtk_gpio_r32(rg, GPIO_REG_FEDGE); - high = mtk_gpio_r32(rg, GPIO_REG_HLVL); - low = mtk_gpio_r32(rg, GPIO_REG_LLVL); - mtk_gpio_w32(rg, GPIO_REG_FEDGE, fall & ~BIT(pin)); - mtk_gpio_w32(rg, GPIO_REG_REDGE, rise & ~BIT(pin)); - mtk_gpio_w32(rg, GPIO_REG_HLVL, high & ~BIT(pin)); - mtk_gpio_w32(rg, GPIO_REG_LLVL, low & ~BIT(pin)); - spin_unlock_irqrestore(&rg->lock, flags); + scoped_guard(gpio_generic_lock_irqsave, &rg->chip) { + rise = mtk_gpio_r32(rg, GPIO_REG_REDGE); + fall = mtk_gpio_r32(rg, GPIO_REG_FEDGE); + high = mtk_gpio_r32(rg, GPIO_REG_HLVL); + low = mtk_gpio_r32(rg, GPIO_REG_LLVL); + mtk_gpio_w32(rg, GPIO_REG_FEDGE, fall & ~BIT(pin)); + mtk_gpio_w32(rg, GPIO_REG_REDGE, rise & ~BIT(pin)); + mtk_gpio_w32(rg, GPIO_REG_HLVL, high & ~BIT(pin)); + mtk_gpio_w32(rg, GPIO_REG_LLVL, low & ~BIT(pin)); + } gpiochip_disable_irq(gc, d->hwirq); } @@ -232,7 +228,6 @@ mediatek_gpio_bank_probe(struct device *dev, int bank) rg = &mtk->gc_map[bank]; memset(rg, 0, sizeof(*rg)); - spin_lock_init(&rg->lock); rg->bank = bank; dat = mtk->base + GPIO_REG_DATA + (rg->bank * GPIO_BANK_STRIDE); -- 2.48.1 From brgl at bgdev.pl Tue Sep 9 02:15:37 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:37 +0200 Subject: [PATCH 10/15] gpio: menz127: use new generic GPIO chip API In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-10-9f723dc3524a@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-menz127.c | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/drivers/gpio/gpio-menz127.c b/drivers/gpio/gpio-menz127.c index ebe5da4933bce730c70f83c1c0f86fc4a4cc9906..27cdbc36a5fd468dfcf9e0029651c8d22e176f56 100644 --- a/drivers/gpio/gpio-menz127.c +++ b/drivers/gpio/gpio-menz127.c @@ -12,6 +12,7 @@ #include #include #include +#include #define MEN_Z127_CTRL 0x00 #define MEN_Z127_PSR 0x04 @@ -30,7 +31,7 @@ (db <= MEN_Z127_DB_MAX_US)) struct men_z127_gpio { - struct gpio_chip gc; + struct gpio_generic_chip chip; void __iomem *reg_base; struct resource *mem; }; @@ -64,7 +65,7 @@ static int men_z127_debounce(struct gpio_chip *gc, unsigned gpio, debounce /= 50; } - raw_spin_lock(&gc->bgpio_lock); + guard(gpio_generic_lock)(&priv->chip); db_en = readl(priv->reg_base + MEN_Z127_DBER); @@ -79,8 +80,6 @@ static int men_z127_debounce(struct gpio_chip *gc, unsigned gpio, writel(db_en, priv->reg_base + MEN_Z127_DBER); writel(db_cnt, priv->reg_base + GPIO_TO_DBCNT_REG(gpio)); - raw_spin_unlock(&gc->bgpio_lock); - return 0; } @@ -91,7 +90,8 @@ static int men_z127_set_single_ended(struct gpio_chip *gc, struct men_z127_gpio *priv = gpiochip_get_data(gc); u32 od_en; - raw_spin_lock(&gc->bgpio_lock); + guard(gpio_generic_lock)(&priv->chip); + od_en = readl(priv->reg_base + MEN_Z127_ODER); if (param == PIN_CONFIG_DRIVE_OPEN_DRAIN) @@ -101,7 +101,6 @@ static int men_z127_set_single_ended(struct gpio_chip *gc, od_en &= ~BIT(offset); writel(od_en, priv->reg_base + MEN_Z127_ODER); - raw_spin_unlock(&gc->bgpio_lock); return 0; } @@ -137,6 +136,7 @@ static void men_z127_release_mem(void *data) static int men_z127_probe(struct mcb_device *mdev, const struct mcb_device_id *id) { + struct gpio_generic_chip_config config; struct men_z127_gpio *men_z127_gpio; struct device *dev = &mdev->dev; int ret; @@ -163,18 +163,21 @@ static int men_z127_probe(struct mcb_device *mdev, mcb_set_drvdata(mdev, men_z127_gpio); - ret = bgpio_init(&men_z127_gpio->gc, &mdev->dev, 4, - men_z127_gpio->reg_base + MEN_Z127_PSR, - men_z127_gpio->reg_base + MEN_Z127_CTRL, - NULL, - men_z127_gpio->reg_base + MEN_Z127_GPIODR, - NULL, 0); + config = (typeof(config)){ + .dev = &mdev->dev, + .sz = 4, + .dat = men_z127_gpio->reg_base + MEN_Z127_PSR, + .set = men_z127_gpio->reg_base + MEN_Z127_CTRL, + .dirout = men_z127_gpio->reg_base + MEN_Z127_GPIODR, + }; + + ret = gpio_generic_chip_init(&men_z127_gpio->chip, &config); if (ret) return ret; - men_z127_gpio->gc.set_config = men_z127_set_config; + men_z127_gpio->chip.gc.set_config = men_z127_set_config; - ret = devm_gpiochip_add_data(dev, &men_z127_gpio->gc, men_z127_gpio); + ret = devm_gpiochip_add_data(dev, &men_z127_gpio->chip.gc, men_z127_gpio); if (ret) return dev_err_probe(dev, ret, "failed to register MEN 16Z127 GPIO controller"); -- 2.48.1 From brgl at bgdev.pl Tue Sep 9 02:15:38 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:38 +0200 Subject: [PATCH 11/15] gpio: sifive: use new generic GPIO chip API In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-11-9f723dc3524a@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-sifive.c | 73 ++++++++++++++++++++++++---------------------- 1 file changed, 38 insertions(+), 35 deletions(-) diff --git a/drivers/gpio/gpio-sifive.c b/drivers/gpio/gpio-sifive.c index 98ef975c44d9a6c9238605cfd1d5820fd70a66ca..07ee5c0b4f8023978c76873f25119d5dc21d996c 100644 --- a/drivers/gpio/gpio-sifive.c +++ b/drivers/gpio/gpio-sifive.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include @@ -32,7 +33,7 @@ struct sifive_gpio { void __iomem *base; - struct gpio_chip gc; + struct gpio_generic_chip gen_gc; struct regmap *regs; unsigned long irq_state; unsigned int trigger[SIFIVE_GPIO_MAX]; @@ -41,10 +42,10 @@ struct sifive_gpio { static void sifive_gpio_set_ie(struct sifive_gpio *chip, unsigned int offset) { - unsigned long flags; unsigned int trigger; - raw_spin_lock_irqsave(&chip->gc.bgpio_lock, flags); + guard(gpio_generic_lock_irqsave)(&chip->gen_gc); + trigger = (chip->irq_state & BIT(offset)) ? chip->trigger[offset] : 0; regmap_update_bits(chip->regs, SIFIVE_GPIO_RISE_IE, BIT(offset), (trigger & IRQ_TYPE_EDGE_RISING) ? BIT(offset) : 0); @@ -54,7 +55,6 @@ static void sifive_gpio_set_ie(struct sifive_gpio *chip, unsigned int offset) (trigger & IRQ_TYPE_LEVEL_HIGH) ? BIT(offset) : 0); regmap_update_bits(chip->regs, SIFIVE_GPIO_LOW_IE, BIT(offset), (trigger & IRQ_TYPE_LEVEL_LOW) ? BIT(offset) : 0); - raw_spin_unlock_irqrestore(&chip->gc.bgpio_lock, flags); } static int sifive_gpio_irq_set_type(struct irq_data *d, unsigned int trigger) @@ -72,13 +72,12 @@ static int sifive_gpio_irq_set_type(struct irq_data *d, unsigned int trigger) } static void sifive_gpio_irq_enable(struct irq_data *d) -{ + { struct gpio_chip *gc = irq_data_get_irq_chip_data(d); struct sifive_gpio *chip = gpiochip_get_data(gc); irq_hw_number_t hwirq = irqd_to_hwirq(d); int offset = hwirq % SIFIVE_GPIO_MAX; u32 bit = BIT(offset); - unsigned long flags; gpiochip_enable_irq(gc, hwirq); irq_chip_enable_parent(d); @@ -86,13 +85,13 @@ static void sifive_gpio_irq_enable(struct irq_data *d) /* Switch to input */ gc->direction_input(gc, offset); - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); - /* Clear any sticky pending interrupts */ - regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); - regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); - regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); - regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + scoped_guard(gpio_generic_lock_irqsave, &chip->gen_gc) { + /* Clear any sticky pending interrupts */ + regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); + regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); + regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); + regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); + } /* Enable interrupts */ assign_bit(offset, &chip->irq_state, 1); @@ -118,15 +117,14 @@ static void sifive_gpio_irq_eoi(struct irq_data *d) struct sifive_gpio *chip = gpiochip_get_data(gc); int offset = irqd_to_hwirq(d) % SIFIVE_GPIO_MAX; u32 bit = BIT(offset); - unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); - /* Clear all pending interrupts */ - regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); - regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); - regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); - regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + scoped_guard(gpio_generic_lock_irqsave, &chip->gen_gc) { + /* Clear all pending interrupts */ + regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); + regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); + regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); + regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); + } irq_chip_eoi_parent(d); } @@ -179,6 +177,7 @@ static const struct regmap_config sifive_gpio_regmap_config = { static int sifive_gpio_probe(struct platform_device *pdev) { + struct gpio_generic_chip_config config; struct device *dev = &pdev->dev; struct irq_domain *parent; struct gpio_irq_chip *girq; @@ -217,13 +216,17 @@ static int sifive_gpio_probe(struct platform_device *pdev) */ parent = irq_get_irq_data(chip->irq_number[0])->domain; - ret = bgpio_init(&chip->gc, dev, 4, - chip->base + SIFIVE_GPIO_INPUT_VAL, - chip->base + SIFIVE_GPIO_OUTPUT_VAL, - NULL, - chip->base + SIFIVE_GPIO_OUTPUT_EN, - chip->base + SIFIVE_GPIO_INPUT_EN, - BGPIOF_READ_OUTPUT_REG_SET); + config = (typeof(config)){ + .dev = dev, + .sz = 4, + .dat = chip->base + SIFIVE_GPIO_INPUT_VAL, + .set = chip->base + SIFIVE_GPIO_OUTPUT_VAL, + .dirout = chip->base + SIFIVE_GPIO_OUTPUT_EN, + .dirin = chip->base + SIFIVE_GPIO_INPUT_EN, + .flags = BGPIOF_READ_OUTPUT_REG_SET, + }; + + ret = gpio_generic_chip_init(&chip->gen_gc, &config); if (ret) { dev_err(dev, "unable to init generic GPIO\n"); return ret; @@ -236,12 +239,12 @@ static int sifive_gpio_probe(struct platform_device *pdev) regmap_write(chip->regs, SIFIVE_GPIO_LOW_IE, 0); chip->irq_state = 0; - chip->gc.base = -1; - chip->gc.ngpio = ngpio; - chip->gc.label = dev_name(dev); - chip->gc.parent = dev; - chip->gc.owner = THIS_MODULE; - girq = &chip->gc.irq; + chip->gen_gc.gc.base = -1; + chip->gen_gc.gc.ngpio = ngpio; + chip->gen_gc.gc.label = dev_name(dev); + chip->gen_gc.gc.parent = dev; + chip->gen_gc.gc.owner = THIS_MODULE; + girq = &chip->gen_gc.gc.irq; gpio_irq_chip_set_chip(girq, &sifive_gpio_irqchip); girq->fwnode = dev_fwnode(dev); girq->parent_domain = parent; @@ -249,7 +252,7 @@ static int sifive_gpio_probe(struct platform_device *pdev) girq->handler = handle_bad_irq; girq->default_type = IRQ_TYPE_NONE; - return gpiochip_add_data(&chip->gc, chip); + return gpiochip_add_data(&chip->gen_gc.gc, chip); } static const struct of_device_id sifive_gpio_match[] = { -- 2.48.1 From dlan at gentoo.org Tue Sep 9 02:15:03 2025 From: dlan at gentoo.org (Yixun Lan) Date: Tue, 9 Sep 2025 17:15:03 +0800 Subject: [GIT PULL] clk: spacemit: Updates for v6.18 Message-ID: <20250909171321-GYC7803064@gentoo.org> Hi Stephen, Please pull SpacemiT's clock changes for v6.18 Yixun Lan The following changes since commit 8f5ae30d69d7543eee0d70083daf4de8fe15d585: Linux 6.17-rc1 (2025-08-10 19:41:16 +0300) are available in the Git repository at: https://github.com/spacemit-com/linux tags/spacemit-clk-for-6.18-1 for you to fetch changes up to d02c71cba7bba453d233a49497412ddbf2d44871: clk: spacemit: ccu_pll: convert from round_rate() to determine_rate() (2025-08-26 06:07:45 +0800) ---------------------------------------------------------------- RISC-V SpacemiT clock changes for 6.18 - Convert to use determine_rate() - Fix clocks of SSPA ---------------------------------------------------------------- Brian Masney (3): clk: spacemit: ccu_ddn: convert from round_rate() to determine_rate() clk: spacemit: ccu_mix: convert from round_rate() to determine_rate() clk: spacemit: ccu_pll: convert from round_rate() to determine_rate() Troy Mitchell (2): dt-bindings: clock: spacemit: CLK_SSPA_I2S_BCLK for SSPA clk: spacemit: fix sspax_clk drivers/clk/spacemit/ccu-k1.c | 29 ++++++++++++++++++++++---- drivers/clk/spacemit/ccu_ddn.c | 11 ++++++---- drivers/clk/spacemit/ccu_mix.c | 12 ++++++----- drivers/clk/spacemit/ccu_pll.c | 10 +++++---- include/dt-bindings/clock/spacemit,k1-syscon.h | 2 ++ 5 files changed, 47 insertions(+), 17 deletions(-) From brgl at bgdev.pl Tue Sep 9 02:15:39 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:39 +0200 Subject: [PATCH 12/15] gpio: spacemit-k1: use new generic GPIO chip API In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-12-9f723dc3524a@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-spacemit-k1.c | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/drivers/gpio/gpio-spacemit-k1.c b/drivers/gpio/gpio-spacemit-k1.c index 3cc75c701ec40194e602b80d3f96f23204ce3b4d..9e57f43d3d13ad28fcd3327ecdc3f359691a44c9 100644 --- a/drivers/gpio/gpio-spacemit-k1.c +++ b/drivers/gpio/gpio-spacemit-k1.c @@ -6,6 +6,7 @@ #include #include +#include #include #include #include @@ -38,7 +39,7 @@ struct spacemit_gpio; struct spacemit_gpio_bank { - struct gpio_chip gc; + struct gpio_generic_chip chip; struct spacemit_gpio *sg; void __iomem *base; u32 irq_mask; @@ -72,7 +73,7 @@ static irqreturn_t spacemit_gpio_irq_handler(int irq, void *dev_id) return IRQ_NONE; for_each_set_bit(n, &pending, BITS_PER_LONG) - handle_nested_irq(irq_find_mapping(gb->gc.irq.domain, n)); + handle_nested_irq(irq_find_mapping(gb->chip.gc.irq.domain, n)); return IRQ_HANDLED; } @@ -143,7 +144,7 @@ static void spacemit_gpio_irq_print_chip(struct irq_data *data, struct seq_file { struct spacemit_gpio_bank *gb = irq_data_get_irq_chip_data(data); - seq_printf(p, "%s-%d", dev_name(gb->gc.parent), spacemit_gpio_bank_index(gb)); + seq_printf(p, "%s-%d", dev_name(gb->chip.gc.parent), spacemit_gpio_bank_index(gb)); } static struct irq_chip spacemit_gpio_chip = { @@ -165,7 +166,7 @@ static bool spacemit_of_node_instance_match(struct gpio_chip *gc, unsigned int i if (i >= SPACEMIT_NR_BANKS) return false; - return (gc == &sg->sgb[i].gc); + return (gc == &sg->sgb[i].chip.gc); } static int spacemit_gpio_add_bank(struct spacemit_gpio *sg, @@ -173,7 +174,8 @@ static int spacemit_gpio_add_bank(struct spacemit_gpio *sg, int index, int irq) { struct spacemit_gpio_bank *gb = &sg->sgb[index]; - struct gpio_chip *gc = &gb->gc; + struct gpio_generic_chip_config config; + struct gpio_chip *gc = &gb->chip.gc; struct device *dev = sg->dev; struct gpio_irq_chip *girq; void __iomem *dat, *set, *clr, *dirin, *dirout; @@ -187,9 +189,19 @@ static int spacemit_gpio_add_bank(struct spacemit_gpio *sg, dirin = gb->base + SPACEMIT_GCDR; dirout = gb->base + SPACEMIT_GSDR; + config = (typeof(config)){ + .dev = dev, + .sz = 4, + .dat = dat, + .set = set, + .clr = clr, + .dirout = dirout, + .dirin = dirin, + .flags = BGPIOF_UNREADABLE_REG_SET | BGPIOF_UNREADABLE_REG_DIR, + }; + /* This registers 32 GPIO lines per bank */ - ret = bgpio_init(gc, dev, 4, dat, set, clr, dirout, dirin, - BGPIOF_UNREADABLE_REG_SET | BGPIOF_UNREADABLE_REG_DIR); + ret = gpio_generic_chip_init(&gb->chip, &config); if (ret) return dev_err_probe(dev, ret, "failed to init gpio chip\n"); @@ -221,7 +233,7 @@ static int spacemit_gpio_add_bank(struct spacemit_gpio *sg, ret = devm_request_threaded_irq(dev, irq, NULL, spacemit_gpio_irq_handler, IRQF_ONESHOT | IRQF_SHARED, - gb->gc.label, gb); + gb->chip.gc.label, gb); if (ret < 0) return dev_err_probe(dev, ret, "failed to register IRQ\n"); -- 2.48.1 From brgl at bgdev.pl Tue Sep 9 02:15:40 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:40 +0200 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-sodaville.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/gpio/gpio-sodaville.c b/drivers/gpio/gpio-sodaville.c index abd13c79ace09db228e975f93c92e727d3864ef8..6bc224d3a561077bf3438a70591e1f313ac834f3 100644 --- a/drivers/gpio/gpio-sodaville.c +++ b/drivers/gpio/gpio-sodaville.c @@ -9,6 +9,7 @@ #include #include +#include #include #include #include @@ -39,7 +40,7 @@ struct sdv_gpio_chip_data { void __iomem *gpio_pub_base; struct irq_domain *id; struct irq_chip_generic *gc; - struct gpio_chip chip; + struct gpio_generic_chip gen_gc; }; static int sdv_gpio_pub_set_type(struct irq_data *d, unsigned int type) @@ -180,6 +181,7 @@ static int sdv_register_irqsupport(struct sdv_gpio_chip_data *sd, static int sdv_gpio_probe(struct pci_dev *pdev, const struct pci_device_id *pci_id) { + struct gpio_generic_chip_config config; struct sdv_gpio_chip_data *sd; int ret; u32 mux_val; @@ -206,15 +208,21 @@ static int sdv_gpio_probe(struct pci_dev *pdev, if (!ret) writel(mux_val, sd->gpio_pub_base + GPMUXCTL); - ret = bgpio_init(&sd->chip, &pdev->dev, 4, - sd->gpio_pub_base + GPINR, sd->gpio_pub_base + GPOUTR, - NULL, sd->gpio_pub_base + GPOER, NULL, 0); + config = (typeof(config)){ + .dev = &pdev->dev, + .sz = 4, + .dat = sd->gpio_pub_base + GPINR, + .set = sd->gpio_pub_base + GPOUTR, + .dirout = sd->gpio_pub_base + GPOER, + }; + + ret = gpio_generic_chip_init(&sd->gen_gc, &config); if (ret) return ret; - sd->chip.ngpio = SDV_NUM_PUB_GPIOS; + sd->gen_gc.gc.ngpio = SDV_NUM_PUB_GPIOS; - ret = devm_gpiochip_add_data(&pdev->dev, &sd->chip, sd); + ret = devm_gpiochip_add_data(&pdev->dev, &sd->gen_gc.gc, sd); if (ret < 0) { dev_err(&pdev->dev, "gpiochip_add() failed.\n"); return ret; -- 2.48.1 From ajd at linux.ibm.com Tue Sep 9 02:13:35 2025 From: ajd at linux.ibm.com (Andrew Donnellan) Date: Tue, 9 Sep 2025 19:13:35 +1000 Subject: [PATCH v17 12/12] powerpc: mm: Support page table check In-Reply-To: <20250909091335.183439-1-ajd@linux.ibm.com> References: <20250909091335.183439-1-ajd@linux.ibm.com> Message-ID: <20250909091335.183439-13-ajd@linux.ibm.com> From: Rohan McLure On creation and clearing of a page table mapping, instrument such calls by invoking page_table_check_pte_set and page_table_check_pte_clear respectively. These calls serve as a sanity check against illegal mappings. Enable ARCH_SUPPORTS_PAGE_TABLE_CHECK for all platforms. See also: riscv support in commit 3fee229a8eb9 ("riscv/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") arm64 in commit 42b2547137f5 ("arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") x86_64 in commit d283d422c6c4 ("x86: mm: add x86_64 support for page table check") [ajd at linux.ibm.com: rebase] Reviewed-by: Christophe Leroy Signed-off-by: Rohan McLure Reviewed-by: Pasha Tatashin Signed-off-by: Andrew Donnellan Acked-by: Madhavan Srinivasan --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/book3s/32/pgtable.h | 7 ++- arch/powerpc/include/asm/book3s/64/pgtable.h | 45 +++++++++++++++----- arch/powerpc/include/asm/nohash/pgtable.h | 8 +++- arch/powerpc/mm/book3s64/hash_pgtable.c | 4 ++ arch/powerpc/mm/book3s64/pgtable.c | 11 +++-- arch/powerpc/mm/book3s64/radix_pgtable.c | 3 ++ arch/powerpc/mm/pgtable.c | 4 ++ 8 files changed, 68 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 4730c676b6bf..0d3e26a6c308 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -171,6 +171,7 @@ config PPC select ARCH_STACKWALK select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_DEBUG_PAGEALLOC if PPC_BOOK3S || PPC_8xx + select ARCH_SUPPORTS_PAGE_TABLE_CHECK select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF if PPC64 select ARCH_USE_MEMTEST diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index b225967f85ea..68864a71ca5f 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -202,6 +202,7 @@ void unmap_kernel_page(unsigned long va); #ifndef __ASSEMBLY__ #include #include +#include /* Bits to mask out from a PGD to get to the PUD page */ #define PGD_MASKED_BITS 0 @@ -315,7 +316,11 @@ static inline int __ptep_test_and_clear_young(struct mm_struct *mm, static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - return __pte(pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, 0, 0)); + pte_t old_pte = __pte(pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, 0, 0)); + + page_table_check_pte_clear(mm, addr, old_pte); + + return old_pte; } #define __HAVE_ARCH_PTEP_SET_WRPROTECT diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 48f3a41317dd..81c220bcbd26 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -144,6 +144,8 @@ #define PAGE_KERNEL_ROX __pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX) #ifndef __ASSEMBLY__ +#include + /* * page table defines */ @@ -416,8 +418,11 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm, static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - unsigned long old = pte_update(mm, addr, ptep, ~0UL, 0, 0); - return __pte(old); + pte_t old_pte = __pte(pte_update(mm, addr, ptep, ~0UL, 0, 0)); + + page_table_check_pte_clear(mm, addr, old_pte); + + return old_pte; } #define __HAVE_ARCH_PTEP_GET_AND_CLEAR_FULL @@ -426,11 +431,16 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, pte_t *ptep, int full) { if (full && radix_enabled()) { + pte_t old_pte; + /* * We know that this is a full mm pte clear and * hence can be sure there is no parallel set_pte. */ - return radix__ptep_get_and_clear_full(mm, addr, ptep, full); + old_pte = radix__ptep_get_and_clear_full(mm, addr, ptep, full); + page_table_check_pte_clear(mm, addr, old_pte); + + return old_pte; } return ptep_get_and_clear(mm, addr, ptep); } @@ -1289,19 +1299,34 @@ extern int pudp_test_and_clear_young(struct vm_area_struct *vma, static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { - if (radix_enabled()) - return radix__pmdp_huge_get_and_clear(mm, addr, pmdp); - return hash__pmdp_huge_get_and_clear(mm, addr, pmdp); + pmd_t old_pmd; + + if (radix_enabled()) { + old_pmd = radix__pmdp_huge_get_and_clear(mm, addr, pmdp); + } else { + old_pmd = hash__pmdp_huge_get_and_clear(mm, addr, pmdp); + } + + page_table_check_pmd_clear(mm, addr, old_pmd); + + return old_pmd; } #define __HAVE_ARCH_PUDP_HUGE_GET_AND_CLEAR static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, unsigned long addr, pud_t *pudp) { - if (radix_enabled()) - return radix__pudp_huge_get_and_clear(mm, addr, pudp); - BUG(); - return *pudp; + pud_t old_pud; + + if (radix_enabled()) { + old_pud = radix__pudp_huge_get_and_clear(mm, addr, pudp); + } else { + BUG(); + } + + page_table_check_pud_clear(mm, addr, old_pud); + + return old_pud; } static inline pmd_t pmdp_collapse_flush(struct vm_area_struct *vma, diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h index a8bc4f24beb1..3a6630dca615 100644 --- a/arch/powerpc/include/asm/nohash/pgtable.h +++ b/arch/powerpc/include/asm/nohash/pgtable.h @@ -29,6 +29,8 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, p #ifndef __ASSEMBLY__ +#include + extern int icache_44x_need_flush; #ifndef pte_huge_size @@ -122,7 +124,11 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - return __pte(pte_update(mm, addr, ptep, ~0UL, 0, 0)); + pte_t old_pte = __pte(pte_update(mm, addr, ptep, ~0UL, 0, 0)); + + page_table_check_pte_clear(mm, addr, old_pte); + + return old_pte; } #define __HAVE_ARCH_PTEP_GET_AND_CLEAR diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c index 82d31177630b..ac2a24d15d2e 100644 --- a/arch/powerpc/mm/book3s64/hash_pgtable.c +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -230,6 +231,9 @@ pmd_t hash__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addres pmd = *pmdp; pmd_clear(pmdp); + + page_table_check_pmd_clear(vma->vm_mm, address, pmd); + /* * Wait for all pending hash_page to finish. This is needed * in case of subpage collapse. When we collapse normal pages diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c index ff0c5a1988f8..8be06a3cfcbc 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include @@ -127,6 +128,7 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr, WARN_ON(!(pmd_leaf(pmd))); #endif trace_hugepage_set_pmd(addr, pmd_val(pmd)); + page_table_check_pmd_set(mm, addr, pmdp, pmd); return set_pte_at_unchecked(mm, addr, pmdp_ptep(pmdp), pmd_pte(pmd)); } @@ -144,6 +146,7 @@ void set_pud_at(struct mm_struct *mm, unsigned long addr, WARN_ON(!(pud_leaf(pud))); #endif trace_hugepage_set_pud(addr, pud_val(pud)); + page_table_check_pud_set(mm, addr, pudp, pud); return set_pte_at_unchecked(mm, addr, pudp_ptep(pudp), pud_pte(pud)); } @@ -179,12 +182,14 @@ void serialize_against_pte_lookup(struct mm_struct *mm) pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp) { - unsigned long old_pmd; + pmd_t old_pmd; VM_WARN_ON_ONCE(!pmd_present(*pmdp)); - old_pmd = pmd_hugepage_update(vma->vm_mm, address, pmdp, _PAGE_PRESENT, _PAGE_INVALID); + old_pmd = __pmd(pmd_hugepage_update(vma->vm_mm, address, pmdp, _PAGE_PRESENT, _PAGE_INVALID)); flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE); - return __pmd(old_pmd); + page_table_check_pmd_clear(vma->vm_mm, address, old_pmd); + + return old_pmd; } pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long address, diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c index b2541bf33d01..10aced261cff 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include @@ -1474,6 +1475,8 @@ pmd_t radix__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addre pmd = *pmdp; pmd_clear(pmdp); + page_table_check_pmd_clear(vma->vm_mm, address, pmd); + radix__flush_tlb_collapsed_pmd(vma->vm_mm, address); return pmd; diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 7b69cd16e011..a9be337be3e4 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -206,6 +207,9 @@ void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, * and not hw_valid ptes. Hence there is no translation cache flush * involved that need to be batched. */ + + page_table_check_ptes_set(mm, addr, ptep, pte, nr); + for (;;) { /* -- 2.51.0 From brgl at bgdev.pl Tue Sep 9 02:15:41 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:41 +0200 Subject: [PATCH 14/15] gpio: mmio: use new generic GPIO chip API In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-14-9f723dc3524a@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-mmio.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/drivers/gpio/gpio-mmio.c b/drivers/gpio/gpio-mmio.c index 79e1be149c94842cb6fa6b657343b11e78701220..a5e2f8a826af40ec96d2a3ea58240f1ca8ed250c 100644 --- a/drivers/gpio/gpio-mmio.c +++ b/drivers/gpio/gpio-mmio.c @@ -57,6 +57,7 @@ o ` ~~~~\___/~~~~ ` controller in FPGA is ,.` #include #include +#include #include "gpiolib.h" @@ -737,6 +738,8 @@ MODULE_DEVICE_TABLE(of, bgpio_of_match); static int bgpio_pdev_probe(struct platform_device *pdev) { + struct gpio_generic_chip_config config; + struct gpio_generic_chip *gen_gc; struct device *dev = &pdev->dev; struct resource *r; void __iomem *dat; @@ -748,7 +751,6 @@ static int bgpio_pdev_probe(struct platform_device *pdev) unsigned long flags = 0; unsigned int base; int err; - struct gpio_chip *gc; const char *label; r = platform_get_resource_byname(pdev, IORESOURCE_MEM, "dat"); @@ -777,8 +779,8 @@ static int bgpio_pdev_probe(struct platform_device *pdev) if (IS_ERR(dirin)) return PTR_ERR(dirin); - gc = devm_kzalloc(&pdev->dev, sizeof(*gc), GFP_KERNEL); - if (!gc) + gen_gc = devm_kzalloc(&pdev->dev, sizeof(*gen_gc), GFP_KERNEL); + if (!gen_gc) return -ENOMEM; if (device_is_big_endian(dev)) @@ -787,13 +789,24 @@ static int bgpio_pdev_probe(struct platform_device *pdev) if (device_property_read_bool(dev, "no-output")) flags |= BGPIOF_NO_OUTPUT; - err = bgpio_init(gc, dev, sz, dat, set, clr, dirout, dirin, flags); + config = (typeof(config)){ + .dev = dev, + .sz = sz, + .dat = dat, + .set = set, + .clr = clr, + .dirout = dirout, + .dirin = dirin, + .flags = flags, + }; + + err = gpio_generic_chip_init(gen_gc, &config); if (err) return err; err = device_property_read_string(dev, "label", &label); if (!err) - gc->label = label; + gen_gc->gc.label = label; /* * This property *must not* be used in device-tree sources, it's only @@ -801,11 +814,11 @@ static int bgpio_pdev_probe(struct platform_device *pdev) */ err = device_property_read_u32(dev, "gpio-mmio,base", &base); if (!err && base <= INT_MAX) - gc->base = base; + gen_gc->gc.base = base; - platform_set_drvdata(pdev, gc); + platform_set_drvdata(pdev, &gen_gc->gc); - return devm_gpiochip_add_data(&pdev->dev, gc, NULL); + return devm_gpiochip_add_data(&pdev->dev, &gen_gc->gc, NULL); } static const struct platform_device_id bgpio_id_table[] = { -- 2.48.1 From brgl at bgdev.pl Tue Sep 9 02:15:42 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 09 Sep 2025 11:15:42 +0200 Subject: [PATCH 15/15] gpio: move gpio-mmio-specific fields out of struct gpio_chip In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> Message-ID: <20250909-gpio-mmio-gpio-conv-part4-v1-15-9f723dc3524a@linaro.org> From: Bartosz Golaszewski With all users of bgpio_init() converted to using the modernized generic GPIO chip API, we can now move the gpio-mmio-specific fields out of struct gpio_chip and into the dedicated struct gpio_generic_chip. To that end: adjust the gpio-mmio driver to the new layout, update the docs, etc. The changes in gpio-mlxbf2.c and gpio-mpc8xxx.c are here and not in their respective conversion commits because the former passes the address of the generic chip's lock to the __releases() annotation and we cannot really hide it while gpio-mpc8xxx.c accesses the shadow registers in a driver-specific workaround and there's no reason to make them available in a public API. Also: drop the relevant task from TODO as it's now done. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/TODO | 5 - drivers/gpio/gpio-mlxbf2.c | 2 +- drivers/gpio/gpio-mmio.c | 321 ++++++++++++++++++++++--------------------- drivers/gpio/gpio-mpc8xxx.c | 5 +- include/linux/gpio/driver.h | 44 ------ include/linux/gpio/generic.h | 67 ++++++--- 6 files changed, 211 insertions(+), 233 deletions(-) diff --git a/drivers/gpio/TODO b/drivers/gpio/TODO index b797499e627ee9fdb1ee9c564b8278241f720850..8ed74e05903a972e99e0789319ed19ebd8545a1a 100644 --- a/drivers/gpio/TODO +++ b/drivers/gpio/TODO @@ -131,11 +131,6 @@ Work items: helpers (x86 inb()/outb()) and convert port-mapped I/O drivers to use this with dry-coding and sending to maintainers to test -- Move the MMIO GPIO specific fields out of struct gpio_chip into a - dedicated structure. Currently every GPIO chip has them if gpio-mmio is - enabled in Kconfig even if it itself doesn't register with the helper - library. - ------------------------------------------------------------------------------- Generic regmap GPIO diff --git a/drivers/gpio/gpio-mlxbf2.c b/drivers/gpio/gpio-mlxbf2.c index f99f66cd189ca71c9d188dff0a0b42ef2223abb3..9520d26b20a5851ac8b5de239b8f5980dabc2820 100644 --- a/drivers/gpio/gpio-mlxbf2.c +++ b/drivers/gpio/gpio-mlxbf2.c @@ -156,7 +156,7 @@ static int mlxbf2_gpio_lock_acquire(struct mlxbf2_gpio_context *gs) * Release the YU arm_gpio_lock after changing the direction mode. */ static void mlxbf2_gpio_lock_release(struct mlxbf2_gpio_context *gs) - __releases(&gs->chip.gc.bgpio_lock) + __releases(&gs->chip.lock) __releases(yu_arm_gpio_lock_param.lock) { writel(YU_ARM_GPIO_LOCK_RELEASE, yu_arm_gpio_lock_param.io); diff --git a/drivers/gpio/gpio-mmio.c b/drivers/gpio/gpio-mmio.c index a5e2f8a826af40ec96d2a3ea58240f1ca8ed250c..2fea986e10b87553f6847e96fe214ba3da76c0e9 100644 --- a/drivers/gpio/gpio-mmio.c +++ b/drivers/gpio/gpio-mmio.c @@ -125,20 +125,23 @@ static unsigned long bgpio_read32be(void __iomem *reg) static unsigned long bgpio_line2mask(struct gpio_chip *gc, unsigned int line) { - if (gc->be_bits) - return BIT(gc->bgpio_bits - 1 - line); + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + + if (chip->be_bits) + return BIT(chip->bits - 1 - line); return BIT(line); } static int bgpio_get_set(struct gpio_chip *gc, unsigned int gpio) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long pinmask = bgpio_line2mask(gc, gpio); - bool dir = !!(gc->bgpio_dir & pinmask); + bool dir = !!(chip->sdir & pinmask); if (dir) - return !!(gc->read_reg(gc->reg_set) & pinmask); - else - return !!(gc->read_reg(gc->reg_dat) & pinmask); + return !!(chip->read_reg(chip->reg_set) & pinmask); + + return !!(chip->read_reg(chip->reg_dat) & pinmask); } /* @@ -148,26 +151,28 @@ static int bgpio_get_set(struct gpio_chip *gc, unsigned int gpio) static int bgpio_get_set_multiple(struct gpio_chip *gc, unsigned long *mask, unsigned long *bits) { - unsigned long get_mask = 0; - unsigned long set_mask = 0; + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + unsigned long get_mask = 0, set_mask = 0; /* Make sure we first clear any bits that are zero when we read the register */ *bits &= ~*mask; - set_mask = *mask & gc->bgpio_dir; - get_mask = *mask & ~gc->bgpio_dir; + set_mask = *mask & chip->sdir; + get_mask = *mask & ~chip->sdir; if (set_mask) - *bits |= gc->read_reg(gc->reg_set) & set_mask; + *bits |= chip->read_reg(chip->reg_set) & set_mask; if (get_mask) - *bits |= gc->read_reg(gc->reg_dat) & get_mask; + *bits |= chip->read_reg(chip->reg_dat) & get_mask; return 0; } static int bgpio_get(struct gpio_chip *gc, unsigned int gpio) { - return !!(gc->read_reg(gc->reg_dat) & bgpio_line2mask(gc, gpio)); + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + + return !!(chip->read_reg(chip->reg_dat) & bgpio_line2mask(gc, gpio)); } /* @@ -176,9 +181,11 @@ static int bgpio_get(struct gpio_chip *gc, unsigned int gpio) static int bgpio_get_multiple(struct gpio_chip *gc, unsigned long *mask, unsigned long *bits) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + /* Make sure we first clear any bits that are zero when we read the register */ *bits &= ~*mask; - *bits |= gc->read_reg(gc->reg_dat) & *mask; + *bits |= chip->read_reg(chip->reg_dat) & *mask; return 0; } @@ -188,6 +195,7 @@ static int bgpio_get_multiple(struct gpio_chip *gc, unsigned long *mask, static int bgpio_get_multiple_be(struct gpio_chip *gc, unsigned long *mask, unsigned long *bits) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long readmask = 0; unsigned long val; int bit; @@ -200,7 +208,7 @@ static int bgpio_get_multiple_be(struct gpio_chip *gc, unsigned long *mask, readmask |= bgpio_line2mask(gc, bit); /* Read the register */ - val = gc->read_reg(gc->reg_dat) & readmask; + val = chip->read_reg(chip->reg_dat) & readmask; /* * Mirror the result into the "bits" result, this will give line 0 @@ -219,19 +227,20 @@ static int bgpio_set_none(struct gpio_chip *gc, unsigned int gpio, int val) static int bgpio_set(struct gpio_chip *gc, unsigned int gpio, int val) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long mask = bgpio_line2mask(gc, gpio); unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + raw_spin_lock_irqsave(&chip->lock, flags); if (val) - gc->bgpio_data |= mask; + chip->sdata |= mask; else - gc->bgpio_data &= ~mask; + chip->sdata &= ~mask; - gc->write_reg(gc->reg_dat, gc->bgpio_data); + chip->write_reg(chip->reg_dat, chip->sdata); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->lock, flags); return 0; } @@ -239,31 +248,32 @@ static int bgpio_set(struct gpio_chip *gc, unsigned int gpio, int val) static int bgpio_set_with_clear(struct gpio_chip *gc, unsigned int gpio, int val) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long mask = bgpio_line2mask(gc, gpio); if (val) - gc->write_reg(gc->reg_set, mask); + chip->write_reg(chip->reg_set, mask); else - gc->write_reg(gc->reg_clr, mask); + chip->write_reg(chip->reg_clr, mask); return 0; } static int bgpio_set_set(struct gpio_chip *gc, unsigned int gpio, int val) { - unsigned long mask = bgpio_line2mask(gc, gpio); - unsigned long flags; + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + unsigned long mask = bgpio_line2mask(gc, gpio), flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + raw_spin_lock_irqsave(&chip->lock, flags); if (val) - gc->bgpio_data |= mask; + chip->sdata |= mask; else - gc->bgpio_data &= ~mask; + chip->sdata &= ~mask; - gc->write_reg(gc->reg_set, gc->bgpio_data); + chip->write_reg(chip->reg_set, chip->sdata); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->lock, flags); return 0; } @@ -273,12 +283,13 @@ static void bgpio_multiple_get_masks(struct gpio_chip *gc, unsigned long *set_mask, unsigned long *clear_mask) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); int i; *set_mask = 0; *clear_mask = 0; - for_each_set_bit(i, mask, gc->bgpio_bits) { + for_each_set_bit(i, mask, chip->bits) { if (test_bit(i, bits)) *set_mask |= bgpio_line2mask(gc, i); else @@ -291,25 +302,27 @@ static void bgpio_set_multiple_single_reg(struct gpio_chip *gc, unsigned long *bits, void __iomem *reg) { - unsigned long flags; - unsigned long set_mask, clear_mask; + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + unsigned long flags, set_mask, clear_mask; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + raw_spin_lock_irqsave(&chip->lock, flags); bgpio_multiple_get_masks(gc, mask, bits, &set_mask, &clear_mask); - gc->bgpio_data |= set_mask; - gc->bgpio_data &= ~clear_mask; + chip->sdata |= set_mask; + chip->sdata &= ~clear_mask; - gc->write_reg(reg, gc->bgpio_data); + chip->write_reg(reg, chip->sdata); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->lock, flags); } static int bgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask, unsigned long *bits) { - bgpio_set_multiple_single_reg(gc, mask, bits, gc->reg_dat); + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + + bgpio_set_multiple_single_reg(gc, mask, bits, chip->reg_dat); return 0; } @@ -317,7 +330,9 @@ static int bgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask, static int bgpio_set_multiple_set(struct gpio_chip *gc, unsigned long *mask, unsigned long *bits) { - bgpio_set_multiple_single_reg(gc, mask, bits, gc->reg_set); + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + + bgpio_set_multiple_single_reg(gc, mask, bits, chip->reg_set); return 0; } @@ -326,21 +341,24 @@ static int bgpio_set_multiple_with_clear(struct gpio_chip *gc, unsigned long *mask, unsigned long *bits) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long set_mask, clear_mask; bgpio_multiple_get_masks(gc, mask, bits, &set_mask, &clear_mask); if (set_mask) - gc->write_reg(gc->reg_set, set_mask); + chip->write_reg(chip->reg_set, set_mask); if (clear_mask) - gc->write_reg(gc->reg_clr, clear_mask); + chip->write_reg(chip->reg_clr, clear_mask); return 0; } static int bgpio_dir_return(struct gpio_chip *gc, unsigned int gpio, bool dir_out) { - if (!gc->bgpio_pinctrl) + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + + if (!chip->pinctrl) return 0; if (dir_out) @@ -375,39 +393,42 @@ static int bgpio_simple_dir_out(struct gpio_chip *gc, unsigned int gpio, static int bgpio_dir_in(struct gpio_chip *gc, unsigned int gpio) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + raw_spin_lock_irqsave(&chip->lock, flags); - gc->bgpio_dir &= ~bgpio_line2mask(gc, gpio); + chip->sdir &= ~bgpio_line2mask(gc, gpio); - if (gc->reg_dir_in) - gc->write_reg(gc->reg_dir_in, ~gc->bgpio_dir); - if (gc->reg_dir_out) - gc->write_reg(gc->reg_dir_out, gc->bgpio_dir); + if (chip->reg_dir_in) + chip->write_reg(chip->reg_dir_in, ~chip->sdir); + if (chip->reg_dir_out) + chip->write_reg(chip->reg_dir_out, chip->sdir); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->lock, flags); return bgpio_dir_return(gc, gpio, false); } static int bgpio_get_dir(struct gpio_chip *gc, unsigned int gpio) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + /* Return 0 if output, 1 if input */ - if (gc->bgpio_dir_unreadable) { - if (gc->bgpio_dir & bgpio_line2mask(gc, gpio)) + if (chip->dir_unreadable) { + if (chip->sdir & bgpio_line2mask(gc, gpio)) return GPIO_LINE_DIRECTION_OUT; return GPIO_LINE_DIRECTION_IN; } - if (gc->reg_dir_out) { - if (gc->read_reg(gc->reg_dir_out) & bgpio_line2mask(gc, gpio)) + if (chip->reg_dir_out) { + if (chip->read_reg(chip->reg_dir_out) & bgpio_line2mask(gc, gpio)) return GPIO_LINE_DIRECTION_OUT; return GPIO_LINE_DIRECTION_IN; } - if (gc->reg_dir_in) - if (!(gc->read_reg(gc->reg_dir_in) & bgpio_line2mask(gc, gpio))) + if (chip->reg_dir_in) + if (!(chip->read_reg(chip->reg_dir_in) & bgpio_line2mask(gc, gpio))) return GPIO_LINE_DIRECTION_OUT; return GPIO_LINE_DIRECTION_IN; @@ -415,18 +436,19 @@ static int bgpio_get_dir(struct gpio_chip *gc, unsigned int gpio) static void bgpio_dir_out(struct gpio_chip *gc, unsigned int gpio, int val) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + raw_spin_lock_irqsave(&chip->lock, flags); - gc->bgpio_dir |= bgpio_line2mask(gc, gpio); + chip->sdir |= bgpio_line2mask(gc, gpio); - if (gc->reg_dir_in) - gc->write_reg(gc->reg_dir_in, ~gc->bgpio_dir); - if (gc->reg_dir_out) - gc->write_reg(gc->reg_dir_out, gc->bgpio_dir); + if (chip->reg_dir_in) + chip->write_reg(chip->reg_dir_in, ~chip->sdir); + if (chip->reg_dir_out) + chip->write_reg(chip->reg_dir_out, chip->sdir); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->lock, flags); } static int bgpio_dir_out_dir_first(struct gpio_chip *gc, unsigned int gpio, @@ -446,31 +468,30 @@ static int bgpio_dir_out_val_first(struct gpio_chip *gc, unsigned int gpio, } static int bgpio_setup_accessors(struct device *dev, - struct gpio_chip *gc, + struct gpio_generic_chip *chip, bool byte_be) { - - switch (gc->bgpio_bits) { + switch (chip->bits) { case 8: - gc->read_reg = bgpio_read8; - gc->write_reg = bgpio_write8; + chip->read_reg = bgpio_read8; + chip->write_reg = bgpio_write8; break; case 16: if (byte_be) { - gc->read_reg = bgpio_read16be; - gc->write_reg = bgpio_write16be; + chip->read_reg = bgpio_read16be; + chip->write_reg = bgpio_write16be; } else { - gc->read_reg = bgpio_read16; - gc->write_reg = bgpio_write16; + chip->read_reg = bgpio_read16; + chip->write_reg = bgpio_write16; } break; case 32: if (byte_be) { - gc->read_reg = bgpio_read32be; - gc->write_reg = bgpio_write32be; + chip->read_reg = bgpio_read32be; + chip->write_reg = bgpio_write32be; } else { - gc->read_reg = bgpio_read32; - gc->write_reg = bgpio_write32; + chip->read_reg = bgpio_read32; + chip->write_reg = bgpio_write32; } break; #if BITS_PER_LONG >= 64 @@ -480,13 +501,13 @@ static int bgpio_setup_accessors(struct device *dev, "64 bit big endian byte order unsupported\n"); return -EINVAL; } else { - gc->read_reg = bgpio_read64; - gc->write_reg = bgpio_write64; + chip->read_reg = bgpio_read64; + chip->write_reg = bgpio_write64; } break; #endif /* BITS_PER_LONG >= 64 */ default: - dev_err(dev, "unsupported data width %u bits\n", gc->bgpio_bits); + dev_err(dev, "unsupported data width %u bits\n", chip->bits); return -EINVAL; } @@ -515,27 +536,25 @@ static int bgpio_setup_accessors(struct device *dev, * - an input direction register (named "dirin") where a 1 bit indicates * the GPIO is an input. */ -static int bgpio_setup_io(struct gpio_chip *gc, - void __iomem *dat, - void __iomem *set, - void __iomem *clr, - unsigned long flags) +static int bgpio_setup_io(struct gpio_generic_chip *chip, + const struct gpio_generic_chip_config *cfg) { + struct gpio_chip *gc = &chip->gc; - gc->reg_dat = dat; - if (!gc->reg_dat) + chip->reg_dat = cfg->dat; + if (!chip->reg_dat) return -EINVAL; - if (set && clr) { - gc->reg_set = set; - gc->reg_clr = clr; + if (cfg->set && cfg->clr) { + chip->reg_set = cfg->set; + chip->reg_clr = cfg->clr; gc->set = bgpio_set_with_clear; gc->set_multiple = bgpio_set_multiple_with_clear; - } else if (set && !clr) { - gc->reg_set = set; + } else if (cfg->set && !cfg->clr) { + chip->reg_set = cfg->set; gc->set = bgpio_set_set; gc->set_multiple = bgpio_set_multiple_set; - } else if (flags & BGPIOF_NO_OUTPUT) { + } else if (cfg->flags & BGPIOF_NO_OUTPUT) { gc->set = bgpio_set_none; gc->set_multiple = NULL; } else { @@ -543,10 +562,10 @@ static int bgpio_setup_io(struct gpio_chip *gc, gc->set_multiple = bgpio_set_multiple; } - if (!(flags & BGPIOF_UNREADABLE_REG_SET) && - (flags & BGPIOF_READ_OUTPUT_REG_SET)) { + if (!(cfg->flags & BGPIOF_UNREADABLE_REG_SET) && + (cfg->flags & BGPIOF_READ_OUTPUT_REG_SET)) { gc->get = bgpio_get_set; - if (!gc->be_bits) + if (!chip->be_bits) gc->get_multiple = bgpio_get_set_multiple; /* * We deliberately avoid assigning the ->get_multiple() call @@ -557,7 +576,7 @@ static int bgpio_setup_io(struct gpio_chip *gc, */ } else { gc->get = bgpio_get; - if (gc->be_bits) + if (chip->be_bits) gc->get_multiple = bgpio_get_multiple_be; else gc->get_multiple = bgpio_get_multiple; @@ -566,27 +585,27 @@ static int bgpio_setup_io(struct gpio_chip *gc, return 0; } -static int bgpio_setup_direction(struct gpio_chip *gc, - void __iomem *dirout, - void __iomem *dirin, - unsigned long flags) +static int bgpio_setup_direction(struct gpio_generic_chip *chip, + const struct gpio_generic_chip_config *cfg) { - if (dirout || dirin) { - gc->reg_dir_out = dirout; - gc->reg_dir_in = dirin; - if (flags & BGPIOF_NO_SET_ON_INPUT) + struct gpio_chip *gc = &chip->gc; + + if (cfg->dirout || cfg->dirin) { + chip->reg_dir_out = cfg->dirout; + chip->reg_dir_in = cfg->dirin; + if (cfg->flags & BGPIOF_NO_SET_ON_INPUT) gc->direction_output = bgpio_dir_out_dir_first; else gc->direction_output = bgpio_dir_out_val_first; gc->direction_input = bgpio_dir_in; gc->get_direction = bgpio_get_dir; } else { - if (flags & BGPIOF_NO_OUTPUT) + if (cfg->flags & BGPIOF_NO_OUTPUT) gc->direction_output = bgpio_dir_out_err; else gc->direction_output = bgpio_simple_dir_out; - if (flags & BGPIOF_NO_INPUT) + if (cfg->flags & BGPIOF_NO_INPUT) gc->direction_input = bgpio_dir_in_err; else gc->direction_input = bgpio_simple_dir_in; @@ -595,117 +614,101 @@ static int bgpio_setup_direction(struct gpio_chip *gc, return 0; } -static int bgpio_request(struct gpio_chip *chip, unsigned gpio_pin) +static int bgpio_request(struct gpio_chip *gc, unsigned int gpio_pin) { - if (gpio_pin >= chip->ngpio) + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + + if (gpio_pin >= gc->ngpio) return -EINVAL; - if (chip->bgpio_pinctrl) - return gpiochip_generic_request(chip, gpio_pin); + if (chip->pinctrl) + return gpiochip_generic_request(gc, gpio_pin); return 0; } /** - * bgpio_init() - Initialize generic GPIO accessor functions - * @gc: the GPIO chip to set up - * @dev: the parent device of the new GPIO chip (compulsory) - * @sz: the size (width) of the MMIO registers in bytes, typically 1, 2 or 4 - * @dat: MMIO address for the register to READ the value of the GPIO lines, it - * is expected that a 1 in the corresponding bit in this register means the - * line is asserted - * @set: MMIO address for the register to SET the value of the GPIO lines, it is - * expected that we write the line with 1 in this register to drive the GPIO line - * high. - * @clr: MMIO address for the register to CLEAR the value of the GPIO lines, it is - * expected that we write the line with 1 in this register to drive the GPIO line - * low. It is allowed to leave this address as NULL, in that case the SET register - * will be assumed to also clear the GPIO lines, by actively writing the line - * with 0. - * @dirout: MMIO address for the register to set the line as OUTPUT. It is assumed - * that setting a line to 1 in this register will turn that line into an - * output line. Conversely, setting the line to 0 will turn that line into - * an input. - * @dirin: MMIO address for the register to set this line as INPUT. It is assumed - * that setting a line to 1 in this register will turn that line into an - * input line. Conversely, setting the line to 0 will turn that line into - * an output. - * @flags: Different flags that will affect the behaviour of the device, such as - * endianness etc. + * gpio_generic_chip_init() - Initialize a generic GPIO chip. + * @chip: Generic GPIO chip to set up. + * @cfg: Generic GPIO chip configuration. + * + * Returns 0 on success, negative error number on failure. */ -int bgpio_init(struct gpio_chip *gc, struct device *dev, - unsigned long sz, void __iomem *dat, void __iomem *set, - void __iomem *clr, void __iomem *dirout, void __iomem *dirin, - unsigned long flags) +int gpio_generic_chip_init(struct gpio_generic_chip *chip, + const struct gpio_generic_chip_config *cfg) { + struct gpio_chip *gc = &chip->gc; + unsigned long flags = cfg->flags; + struct device *dev = cfg->dev; int ret; - if (!is_power_of_2(sz)) + if (!is_power_of_2(cfg->sz)) return -EINVAL; - gc->bgpio_bits = sz * 8; - if (gc->bgpio_bits > BITS_PER_LONG) + chip->bits = cfg->sz * 8; + if (chip->bits > BITS_PER_LONG) return -EINVAL; - raw_spin_lock_init(&gc->bgpio_lock); + raw_spin_lock_init(&chip->lock); gc->parent = dev; gc->label = dev_name(dev); gc->base = -1; gc->request = bgpio_request; - gc->be_bits = !!(flags & BGPIOF_BIG_ENDIAN); + chip->be_bits = !!(flags & BGPIOF_BIG_ENDIAN); ret = gpiochip_get_ngpios(gc, dev); if (ret) - gc->ngpio = gc->bgpio_bits; + gc->ngpio = chip->bits; - ret = bgpio_setup_io(gc, dat, set, clr, flags); + ret = bgpio_setup_io(chip, cfg); if (ret) return ret; - ret = bgpio_setup_accessors(dev, gc, flags & BGPIOF_BIG_ENDIAN_BYTE_ORDER); + ret = bgpio_setup_accessors(dev, chip, + flags & BGPIOF_BIG_ENDIAN_BYTE_ORDER); if (ret) return ret; - ret = bgpio_setup_direction(gc, dirout, dirin, flags); + ret = bgpio_setup_direction(chip, cfg); if (ret) return ret; if (flags & BGPIOF_PINCTRL_BACKEND) { - gc->bgpio_pinctrl = true; + chip->pinctrl = true; /* Currently this callback is only used for pincontrol */ gc->free = gpiochip_generic_free; } - gc->bgpio_data = gc->read_reg(gc->reg_dat); + chip->sdata = chip->read_reg(chip->reg_dat); if (gc->set == bgpio_set_set && !(flags & BGPIOF_UNREADABLE_REG_SET)) - gc->bgpio_data = gc->read_reg(gc->reg_set); + chip->sdata = chip->read_reg(chip->reg_set); if (flags & BGPIOF_UNREADABLE_REG_DIR) - gc->bgpio_dir_unreadable = true; + chip->dir_unreadable = true; /* * Inspect hardware to find initial direction setting. */ - if ((gc->reg_dir_out || gc->reg_dir_in) && + if ((chip->reg_dir_out || chip->reg_dir_in) && !(flags & BGPIOF_UNREADABLE_REG_DIR)) { - if (gc->reg_dir_out) - gc->bgpio_dir = gc->read_reg(gc->reg_dir_out); - else if (gc->reg_dir_in) - gc->bgpio_dir = ~gc->read_reg(gc->reg_dir_in); + if (chip->reg_dir_out) + chip->sdir = chip->read_reg(chip->reg_dir_out); + else if (chip->reg_dir_in) + chip->sdir = ~chip->read_reg(chip->reg_dir_in); /* * If we have two direction registers, synchronise * input setting to output setting, the library * can not handle a line being input and output at * the same time. */ - if (gc->reg_dir_out && gc->reg_dir_in) - gc->write_reg(gc->reg_dir_in, ~gc->bgpio_dir); + if (chip->reg_dir_out && chip->reg_dir_in) + chip->write_reg(chip->reg_dir_in, ~chip->sdir); } return ret; } -EXPORT_SYMBOL_GPL(bgpio_init); +EXPORT_SYMBOL_GPL(gpio_generic_chip_init); #if IS_ENABLED(CONFIG_GPIO_GENERIC_PLATFORM) diff --git a/drivers/gpio/gpio-mpc8xxx.c b/drivers/gpio/gpio-mpc8xxx.c index 38643fb813c562957076aab48d804f8048cee5e4..2bb6100840ea27fb63ce7cdc3e1eb3e43526eb4d 100644 --- a/drivers/gpio/gpio-mpc8xxx.c +++ b/drivers/gpio/gpio-mpc8xxx.c @@ -71,7 +71,7 @@ static int mpc8572_gpio_get(struct gpio_chip *gc, unsigned int gpio) mpc8xxx_gc->regs + GPIO_DIR); val = gpio_generic_read_reg(&mpc8xxx_gc->chip, mpc8xxx_gc->regs + GPIO_DAT) & ~out_mask; - out_shadow = gc->bgpio_data & out_mask; + out_shadow = mpc8xxx_gc->chip.sdata & out_mask; return !!((val | out_shadow) & mpc_pin2mask(gpio)); } @@ -399,7 +399,8 @@ static int mpc8xxx_probe(struct platform_device *pdev) gpio_generic_write_reg(&mpc8xxx_gc->chip, mpc8xxx_gc->regs + GPIO_IBE, 0xffffffff); /* Also, latch state of GPIOs configured as output by bootloader. */ - gc->bgpio_data = gpio_generic_read_reg(&mpc8xxx_gc->chip, + mpc8xxx_gc->chip.sdata = + gpio_generic_read_reg(&mpc8xxx_gc->chip, mpc8xxx_gc->regs + GPIO_DAT) & gpio_generic_read_reg(&mpc8xxx_gc->chip, mpc8xxx_gc->regs + GPIO_DIR); diff --git a/include/linux/gpio/driver.h b/include/linux/gpio/driver.h index 9fcd4a988081f74d25dc88535705ba9265e56fd2..9b14fd20f13eee7d465e065e7ded2c92e2bbc78e 100644 --- a/include/linux/gpio/driver.h +++ b/include/linux/gpio/driver.h @@ -388,28 +388,6 @@ struct gpio_irq_chip { * implies that if the chip supports IRQs, these IRQs need to be threaded * as the chip access may sleep when e.g. reading out the IRQ status * registers. - * @read_reg: reader function for generic GPIO - * @write_reg: writer function for generic GPIO - * @be_bits: if the generic GPIO has big endian bit order (bit 31 is representing - * line 0, bit 30 is line 1 ... bit 0 is line 31) this is set to true by the - * generic GPIO core. It is for internal housekeeping only. - * @reg_dat: data (in) register for generic GPIO - * @reg_set: output set register (out=high) for generic GPIO - * @reg_clr: output clear register (out=low) for generic GPIO - * @reg_dir_out: direction out setting register for generic GPIO - * @reg_dir_in: direction in setting register for generic GPIO - * @bgpio_dir_unreadable: indicates that the direction register(s) cannot - * be read and we need to rely on out internal state tracking. - * @bgpio_pinctrl: the generic GPIO uses a pin control backend. - * @bgpio_bits: number of register bits used for a generic GPIO i.e. - * * 8 - * @bgpio_lock: used to lock chip->bgpio_data. Also, this is needed to keep - * shadowed and real data registers writes together. - * @bgpio_data: shadowed data register for generic GPIO to clear/set bits - * safely. - * @bgpio_dir: shadowed direction register for generic GPIO to clear/set - * direction safely. A "1" in this word means the line is set as - * output. * * A gpio_chip can help platforms abstract various sources of GPIOs so * they can all be accessed through a common programming interface. @@ -475,23 +453,6 @@ struct gpio_chip { const char *const *names; bool can_sleep; -#if IS_ENABLED(CONFIG_GPIO_GENERIC) - unsigned long (*read_reg)(void __iomem *reg); - void (*write_reg)(void __iomem *reg, unsigned long data); - bool be_bits; - void __iomem *reg_dat; - void __iomem *reg_set; - void __iomem *reg_clr; - void __iomem *reg_dir_out; - void __iomem *reg_dir_in; - bool bgpio_dir_unreadable; - bool bgpio_pinctrl; - int bgpio_bits; - raw_spinlock_t bgpio_lock; - unsigned long bgpio_data; - unsigned long bgpio_dir; -#endif /* CONFIG_GPIO_GENERIC */ - #ifdef CONFIG_GPIOLIB_IRQCHIP /* * With CONFIG_GPIOLIB_IRQCHIP we get an irqchip inside the gpiolib @@ -723,11 +684,6 @@ int gpiochip_populate_parent_fwspec_fourcell(struct gpio_chip *gc, #endif /* CONFIG_IRQ_DOMAIN_HIERARCHY */ -int bgpio_init(struct gpio_chip *gc, struct device *dev, - unsigned long sz, void __iomem *dat, void __iomem *set, - void __iomem *clr, void __iomem *dirout, void __iomem *dirin, - unsigned long flags); - #define BGPIOF_BIG_ENDIAN BIT(0) #define BGPIOF_UNREADABLE_REG_SET BIT(1) /* reg_set is unreadable */ #define BGPIOF_UNREADABLE_REG_DIR BIT(2) /* reg_dir is unreadable */ diff --git a/include/linux/gpio/generic.h b/include/linux/gpio/generic.h index 4c0626b53ec90388a034bc7797eefa53e7ea064e..162430d96660e96b995eb4a2e64183503fc618e3 100644 --- a/include/linux/gpio/generic.h +++ b/include/linux/gpio/generic.h @@ -50,9 +50,44 @@ struct gpio_generic_chip_config { * struct gpio_generic_chip - Generic GPIO chip implementation. * @gc: The underlying struct gpio_chip object, implementing low-level GPIO * chip routines. + * @read_reg: reader function for generic GPIO + * @write_reg: writer function for generic GPIO + * @be_bits: if the generic GPIO has big endian bit order (bit 31 is + * representing line 0, bit 30 is line 1 ... bit 0 is line 31) this + * is set to true by the generic GPIO core. It is for internal + * housekeeping only. + * @reg_dat: data (in) register for generic GPIO + * @reg_set: output set register (out=high) for generic GPIO + * @reg_clr: output clear register (out=low) for generic GPIO + * @reg_dir_out: direction out setting register for generic GPIO + * @reg_dir_in: direction in setting register for generic GPIO + * @dir_unreadable: indicates that the direction register(s) cannot be read and + * we need to rely on out internal state tracking. + * @pinctrl: the generic GPIO uses a pin control backend. + * @bits: number of register bits used for a generic GPIO + * i.e. * 8 + * @lock: used to lock chip->sdata. Also, this is needed to keep + * shadowed and real data registers writes together. + * @sdata: shadowed data register for generic GPIO to clear/set bits safely. + * @sdir: shadowed direction register for generic GPIO to clear/set direction + * safely. A "1" in this word means the line is set as output. */ struct gpio_generic_chip { struct gpio_chip gc; + unsigned long (*read_reg)(void __iomem *reg); + void (*write_reg)(void __iomem *reg, unsigned long data); + bool be_bits; + void __iomem *reg_dat; + void __iomem *reg_set; + void __iomem *reg_clr; + void __iomem *reg_dir_out; + void __iomem *reg_dir_in; + bool dir_unreadable; + bool pinctrl; + int bits; + raw_spinlock_t lock; + unsigned long sdata; + unsigned long sdir; }; static inline struct gpio_generic_chip * @@ -61,20 +96,8 @@ to_gpio_generic_chip(struct gpio_chip *gc) return container_of(gc, struct gpio_generic_chip, gc); } -/** - * gpio_generic_chip_init() - Initialize a generic GPIO chip. - * @chip: Generic GPIO chip to set up. - * @cfg: Generic GPIO chip configuration. - * - * Returns 0 on success, negative error number on failure. - */ -static inline int -gpio_generic_chip_init(struct gpio_generic_chip *chip, - const struct gpio_generic_chip_config *cfg) -{ - return bgpio_init(&chip->gc, cfg->dev, cfg->sz, cfg->dat, cfg->set, - cfg->clr, cfg->dirout, cfg->dirin, cfg->flags); -} +int gpio_generic_chip_init(struct gpio_generic_chip *chip, + const struct gpio_generic_chip_config *cfg); /** * gpio_generic_chip_set() - Set the GPIO line value of the generic GPIO chip. @@ -110,10 +133,10 @@ gpio_generic_chip_set(struct gpio_generic_chip *chip, unsigned int offset, static inline unsigned long gpio_generic_read_reg(struct gpio_generic_chip *chip, void __iomem *reg) { - if (WARN_ON(!chip->gc.read_reg)) + if (WARN_ON(!chip->read_reg)) return 0; - return chip->gc.read_reg(reg); + return chip->read_reg(reg); } /** @@ -125,23 +148,23 @@ gpio_generic_read_reg(struct gpio_generic_chip *chip, void __iomem *reg) static inline void gpio_generic_write_reg(struct gpio_generic_chip *chip, void __iomem *reg, unsigned long val) { - if (WARN_ON(!chip->gc.write_reg)) + if (WARN_ON(!chip->write_reg)) return; - chip->gc.write_reg(reg, val); + chip->write_reg(reg, val); } #define gpio_generic_chip_lock(gen_gc) \ - raw_spin_lock(&(gen_gc)->gc.bgpio_lock) + raw_spin_lock(&(gen_gc)->lock) #define gpio_generic_chip_unlock(gen_gc) \ - raw_spin_unlock(&(gen_gc)->gc.bgpio_lock) + raw_spin_unlock(&(gen_gc)->lock) #define gpio_generic_chip_lock_irqsave(gen_gc, flags) \ - raw_spin_lock_irqsave(&(gen_gc)->gc.bgpio_lock, flags) + raw_spin_lock_irqsave(&(gen_gc)->lock, flags) #define gpio_generic_chip_unlock_irqrestore(gen_gc, flags) \ - raw_spin_unlock_irqrestore(&(gen_gc)->gc.bgpio_lock, flags) + raw_spin_unlock_irqrestore(&(gen_gc)->lock, flags) DEFINE_LOCK_GUARD_1(gpio_generic_lock, struct gpio_generic_chip, -- 2.48.1 From dlan at gentoo.org Tue Sep 9 02:39:58 2025 From: dlan at gentoo.org (Yixun Lan) Date: Tue, 9 Sep 2025 17:39:58 +0800 Subject: [PATCH 12/15] gpio: spacemit-k1: use new generic GPIO chip API In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-12-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-12-9f723dc3524a@linaro.org> Message-ID: <20250909093958-GYA1207638@gentoo.org> On 11:15 Tue 09 Sep , Bartosz Golaszewski wrote: > From: Bartosz Golaszewski > > Convert the driver to using the new generic GPIO chip interfaces from > linux/gpio/generic.h. > > Signed-off-by: Bartosz Golaszewski Thanks for converting this Reviewed-by: Yixun Lan > --- > drivers/gpio/gpio-spacemit-k1.c | 28 ++++++++++++++++++++-------- > 1 file changed, 20 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpio/gpio-spacemit-k1.c b/drivers/gpio/gpio-spacemit-k1.c > index 3cc75c701ec40194e602b80d3f96f23204ce3b4d..9e57f43d3d13ad28fcd3327ecdc3f359691a44c9 100644 > --- a/drivers/gpio/gpio-spacemit-k1.c > +++ b/drivers/gpio/gpio-spacemit-k1.c > @@ -6,6 +6,7 @@ > > #include > #include > +#include > #include > #include > #include > @@ -38,7 +39,7 @@ > struct spacemit_gpio; > > struct spacemit_gpio_bank { > - struct gpio_chip gc; > + struct gpio_generic_chip chip; > struct spacemit_gpio *sg; > void __iomem *base; > u32 irq_mask; > @@ -72,7 +73,7 @@ static irqreturn_t spacemit_gpio_irq_handler(int irq, void *dev_id) > return IRQ_NONE; > > for_each_set_bit(n, &pending, BITS_PER_LONG) > - handle_nested_irq(irq_find_mapping(gb->gc.irq.domain, n)); > + handle_nested_irq(irq_find_mapping(gb->chip.gc.irq.domain, n)); > > return IRQ_HANDLED; > } > @@ -143,7 +144,7 @@ static void spacemit_gpio_irq_print_chip(struct irq_data *data, struct seq_file > { > struct spacemit_gpio_bank *gb = irq_data_get_irq_chip_data(data); > > - seq_printf(p, "%s-%d", dev_name(gb->gc.parent), spacemit_gpio_bank_index(gb)); > + seq_printf(p, "%s-%d", dev_name(gb->chip.gc.parent), spacemit_gpio_bank_index(gb)); > } > > static struct irq_chip spacemit_gpio_chip = { > @@ -165,7 +166,7 @@ static bool spacemit_of_node_instance_match(struct gpio_chip *gc, unsigned int i > if (i >= SPACEMIT_NR_BANKS) > return false; > > - return (gc == &sg->sgb[i].gc); > + return (gc == &sg->sgb[i].chip.gc); > } > > static int spacemit_gpio_add_bank(struct spacemit_gpio *sg, > @@ -173,7 +174,8 @@ static int spacemit_gpio_add_bank(struct spacemit_gpio *sg, > int index, int irq) > { > struct spacemit_gpio_bank *gb = &sg->sgb[index]; > - struct gpio_chip *gc = &gb->gc; > + struct gpio_generic_chip_config config; > + struct gpio_chip *gc = &gb->chip.gc; > struct device *dev = sg->dev; > struct gpio_irq_chip *girq; > void __iomem *dat, *set, *clr, *dirin, *dirout; > @@ -187,9 +189,19 @@ static int spacemit_gpio_add_bank(struct spacemit_gpio *sg, > dirin = gb->base + SPACEMIT_GCDR; > dirout = gb->base + SPACEMIT_GSDR; > > + config = (typeof(config)){ > + .dev = dev, > + .sz = 4, > + .dat = dat, > + .set = set, > + .clr = clr, > + .dirout = dirout, > + .dirin = dirin, > + .flags = BGPIOF_UNREADABLE_REG_SET | BGPIOF_UNREADABLE_REG_DIR, > + }; > + > /* This registers 32 GPIO lines per bank */ > - ret = bgpio_init(gc, dev, 4, dat, set, clr, dirout, dirin, > - BGPIOF_UNREADABLE_REG_SET | BGPIOF_UNREADABLE_REG_DIR); > + ret = gpio_generic_chip_init(&gb->chip, &config); > if (ret) > return dev_err_probe(dev, ret, "failed to init gpio chip\n"); > > @@ -221,7 +233,7 @@ static int spacemit_gpio_add_bank(struct spacemit_gpio *sg, > ret = devm_request_threaded_irq(dev, irq, NULL, > spacemit_gpio_irq_handler, > IRQF_ONESHOT | IRQF_SHARED, > - gb->gc.label, gb); > + gb->chip.gc.label, gb); > if (ret < 0) > return dev_err_probe(dev, ret, "failed to register IRQ\n"); > > > -- > 2.48.1 > -- Yixun Lan (dlan) From david at redhat.com Tue Sep 9 02:55:51 2025 From: david at redhat.com (David Hildenbrand) Date: Tue, 9 Sep 2025 11:55:51 +0200 Subject: [PATCH v2 22/37] mm/cma: refuse handing out non-contiguous page ranges In-Reply-To: <20250901150359.867252-23-david@redhat.com> References: <20250901150359.867252-1-david@redhat.com> <20250901150359.867252-23-david@redhat.com> Message-ID: <6ec933b1-b3f7-41c0-95d8-e518bb87375e@redhat.com> On 01.09.25 17:03, David Hildenbrand wrote: > Let's disallow handing out PFN ranges with non-contiguous pages, so we > can remove the nth-page usage in __cma_alloc(), and so any callers don't > have to worry about that either when wanting to blindly iterate pages. > > This is really only a problem in configs with SPARSEMEM but without > SPARSEMEM_VMEMMAP, and only when we would cross memory sections in some > cases. > > Will this cause harm? Probably not, because it's mostly 32bit that does > not support SPARSEMEM_VMEMMAP. If this ever becomes a problem we could > look into allocating the memmap for the memory sections spanned by a > single CMA region in one go from memblock. > > Reviewed-by: Alexandru Elisei > Reviewed-by: Lorenzo Stoakes > Signed-off-by: David Hildenbrand > --- @Andrew, the following fixup on top. I'm still cross-compiling it, but at the time you read this mail my cross compiles should have been done. From cbfa2763e1820b917ce3430f45e5f3a55eb2970f Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Tue, 9 Sep 2025 05:50:13 -0400 Subject: [PATCH] fixup: mm/cma: refuse handing out non-contiguous page ranges Apparently we can have NUMMU configs with SPARSEMEM enabled. Signed-off-by: David Hildenbrand --- mm/util.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/util.c b/mm/util.c index 248f877f629b6..6c1d64ed02211 100644 --- a/mm/util.c +++ b/mm/util.c @@ -1306,6 +1306,7 @@ unsigned int folio_pte_batch(struct folio *folio, pte_t *ptep, pte_t pte, { return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr, 0); } +#endif /* CONFIG_MMU */ #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP) /** @@ -1342,4 +1343,3 @@ bool page_range_contiguous(const struct page *page, unsigned long nr_pages) } EXPORT_SYMBOL(page_range_contiguous); #endif -#endif /* CONFIG_MMU */ -- 2.50.1 -- Cheers David / dhildenb From zhangchunyan at iscas.ac.cn Tue Sep 9 02:56:10 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Tue, 9 Sep 2025 17:56:10 +0800 Subject: [PATCH V10 4/5] riscv: mm: Add soft-dirty page tracking support In-Reply-To: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> References: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250909095611.803898-5-zhangchunyan@iscas.ac.cn> The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59 for software, this patch uses bit 59 for soft-dirty. To add swap PTE soft-dirty tracking, we borrow bit 3 which is available for swap PTEs on RISC-V systems. Signed-off-by: Chunyan Zhang --- arch/riscv/Kconfig | 1 + arch/riscv/include/asm/pgtable-bits.h | 19 +++++++ arch/riscv/include/asm/pgtable.h | 73 ++++++++++++++++++++++++++- 3 files changed, 91 insertions(+), 2 deletions(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index d99df67cc7a4..53b73e4bdf3f 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -141,6 +141,7 @@ config RISCV select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET select HAVE_ARCH_SECCOMP_FILTER + select HAVE_ARCH_SOFT_DIRTY if 64BIT && MMU && RISCV_ISA_SVRSW60T59B select HAVE_ARCH_THREAD_STRUCT_WHITELIST select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRANSPARENT_HUGEPAGE if 64BIT && MMU diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h index 179bd4afece4..8ffe81bf66d2 100644 --- a/arch/riscv/include/asm/pgtable-bits.h +++ b/arch/riscv/include/asm/pgtable-bits.h @@ -19,6 +19,25 @@ #define _PAGE_SOFT (3 << 8) /* Reserved for software */ #define _PAGE_SPECIAL (1 << 8) /* RSW: 0x1 */ + +#ifdef CONFIG_MEM_SOFT_DIRTY + +/* ext_svrsw60t59b: bit 59 for software dirty tracking */ +#define _PAGE_SOFT_DIRTY \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + (1UL << 59) : 0) +/* + * Bit 3 is always zero for swap entry computation, so we + * can borrow it for swap page soft-dirty tracking. + */ +#define _PAGE_SWP_SOFT_DIRTY \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + _PAGE_EXEC : 0) +#else +#define _PAGE_SOFT_DIRTY 0 +#define _PAGE_SWP_SOFT_DIRTY 0 +#endif /* CONFIG_MEM_SOFT_DIRTY */ + #define _PAGE_TABLE _PAGE_PRESENT /* diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 91697fbf1f90..b2d00d129d81 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -427,7 +427,7 @@ static inline pte_t pte_mkwrite_novma(pte_t pte) static inline pte_t pte_mkdirty(pte_t pte) { - return __pte(pte_val(pte) | _PAGE_DIRTY); + return __pte(pte_val(pte) | _PAGE_DIRTY | _PAGE_SOFT_DIRTY); } static inline pte_t pte_mkclean(pte_t pte) @@ -455,6 +455,40 @@ static inline pte_t pte_mkhuge(pte_t pte) return pte; } +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY +#define pte_soft_dirty_available() riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B) + +static inline bool pte_soft_dirty(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_SOFT_DIRTY); +} + +static inline pte_t pte_mksoft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SOFT_DIRTY); +} + +static inline pte_t pte_clear_soft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_SOFT_DIRTY)); +} + +static inline bool pte_swp_soft_dirty(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_SWP_SOFT_DIRTY); +} + +static inline pte_t pte_swp_mksoft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_SOFT_DIRTY); +} + +static inline pte_t pte_swp_clear_soft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_SWP_SOFT_DIRTY)); +} +#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */ + #ifdef CONFIG_RISCV_ISA_SVNAPOT #define pte_leaf_size(pte) (pte_napot(pte) ? \ napot_cont_size(napot_cont_order(pte)) :\ @@ -802,6 +836,40 @@ static inline pud_t pud_mkspecial(pud_t pud) } #endif +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY +static inline bool pmd_soft_dirty(pmd_t pmd) +{ + return pte_soft_dirty(pmd_pte(pmd)); +} + +static inline pmd_t pmd_mksoft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_mksoft_dirty(pmd_pte(pmd))); +} + +static inline pmd_t pmd_clear_soft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_clear_soft_dirty(pmd_pte(pmd))); +} + +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION +static inline bool pmd_swp_soft_dirty(pmd_t pmd) +{ + return pte_swp_soft_dirty(pmd_pte(pmd)); +} + +static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_swp_mksoft_dirty(pmd_pte(pmd))); +} + +static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_swp_clear_soft_dirty(pmd_pte(pmd))); +} +#endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */ +#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */ + static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd) { @@ -983,7 +1051,8 @@ static inline pud_t pud_modify(pud_t pud, pgprot_t newprot) * * Format of swap PTE: * bit 0: _PAGE_PRESENT (zero) - * bit 1 to 3: _PAGE_LEAF (zero) + * bit 1 to 2: (zero) + * bit 3: _PAGE_SWP_SOFT_DIRTY * bit 5: _PAGE_PROT_NONE (zero) * bit 6: exclusive marker * bits 7 to 11: swap type -- 2.34.1 From zhangchunyan at iscas.ac.cn Tue Sep 9 02:56:07 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Tue, 9 Sep 2025 17:56:07 +0800 Subject: [PATCH V10 1/5] mm: softdirty: Add pte_soft_dirty_available() In-Reply-To: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> References: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250909095611.803898-2-zhangchunyan@iscas.ac.cn> Some platforms can customize the PTE soft dirty bit and make it unavailable even if the architecture allows providing the PTE resource. Add an API which architectures can define their specific implementations to detect if the PTE soft-dirty bit is available, on which the kernel is running. Signed-off-by: Chunyan Zhang --- fs/proc/task_mmu.c | 17 ++++++++++++++++- include/linux/pgtable.h | 10 ++++++++++ mm/debug_vm_pgtable.c | 9 +++++---- mm/huge_memory.c | 10 ++++++---- mm/internal.h | 2 +- mm/mremap.c | 10 ++++++---- mm/userfaultfd.c | 6 ++++-- 7 files changed, 48 insertions(+), 16 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 29cca0e6d0ff..20a609ec1ba6 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1058,7 +1058,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) * -Werror=unterminated-string-initialization warning * with GCC 15 */ - static const char mnemonics[BITS_PER_LONG][3] = { + static char mnemonics[BITS_PER_LONG][3] = { /* * In case if we meet a flag we don't know about. */ @@ -1129,6 +1129,16 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) [ilog2(VM_SEALED)] = "sl", #endif }; +/* + * We should remove the VM_SOFTDIRTY flag if the PTE soft-dirty bit is + * unavailable on which the kernel is running, even if the architecture + * allows providing the PTE resource and soft-dirty is compiled in. + */ +#ifdef CONFIG_MEM_SOFT_DIRTY + if (!pte_soft_dirty_available()) + mnemonics[ilog2(VM_SOFTDIRTY)][0] = 0; +#endif + size_t i; seq_puts(m, "VmFlags: "); @@ -1531,6 +1541,8 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, static inline void clear_soft_dirty(struct vm_area_struct *vma, unsigned long addr, pte_t *pte) { + if (!pte_soft_dirty_available()) + return; /* * The soft-dirty tracker uses #PF-s to catch writes * to pages, so write-protect the pte as well. See the @@ -1566,6 +1578,9 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, { pmd_t old, pmd = *pmdp; + if (!pte_soft_dirty_available()) + return; + if (pmd_present(pmd)) { /* See comment in change_huge_pmd() */ old = pmdp_invalidate(vma, addr, pmdp); diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 4c035637eeb7..c0e2a6dc69f4 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1538,6 +1538,15 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) #endif #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY + +/* + * Some platforms can customize the PTE soft dirty bit and make it unavailable + * even if the architecture allows providing the PTE resource. + */ +#ifndef pte_soft_dirty_available +#define pte_soft_dirty_available() (true) +#endif + #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) { @@ -1555,6 +1564,7 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) } #endif #else /* !CONFIG_HAVE_ARCH_SOFT_DIRTY */ +#define pte_soft_dirty_available() (false) static inline int pte_soft_dirty(pte_t pte) { return 0; diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index 830107b6dd08..98ed7e22ccec 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -690,7 +690,7 @@ static void __init pte_soft_dirty_tests(struct pgtable_debug_args *args) { pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) + if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || !pte_soft_dirty_available()) return; pr_debug("Validating PTE soft dirty\n"); @@ -702,7 +702,7 @@ static void __init pte_swap_soft_dirty_tests(struct pgtable_debug_args *args) { pte_t pte; - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) + if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || !pte_soft_dirty_available()) return; pr_debug("Validating PTE swap soft dirty\n"); @@ -718,7 +718,7 @@ static void __init pmd_soft_dirty_tests(struct pgtable_debug_args *args) { pmd_t pmd; - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) + if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || !pte_soft_dirty_available()) return; if (!has_transparent_hugepage()) @@ -735,7 +735,8 @@ static void __init pmd_swap_soft_dirty_tests(struct pgtable_debug_args *args) pmd_t pmd; if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || - !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) + !pte_soft_dirty_available() || + !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) return; if (!has_transparent_hugepage()) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9c38a95e9f09..4e4fd56c0c18 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2272,10 +2272,12 @@ static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl, static pmd_t move_soft_dirty_pmd(pmd_t pmd) { #ifdef CONFIG_MEM_SOFT_DIRTY - if (unlikely(is_pmd_migration_entry(pmd))) - pmd = pmd_swp_mksoft_dirty(pmd); - else if (pmd_present(pmd)) - pmd = pmd_mksoft_dirty(pmd); + if (pte_soft_dirty_available()) { + if (unlikely(is_pmd_migration_entry(pmd))) + pmd = pmd_swp_mksoft_dirty(pmd); + else if (pmd_present(pmd)) + pmd = pmd_mksoft_dirty(pmd); + } #endif return pmd; } diff --git a/mm/internal.h b/mm/internal.h index 45b725c3dc03..8a5b20fac892 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1538,7 +1538,7 @@ static inline bool vma_soft_dirty_enabled(struct vm_area_struct *vma) * VM_SOFTDIRTY is defined as 0x0, then !(vm_flags & VM_SOFTDIRTY) * will be constantly true. */ - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) + if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || !pte_soft_dirty_available()) return false; /* diff --git a/mm/mremap.c b/mm/mremap.c index e618a706aff5..788dd8aaae47 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -163,10 +163,12 @@ static pte_t move_soft_dirty_pte(pte_t pte) * in userspace the ptes were moved. */ #ifdef CONFIG_MEM_SOFT_DIRTY - if (pte_present(pte)) - pte = pte_mksoft_dirty(pte); - else if (is_swap_pte(pte)) - pte = pte_swp_mksoft_dirty(pte); + if (pte_soft_dirty_available()) { + if (pte_present(pte)) + pte = pte_mksoft_dirty(pte); + else if (is_swap_pte(pte)) + pte = pte_swp_mksoft_dirty(pte); + } #endif return pte; } diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 45e6290e2e8b..94f159a680a4 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -1066,7 +1066,8 @@ static int move_present_pte(struct mm_struct *mm, orig_dst_pte = folio_mk_pte(src_folio, dst_vma->vm_page_prot); /* Set soft dirty bit so userspace can notice the pte was moved */ #ifdef CONFIG_MEM_SOFT_DIRTY - orig_dst_pte = pte_mksoft_dirty(orig_dst_pte); + if (pte_soft_dirty_available()) + orig_dst_pte = pte_mksoft_dirty(orig_dst_pte); #endif if (pte_dirty(orig_src_pte)) orig_dst_pte = pte_mkdirty(orig_dst_pte); @@ -1135,7 +1136,8 @@ static int move_swap_pte(struct mm_struct *mm, struct vm_area_struct *dst_vma, orig_src_pte = ptep_get_and_clear(mm, src_addr, src_pte); #ifdef CONFIG_MEM_SOFT_DIRTY - orig_src_pte = pte_swp_mksoft_dirty(orig_src_pte); + if (pte_soft_dirty_available()) + orig_src_pte = pte_swp_mksoft_dirty(orig_src_pte); #endif set_pte_at(mm, dst_addr, dst_pte, orig_src_pte); double_pt_unlock(dst_ptl, src_ptl); -- 2.34.1 From zhangchunyan at iscas.ac.cn Tue Sep 9 02:56:06 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Tue, 9 Sep 2025 17:56:06 +0800 Subject: [PATCH V10 0/5] riscv: mm: Add soft-dirty and uffd-wp support Message-ID: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> This patchset adds support for Svrsw60t59b [1] extension which is ratified now, also soft dirty and userfaultfd write protect tracking for RISC-V. The patches 1 and 2 add macros to detect if the soft-dirty / uffd_wp PTE bits are available, in other words, the Svrsw60t59b extension is supported for the RISC-V device on which the kernel is running. This patchset has been tested with kselftest mm suite in which soft-dirty, madv_populate, test_unmerge_uffd_wp, and uffd-unit-tests run and pass, and no regressions are observed in any of the other tests. This patchset applies on top of v6.17-rc4. V10: - Fixed the issue reported by kernel test irobot . V9: https://lore.kernel.org/all/20250905103651.489197-1-zhangchunyan at iscas.ac.cn/ - Add pte_soft_dirty/uffd_wp_available() API to allow dynamically checking if the PTE bit is available for the platform on which the kernel is running. V8: https://lore.kernel.org/all/20250619065232.1786470-1-zhangchunyan at iscas.ac.cn/) - Rebase on v6.16-rc1; - Add dependencies to MMU && 64BIT for RISCV_ISA_SVRSW60T59B; - Use 'Svrsw60t59b' instead of 'SVRSW60T59B' in Kconfig help paragraph; - Add Alex's Reviewed-by tag in patch 1. V7: https://lore.kernel.org/all/20250409095320.224100-1-zhangchunyan at iscas.ac.cn/ - Add Svrsw60t59b [1] extension support; - Have soft-dirty and uffd-wp depending on the Svrsw60t59b extension to avoid crashes for the hardware which don't have this extension. V6: https://lore.kernel.org/all/20250408084301.68186-1-zhangchunyan at iscas.ac.cn/ - Changes to use bits 59-60 which are supported by extension Svrsw60t59b for soft dirty and userfaultfd write protect tracking. V5: https://lore.kernel.org/all/20241113095833.1805746-1-zhangchunyan at iscas.ac.cn/ - Fixed typos and corrected some words in Kconfig and commit message; - Removed pte_wrprotect() from pte_swp_mkuffd_wp(), this is a copy-paste error; - Added Alex's Reviewed-by tag in patch 2. V4: https://lore.kernel.org/all/20240830011101.3189522-1-zhangchunyan at iscas.ac.cn/ - Added bit(4) descriptions into "Format of swap PTE". V3: https://lore.kernel.org/all/20240805095243.44809-1-zhangchunyan at iscas.ac.cn/ - Fixed the issue reported by kernel test irobot . V2: https://lore.kernel.org/all/20240731040444.3384790-1-zhangchunyan at iscas.ac.cn/ - Add uffd-wp supported; - Make soft-dirty uffd-wp and devmap mutually exclusive which all use the same PTE bit; - Add test results of CRIU in the cover-letter. [1] https://github.com/riscv-non-isa/riscv-iommu/pull/543 Chunyan Zhang (5): mm: softdirty: Add pte_soft_dirty_available() mm: uffd_wp: Add pte_uffd_wp_available() riscv: Add RISC-V Svrsw60t59b extension support riscv: mm: Add soft-dirty page tracking support riscv: mm: Add uffd write-protect support arch/riscv/Kconfig | 16 +++ arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/include/asm/pgtable-bits.h | 37 +++++++ arch/riscv/include/asm/pgtable.h | 140 +++++++++++++++++++++++++- arch/riscv/kernel/cpufeature.c | 1 + fs/proc/task_mmu.c | 17 +++- fs/userfaultfd.c | 25 +++-- include/asm-generic/pgtable_uffd.h | 12 +++ include/linux/mm_inline.h | 7 ++ include/linux/pgtable.h | 10 ++ include/linux/userfaultfd_k.h | 44 +++++--- mm/debug_vm_pgtable.c | 9 +- mm/huge_memory.c | 10 +- mm/internal.h | 2 +- mm/memory.c | 6 +- mm/mremap.c | 10 +- mm/userfaultfd.c | 6 +- 17 files changed, 306 insertions(+), 47 deletions(-) -- 2.34.1 From zhangchunyan at iscas.ac.cn Tue Sep 9 02:56:08 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Tue, 9 Sep 2025 17:56:08 +0800 Subject: [PATCH V10 2/5] mm: uffd_wp: Add pte_uffd_wp_available() In-Reply-To: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> References: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250909095611.803898-3-zhangchunyan@iscas.ac.cn> Some platforms can customize the PTE uffd_wp bit and make it unavailable even if the architecture allows providing the PTE resource. This patch adds a macro API which allows architectures to define their specific ones for checking if the PTE uffd_wp bit is available. Signed-off-by: Chunyan Zhang --- fs/userfaultfd.c | 25 +++++++++-------- include/asm-generic/pgtable_uffd.h | 12 ++++++++ include/linux/mm_inline.h | 7 +++++ include/linux/userfaultfd_k.h | 44 +++++++++++++++++++----------- mm/memory.c | 6 ++-- 5 files changed, 65 insertions(+), 29 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 54c6cc7fe9c6..68e5006e5158 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1270,9 +1270,10 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MISSING) vm_flags |= VM_UFFD_MISSING; if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) { -#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP - goto out; -#endif + if (!IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP) || + !pte_uffd_wp_available()) + goto out; + vm_flags |= VM_UFFD_WP; } if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MINOR) { @@ -1980,14 +1981,16 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx, uffdio_api.features &= ~(UFFD_FEATURE_MINOR_HUGETLBFS | UFFD_FEATURE_MINOR_SHMEM); #endif -#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP - uffdio_api.features &= ~UFFD_FEATURE_PAGEFAULT_FLAG_WP; -#endif -#ifndef CONFIG_PTE_MARKER_UFFD_WP - uffdio_api.features &= ~UFFD_FEATURE_WP_HUGETLBFS_SHMEM; - uffdio_api.features &= ~UFFD_FEATURE_WP_UNPOPULATED; - uffdio_api.features &= ~UFFD_FEATURE_WP_ASYNC; -#endif + if (!IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP) || + !pte_uffd_wp_available()) + uffdio_api.features &= ~UFFD_FEATURE_PAGEFAULT_FLAG_WP; + + if (!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || + !pte_uffd_wp_available()) { + uffdio_api.features &= ~UFFD_FEATURE_WP_HUGETLBFS_SHMEM; + uffdio_api.features &= ~UFFD_FEATURE_WP_UNPOPULATED; + uffdio_api.features &= ~UFFD_FEATURE_WP_ASYNC; + } ret = -EINVAL; if (features & ~uffdio_api.features) diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h index 828966d4c281..abab46bd718b 100644 --- a/include/asm-generic/pgtable_uffd.h +++ b/include/asm-generic/pgtable_uffd.h @@ -61,6 +61,18 @@ static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) { return pmd; } +#define pte_uffd_wp_available() (false) +#else +/* + * Some platforms can customize the PTE uffd_wp bit and make it unavailable + * even if the architecture allows providing the PTE resource. + * It allows architectures to define their APIs to check if the PTE + * uffd_wp bit is available on the specific devices. + */ +#ifndef pte_uffd_wp_available +#define pte_uffd_wp_available() (true) +#endif + #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ #endif /* _ASM_GENERIC_PGTABLE_UFFD_H */ diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 89b518ff097e..4e5a8a265642 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -571,6 +571,13 @@ pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, pte_t *pte, pte_t pteval) { #ifdef CONFIG_PTE_MARKER_UFFD_WP + /* + * Some platforms can customize the PTE uffd_wp bit and make it unavailable + * even if the architecture allows providing the PTE resource. + */ + if (!pte_uffd_wp_available()) + return false; + bool arm_uffd_pte = false; /* The current status of the pte should be "cleared" before calling */ diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index c0e716aec26a..ec4a815286c8 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -228,15 +228,15 @@ static inline bool vma_can_userfault(struct vm_area_struct *vma, if (wp_async && (vm_flags == VM_UFFD_WP)) return true; -#ifndef CONFIG_PTE_MARKER_UFFD_WP /* * If user requested uffd-wp but not enabled pte markers for * uffd-wp, then shmem & hugetlbfs are not supported but only * anonymous. */ - if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma)) + if ((!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || + !pte_uffd_wp_available()) && + (vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma)) return false; -#endif /* By default, allow any of anon|shmem|hugetlb */ return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || @@ -437,8 +437,11 @@ static inline bool userfaultfd_wp_use_markers(struct vm_area_struct *vma) static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) { #ifdef CONFIG_PTE_MARKER_UFFD_WP - return is_pte_marker_entry(entry) && - (pte_marker_get(entry) & PTE_MARKER_UFFD_WP); + if (pte_uffd_wp_available()) + return is_pte_marker_entry(entry) && + (pte_marker_get(entry) & PTE_MARKER_UFFD_WP); + else + return false; #else return false; #endif @@ -447,14 +450,19 @@ static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) static inline bool pte_marker_uffd_wp(pte_t pte) { #ifdef CONFIG_PTE_MARKER_UFFD_WP - swp_entry_t entry; + if (pte_uffd_wp_available()) { + swp_entry_t entry; - if (!is_swap_pte(pte)) - return false; + if (!is_swap_pte(pte)) + return false; - entry = pte_to_swp_entry(pte); + entry = pte_to_swp_entry(pte); + + return pte_marker_entry_uffd_wp(entry); + } else { + return false; + } - return pte_marker_entry_uffd_wp(entry); #else return false; #endif @@ -467,14 +475,18 @@ static inline bool pte_marker_uffd_wp(pte_t pte) static inline bool pte_swp_uffd_wp_any(pte_t pte) { #ifdef CONFIG_PTE_MARKER_UFFD_WP - if (!is_swap_pte(pte)) - return false; + if (pte_uffd_wp_available()) { + if (!is_swap_pte(pte)) + return false; - if (pte_swp_uffd_wp(pte)) - return true; + if (pte_swp_uffd_wp(pte)) + return true; - if (pte_marker_uffd_wp(pte)) - return true; + if (pte_marker_uffd_wp(pte)) + return true; + } else { + return false; + } #endif return false; } diff --git a/mm/memory.c b/mm/memory.c index 0ba4f6b71847..d6c874221433 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1465,7 +1465,9 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, { bool was_installed = false; -#ifdef CONFIG_PTE_MARKER_UFFD_WP + if (!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || !pte_uffd_wp_available()) + return false; + /* Zap on anonymous always means dropping everything */ if (vma_is_anonymous(vma)) return false; @@ -1482,7 +1484,7 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, pte++; addr += PAGE_SIZE; } -#endif + return was_installed; } -- 2.34.1 From zhangchunyan at iscas.ac.cn Tue Sep 9 02:56:11 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Tue, 9 Sep 2025 17:56:11 +0800 Subject: [PATCH V10 5/5] riscv: mm: Add uffd write-protect support In-Reply-To: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> References: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250909095611.803898-6-zhangchunyan@iscas.ac.cn> The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59 for software, this patch uses bit 60 for uffd-wp tracking Additionally for tracking the uffd-wp state as a PTE swap bit, we borrow bit 4 which is not involved into swap entry computation. Signed-off-by: Chunyan Zhang --- arch/riscv/Kconfig | 1 + arch/riscv/include/asm/pgtable-bits.h | 18 +++++++ arch/riscv/include/asm/pgtable.h | 67 +++++++++++++++++++++++++++ 3 files changed, 86 insertions(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 53b73e4bdf3f..f928768bb14a 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -147,6 +147,7 @@ config RISCV select HAVE_ARCH_TRANSPARENT_HUGEPAGE if 64BIT && MMU select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if 64BIT && MMU select HAVE_ARCH_USERFAULTFD_MINOR if 64BIT && USERFAULTFD + select HAVE_ARCH_USERFAULTFD_WP if 64BIT && MMU && USERFAULTFD && RISCV_ISA_SVRSW60T59B select HAVE_ARCH_VMAP_STACK if MMU && 64BIT select HAVE_ASM_MODVERSIONS select HAVE_CONTEXT_TRACKING_USER diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h index 8ffe81bf66d2..894b2a24fc49 100644 --- a/arch/riscv/include/asm/pgtable-bits.h +++ b/arch/riscv/include/asm/pgtable-bits.h @@ -38,6 +38,24 @@ #define _PAGE_SWP_SOFT_DIRTY 0 #endif /* CONFIG_MEM_SOFT_DIRTY */ +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP + +/* ext_svrsw60t59b: Bit(60) for uffd-wp tracking */ +#define _PAGE_UFFD_WP \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + (1UL << 60) : 0) +/* + * Bit 4 is not involved into swap entry computation, so we + * can borrow it for swap page uffd-wp tracking. + */ +#define _PAGE_SWP_UFFD_WP \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + _PAGE_USER : 0) +#else +#define _PAGE_UFFD_WP 0 +#define _PAGE_SWP_UFFD_WP 0 +#endif + #define _PAGE_TABLE _PAGE_PRESENT /* diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index b2d00d129d81..94cc97d3dbff 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -416,6 +416,40 @@ static inline pte_t pte_wrprotect(pte_t pte) return __pte(pte_val(pte) & ~(_PAGE_WRITE)); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +#define pte_uffd_wp_available() riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B) + +static inline bool pte_uffd_wp(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_UFFD_WP); +} + +static inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte_wrprotect(__pte(pte_val(pte) | _PAGE_UFFD_WP)); +} + +static inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_UFFD_WP)); +} + +static inline bool pte_swp_uffd_wp(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_SWP_UFFD_WP); +} + +static inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_UFFD_WP); +} + +static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_SWP_UFFD_WP)); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + /* static inline pte_t pte_mkread(pte_t pte) */ static inline pte_t pte_mkwrite_novma(pte_t pte) @@ -836,6 +870,38 @@ static inline pud_t pud_mkspecial(pud_t pud) } #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline bool pmd_uffd_wp(pmd_t pmd) +{ + return pte_uffd_wp(pmd_pte(pmd)); +} + +static inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_mkuffd_wp(pmd_pte(pmd))); +} + +static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_clear_uffd_wp(pmd_pte(pmd))); +} + +static inline bool pmd_swp_uffd_wp(pmd_t pmd) +{ + return pte_swp_uffd_wp(pmd_pte(pmd)); +} + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_swp_mkuffd_wp(pmd_pte(pmd))); +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_swp_clear_uffd_wp(pmd_pte(pmd))); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY static inline bool pmd_soft_dirty(pmd_t pmd) { @@ -1053,6 +1119,7 @@ static inline pud_t pud_modify(pud_t pud, pgprot_t newprot) * bit 0: _PAGE_PRESENT (zero) * bit 1 to 2: (zero) * bit 3: _PAGE_SWP_SOFT_DIRTY + * bit 4: _PAGE_SWP_UFFD_WP * bit 5: _PAGE_PROT_NONE (zero) * bit 6: exclusive marker * bits 7 to 11: swap type -- 2.34.1 From zhangchunyan at iscas.ac.cn Tue Sep 9 02:56:09 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Tue, 9 Sep 2025 17:56:09 +0800 Subject: [PATCH V10 3/5] riscv: Add RISC-V Svrsw60t59b extension support In-Reply-To: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> References: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250909095611.803898-4-zhangchunyan@iscas.ac.cn> The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59 for software to use. Reviewed-by: Alexandre Ghiti Signed-off-by: Chunyan Zhang --- arch/riscv/Kconfig | 14 ++++++++++++++ arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/kernel/cpufeature.c | 1 + 3 files changed, 16 insertions(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index a4b233a0659e..d99df67cc7a4 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -862,6 +862,20 @@ config RISCV_ISA_ZICBOP If you don't know what to do here, say Y. +config RISCV_ISA_SVRSW60T59B + bool "Svrsw60t59b extension support for using PTE bits 60 and 59" + depends on MMU && 64BIT + depends on RISCV_ALTERNATIVE + default y + help + Adds support to dynamically detect the presence of the Svrsw60t59b + extension and enable its usage. + + The Svrsw60t59b extension allows to free the PTE reserved bits 60 + and 59 for software to use. + + If you don't know what to do here, say Y. + config TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI def_bool y # https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=aed44286efa8ae8717a77d94b51ac3614e2ca6dc diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index affd63e11b0a..f98fcb5c17d5 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -106,6 +106,7 @@ #define RISCV_ISA_EXT_ZAAMO 97 #define RISCV_ISA_EXT_ZALRSC 98 #define RISCV_ISA_EXT_ZICBOP 99 +#define RISCV_ISA_EXT_SVRSW60T59B 100 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 743d53415572..de29562096ff 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -540,6 +540,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_DATA(svnapot, RISCV_ISA_EXT_SVNAPOT), __RISCV_ISA_EXT_DATA(svpbmt, RISCV_ISA_EXT_SVPBMT), __RISCV_ISA_EXT_DATA(svvptc, RISCV_ISA_EXT_SVVPTC), + __RISCV_ISA_EXT_DATA(svrsw60t59b, RISCV_ISA_EXT_SVRSW60T59B), }; const size_t riscv_isa_ext_count = ARRAY_SIZE(riscv_isa_ext); -- 2.34.1 From andriy.shevchenko at intel.com Tue Sep 9 04:31:02 2025 From: andriy.shevchenko at intel.com (Andy Shevchenko) Date: Tue, 9 Sep 2025 14:31:02 +0300 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> Message-ID: On Tue, Sep 09, 2025 at 11:15:40AM +0200, Bartosz Golaszewski wrote: > > Convert the driver to using the new generic GPIO chip interfaces from > linux/gpio/generic.h. ... > + config = (typeof(config)){ This looks unusual. Why can't properly formed compound literal be used as in many other places in the kernel? > + .dev = &pdev->dev, > + .sz = 4, > + .dat = sd->gpio_pub_base + GPINR, > + .set = sd->gpio_pub_base + GPOUTR, > + .dirout = sd->gpio_pub_base + GPOER, > + }; -- With Best Regards, Andy Shevchenko From brgl at bgdev.pl Tue Sep 9 04:35:04 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 9 Sep 2025 13:35:04 +0200 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> Message-ID: On Tue, Sep 9, 2025 at 1:31?PM Andy Shevchenko wrote: > > On Tue, Sep 09, 2025 at 11:15:40AM +0200, Bartosz Golaszewski wrote: > > > > Convert the driver to using the new generic GPIO chip interfaces from > > linux/gpio/generic.h. > > ... > > > + config = (typeof(config)){ > > This looks unusual. Why can't properly formed compound literal be used as in > many other places in the kernel? > It is correct C and checkpatch doesn't raise any warnings. It's the same kind of argument as between kmalloc(sizeof(struct foo)) vs kmalloc(sizeof(f)). I guess it's personal taste but I like this version better. Bartosz > > + .dev = &pdev->dev, > > + .sz = 4, > > + .dat = sd->gpio_pub_base + GPINR, > > + .set = sd->gpio_pub_base + GPOUTR, > > + .dirout = sd->gpio_pub_base + GPOER, > > + }; From david at redhat.com Tue Sep 9 04:42:26 2025 From: david at redhat.com (David Hildenbrand) Date: Tue, 9 Sep 2025 13:42:26 +0200 Subject: [PATCH V10 1/5] mm: softdirty: Add pte_soft_dirty_available() In-Reply-To: <20250909095611.803898-2-zhangchunyan@iscas.ac.cn> References: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> <20250909095611.803898-2-zhangchunyan@iscas.ac.cn> Message-ID: <6b2f12aa-8ed9-476d-a69d-f05ea526f16a@redhat.com> On 09.09.25 11:56, Chunyan Zhang wrote: > Some platforms can customize the PTE soft dirty bit and make it unavailable > even if the architecture allows providing the PTE resource. > > Add an API which architectures can define their specific implementations > to detect if the PTE soft-dirty bit is available, on which the kernel > is running. > > Signed-off-by: Chunyan Zhang > --- > fs/proc/task_mmu.c | 17 ++++++++++++++++- > include/linux/pgtable.h | 10 ++++++++++ > mm/debug_vm_pgtable.c | 9 +++++---- > mm/huge_memory.c | 10 ++++++---- > mm/internal.h | 2 +- > mm/mremap.c | 10 ++++++---- > mm/userfaultfd.c | 6 ++++-- > 7 files changed, 48 insertions(+), 16 deletions(-) > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index 29cca0e6d0ff..20a609ec1ba6 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -1058,7 +1058,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) > * -Werror=unterminated-string-initialization warning > * with GCC 15 > */ > - static const char mnemonics[BITS_PER_LONG][3] = { > + static char mnemonics[BITS_PER_LONG][3] = { > /* > * In case if we meet a flag we don't know about. > */ > @@ -1129,6 +1129,16 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) > [ilog2(VM_SEALED)] = "sl", > #endif > }; > +/* > + * We should remove the VM_SOFTDIRTY flag if the PTE soft-dirty bit is > + * unavailable on which the kernel is running, even if the architecture > + * allows providing the PTE resource and soft-dirty is compiled in. > + */ > +#ifdef CONFIG_MEM_SOFT_DIRTY > + if (!pte_soft_dirty_available()) > + mnemonics[ilog2(VM_SOFTDIRTY)][0] = 0; > +#endif > + > size_t i; > > seq_puts(m, "VmFlags: "); > @@ -1531,6 +1541,8 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, > static inline void clear_soft_dirty(struct vm_area_struct *vma, > unsigned long addr, pte_t *pte) > { > + if (!pte_soft_dirty_available()) > + return; > /* > * The soft-dirty tracker uses #PF-s to catch writes > * to pages, so write-protect the pte as well. See the > @@ -1566,6 +1578,9 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > { > pmd_t old, pmd = *pmdp; > > + if (!pte_soft_dirty_available()) > + return; > + > if (pmd_present(pmd)) { > /* See comment in change_huge_pmd() */ > old = pmdp_invalidate(vma, addr, pmdp); > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index 4c035637eeb7..c0e2a6dc69f4 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -1538,6 +1538,15 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) > #endif > > #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY > + > +/* > + * Some platforms can customize the PTE soft dirty bit and make it unavailable > + * even if the architecture allows providing the PTE resource. > + */ > +#ifndef pte_soft_dirty_available > +#define pte_soft_dirty_available() (true) > +#endif > + > #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION > static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) > { > @@ -1555,6 +1564,7 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) > } > #endif > #else /* !CONFIG_HAVE_ARCH_SOFT_DIRTY */ > +#define pte_soft_dirty_available() (false) > static inline int pte_soft_dirty(pte_t pte) > { > return 0; > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > index 830107b6dd08..98ed7e22ccec 100644 > --- a/mm/debug_vm_pgtable.c > +++ b/mm/debug_vm_pgtable.c > @@ -690,7 +690,7 @@ static void __init pte_soft_dirty_tests(struct pgtable_debug_args *args) > { > pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); > > - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > + if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || !pte_soft_dirty_available()) I suggest that you instead make pte_soft_dirty_available() be false without CONFIG_MEM_SOFT_DIRTY. e.g., for the default implementation define pte_soft_dirty_available() IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) That way you can avoid some ifefs and cleanup these checks. But as we do also have PMD soft-dirty support, I guess we would want to call this something more abstract "pgtable_soft_dirty_available" or "pgtable_soft_dirty_supported" -- Cheers David / dhildenb From david at redhat.com Tue Sep 9 04:43:28 2025 From: david at redhat.com (David Hildenbrand) Date: Tue, 9 Sep 2025 13:43:28 +0200 Subject: [PATCH V10 2/5] mm: uffd_wp: Add pte_uffd_wp_available() In-Reply-To: <20250909095611.803898-3-zhangchunyan@iscas.ac.cn> References: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> <20250909095611.803898-3-zhangchunyan@iscas.ac.cn> Message-ID: <0847f0b4-a02e-42d7-9bf3-797a753d3304@redhat.com> On 09.09.25 11:56, Chunyan Zhang wrote: > Some platforms can customize the PTE uffd_wp bit and make it unavailable > even if the architecture allows providing the PTE resource. > This patch adds a macro API which allows architectures to define > their specific ones for checking if the PTE uffd_wp bit is available. > > Signed-off-by: Chunyan Zhang > --- > fs/userfaultfd.c | 25 +++++++++-------- > include/asm-generic/pgtable_uffd.h | 12 ++++++++ > include/linux/mm_inline.h | 7 +++++ > include/linux/userfaultfd_k.h | 44 +++++++++++++++++++----------- > mm/memory.c | 6 ++-- > 5 files changed, 65 insertions(+), 29 deletions(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 54c6cc7fe9c6..68e5006e5158 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -1270,9 +1270,10 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, > if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MISSING) > vm_flags |= VM_UFFD_MISSING; > if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) { > -#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP > - goto out; > -#endif > + if (!IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP) || > + !pte_uffd_wp_available()) > + goto out; > + Same comment as for the other patch: make the CONFIG_HAVE_ARCH_USERFAULTFD_WP part of the pte_uffd_wp_available() check and better call it "pgtable_uffd_wp_" ... available/supported. -- Cheers David / dhildenb From anup at brainfault.org Tue Sep 9 06:08:06 2025 From: anup at brainfault.org (Anup Patel) Date: Tue, 9 Sep 2025 18:38:06 +0530 Subject: [PATCH v6 4/8] drivers/perf: riscv: Implement PMU event info function In-Reply-To: <20250909-pmu_event_info-v6-4-d8f80cacb884@rivosinc.com> References: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> <20250909-pmu_event_info-v6-4-d8f80cacb884@rivosinc.com> Message-ID: On Tue, Sep 9, 2025 at 12:33?PM Atish Patra wrote: > > With the new SBI PMU event info function, we can query the availability > of the all standard SBI PMU events at boot time with a single ecall. > This improves the bootime by avoiding making an SBI call for each > standard PMU event. Since this function is defined only in SBI v3.0, > invoke this only if the underlying SBI implementation is v3.0 or higher. > > Signed-off-by: Atish Patra LGTM. Reviewed-by: Anup Patel Regards, Anup > --- > arch/riscv/include/asm/sbi.h | 9 ++++++ > drivers/perf/riscv_pmu_sbi.c | 69 ++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 78 insertions(+) > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > index b0c41ef56968..5ca7cebc13cc 100644 > --- a/arch/riscv/include/asm/sbi.h > +++ b/arch/riscv/include/asm/sbi.h > @@ -136,6 +136,7 @@ enum sbi_ext_pmu_fid { > SBI_EXT_PMU_COUNTER_FW_READ, > SBI_EXT_PMU_COUNTER_FW_READ_HI, > SBI_EXT_PMU_SNAPSHOT_SET_SHMEM, > + SBI_EXT_PMU_EVENT_GET_INFO, > }; > > union sbi_pmu_ctr_info { > @@ -159,6 +160,14 @@ struct riscv_pmu_snapshot_data { > u64 reserved[447]; > }; > > +struct riscv_pmu_event_info { > + u32 event_idx; > + u32 output; > + u64 event_data; > +}; > + > +#define RISCV_PMU_EVENT_INFO_OUTPUT_MASK 0x01 > + > #define RISCV_PMU_RAW_EVENT_MASK GENMASK_ULL(47, 0) > #define RISCV_PMU_PLAT_FW_EVENT_MASK GENMASK_ULL(61, 0) > /* SBI v3.0 allows extended hpmeventX width value */ > diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c > index 3644bed4c8ab..a6c479f853e1 100644 > --- a/drivers/perf/riscv_pmu_sbi.c > +++ b/drivers/perf/riscv_pmu_sbi.c > @@ -299,6 +299,66 @@ static struct sbi_pmu_event_data pmu_cache_event_map[PERF_COUNT_HW_CACHE_MAX] > }, > }; > > +static int pmu_sbi_check_event_info(void) > +{ > + int num_events = ARRAY_SIZE(pmu_hw_event_map) + PERF_COUNT_HW_CACHE_MAX * > + PERF_COUNT_HW_CACHE_OP_MAX * PERF_COUNT_HW_CACHE_RESULT_MAX; > + struct riscv_pmu_event_info *event_info_shmem; > + phys_addr_t base_addr; > + int i, j, k, result = 0, count = 0; > + struct sbiret ret; > + > + event_info_shmem = kcalloc(num_events, sizeof(*event_info_shmem), GFP_KERNEL); > + if (!event_info_shmem) > + return -ENOMEM; > + > + for (i = 0; i < ARRAY_SIZE(pmu_hw_event_map); i++) > + event_info_shmem[count++].event_idx = pmu_hw_event_map[i].event_idx; > + > + for (i = 0; i < ARRAY_SIZE(pmu_cache_event_map); i++) { > + for (j = 0; j < ARRAY_SIZE(pmu_cache_event_map[i]); j++) { > + for (k = 0; k < ARRAY_SIZE(pmu_cache_event_map[i][j]); k++) > + event_info_shmem[count++].event_idx = > + pmu_cache_event_map[i][j][k].event_idx; > + } > + } > + > + base_addr = __pa(event_info_shmem); > + if (IS_ENABLED(CONFIG_32BIT)) > + ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_EVENT_GET_INFO, lower_32_bits(base_addr), > + upper_32_bits(base_addr), count, 0, 0, 0); > + else > + ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_EVENT_GET_INFO, base_addr, 0, > + count, 0, 0, 0); > + if (ret.error) { > + result = -EOPNOTSUPP; > + goto free_mem; > + } > + > + for (i = 0; i < ARRAY_SIZE(pmu_hw_event_map); i++) { > + if (!(event_info_shmem[i].output & RISCV_PMU_EVENT_INFO_OUTPUT_MASK)) > + pmu_hw_event_map[i].event_idx = -ENOENT; > + } > + > + count = ARRAY_SIZE(pmu_hw_event_map); > + > + for (i = 0; i < ARRAY_SIZE(pmu_cache_event_map); i++) { > + for (j = 0; j < ARRAY_SIZE(pmu_cache_event_map[i]); j++) { > + for (k = 0; k < ARRAY_SIZE(pmu_cache_event_map[i][j]); k++) { > + if (!(event_info_shmem[count].output & > + RISCV_PMU_EVENT_INFO_OUTPUT_MASK)) > + pmu_cache_event_map[i][j][k].event_idx = -ENOENT; > + count++; > + } > + } > + } > + > +free_mem: > + kfree(event_info_shmem); > + > + return result; > +} > + > static void pmu_sbi_check_event(struct sbi_pmu_event_data *edata) > { > struct sbiret ret; > @@ -316,6 +376,15 @@ static void pmu_sbi_check_event(struct sbi_pmu_event_data *edata) > > static void pmu_sbi_check_std_events(struct work_struct *work) > { > + int ret; > + > + if (sbi_v3_available) { > + ret = pmu_sbi_check_event_info(); > + if (ret) > + pr_err("pmu_sbi_check_event_info failed with error %d\n", ret); > + return; > + } > + > for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_map); i++) > pmu_sbi_check_event(&pmu_hw_event_map[i]); > > > -- > 2.43.0 > From anup at brainfault.org Tue Sep 9 06:09:26 2025 From: anup at brainfault.org (Anup Patel) Date: Tue, 9 Sep 2025 18:39:26 +0530 Subject: [PATCH v6 6/8] RISC-V: KVM: No need of explicit writable slot check In-Reply-To: <20250909-pmu_event_info-v6-6-d8f80cacb884@rivosinc.com> References: <20250909-pmu_event_info-v6-0-d8f80cacb884@rivosinc.com> <20250909-pmu_event_info-v6-6-d8f80cacb884@rivosinc.com> Message-ID: On Tue, Sep 9, 2025 at 12:33?PM Atish Patra wrote: > > There is not much value in checking if a memslot is writable explicitly > before a write as it may change underneath after the check. Rather, return > invalid address error when write_guest fails as it checks if the slot > is writable anyways. > > Suggested-by: Sean Christopherson > Signed-off-by: Atish Patra LGTM. Reviewed-by: Anup Patel Regards, Anup > --- > arch/riscv/kvm/vcpu_pmu.c | 11 ++--------- > arch/riscv/kvm/vcpu_sbi_sta.c | 9 ++------- > 2 files changed, 4 insertions(+), 16 deletions(-) > > diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c > index 15d71a7b75ba..f8514086bd6b 100644 > --- a/arch/riscv/kvm/vcpu_pmu.c > +++ b/arch/riscv/kvm/vcpu_pmu.c > @@ -409,8 +409,6 @@ int kvm_riscv_vcpu_pmu_snapshot_set_shmem(struct kvm_vcpu *vcpu, unsigned long s > int snapshot_area_size = sizeof(struct riscv_pmu_snapshot_data); > int sbiret = 0; > gpa_t saddr; > - unsigned long hva; > - bool writable; > > if (!kvpmu || flags) { > sbiret = SBI_ERR_INVALID_PARAM; > @@ -432,19 +430,14 @@ int kvm_riscv_vcpu_pmu_snapshot_set_shmem(struct kvm_vcpu *vcpu, unsigned long s > goto out; > } > > - hva = kvm_vcpu_gfn_to_hva_prot(vcpu, saddr >> PAGE_SHIFT, &writable); > - if (kvm_is_error_hva(hva) || !writable) { > - sbiret = SBI_ERR_INVALID_ADDRESS; > - goto out; > - } > - > kvpmu->sdata = kzalloc(snapshot_area_size, GFP_ATOMIC); > if (!kvpmu->sdata) > return -ENOMEM; > > + /* No need to check writable slot explicitly as kvm_vcpu_write_guest does it internally */ > if (kvm_vcpu_write_guest(vcpu, saddr, kvpmu->sdata, snapshot_area_size)) { > kfree(kvpmu->sdata); > - sbiret = SBI_ERR_FAILURE; > + sbiret = SBI_ERR_INVALID_ADDRESS; > goto out; > } > > diff --git a/arch/riscv/kvm/vcpu_sbi_sta.c b/arch/riscv/kvm/vcpu_sbi_sta.c > index cc6cb7c8f0e4..caaa28460ca4 100644 > --- a/arch/riscv/kvm/vcpu_sbi_sta.c > +++ b/arch/riscv/kvm/vcpu_sbi_sta.c > @@ -85,8 +85,6 @@ static int kvm_sbi_sta_steal_time_set_shmem(struct kvm_vcpu *vcpu) > unsigned long shmem_phys_hi = cp->a1; > u32 flags = cp->a2; > struct sbi_sta_struct zero_sta = {0}; > - unsigned long hva; > - bool writable; > gpa_t shmem; > int ret; > > @@ -111,13 +109,10 @@ static int kvm_sbi_sta_steal_time_set_shmem(struct kvm_vcpu *vcpu) > return SBI_ERR_INVALID_ADDRESS; > } > > - hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable); > - if (kvm_is_error_hva(hva) || !writable) > - return SBI_ERR_INVALID_ADDRESS; > - > + /* No need to check writable slot explicitly as kvm_vcpu_write_guest does it internally */ > ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta)); > if (ret) > - return SBI_ERR_FAILURE; > + return SBI_ERR_INVALID_ADDRESS; > > vcpu->arch.sta.shmem = shmem; > vcpu->arch.sta.last_steal = current->sched_info.run_delay; > > -- > 2.43.0 > From andriy.shevchenko at intel.com Tue Sep 9 06:13:04 2025 From: andriy.shevchenko at intel.com (Andy Shevchenko) Date: Tue, 9 Sep 2025 16:13:04 +0300 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> Message-ID: On Tue, Sep 09, 2025 at 01:35:04PM +0200, Bartosz Golaszewski wrote: > On Tue, Sep 9, 2025 at 1:31?PM Andy Shevchenko > wrote: > > On Tue, Sep 09, 2025 at 11:15:40AM +0200, Bartosz Golaszewski wrote: ... > > > + config = (typeof(config)){ > > > > This looks unusual. Why can't properly formed compound literal be used as in > > many other places in the kernel? > > It is correct C If it compiles, it doesn't mean it's correct C, it might be non-standard. Have you checked with the standard (note, I read that part in the past, but I may forgot the details, so I don't know the answer to this)? > and checkpatch doesn't raise any warnings. checkpatch is far from being useful in the questions like this. It false positively complains for for_each*() macros all over the kernel, for example. > It's the > same kind of argument as between kmalloc(sizeof(struct foo)) vs > kmalloc(sizeof(f)). Maybe, but it introduces a new style while all other cases use the other, _established_ style. So we have a precedent and the form the code is written in is against the de facto usage of the compound literals. > I guess it's personal taste but I like this version better. In kernel we also try to be consistent. This add inconsistency. Am I wrong? > > > + .dev = &pdev->dev, > > > + .sz = 4, > > > + .dat = sd->gpio_pub_base + GPINR, > > > + .set = sd->gpio_pub_base + GPOUTR, > > > + .dirout = sd->gpio_pub_base + GPOER, > > > + }; -- With Best Regards, Andy Shevchenko From brgl at bgdev.pl Tue Sep 9 06:24:23 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 9 Sep 2025 08:24:23 -0500 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> Message-ID: On Tue, 9 Sep 2025 15:13:04 +0200, Andy Shevchenko said: > On Tue, Sep 09, 2025 at 01:35:04PM +0200, Bartosz Golaszewski wrote: >> On Tue, Sep 9, 2025 at 1:31?PM Andy Shevchenko >> wrote: >> > On Tue, Sep 09, 2025 at 11:15:40AM +0200, Bartosz Golaszewski wrote: > > ... > >> > > + config = (typeof(config)){ >> > >> > This looks unusual. Why can't properly formed compound literal be used as in >> > many other places in the kernel? >> >> It is correct C > > If it compiles, it doesn't mean it's correct C, it might be non-standard. > Have you checked with the standard (note, I read that part in the past, > but I may forgot the details, so I don't know the answer to this)? > It's a GNU extension alright but it's supported in the kernel as it evaluates to a simple cast. >> and checkpatch doesn't raise any warnings. > > checkpatch is far from being useful in the questions like this. > It false positively complains for for_each*() macros all over > the kernel, for example. > >> It's the >> same kind of argument as between kmalloc(sizeof(struct foo)) vs >> kmalloc(sizeof(f)). > > Maybe, but it introduces a new style while all other cases use the other, > _established_ style. So we have a precedent and the form the code is written > in is against the de facto usage of the compound literals. > It may not be *very* common but it's hardly new style: $ git grep -P "\(typeof\(.*\)\) ?\{" | wc 108 529 7315 Bart >> I guess it's personal taste but I like this version better. > > In kernel we also try to be consistent. This add inconsistency. Am I wrong? > >> > > + .dev = &pdev->dev, >> > > + .sz = 4, >> > > + .dat = sd->gpio_pub_base + GPINR, >> > > + .set = sd->gpio_pub_base + GPOUTR, >> > > + .dirout = sd->gpio_pub_base + GPOER, >> > > + }; > > -- > With Best Regards, > Andy Shevchenko > > > From aliceryhl at google.com Tue Sep 9 06:41:24 2025 From: aliceryhl at google.com (Alice Ryhl) Date: Tue, 9 Sep 2025 13:41:24 +0000 Subject: [PATCH v1] rust: cfi: only 64-bit arm and x86 support CFI_CLANG In-Reply-To: <202509082009.4A8DC97BD2@keescook> References: <20250908-distill-lint-1ae78bcf777c@spud> <202509082009.4A8DC97BD2@keescook> Message-ID: On Mon, Sep 08, 2025 at 08:11:48PM -0700, Kees Cook wrote: > On Mon, Sep 08, 2025 at 02:12:35PM +0100, Conor Dooley wrote: > > From: Conor Dooley > > > > The kernel uses the standard rustc targets for non-x86 targets, and out > > of those only 64-bit arm's target has kcfi support enabled. For x86, the > > custom 64-bit target enables kcfi. > > > > The HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC config option that allows > > CFI_CLANG to be used in combination with RUST does not check whether the > > rustc target supports kcfi. This breaks the build on riscv (and > > presumably 32-bit arm) when CFI_CLANG and RUST are enabled at the same > > time. > > > > Ordinarily, a rustc-option check would be used to detect target support > > but unfortunately rustc-option filters out the target for reasons given > > in commit 46e24a545cdb4 ("rust: kasan/kbuild: fix missing flags on first > > build"). As a result, if the host supports kcfi but the target does not, > > e.g. when building for riscv on x86_64, the build would remain broken. > > > > Instead, make HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC depend on the only > > two architectures where the target used supports it to fix the build. > > I'm generally fine with this, but normally we do arch-specific stuff > only in arch/$arch/Kconfig, and expose some kind of > ARCH_HAS_CFI_ICALL_NORMALIZE_INTEGERS that would get tested here. Should > we do that here too? I'm thinking in this case it makes sense to keep this patch simple as it's a fix. Once rustc supports cfi on riscv (which should really just be changing the target to list it as supported), we can reorganize it to match what you're describing at that point. > > CC: stable at vger.kernel.org > > Fixes: ca627e636551e ("rust: cfi: add support for CFI_CLANG with Rust") > > Signed-off-by: Conor Dooley > > --- > > CC: Paul Walmsley > > CC: Palmer Dabbelt > > CC: Alexandre Ghiti > > CC: Miguel Ojeda > > CC: Alex Gaynor > > CC: Boqun Feng > > CC: Gary Guo > > CC: "Bj?rn Roy Baron" > > CC: Benno Lossin > > CC: Andreas Hindborg > > CC: Alice Ryhl > > CC: Trevor Gross > > CC: Danilo Krummrich > > CC: Kees Cook > > CC: Sami Tolvanen > > CC: Matthew Maurer > > CC: "Peter Zijlstra (Intel)" > > CC: linux-kernel at vger.kernel.org > > CC: linux-riscv at lists.infradead.org > > CC: rust-for-linux at vger.kernel.org > > --- > > arch/Kconfig | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/arch/Kconfig b/arch/Kconfig > > index d1b4ffd6e0856..880cddff5eda7 100644 > > --- a/arch/Kconfig > > +++ b/arch/Kconfig > > @@ -917,6 +917,7 @@ config HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC > > def_bool y > > depends on HAVE_CFI_ICALL_NORMALIZE_INTEGERS_CLANG > > depends on RUSTC_VERSION >= 107900 > > + depends on ARM64 || X86_64 > > # With GCOV/KASAN we need this fix: https://github.com/rust-lang/rust/pull/129373 > > depends on (RUSTC_LLVM_VERSION >= 190103 && RUSTC_VERSION >= 108200) || \ > > (!GCOV_KERNEL && !KASAN_GENERIC && !KASAN_SW_TAGS) > > -- > > 2.47.2 > > > > -- > Kees Cook From andriy.shevchenko at intel.com Tue Sep 9 06:45:31 2025 From: andriy.shevchenko at intel.com (Andy Shevchenko) Date: Tue, 9 Sep 2025 16:45:31 +0300 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> Message-ID: On Tue, Sep 09, 2025 at 08:24:23AM -0500, Bartosz Golaszewski wrote: > On Tue, 9 Sep 2025 15:13:04 +0200, Andy Shevchenko > said: > > On Tue, Sep 09, 2025 at 01:35:04PM +0200, Bartosz Golaszewski wrote: > >> On Tue, Sep 9, 2025 at 1:31?PM Andy Shevchenko > >> wrote: > >> > On Tue, Sep 09, 2025 at 11:15:40AM +0200, Bartosz Golaszewski wrote: ... > >> > > + config = (typeof(config)){ > >> > > >> > This looks unusual. Why can't properly formed compound literal be used as in > >> > many other places in the kernel? > >> > >> It is correct C > > > > If it compiles, it doesn't mean it's correct C, it might be non-standard. > > Have you checked with the standard (note, I read that part in the past, > > but I may forgot the details, so I don't know the answer to this)? > > It's a GNU extension alright clang, I suppose, also okay with this? > but it's supported in the kernel as it evaluates > to a simple cast. There is no cast. And that's make a big difference to what the code tries to do. > >> and checkpatch doesn't raise any warnings. > > > > checkpatch is far from being useful in the questions like this. > > It false positively complains for for_each*() macros all over > > the kernel, for example. > > > >> It's the > >> same kind of argument as between kmalloc(sizeof(struct foo)) vs > >> kmalloc(sizeof(f)). > > > > Maybe, but it introduces a new style while all other cases use the other, > > _established_ style. So we have a precedent and the form the code is written > > in is against the de facto usage of the compound literals. > > It may not be *very* common but it's hardly new style: I think your statement is incorrect see below why. > $ git grep -P "\(typeof\(.*\)\) ?\{" | wc > 108 529 7315 Not correct. The correct output will be closer to $ git grep -l -P "\(typeof\(.*\)\) ?\{" | wc -l 15 And if you looked at the output carefully, you see the bug in the RE you used. So, even closer will be this one: $ git grep -l -P "=[[:space:]]+\(typeof\(.*\)\) ?\{" | wc -l 7 2 out of which are related to libeth, effectively makes this 6. No, this is completely non-standard and unusual thing in the kernel. > >> I guess it's personal taste but I like this version better. > > > > In kernel we also try to be consistent. This add inconsistency. Am I wrong? > > > >> > > + .dev = &pdev->dev, > >> > > + .sz = 4, > >> > > + .dat = sd->gpio_pub_base + GPINR, > >> > > + .set = sd->gpio_pub_base + GPOUTR, > >> > > + .dirout = sd->gpio_pub_base + GPOER, > >> > > + }; -- With Best Regards, Andy Shevchenko From andriy.shevchenko at intel.com Tue Sep 9 06:47:09 2025 From: andriy.shevchenko at intel.com (Andy Shevchenko) Date: Tue, 9 Sep 2025 16:47:09 +0300 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> Message-ID: On Tue, Sep 09, 2025 at 04:45:31PM +0300, Andy Shevchenko wrote: > On Tue, Sep 09, 2025 at 08:24:23AM -0500, Bartosz Golaszewski wrote: > > On Tue, 9 Sep 2025 15:13:04 +0200, Andy Shevchenko > > said: > > > On Tue, Sep 09, 2025 at 01:35:04PM +0200, Bartosz Golaszewski wrote: > > >> On Tue, Sep 9, 2025 at 1:31?PM Andy Shevchenko > > >> wrote: > > >> > On Tue, Sep 09, 2025 at 11:15:40AM +0200, Bartosz Golaszewski wrote: ... > > >> > > + config = (typeof(config)){ > > >> > > > >> > This looks unusual. Why can't properly formed compound literal be used as in > > >> > many other places in the kernel? > > >> > > >> It is correct C > > > > > > If it compiles, it doesn't mean it's correct C, it might be non-standard. > > > Have you checked with the standard (note, I read that part in the past, > > > but I may forgot the details, so I don't know the answer to this)? > > > > It's a GNU extension alright > > clang, I suppose, also okay with this? > > > but it's supported in the kernel as it evaluates > > to a simple cast. > > There is no cast. And that's make a big difference to what the code tries to do. > > > >> and checkpatch doesn't raise any warnings. > > > > > > checkpatch is far from being useful in the questions like this. > > > It false positively complains for for_each*() macros all over > > > the kernel, for example. > > > > > >> It's the > > >> same kind of argument as between kmalloc(sizeof(struct foo)) vs > > >> kmalloc(sizeof(f)). > > > > > > Maybe, but it introduces a new style while all other cases use the other, > > > _established_ style. So we have a precedent and the form the code is written > > > in is against the de facto usage of the compound literals. > > > > It may not be *very* common but it's hardly new style: > > I think your statement is incorrect see below why. > > > $ git grep -P "\(typeof\(.*\)\) ?\{" | wc > > 108 529 7315 > > Not correct. The correct output will be closer to > > $ git grep -l -P "\(typeof\(.*\)\) ?\{" | wc -l > 15 > > And if you looked at the output carefully, you see the bug in the RE you used. > > So, even closer will be this one: > > $ git grep -l -P "=[[:space:]]+\(typeof\(.*\)\) ?\{" | wc -l > 7 > > 2 out of which are related to libeth, effectively makes this 6. TBH, I think those 6 all made the same mistake, i.e. thinking of the compound literal as a cast. Which is not! > No, this is completely non-standard and unusual thing in the kernel. > > > >> I guess it's personal taste but I like this version better. > > > > > > In kernel we also try to be consistent. This add inconsistency. Am I wrong? > > > > > >> > > + .dev = &pdev->dev, > > >> > > + .sz = 4, > > >> > > + .dat = sd->gpio_pub_base + GPINR, > > >> > > + .set = sd->gpio_pub_base + GPOUTR, > > >> > > + .dirout = sd->gpio_pub_base + GPOER, > > >> > > + }; -- With Best Regards, Andy Shevchenko From andriy.shevchenko at intel.com Tue Sep 9 06:56:12 2025 From: andriy.shevchenko at intel.com (Andy Shevchenko) Date: Tue, 9 Sep 2025 16:56:12 +0300 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> Message-ID: On Tue, Sep 09, 2025 at 04:47:09PM +0300, Andy Shevchenko wrote: > On Tue, Sep 09, 2025 at 04:45:31PM +0300, Andy Shevchenko wrote: > > On Tue, Sep 09, 2025 at 08:24:23AM -0500, Bartosz Golaszewski wrote: > > > On Tue, 9 Sep 2025 15:13:04 +0200, Andy Shevchenko > > > said: > > > > On Tue, Sep 09, 2025 at 01:35:04PM +0200, Bartosz Golaszewski wrote: > > > >> On Tue, Sep 9, 2025 at 1:31?PM Andy Shevchenko > > > >> wrote: > > > >> > On Tue, Sep 09, 2025 at 11:15:40AM +0200, Bartosz Golaszewski wrote: ... > > > >> > > + config = (typeof(config)){ > > > >> > > > > >> > This looks unusual. Why can't properly formed compound literal be used as in > > > >> > many other places in the kernel? > > > >> > > > >> It is correct C > > > > > > > > If it compiles, it doesn't mean it's correct C, it might be non-standard. > > > > Have you checked with the standard (note, I read that part in the past, > > > > but I may forgot the details, so I don't know the answer to this)? > > > > > > It's a GNU extension alright > > > > clang, I suppose, also okay with this? > > > > > but it's supported in the kernel as it evaluates > > > to a simple cast. > > > > There is no cast. And that's make a big difference to what the code tries to do. > > > > > >> and checkpatch doesn't raise any warnings. > > > > > > > > checkpatch is far from being useful in the questions like this. > > > > It false positively complains for for_each*() macros all over > > > > the kernel, for example. > > > > > > > >> It's the > > > >> same kind of argument as between kmalloc(sizeof(struct foo)) vs > > > >> kmalloc(sizeof(f)). > > > > > > > > Maybe, but it introduces a new style while all other cases use the other, > > > > _established_ style. So we have a precedent and the form the code is written > > > > in is against the de facto usage of the compound literals. > > > > > > It may not be *very* common but it's hardly new style: > > > > I think your statement is incorrect see below why. > > > > > $ git grep -P "\(typeof\(.*\)\) ?\{" | wc > > > 108 529 7315 > > > > Not correct. The correct output will be closer to > > > > $ git grep -l -P "\(typeof\(.*\)\) ?\{" | wc -l > > 15 > > > > And if you looked at the output carefully, you see the bug in the RE you used. > > > > So, even closer will be this one: > > > > $ git grep -l -P "=[[:space:]]+\(typeof\(.*\)\) ?\{" | wc -l > > 7 > > > > 2 out of which are related to libeth, effectively makes this 6. And for of fullness the picture: $ git grep -l -P "=[[:space:]]+\(struct [^[:space:]]*\) ?\{" | wc -l 501 So, it's 1:100 ratio. > TBH, I think those 6 all made the same mistake, i.e. thinking of the compound > literal as a cast. Which is not! > > > No, this is completely non-standard and unusual thing in the kernel. > > > > > >> I guess it's personal taste but I like this version better. > > > > > > > > In kernel we also try to be consistent. This add inconsistency. Am I wrong? > > > > > > > >> > > + .dev = &pdev->dev, > > > >> > > + .sz = 4, > > > >> > > + .dat = sd->gpio_pub_base + GPINR, > > > >> > > + .set = sd->gpio_pub_base + GPOUTR, > > > >> > > + .dirout = sd->gpio_pub_base + GPOER, > > > >> > > + }; -- With Best Regards, Andy Shevchenko From brgl at bgdev.pl Tue Sep 9 06:56:41 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 9 Sep 2025 15:56:41 +0200 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> Message-ID: On Tue, Sep 9, 2025 at 3:47?PM Andy Shevchenko wrote: > > TBH, I think those 6 all made the same mistake, i.e. thinking of the compound > literal as a cast. Which is not! > What do you suggest? And are we not allowed to use C99 features now anyway? Bartosz From andriy.shevchenko at intel.com Tue Sep 9 07:02:12 2025 From: andriy.shevchenko at intel.com (Andy Shevchenko) Date: Tue, 9 Sep 2025 17:02:12 +0300 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> Message-ID: On Tue, Sep 09, 2025 at 03:56:41PM +0200, Bartosz Golaszewski wrote: > On Tue, Sep 9, 2025 at 3:47?PM Andy Shevchenko > wrote: > > > > TBH, I think those 6 all made the same mistake, i.e. thinking of the compound > > literal as a cast. Which is not! > > What do you suggest? Write it in less odd way :-) foo = (struct bar) { ... }; > And are we not allowed to use C99 features now anyway? It's fine, it's not about the C standard number. -- With Best Regards, Andy Shevchenko From brgl at bgdev.pl Tue Sep 9 07:05:41 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 9 Sep 2025 16:05:41 +0200 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> Message-ID: On Tue, Sep 9, 2025 at 4:02?PM Andy Shevchenko wrote: > > On Tue, Sep 09, 2025 at 03:56:41PM +0200, Bartosz Golaszewski wrote: > > On Tue, Sep 9, 2025 at 3:47?PM Andy Shevchenko > > wrote: > > > > > > TBH, I think those 6 all made the same mistake, i.e. thinking of the compound > > > literal as a cast. Which is not! > > > > What do you suggest? > > Write it in less odd way :-) > > foo = (struct bar) { ... }; I don't get your reasoning. typeof() itself is well established in the kernel and doesn't foo = (struct bar){ ... }; evaluate to the same thing as foo = (typeof(foo)){ ... }; ? Isn't it still the same compound literal? Bartosz > > > And are we not allowed to use C99 features now anyway? > > It's fine, it's not about the C standard number. > From andy.shevchenko at gmail.com Tue Sep 9 08:15:28 2025 From: andy.shevchenko at gmail.com (Andy Shevchenko) Date: Tue, 9 Sep 2025 18:15:28 +0300 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> Message-ID: On Tue, Sep 9, 2025 at 5:05?PM Bartosz Golaszewski wrote: > On Tue, Sep 9, 2025 at 4:02?PM Andy Shevchenko > wrote: > > On Tue, Sep 09, 2025 at 03:56:41PM +0200, Bartosz Golaszewski wrote: > > > On Tue, Sep 9, 2025 at 3:47?PM Andy Shevchenko > > > wrote: ... > > > > TBH, I think those 6 all made the same mistake, i.e. thinking of the compound > > > > literal as a cast. Which is not! > > > > > > What do you suggest? > > > > Write it in less odd way :-) > > > > foo = (struct bar) { ... }; > > I don't get your reasoning. typeof() itself is well established in the > kernel and doesn't > > foo = (struct bar){ ... }; > > evaluate to the same thing as > > foo = (typeof(foo)){ ... }; > > ? Isn't it still the same compound literal? It makes it so, but typeof() usually is used for casts and not for compound literals. That's (usage typeof() for compound literals) what I am against in this case. > > > And are we not allowed to use C99 features now anyway? > > > > It's fine, it's not about the C standard number. -- With Best Regards, Andy Shevchenko From andy.shevchenko at gmail.com Tue Sep 9 08:25:45 2025 From: andy.shevchenko at gmail.com (Andy Shevchenko) Date: Tue, 9 Sep 2025 18:25:45 +0300 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> Message-ID: On Tue, Sep 9, 2025 at 6:15?PM Andy Shevchenko wrote: > On Tue, Sep 9, 2025 at 5:05?PM Bartosz Golaszewski wrote: > > On Tue, Sep 9, 2025 at 4:02?PM Andy Shevchenko > > wrote: > > > On Tue, Sep 09, 2025 at 03:56:41PM +0200, Bartosz Golaszewski wrote: > > > > On Tue, Sep 9, 2025 at 3:47?PM Andy Shevchenko > > > > wrote: ... > > > > > TBH, I think those 6 all made the same mistake, i.e. thinking of the compound > > > > > literal as a cast. Which is not! > > > > > > > > What do you suggest? > > > > > > Write it in less odd way :-) > > > > > > foo = (struct bar) { ... }; > > > > I don't get your reasoning. typeof() itself is well established in the > > kernel and doesn't > > > > foo = (struct bar){ ... }; > > > > evaluate to the same thing as > > > > foo = (typeof(foo)){ ... }; > > > > ? Isn't it still the same compound literal? > > It makes it so, but typeof() usually is used for casts and not for > compound literals. That's (usage typeof() for compound literals) what > I am against in this case. FWIW, brief googling showed that nobody (okay, I haven't found yet reddit/SO/GCC or LLVM documentation) uses typeof() for compound literals. So, this makes me feel right, that the form of typeof() is weird and works due to unknown reasons. Any pointers to the documentation you read about it? > > > > And are we not allowed to use C99 features now anyway? > > > > > > It's fine, it's not about the C standard number. E.g., https://gcc.gnu.org/onlinedocs/gcc-15.1.0/gcc/Compound-Literals.html (8.1.0 is the same). -- With Best Regards, Andy Shevchenko From horms at kernel.org Tue Sep 9 09:15:49 2025 From: horms at kernel.org (Simon Horman) Date: Tue, 9 Sep 2025 17:15:49 +0100 Subject: [PATCH net-next v9 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <45053235-3b01-42d8-98aa-042681104d11@iscas.ac.cn> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> <20250905-net-k1-emac-v9-2-f1649b98a19c@iscas.ac.cn> <20250905153500.GH553991@horms.kernel.org> <0605f176-5cdb-4f5b-9a6b-afa139c96732@iscas.ac.cn> <20250905160158.GI553991@horms.kernel.org> <45053235-3b01-42d8-98aa-042681104d11@iscas.ac.cn> Message-ID: <20250909161549.GC20205@horms.kernel.org> On Sat, Sep 06, 2025 at 12:35:37AM +0800, Vivian Wang wrote: > On 9/6/25 00:01, Simon Horman wrote: > > > On Fri, Sep 05, 2025 at 11:45:29PM +0800, Vivian Wang wrote: > > > > ... > > > > Hi Vivian, > > > >>>> + status = emac_rx_frame_status(priv, rx_desc); > >>>> + if (unlikely(status == RX_FRAME_DISCARD)) { > >>>> + ndev->stats.rx_dropped++; > >>> As per the comment in struct net-device, > >>> ndev->stats should not be used in modern drivers. > >>> > >>> Probably you want to implement NETDEV_PCPU_STAT_TSTATS. > >>> > >>> Sorry for not mentioning this in an earlier review of > >>> stats in this driver. > >>> > >> On a closer look, these counters in ndev->stats seems to be redundant > >> with the hardware-tracked statistics, so maybe I should just not bother > >> with updating ndev->stats. Does that make sense? > > For rx/tx packets/bytes I think that makes sense. > > But what about rx/tx drops? > > Right... but tstats doesn't have *_dropped. It seems that tx_dropped and > rx_dropped are considered "slow path" for real devices. It makes sense > to me that those should be very rare. > > So it seems that what I should do is to just track tx_dropped and > rx_dropped myself in a member in emac_priv and report in the > ndo_get_stats64 callback, and use the hardware stuff for the rest, as > implemented now. Thanks, that makes sense to me. From horms at kernel.org Tue Sep 9 09:18:26 2025 From: horms at kernel.org (Simon Horman) Date: Tue, 9 Sep 2025 17:18:26 +0100 Subject: [PATCH net-next v9 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250909161549.GC20205@horms.kernel.org> References: <20250905-net-k1-emac-v9-0-f1649b98a19c@iscas.ac.cn> <20250905-net-k1-emac-v9-2-f1649b98a19c@iscas.ac.cn> <20250905153500.GH553991@horms.kernel.org> <0605f176-5cdb-4f5b-9a6b-afa139c96732@iscas.ac.cn> <20250905160158.GI553991@horms.kernel.org> <45053235-3b01-42d8-98aa-042681104d11@iscas.ac.cn> <20250909161549.GC20205@horms.kernel.org> Message-ID: <20250909161826.GA23218@horms.kernel.org> On Tue, Sep 09, 2025 at 05:15:56PM +0100, Simon Horman wrote: > On Sat, Sep 06, 2025 at 12:35:37AM +0800, Vivian Wang wrote: > > On 9/6/25 00:01, Simon Horman wrote: > > > > > On Fri, Sep 05, 2025 at 11:45:29PM +0800, Vivian Wang wrote: > > > > > > ... > > > > > > Hi Vivian, > > > > > >>>> + status = emac_rx_frame_status(priv, rx_desc); > > >>>> + if (unlikely(status == RX_FRAME_DISCARD)) { > > >>>> + ndev->stats.rx_dropped++; > > >>> As per the comment in struct net-device, > > >>> ndev->stats should not be used in modern drivers. > > >>> > > >>> Probably you want to implement NETDEV_PCPU_STAT_TSTATS. > > >>> > > >>> Sorry for not mentioning this in an earlier review of > > >>> stats in this driver. > > >>> > > >> On a closer look, these counters in ndev->stats seems to be redundant > > >> with the hardware-tracked statistics, so maybe I should just not bother > > >> with updating ndev->stats. Does that make sense? > > > For rx/tx packets/bytes I think that makes sense. > > > But what about rx/tx drops? > > > > Right... but tstats doesn't have *_dropped. It seems that tx_dropped and > > rx_dropped are considered "slow path" for real devices. It makes sense > > to me that those should be very rare. > > > > So it seems that what I should do is to just track tx_dropped and > > rx_dropped myself in a member in emac_priv and report in the > > ndo_get_stats64 callback, and use the hardware stuff for the rest, as > > implemented now. > > Thanks, that makes sense to me. Oops, for the 2nd time today I see I've responded where other's have already done so. In this case, please take Jakub's advice elsewhere in this thread. From brgl at bgdev.pl Tue Sep 9 09:20:07 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Tue, 9 Sep 2025 18:20:07 +0200 Subject: [PATCH 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-13-9f723dc3524a@linaro.org> Message-ID: On Tue, Sep 9, 2025 at 5:26?PM Andy Shevchenko wrote: > > On Tue, Sep 9, 2025 at 6:15?PM Andy Shevchenko > wrote: > > On Tue, Sep 9, 2025 at 5:05?PM Bartosz Golaszewski wrote: > > > On Tue, Sep 9, 2025 at 4:02?PM Andy Shevchenko > > > wrote: > > > > On Tue, Sep 09, 2025 at 03:56:41PM +0200, Bartosz Golaszewski wrote: > > > > > On Tue, Sep 9, 2025 at 3:47?PM Andy Shevchenko > > > > > wrote: > > ... > > > > > > > TBH, I think those 6 all made the same mistake, i.e. thinking of the compound > > > > > > literal as a cast. Which is not! > > > > > > > > > > What do you suggest? > > > > > > > > Write it in less odd way :-) > > > > > > > > foo = (struct bar) { ... }; > > > > > > I don't get your reasoning. typeof() itself is well established in the > > > kernel and doesn't > > > > > > foo = (struct bar){ ... }; > > > > > > evaluate to the same thing as > > > > > > foo = (typeof(foo)){ ... }; > > > > > > ? Isn't it still the same compound literal? > > > > It makes it so, but typeof() usually is used for casts and not for > > compound literals. That's (usage typeof() for compound literals) what > > I am against in this case. > > FWIW, brief googling showed that nobody (okay, I haven't found yet > reddit/SO/GCC or LLVM documentation) uses typeof() for compound > literals. So, this makes me feel right, that the form of typeof() is > weird and works due to unknown reasons. Any pointers to the > documentation you read about it? > Ok I'll change it. I also need to change it in existing patches that already landed in next then. > > > > > And are we not allowed to use C99 features now anyway? > > > > > > > > It's fine, it's not about the C standard number. > > E.g., https://gcc.gnu.org/onlinedocs/gcc-15.1.0/gcc/Compound-Literals.html > (8.1.0 is the same). > I get it, I understood incorrectly how they work, no need to rub it in. :) Bart From spriteovo at gmail.com Tue Sep 9 09:53:11 2025 From: spriteovo at gmail.com (Asuna Yang) Date: Tue, 09 Sep 2025 18:53:11 +0200 Subject: [PATCH v2] RISC-V: re-enable gcc + rust builds Message-ID: <20250909-gcc-rust-v2-v2-1-35e086b1b255@gmail.com> Commit 33549fcf37ec ("RISC-V: disallow gcc + rust builds") disabled GCC + Rust builds for RISC-V due to differences in extension handling compared to LLVM. Add a Kconfig symbol to indicate the version of libclang used by Rust bindgen and add conditions for the availability of libclang to the RISC-V extension Kconfig symbols that depend on the cc-option function. For Zicsr/Zifencei special handling, since LLVM/Clang always enables these two extensions, either don't pass them to -march, or pass them explicitly and Rust bindgen libclang must recognize them. Clang does not support -mno-riscv-attribute flag, filter it out to resolve error: unknown argument: '-mno-riscv-attribute'. Define BINDGEN_TARGET_riscv to pass the target triplet to Rust bindgen libclang for RISC-V to resolve error: unsupported argument 'medany' to option '-mcmodel=' for target 'unknown'. Improve to output a clearer error message if the target triplet is undefined for Rust bindgen libclang. Update the documentation, GCC + Rust builds are now supported. --- Discussion: https://lore.kernel.org/linux-riscv/68496eed-b5a4-4739-8d84-dcc428a08e20 at gmail.com/ Patch v1: https://lore.kernel.org/linux-riscv/20250903190806.2604757-1-SpriteOvO at gmail.com/ GCC + Rust builds for RISC-V are disabled about a year ago due to differences in extension handling compared to LLVM, as discussed in https://lore.kernel.org/all/20240917000848.720765-1-jmontleo at redhat.com/ This patch re-enables GCC + Rust builds. Compared to v1, v2 reverts the separation of get-rust-bindgen-libclang script and improves Kconfig conditions based on Conor's review. The separation of get-rust-bindgen-libclang script is reverted based on the concerns raised by Miguel. However, it's worth noting that we now have 3 different places rust/Makefile scripts/{Kconfig.include,rust_is_avilable.sh} where manually calling bindgen rust_is_available_bindgen_libclang.h + sed to get the version of libclang, and in particular, for our newly added Kconfig symbol, we now use awk to canonicalize the version to an integer. I would still like to do the script separation later for better maintainability and readability if possible, which can be discussed further later when Miguel has time. Signed-off-by: Asuna Yang --- Documentation/rust/arch-support.rst | 2 +- arch/riscv/Kconfig | 30 +++++++++++++++++++++++++++++- init/Kconfig | 6 ++++++ rust/Makefile | 7 ++++++- scripts/Kconfig.include | 1 + 5 files changed, 43 insertions(+), 3 deletions(-) diff --git a/Documentation/rust/arch-support.rst b/Documentation/rust/arch-support.rst index 6e6a515d08991a130a8e79dc4ad7ad09da244020..5282e0e174e8de66b4c6fec354cf329fd2aec873 100644 --- a/Documentation/rust/arch-support.rst +++ b/Documentation/rust/arch-support.rst @@ -18,7 +18,7 @@ Architecture Level of support Constraints ``arm`` Maintained ARMv7 Little Endian only. ``arm64`` Maintained Little Endian only. ``loongarch`` Maintained \- -``riscv`` Maintained ``riscv64`` and LLVM/Clang only. +``riscv`` Maintained ``riscv64`` only. ``um`` Maintained \- ``x86`` Maintained ``x86_64`` only. ============= ================ ============================================== diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 51dcd8eaa24356d947ebe0f1c4a701a3cfc6b757..3e892864f930778218073e8ee5980eb8f4e1594a 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -191,7 +191,7 @@ config RISCV select HAVE_REGS_AND_STACK_ACCESS_API select HAVE_RETHOOK if !XIP_KERNEL select HAVE_RSEQ - select HAVE_RUST if RUSTC_SUPPORTS_RISCV && CC_IS_CLANG + select HAVE_RUST if RUSTC_SUPPORTS_RISCV && TOOLCHAIN_MATCHES_ZICSR_ZIFENCEI select HAVE_SAMPLE_FTRACE_DIRECT select HAVE_SAMPLE_FTRACE_DIRECT_MULTI select HAVE_STACKPROTECTOR @@ -629,6 +629,8 @@ config TOOLCHAIN_HAS_V depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32imv) depends on LLD_VERSION >= 140000 || LD_VERSION >= 23800 depends on AS_HAS_OPTION_ARCH + # https://github.com/llvm/llvm-project/commit/e6de53b4de4aecca4ac892500a0907805896ed27 + depends on !RUST || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 config RISCV_ISA_V bool "Vector extension support" @@ -693,6 +695,8 @@ config TOOLCHAIN_HAS_ZABHA depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zabha) depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zabha) depends on AS_HAS_OPTION_ARCH + # https://github.com/llvm/llvm-project/commit/6b7444964a8d028989beee554a1f5c61d16a1cac + depends on !RUST || RUST_BINDGEN_LIBCLANG_VERSION >= 190100 config RISCV_ISA_ZABHA bool "Zabha extension support for atomic byte/halfword operations" @@ -711,6 +715,8 @@ config TOOLCHAIN_HAS_ZACAS depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zacas) depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zacas) depends on AS_HAS_OPTION_ARCH + # https://github.com/llvm/llvm-project/commit/614aeda93b2225c6eb42b00ba189ba7ca2585c60 + depends on !RUST || RUST_BINDGEN_LIBCLANG_VERSION >= 200100 config RISCV_ISA_ZACAS bool "Zacas extension support for atomic CAS" @@ -730,6 +736,8 @@ config TOOLCHAIN_HAS_ZBB depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbb) depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 depends on AS_HAS_OPTION_ARCH + # https://github.com/llvm/llvm-project/commit/33d008b169f3c813a4a45da220d0952f795ac477 + depends on !RUST || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 # This symbol indicates that the toolchain supports all v1.0 vector crypto # extensions, including Zvk*, Zvbb, and Zvbc. LLVM added all of these at once. @@ -745,6 +753,8 @@ config TOOLCHAIN_HAS_ZBA depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zba) depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 depends on AS_HAS_OPTION_ARCH + # https://github.com/llvm/llvm-project/commit/33d008b169f3c813a4a45da220d0952f795ac477 + depends on !RUST || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 config RISCV_ISA_ZBA bool "Zba extension support for bit manipulation instructions" @@ -780,6 +790,8 @@ config TOOLCHAIN_HAS_ZBC depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbc) depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 depends on AS_HAS_OPTION_ARCH + # https://github.com/llvm/llvm-project/commit/33d008b169f3c813a4a45da220d0952f795ac477 + depends on !RUST || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 config RISCV_ISA_ZBC bool "Zbc extension support for carry-less multiplication instructions" @@ -803,6 +815,8 @@ config TOOLCHAIN_HAS_ZBKB depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbkb) depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 depends on AS_HAS_OPTION_ARCH + # https://github.com/llvm/llvm-project/commit/7ee1c162cc53d37f717f9a138276ad64fa6863bc + depends on !RUST || RUST_BINDGEN_LIBCLANG_VERSION >= 140000 config RISCV_ISA_ZBKB bool "Zbkb extension support for bit manipulation instructions" @@ -890,6 +904,20 @@ config TOOLCHAIN_NEEDS_OLD_ISA_SPEC versions of clang and GCC to be passed to GAS, which has the same result as passing zicsr and zifencei to -march. +config TOOLCHAIN_MATCHES_ZICSR_ZIFENCEI + def_bool y + # https://github.com/llvm/llvm-project/commit/22e199e6afb1263c943c0c0d4498694e15bf8a16 + depends on TOOLCHAIN_NEEDS_OLD_ISA_SPEC || !TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI || RUST_BINDGEN_LIBCLANG_VERSION >= 170000 + help + LLVM/Clang >= 17.0.0 starts recognizing Zicsr/Zifencei in -march, passing + them to -march doesn't generate an error anymore, and passing them or not + doesn't have any real difference, it still follows ISA before version + 20190608 - Zicsr/Zifencei are included in base ISA. + + The current latest version of LLVM/Clang still does not require explicit + Zicsr/Zifencei to enable these two extensions, Clang just accepts them in + -march and then silently ignores them. + config FPU bool "FPU support" default y diff --git a/init/Kconfig b/init/Kconfig index e3eb63eadc8757a10b091c74bbee8008278c0521..0859d308a48591df769c7dbaef6f035324892bd3 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -82,6 +82,12 @@ config RUSTC_LLVM_VERSION int default $(rustc-llvm-version) +config RUST_BINDGEN_LIBCLANG_VERSION + int + default $(rustc-bindgen-libclang-version) + help + This is the version of `libclang` used by the Rust bindings generator. + config CC_CAN_LINK bool default $(success,$(srctree)/scripts/cc-can-link.sh $(CC) $(CLANG_FLAGS) $(USERCFLAGS) $(USERLDFLAGS) $(m64-flag)) if 64BIT diff --git a/rust/Makefile b/rust/Makefile index bfa915b0e58854045b367557342727fee4fe2808..8c6f84487c41880816d1e55ba4c0df0e5af4e8fd 100644 --- a/rust/Makefile +++ b/rust/Makefile @@ -290,20 +290,25 @@ bindgen_skip_c_flags := -mno-fp-ret-in-387 -mpreferred-stack-boundary=% \ -fno-inline-functions-called-once -fsanitize=bounds-strict \ -fstrict-flex-arrays=% -fmin-function-alignment=% \ -fzero-init-padding-bits=% -mno-fdpic \ - --param=% --param asan-% + --param=% --param asan-% -mno-riscv-attribute # Derived from `scripts/Makefile.clang`. BINDGEN_TARGET_x86 := x86_64-linux-gnu BINDGEN_TARGET_arm64 := aarch64-linux-gnu BINDGEN_TARGET_arm := arm-linux-gnueabi BINDGEN_TARGET_loongarch := loongarch64-linux-gnusf +BINDGEN_TARGET_riscv := riscv64-linux-gnu BINDGEN_TARGET_um := $(BINDGEN_TARGET_$(SUBARCH)) BINDGEN_TARGET := $(BINDGEN_TARGET_$(SRCARCH)) +ifeq ($(BINDGEN_TARGET),) +$(error add '--target=' option to rust/Makefile) +else # All warnings are inhibited since GCC builds are very experimental, # many GCC warnings are not supported by Clang, they may only appear in # some configurations, with new GCC versions, etc. bindgen_extra_c_flags = -w --target=$(BINDGEN_TARGET) +endif # Auto variable zero-initialization requires an additional special option with # clang that is going to be removed sometime in the future (likely in diff --git a/scripts/Kconfig.include b/scripts/Kconfig.include index 33193ca6e8030e659d6b321acaea1acd42c387a4..00462b29030515fcaaa49613e87e2a33320468ae 100644 --- a/scripts/Kconfig.include +++ b/scripts/Kconfig.include @@ -67,6 +67,7 @@ m64-flag := $(cc-option-bit,-m64) rustc-version := $(shell,$(srctree)/scripts/rustc-version.sh $(RUSTC)) rustc-llvm-version := $(shell,$(srctree)/scripts/rustc-llvm-version.sh $(RUSTC)) +rustc-bindgen-libclang-version := $(shell,$(BINDGEN) $(srctree)/scripts/rust_is_available_bindgen_libclang.h 2>&1 | sed -nE 's:.*clang version ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p' | awk -F'.' '{print $1 * 10000 + $2 * 100 + $3}') # $(rustc-option,) # Return y if the Rust compiler supports , n otherwise --- base-commit: f777d1112ee597d7f7dd3ca232220873a34ad0c8 change-id: 20250909-gcc-rust-v2-7084003bc619 Best regards, -- Asuna Yang From miguel.ojeda.sandonis at gmail.com Tue Sep 9 10:12:06 2025 From: miguel.ojeda.sandonis at gmail.com (Miguel Ojeda) Date: Tue, 9 Sep 2025 19:12:06 +0200 Subject: [PATCH v2] RISC-V: re-enable gcc + rust builds In-Reply-To: <20250909-gcc-rust-v2-v2-1-35e086b1b255@gmail.com> References: <20250909-gcc-rust-v2-v2-1-35e086b1b255@gmail.com> Message-ID: On Tue, Sep 9, 2025 at 6:55?PM Asuna Yang wrote: > > The separation of get-rust-bindgen-libclang script is reverted based on the > concerns raised by Miguel. However, it's worth noting that we now have 3 > different places rust/Makefile scripts/{Kconfig.include,rust_is_avilable.sh} > where manually calling bindgen rust_is_available_bindgen_libclang.h + sed to get > the version of libclang, and in particular, for our newly added Kconfig symbol, > we now use awk to canonicalize the version to an integer. I would still like to > do the script separation later for better maintainability and readability if > possible, which can be discussed further later when Miguel has time. To clarify, since this probably targets the next cycle, there is time to discuss and get feedback to do whatever we feel it is best (personally, I can take a look after Kangrejos at some point). Is there a particular rush for this? Having said that, this v2 looks substantially simpler than v1, which is nice, and perhaps RISC-V wants to land it already. Up to them in that case. (I see the `ifeq ($(BINDGEN_TARGET),)` is still there -- in general I would suggest splitting things if they don't depend on each other, but it is not a huge deal. I would also probably have split the `rustc-bindgen-libclang-version` into its own, but at least that is a dependency). Thanks! Cheers, Miguel From ajones at ventanamicro.com Tue Sep 9 10:12:59 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Tue, 9 Sep 2025 12:12:59 -0500 Subject: [PATCH V10 3/5] riscv: Add RISC-V Svrsw60t59b extension support In-Reply-To: <20250909095611.803898-4-zhangchunyan@iscas.ac.cn> References: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> <20250909095611.803898-4-zhangchunyan@iscas.ac.cn> Message-ID: <20250909-2130daabd7f57a8a357c677f@orel> On Tue, Sep 09, 2025 at 05:56:09PM +0800, Chunyan Zhang wrote: > The Svrsw60t59b extension allows to free the PTE reserved bits 60 > and 59 for software to use. > > Reviewed-by: Alexandre Ghiti > Signed-off-by: Chunyan Zhang > --- > arch/riscv/Kconfig | 14 ++++++++++++++ > arch/riscv/include/asm/hwcap.h | 1 + > arch/riscv/kernel/cpufeature.c | 1 + > 3 files changed, 16 insertions(+) > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index a4b233a0659e..d99df67cc7a4 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -862,6 +862,20 @@ config RISCV_ISA_ZICBOP > > If you don't know what to do here, say Y. > > +config RISCV_ISA_SVRSW60T59B > + bool "Svrsw60t59b extension support for using PTE bits 60 and 59" > + depends on MMU && 64BIT > + depends on RISCV_ALTERNATIVE > + default y > + help > + Adds support to dynamically detect the presence of the Svrsw60t59b > + extension and enable its usage. > + > + The Svrsw60t59b extension allows to free the PTE reserved bits 60 > + and 59 for software to use. > + > + If you don't know what to do here, say Y. > + > config TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI > def_bool y > # https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=aed44286efa8ae8717a77d94b51ac3614e2ca6dc > diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h > index affd63e11b0a..f98fcb5c17d5 100644 > --- a/arch/riscv/include/asm/hwcap.h > +++ b/arch/riscv/include/asm/hwcap.h > @@ -106,6 +106,7 @@ > #define RISCV_ISA_EXT_ZAAMO 97 > #define RISCV_ISA_EXT_ZALRSC 98 > #define RISCV_ISA_EXT_ZICBOP 99 > +#define RISCV_ISA_EXT_SVRSW60T59B 100 > > #define RISCV_ISA_EXT_XLINUXENVCFG 127 > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c > index 743d53415572..de29562096ff 100644 > --- a/arch/riscv/kernel/cpufeature.c > +++ b/arch/riscv/kernel/cpufeature.c > @@ -540,6 +540,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { > __RISCV_ISA_EXT_DATA(svnapot, RISCV_ISA_EXT_SVNAPOT), > __RISCV_ISA_EXT_DATA(svpbmt, RISCV_ISA_EXT_SVPBMT), > __RISCV_ISA_EXT_DATA(svvptc, RISCV_ISA_EXT_SVVPTC), > + __RISCV_ISA_EXT_DATA(svrsw60t59b, RISCV_ISA_EXT_SVRSW60T59B), svrsw60t59b should come before svvptc. See the ordering rule comment at the top of the array. Otherwise, Reviewed-by: Andrew Jones > }; > > const size_t riscv_isa_ext_count = ARRAY_SIZE(riscv_isa_ext); > -- > 2.34.1 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv From wei.liu at kernel.org Tue Sep 9 10:20:34 2025 From: wei.liu at kernel.org (Wei Liu) Date: Tue, 9 Sep 2025 17:20:34 +0000 Subject: [PATCH v2 0/7] Drivers: hv: Fix NEED_RESCHED_LAZY and use common APIs In-Reply-To: References: <20250828000156.23389-1-seanjc@google.com> Message-ID: On Thu, Sep 04, 2025 at 10:39:37PM -0700, Sean Christopherson wrote: > On Thu, Sep 04, 2025, Wei Liu wrote: > > On Wed, Aug 27, 2025 at 05:01:49PM -0700, Sean Christopherson wrote: > > > Fix a bug where MSHV root partitions (and upper-level VTL code) don't honor > > > NEED_RESCHED_LAZY, and then deduplicate the TIF related MSHV code by turning > > > the "kvm" entry APIs into more generic "virt" APIs. > > > > > > This version is based on > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git hyperv-next > > > > > > in order to pickup the VTL changes that are queued for 6.18. I also > > > squashed the NEED_RESCHED_LAZY fixes for root and VTL modes into a single > > > patch, as it should be easy/straightforward to drop the VTL change as needed > > > if we want this in 6.17 or earlier. > > > > > > That effectively means the full series is dependent on the VTL changes being > > > fully merged for 6.18. But I think that's ok as it's really only the MSHV > > > changes that have any urgency whatsoever, and I assume that Microsoft is > > > the only user that truly cares about the MSHV root fix. I.e. if the whole > > > thing gets delayed, I think it's only the Hyper-V folks that are impacted. > > > > > > I have no preference what tree this goes through, or when, and can respin > > > and/or split as needed. > > > > > > As with v1, the Hyper-V stuff and non-x86 architectures are compile-tested > > > only. > > > > > > v2: > > > - Rebase on hyperv-next. > > > - Fix and converge the VTL code as well. [Peter, Nuno] > > > > > > v1: https://lore.kernel.org/all/20250825200622.3759571-1-seanjc at google.com > > > > > > > I dropped the mshv_vtl changes in this series and applied the rest > > (including the KVM changes) to hyperv-next. > > mshv_do_pre_guest_mode_work() ended up getting left behind since its removal was > in the last mshv_vtl patch. > > $ git grep mshv_do_pre_guest_mode_work > drivers/hv/mshv.h:int mshv_do_pre_guest_mode_work(ulong th_flags); > drivers/hv/mshv_common.c:int mshv_do_pre_guest_mode_work(ulong th_flags) > drivers/hv/mshv_common.c:EXPORT_SYMBOL_GPL(mshv_do_pre_guest_mode_work); > > Want to squash this into 3786d7d6b3c0 ("mshv: Use common "entry virt" APIs to do > work in root before running guest")? > It's done. Thanks for pointing it out. Wei From spriteovo at gmail.com Tue Sep 9 10:26:33 2025 From: spriteovo at gmail.com (Asuna) Date: Wed, 10 Sep 2025 01:26:33 +0800 Subject: [PATCH v2] RISC-V: re-enable gcc + rust builds In-Reply-To: References: <20250909-gcc-rust-v2-v2-1-35e086b1b255@gmail.com> Message-ID: On 9/10/25 1:12 AM, Miguel Ojeda wrote: > To clarify, since this probably targets the next cycle, there is time > to discuss and get feedback to do whatever we feel it is best > (personally, I can take a look after Kangrejos at some point). Is > there a particular rush for this? Nah, no rush, as long as it works so that RISC-V distros can deliver Rust-based components, that's the main issue I'm trying to address. From conor at kernel.org Tue Sep 9 12:58:17 2025 From: conor at kernel.org (Conor Dooley) Date: Tue, 9 Sep 2025 20:58:17 +0100 Subject: [PATCH v1] riscv: dts: allwinner: rename devterm i2c-gpio node to comply with binding Message-ID: <20250909-frown-wrinkle-f16df243a970@spud> From: Conor Dooley The i2c controller binding does not permit permit the node name to contain "gpio", resulting in two warnings: i2c-gpio-0 (i2c-gpio): $nodename:0: 'i2c-gpio-0' does not match '^i2c(@.+|-[a-z0-9]+)?$' i2c-gpio-0 (i2c-gpio): Unevaluated properties are not allowed ('#address-cells', '#size-cells', 'adc at 54' were unexpected) Drop it to satisfy dtbs_check. Signed-off-by: Conor Dooley --- CC: Rob Herring CC: Krzysztof Kozlowski CC: Conor Dooley CC: Chen-Yu Tsai CC: Jernej Skrabec CC: Samuel Holland CC: devicetree at vger.kernel.org CC: linux-arm-kernel at lists.infradead.org CC: linux-sunxi at lists.linux.dev CC: linux-riscv at lists.infradead.org CC: linux-kernel at vger.kernel.org --- arch/riscv/boot/dts/allwinner/sun20i-d1-devterm-v3.14.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/boot/dts/allwinner/sun20i-d1-devterm-v3.14.dts b/arch/riscv/boot/dts/allwinner/sun20i-d1-devterm-v3.14.dts index bc5c84f227622..5f2e5cc3e3d55 100644 --- a/arch/riscv/boot/dts/allwinner/sun20i-d1-devterm-v3.14.dts +++ b/arch/riscv/boot/dts/allwinner/sun20i-d1-devterm-v3.14.dts @@ -17,7 +17,7 @@ fan { #cooling-cells = <2>; }; - i2c-gpio-0 { + i2c-0 { compatible = "i2c-gpio"; sda-gpios = <&pio 3 14 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>; /* PD14/GPIO44 */ scl-gpios = <&pio 3 15 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>; /* PD15/GPIO45 */ -- 2.47.2 From samuel.holland at sifive.com Tue Sep 9 15:41:27 2025 From: samuel.holland at sifive.com (Samuel Holland) Date: Tue, 9 Sep 2025 15:41:27 -0700 Subject: [PATCH] cache: sifive_ccache: Optimize cache flushes Message-ID: <20250909224131.276800-1-samuel.holland@sifive.com> Fence instructions are required only at the beginning and the end of a flush operation, not separately for each cache line being flushed. Speed up cache flushes by about 15% by removing the extra fences. Signed-off-by: Samuel Holland --- drivers/cache/sifive_ccache.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/cache/sifive_ccache.c b/drivers/cache/sifive_ccache.c index e1a283805ea7f..a86800b123b9e 100644 --- a/drivers/cache/sifive_ccache.c +++ b/drivers/cache/sifive_ccache.c @@ -151,16 +151,16 @@ static void ccache_flush_range(phys_addr_t start, size_t len) if (!len) return; - mb(); + mb(); /* complete earlier memory accesses before the cache flush */ for (line = ALIGN_DOWN(start, SIFIVE_CCACHE_LINE_SIZE); line < end; line += SIFIVE_CCACHE_LINE_SIZE) { #ifdef CONFIG_32BIT - writel(line >> 4, ccache_base + SIFIVE_CCACHE_FLUSH32); + writel_relaxed(line >> 4, ccache_base + SIFIVE_CCACHE_FLUSH32); #else - writeq(line, ccache_base + SIFIVE_CCACHE_FLUSH64); + writeq_relaxed(line, ccache_base + SIFIVE_CCACHE_FLUSH64); #endif - mb(); } + mb(); /* issue later memory accesses after the cache flush */ } static const struct riscv_nonstd_cache_ops ccache_mgmt_ops __initconst = { -- 2.47.2 base-commit: 9dd1835ecda5b96ac88c166f4a87386f3e727bd9 branch: up/ccache-flush-opt From unicornxw at gmail.com Tue Sep 9 19:07:11 2025 From: unicornxw at gmail.com (Chen Wang) Date: Wed, 10 Sep 2025 10:07:11 +0800 Subject: [PATCH v2 0/7] Add PCIe support to Sophgo SG2042 SoC Message-ID: From: Chen Wang Sophgo's SG2042 SoC uses Cadence PCIe core to implement RC mode. This is a completely rewritten PCIe driver for SG2042. It inherits some previously submitted patch codes (not merged into the upstream mainline), but the biggest difference is that the support for compatibility with old 32-bit PCIe devices has been removed in this new version. This is because after discussing with community users, we felt that there was not much demand for support for old devices, so we made a new design based on the simplified design and practical needs. If someone really needs to play with old devices, we can provide them with some necessary hack patches in the downstream repository. Since the new design is quite different from the old code, I will release it as a new patch series. The old patch series can be found in here [old-series]. Note, regarding [2/7] of this patchset, this fix is introduced because the pcie->ops pointer is not filled in SG2042 PCIe driver. This is not a must-have parameter, if we use it w/o checking will cause a null pointer access error during runtime. Link: https://lore.kernel.org/linux-riscv/cover.1736923025.git.unicorn_wang at outlook.com/ [old-series] Thanks, Chen --- Changes in v2: This patchset is based on v6.17-rc1. Fixed following issues based on feedbacks from Rob Herring, Manivannan Sadhasivam, Bjorn Helgaas, ALOK TIWARI, thanks. - Driver binding: - Removed vendor-id/device-id from "required" property. - Improve drivers code: - Have separated pci_ops for the root bus and child buses. - Make the driver tristate and as a module. - Change the configuration name from PCIE_SG2042 to PCIE_SG2042_HOST. - Removed "Fixes" tag from commit [2/7], since this is not for an existing bug fix. - Other code cleanups and optimizations - DT: - Add PCIe support for SG2042 EVB boards. Changes in v1: The patch series is based on v6.17-rc1. You can simply review or test the patches at the link [1]. Link: https://lore.kernel.org/linux-riscv/cover.1756344464.git.unicorn_wang at outlook.com/ [1] --- Chen Wang (7): dt-bindings: pci: Add Sophgo SG2042 PCIe host PCI: cadence: Check pcie-ops before using it. PCI: sg2042: Add Sophgo SG2042 PCIe driver riscv: sophgo: dts: add PCIe controllers for SG2042 riscv: sophgo: dts: enable PCIe for PioneerBox riscv: sophgo: dts: enable PCIe for SG2042_EVB_V1.X riscv: sophgo: dts: enable PCIe for SG2042_EVB_V2.0 .../bindings/pci/sophgo,sg2042-pcie-host.yaml | 64 +++++++++++ arch/riscv/boot/dts/sophgo/sg2042-evb-v1.dts | 12 ++ arch/riscv/boot/dts/sophgo/sg2042-evb-v2.dts | 12 ++ .../boot/dts/sophgo/sg2042-milkv-pioneer.dts | 12 ++ arch/riscv/boot/dts/sophgo/sg2042.dtsi | 88 +++++++++++++++ drivers/pci/controller/cadence/Kconfig | 10 ++ drivers/pci/controller/cadence/Makefile | 1 + .../controller/cadence/pcie-cadence-host.c | 2 +- drivers/pci/controller/cadence/pcie-cadence.c | 4 +- drivers/pci/controller/cadence/pcie-cadence.h | 6 +- drivers/pci/controller/cadence/pcie-sg2042.c | 104 ++++++++++++++++++ 11 files changed, 309 insertions(+), 6 deletions(-) create mode 100644 Documentation/devicetree/bindings/pci/sophgo,sg2042-pcie-host.yaml create mode 100644 drivers/pci/controller/cadence/pcie-sg2042.c base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585 -- 2.34.1 From unicornxw at gmail.com Tue Sep 9 19:07:57 2025 From: unicornxw at gmail.com (Chen Wang) Date: Wed, 10 Sep 2025 10:07:57 +0800 Subject: [PATCH v2 1/7] dt-bindings: pci: Add Sophgo SG2042 PCIe host In-Reply-To: References: Message-ID: <2755f145755b6096247c26852b63671a6fea4dbf.1757467895.git.unicorn_wang@outlook.com> From: Chen Wang Add binding for Sophgo SG2042 PCIe host controller. Reviewed-by: Rob Herring (Arm) Signed-off-by: Chen Wang --- .../bindings/pci/sophgo,sg2042-pcie-host.yaml | 64 +++++++++++++++++++ 1 file changed, 64 insertions(+) create mode 100644 Documentation/devicetree/bindings/pci/sophgo,sg2042-pcie-host.yaml diff --git a/Documentation/devicetree/bindings/pci/sophgo,sg2042-pcie-host.yaml b/Documentation/devicetree/bindings/pci/sophgo,sg2042-pcie-host.yaml new file mode 100644 index 000000000000..f8b7ca57fff1 --- /dev/null +++ b/Documentation/devicetree/bindings/pci/sophgo,sg2042-pcie-host.yaml @@ -0,0 +1,64 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/pci/sophgo,sg2042-pcie-host.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Sophgo SG2042 PCIe Host (Cadence PCIe Wrapper) + +description: + Sophgo SG2042 PCIe host controller is based on the Cadence PCIe core. + +maintainers: + - Chen Wang + +properties: + compatible: + const: sophgo,sg2042-pcie-host + + reg: + maxItems: 2 + + reg-names: + items: + - const: reg + - const: cfg + + vendor-id: + const: 0x1f1c + + device-id: + const: 0x2042 + + msi-parent: true + +allOf: + - $ref: cdns-pcie-host.yaml# + +required: + - compatible + - reg + - reg-names + +unevaluatedProperties: false + +examples: + - | + #include + + pcie at 62000000 { + compatible = "sophgo,sg2042-pcie-host"; + device_type = "pci"; + reg = <0x62000000 0x00800000>, + <0x48000000 0x00001000>; + reg-names = "reg", "cfg"; + #address-cells = <3>; + #size-cells = <2>; + ranges = <0x81000000 0 0x00000000 0xde000000 0 0x00010000>, + <0x82000000 0 0xd0400000 0xd0400000 0 0x0d000000>; + bus-range = <0x00 0xff>; + vendor-id = <0x1f1c>; + device-id = <0x2042>; + cdns,no-bar-match-nbits = <48>; + msi-parent = <&msi>; + }; -- 2.34.1 From unicornxw at gmail.com Tue Sep 9 19:08:16 2025 From: unicornxw at gmail.com (Chen Wang) Date: Wed, 10 Sep 2025 10:08:16 +0800 Subject: [PATCH v2 2/7] PCI: cadence: Check pcie-ops before using it. In-Reply-To: References: Message-ID: <18aba25b853d00caf10cc784093c0b91fdc1747d.1757467895.git.unicorn_wang@outlook.com> From: Chen Wang ops of struct cdns_pcie may be NULL, direct use will result in a null pointer error. Add checking of pcie->ops before using it for new driver that may not supply pcie->ops. Signed-off-by: Chen Wang --- drivers/pci/controller/cadence/pcie-cadence-host.c | 2 +- drivers/pci/controller/cadence/pcie-cadence.c | 4 ++-- drivers/pci/controller/cadence/pcie-cadence.h | 6 +++--- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/pci/controller/cadence/pcie-cadence-host.c b/drivers/pci/controller/cadence/pcie-cadence-host.c index 59a4631de79f..fffd63d6665e 100644 --- a/drivers/pci/controller/cadence/pcie-cadence-host.c +++ b/drivers/pci/controller/cadence/pcie-cadence-host.c @@ -531,7 +531,7 @@ static int cdns_pcie_host_init_address_translation(struct cdns_pcie_rc *rc) cdns_pcie_writel(pcie, CDNS_PCIE_AT_OB_REGION_PCI_ADDR1(0), addr1); cdns_pcie_writel(pcie, CDNS_PCIE_AT_OB_REGION_DESC1(0), desc1); - if (pcie->ops->cpu_addr_fixup) + if (pcie->ops && pcie->ops->cpu_addr_fixup) cpu_addr = pcie->ops->cpu_addr_fixup(pcie, cpu_addr); addr0 = CDNS_PCIE_AT_OB_REGION_CPU_ADDR0_NBITS(12) | diff --git a/drivers/pci/controller/cadence/pcie-cadence.c b/drivers/pci/controller/cadence/pcie-cadence.c index 70a19573440e..61806bbd8aa3 100644 --- a/drivers/pci/controller/cadence/pcie-cadence.c +++ b/drivers/pci/controller/cadence/pcie-cadence.c @@ -92,7 +92,7 @@ void cdns_pcie_set_outbound_region(struct cdns_pcie *pcie, u8 busnr, u8 fn, cdns_pcie_writel(pcie, CDNS_PCIE_AT_OB_REGION_DESC1(r), desc1); /* Set the CPU address */ - if (pcie->ops->cpu_addr_fixup) + if (pcie->ops && pcie->ops->cpu_addr_fixup) cpu_addr = pcie->ops->cpu_addr_fixup(pcie, cpu_addr); addr0 = CDNS_PCIE_AT_OB_REGION_CPU_ADDR0_NBITS(nbits) | @@ -123,7 +123,7 @@ void cdns_pcie_set_outbound_region_for_normal_msg(struct cdns_pcie *pcie, } /* Set the CPU address */ - if (pcie->ops->cpu_addr_fixup) + if (pcie->ops && pcie->ops->cpu_addr_fixup) cpu_addr = pcie->ops->cpu_addr_fixup(pcie, cpu_addr); addr0 = CDNS_PCIE_AT_OB_REGION_CPU_ADDR0_NBITS(17) | diff --git a/drivers/pci/controller/cadence/pcie-cadence.h b/drivers/pci/controller/cadence/pcie-cadence.h index 1d81c4bf6c6d..2f07ba661bda 100644 --- a/drivers/pci/controller/cadence/pcie-cadence.h +++ b/drivers/pci/controller/cadence/pcie-cadence.h @@ -468,7 +468,7 @@ static inline u32 cdns_pcie_ep_fn_readl(struct cdns_pcie *pcie, u8 fn, u32 reg) static inline int cdns_pcie_start_link(struct cdns_pcie *pcie) { - if (pcie->ops->start_link) + if (pcie->ops && pcie->ops->start_link) return pcie->ops->start_link(pcie); return 0; @@ -476,13 +476,13 @@ static inline int cdns_pcie_start_link(struct cdns_pcie *pcie) static inline void cdns_pcie_stop_link(struct cdns_pcie *pcie) { - if (pcie->ops->stop_link) + if (pcie->ops && pcie->ops->stop_link) pcie->ops->stop_link(pcie); } static inline bool cdns_pcie_link_up(struct cdns_pcie *pcie) { - if (pcie->ops->link_up) + if (pcie->ops && pcie->ops->link_up) return pcie->ops->link_up(pcie); return true; -- 2.34.1 From unicornxw at gmail.com Tue Sep 9 19:08:39 2025 From: unicornxw at gmail.com (Chen Wang) Date: Wed, 10 Sep 2025 10:08:39 +0800 Subject: [PATCH v2 3/7] PCI: sg2042: Add Sophgo SG2042 PCIe driver In-Reply-To: References: Message-ID: <162d064228261ccd0bf9313a20288e510912effd.1757467895.git.unicorn_wang@outlook.com> From: Chen Wang Add support for PCIe controller in SG2042 SoC. The controller uses the Cadence PCIe core programmed by pcie-cadence*.c. The PCIe controller will work in host mode only, supporting data rate(gen4) and lanes(x16 or x8). Signed-off-by: Chen Wang --- drivers/pci/controller/cadence/Kconfig | 10 ++ drivers/pci/controller/cadence/Makefile | 1 + drivers/pci/controller/cadence/pcie-sg2042.c | 104 +++++++++++++++++++ 3 files changed, 115 insertions(+) create mode 100644 drivers/pci/controller/cadence/pcie-sg2042.c diff --git a/drivers/pci/controller/cadence/Kconfig b/drivers/pci/controller/cadence/Kconfig index 666e16b6367f..02a639e55fd8 100644 --- a/drivers/pci/controller/cadence/Kconfig +++ b/drivers/pci/controller/cadence/Kconfig @@ -42,6 +42,15 @@ config PCIE_CADENCE_PLAT_EP endpoint mode. This PCIe controller may be embedded into many different vendors SoCs. +config PCIE_SG2042_HOST + tristate "Sophgo SG2042 PCIe controller (host mode)" + depends on OF && (ARCH_SOPHGO || COMPILE_TEST) + select PCIE_CADENCE_HOST + help + Say Y here if you want to support the Sophgo SG2042 PCIe platform + controller in host mode. Sophgo SG2042 PCIe controller uses Cadence + PCIe core. + config PCI_J721E tristate select PCIE_CADENCE_HOST if PCI_J721E_HOST != n @@ -67,4 +76,5 @@ config PCI_J721E_EP Say Y here if you want to support the TI J721E PCIe platform controller in endpoint mode. TI J721E PCIe controller uses Cadence PCIe core. + endmenu diff --git a/drivers/pci/controller/cadence/Makefile b/drivers/pci/controller/cadence/Makefile index 9bac5fb2f13d..5e23f8539ecc 100644 --- a/drivers/pci/controller/cadence/Makefile +++ b/drivers/pci/controller/cadence/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_PCIE_CADENCE_HOST) += pcie-cadence-host.o obj-$(CONFIG_PCIE_CADENCE_EP) += pcie-cadence-ep.o obj-$(CONFIG_PCIE_CADENCE_PLAT) += pcie-cadence-plat.o obj-$(CONFIG_PCI_J721E) += pci-j721e.o +obj-$(CONFIG_PCIE_SG2042_HOST) += pcie-sg2042.o diff --git a/drivers/pci/controller/cadence/pcie-sg2042.c b/drivers/pci/controller/cadence/pcie-sg2042.c new file mode 100644 index 000000000000..c026e1ca5d6e --- /dev/null +++ b/drivers/pci/controller/cadence/pcie-sg2042.c @@ -0,0 +1,104 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * pcie-sg2042 - PCIe controller driver for Sophgo SG2042 SoC + * + * Copyright (C) 2025 Sophgo Technology Inc. + * Copyright (C) 2025 Chen Wang + */ + +#include +#include +#include +#include + +#include "pcie-cadence.h" + +/* + * SG2042 only supports 4-byte aligned access, so for the rootbus (i.e. to + * read/write the Root Port itself, read32/write32 is required. For + * non-rootbus (i.e. to read/write the PCIe peripheral registers, supports + * 1/2/4 byte aligned access, so directly using read/write should be fine. + */ + +static struct pci_ops sg2042_pcie_root_ops = { + .map_bus = cdns_pci_map_bus, + .read = pci_generic_config_read32, + .write = pci_generic_config_write32, +}; + +static struct pci_ops sg2042_pcie_child_ops = { + .map_bus = cdns_pci_map_bus, + .read = pci_generic_config_read, + .write = pci_generic_config_write, +}; + +static int sg2042_pcie_probe(struct platform_device *pdev) +{ + struct device *dev = &pdev->dev; + struct pci_host_bridge *bridge; + struct cdns_pcie *pcie; + struct cdns_pcie_rc *rc; + int ret; + + bridge = devm_pci_alloc_host_bridge(dev, sizeof(*rc)); + if (!bridge) { + dev_err_probe(dev, -ENOMEM, "Failed to alloc host bridge!\n"); + return -ENOMEM; + } + + bridge->ops = &sg2042_pcie_root_ops; + bridge->child_ops = &sg2042_pcie_child_ops; + + rc = pci_host_bridge_priv(bridge); + pcie = &rc->pcie; + pcie->dev = dev; + + platform_set_drvdata(pdev, pcie); + + pm_runtime_set_active(dev); + pm_runtime_no_callbacks(dev); + devm_pm_runtime_enable(dev); + + ret = cdns_pcie_init_phy(dev, pcie); + if (ret) { + dev_err_probe(dev, ret, "Failed to init phy!\n"); + return ret; + } + + ret = cdns_pcie_host_setup(rc); + if (ret) { + dev_err_probe(dev, ret, "Failed to setup host!\n"); + cdns_pcie_disable_phy(pcie); + return ret; + } + + return 0; +} + +static void sg2042_pcie_remove(struct platform_device *pdev) +{ + struct cdns_pcie *pcie = platform_get_drvdata(pdev); + + cdns_pcie_disable_phy(pcie); +} + +static const struct of_device_id sg2042_pcie_of_match[] = { + { .compatible = "sophgo,sg2042-pcie-host" }, + {}, +}; +MODULE_DEVICE_TABLE(of, sg2042_pcie_of_match); + +static struct platform_driver sg2042_pcie_driver = { + .driver = { + .name = "sg2042-pcie", + .of_match_table = sg2042_pcie_of_match, + .pm = &cdns_pcie_pm_ops, + }, + .probe = sg2042_pcie_probe, + .remove = sg2042_pcie_remove, +}; +module_platform_driver(sg2042_pcie_driver); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("PCIe controller driver for SG2042 SoCs"); +MODULE_AUTHOR("Chen Wang "); -- 2.34.1 From unicornxw at gmail.com Tue Sep 9 19:09:13 2025 From: unicornxw at gmail.com (Chen Wang) Date: Wed, 10 Sep 2025 10:09:13 +0800 Subject: [PATCH v2 4/7] riscv: sophgo: dts: add PCIe controllers for SG2042 In-Reply-To: References: Message-ID: <5cecf3c854253e508a88995011dd4631fa0c6eae.1757467895.git.unicorn_wang@outlook.com> From: Chen Wang Add PCIe controller nodes in DTS for Sophgo SG2042. Default they are disabled. Signed-off-by: Inochi Amaoto Signed-off-by: Han Gao Signed-off-by: Chen Wang --- arch/riscv/boot/dts/sophgo/sg2042.dtsi | 88 ++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) diff --git a/arch/riscv/boot/dts/sophgo/sg2042.dtsi b/arch/riscv/boot/dts/sophgo/sg2042.dtsi index b3e4d3c18fdc..b521f674283e 100644 --- a/arch/riscv/boot/dts/sophgo/sg2042.dtsi +++ b/arch/riscv/boot/dts/sophgo/sg2042.dtsi @@ -220,6 +220,94 @@ clkgen: clock-controller at 7030012000 { #clock-cells = <1>; }; + pcie_rc0: pcie at 7060000000 { + compatible = "sophgo,sg2042-pcie-host"; + device_type = "pci"; + reg = <0x70 0x60000000 0x0 0x00800000>, + <0x40 0x00000000 0x0 0x00001000>; + reg-names = "reg", "cfg"; + linux,pci-domain = <0>; + #address-cells = <3>; + #size-cells = <2>; + ranges = <0x01000000 0x0 0xc0000000 0x40 0xc0000000 0x0 0x00400000>, + <0x42000000 0x0 0xd0000000 0x40 0xd0000000 0x0 0x10000000>, + <0x02000000 0x0 0xe0000000 0x40 0xe0000000 0x0 0x20000000>, + <0x43000000 0x42 0x00000000 0x42 0x00000000 0x2 0x00000000>, + <0x03000000 0x41 0x00000000 0x41 0x00000000 0x1 0x00000000>; + bus-range = <0x0 0xff>; + vendor-id = <0x1f1c>; + device-id = <0x2042>; + cdns,no-bar-match-nbits = <48>; + msi-parent = <&msi>; + status = "disabled"; + }; + + pcie_rc1: pcie at 7060800000 { + compatible = "sophgo,sg2042-pcie-host"; + device_type = "pci"; + reg = <0x70 0x60800000 0x0 0x00800000>, + <0x44 0x00000000 0x0 0x00001000>; + reg-names = "reg", "cfg"; + linux,pci-domain = <1>; + #address-cells = <3>; + #size-cells = <2>; + ranges = <0x01000000 0x0 0xc0400000 0x44 0xc0400000 0x0 0x00400000>, + <0x42000000 0x0 0xd0000000 0x44 0xd0000000 0x0 0x10000000>, + <0x02000000 0x0 0xe0000000 0x44 0xe0000000 0x0 0x20000000>, + <0x43000000 0x46 0x00000000 0x46 0x00000000 0x2 0x00000000>, + <0x03000000 0x45 0x00000000 0x45 0x00000000 0x1 0x00000000>; + bus-range = <0x0 0xff>; + vendor-id = <0x1f1c>; + device-id = <0x2042>; + cdns,no-bar-match-nbits = <48>; + msi-parent = <&msi>; + status = "disabled"; + }; + + pcie_rc2: pcie at 7062000000 { + compatible = "sophgo,sg2042-pcie-host"; + device_type = "pci"; + reg = <0x70 0x62000000 0x0 0x00800000>, + <0x48 0x00000000 0x0 0x00001000>; + reg-names = "reg", "cfg"; + linux,pci-domain = <2>; + #address-cells = <3>; + #size-cells = <2>; + ranges = <0x01000000 0x0 0xc0800000 0x48 0xc0800000 0x0 0x00400000>, + <0x42000000 0x0 0xd0000000 0x48 0xd0000000 0x0 0x10000000>, + <0x02000000 0x0 0xe0000000 0x48 0xe0000000 0x0 0x20000000>, + <0x03000000 0x49 0x00000000 0x49 0x00000000 0x1 0x00000000>, + <0x43000000 0x4a 0x00000000 0x4a 0x00000000 0x2 0x00000000>; + bus-range = <0x0 0xff>; + vendor-id = <0x1f1c>; + device-id = <0x2042>; + cdns,no-bar-match-nbits = <48>; + msi-parent = <&msi>; + status = "disabled"; + }; + + pcie_rc3: pcie at 7062800000 { + compatible = "sophgo,sg2042-pcie-host"; + device_type = "pci"; + reg = <0x70 0x62800000 0x0 0x00800000>, + <0x4c 0x00000000 0x0 0x00001000>; + reg-names = "reg", "cfg"; + linux,pci-domain = <3>; + #address-cells = <3>; + #size-cells = <2>; + ranges = <0x01000000 0x0 0xc0c00000 0x4c 0xc0c00000 0x0 0x00400000>, + <0x42000000 0x0 0xf8000000 0x4c 0xf8000000 0x0 0x04000000>, + <0x02000000 0x0 0xfc000000 0x4c 0xfc000000 0x0 0x04000000>, + <0x43000000 0x4e 0x00000000 0x4e 0x00000000 0x2 0x00000000>, + <0x03000000 0x4d 0x00000000 0x4d 0x00000000 0x1 0x00000000>; + bus-range = <0x0 0xff>; + vendor-id = <0x1f1c>; + device-id = <0x2042>; + cdns,no-bar-match-nbits = <48>; + msi-parent = <&msi>; + status = "disabled"; + }; + clint_mswi: interrupt-controller at 7094000000 { compatible = "sophgo,sg2042-aclint-mswi", "thead,c900-aclint-mswi"; reg = <0x00000070 0x94000000 0x00000000 0x00004000>; -- 2.34.1 From unicornxw at gmail.com Tue Sep 9 19:09:38 2025 From: unicornxw at gmail.com (Chen Wang) Date: Wed, 10 Sep 2025 10:09:38 +0800 Subject: [PATCH v2 5/7] riscv: sophgo: dts: enable PCIe for PioneerBox In-Reply-To: References: Message-ID: <4e885f30470ea07f499c9a83ab5dd327e00774ca.1757467895.git.unicorn_wang@outlook.com> From: Chen Wang Enable PCIe controllers for PioneerBox, which uses SG2042 SoC. Signed-off-by: Chen Wang --- arch/riscv/boot/dts/sophgo/sg2042-milkv-pioneer.dts | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/riscv/boot/dts/sophgo/sg2042-milkv-pioneer.dts b/arch/riscv/boot/dts/sophgo/sg2042-milkv-pioneer.dts index ef3a602172b1..c4d5f8d7d4ad 100644 --- a/arch/riscv/boot/dts/sophgo/sg2042-milkv-pioneer.dts +++ b/arch/riscv/boot/dts/sophgo/sg2042-milkv-pioneer.dts @@ -128,6 +128,18 @@ uart0-rx-pins { }; }; +&pcie_rc0 { + status = "okay"; +}; + +&pcie_rc2 { + status = "okay"; +}; + +&pcie_rc3 { + status = "okay"; +}; + &sd { pinctrl-0 = <&sd_cfg>; pinctrl-names = "default"; -- 2.34.1 From unicornxw at gmail.com Tue Sep 9 19:10:01 2025 From: unicornxw at gmail.com (Chen Wang) Date: Wed, 10 Sep 2025 10:10:01 +0800 Subject: [PATCH v2 6/7] riscv: sophgo: dts: enable PCIe for SG2042_EVB_V1.X In-Reply-To: References: Message-ID: <2d85c8b221bf4aceae6f3dfaef6d53221daf7e70.1757467895.git.unicorn_wang@outlook.com> From: Chen Wang Enable PCIe controllers for Sophgo SG2042_EVB_V1.X board, which uses SG2042 SoC. Signed-off-by: Han Gao Signed-off-by: Chen Wang --- arch/riscv/boot/dts/sophgo/sg2042-evb-v1.dts | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/riscv/boot/dts/sophgo/sg2042-evb-v1.dts b/arch/riscv/boot/dts/sophgo/sg2042-evb-v1.dts index 3320bc1dd2c6..a186d036cf36 100644 --- a/arch/riscv/boot/dts/sophgo/sg2042-evb-v1.dts +++ b/arch/riscv/boot/dts/sophgo/sg2042-evb-v1.dts @@ -164,6 +164,18 @@ phy0: phy at 0 { }; }; +&pcie_rc0 { + status = "okay"; +}; + +&pcie_rc1 { + status = "okay"; +}; + +&pcie_rc2 { + status = "okay"; +}; + &pinctrl { emmc_cfg: sdhci-emmc-cfg { sdhci-emmc-wp-pins { -- 2.34.1 From unicornxw at gmail.com Tue Sep 9 19:10:20 2025 From: unicornxw at gmail.com (Chen Wang) Date: Wed, 10 Sep 2025 10:10:20 +0800 Subject: [PATCH v2 7/7] riscv: sophgo: dts: enable PCIe for SG2042_EVB_V2.0 In-Reply-To: References: Message-ID: <023eb6dbd2d9d808c3992e954ad7eb3840da8260.1757467895.git.unicorn_wang@outlook.com> From: Chen Wang Enable PCIe controllers for Sophgo SG2042_EVB_V2.0 board, which uses SG2042 SoC. Signed-off-by: Han Gao Signed-off-by: Chen Wang --- arch/riscv/boot/dts/sophgo/sg2042-evb-v2.dts | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/riscv/boot/dts/sophgo/sg2042-evb-v2.dts b/arch/riscv/boot/dts/sophgo/sg2042-evb-v2.dts index 46980e41b886..0cd0dc0f537c 100644 --- a/arch/riscv/boot/dts/sophgo/sg2042-evb-v2.dts +++ b/arch/riscv/boot/dts/sophgo/sg2042-evb-v2.dts @@ -152,6 +152,18 @@ phy0: phy at 0 { }; }; +&pcie_rc0 { + status = "okay"; +}; + +&pcie_rc1 { + status = "okay"; +}; + +&pcie_rc2 { + status = "okay"; +}; + &pinctrl { emmc_cfg: sdhci-emmc-cfg { sdhci-emmc-wp-pins { -- 2.34.1 From inochiama at gmail.com Tue Sep 9 19:56:23 2025 From: inochiama at gmail.com (Inochi Amaoto) Date: Wed, 10 Sep 2025 10:56:23 +0800 Subject: [PATCH v2 3/7] PCI: sg2042: Add Sophgo SG2042 PCIe driver In-Reply-To: <162d064228261ccd0bf9313a20288e510912effd.1757467895.git.unicorn_wang@outlook.com> References: <162d064228261ccd0bf9313a20288e510912effd.1757467895.git.unicorn_wang@outlook.com> Message-ID: On Wed, Sep 10, 2025 at 10:08:39AM +0800, Chen Wang wrote: > From: Chen Wang > > Add support for PCIe controller in SG2042 SoC. The controller > uses the Cadence PCIe core programmed by pcie-cadence*.c. The > PCIe controller will work in host mode only, supporting data > rate(gen4) and lanes(x16 or x8). > > Signed-off-by: Chen Wang > --- > drivers/pci/controller/cadence/Kconfig | 10 ++ > drivers/pci/controller/cadence/Makefile | 1 + > drivers/pci/controller/cadence/pcie-sg2042.c | 104 +++++++++++++++++++ > 3 files changed, 115 insertions(+) > create mode 100644 drivers/pci/controller/cadence/pcie-sg2042.c > > diff --git a/drivers/pci/controller/cadence/Kconfig b/drivers/pci/controller/cadence/Kconfig > index 666e16b6367f..02a639e55fd8 100644 > --- a/drivers/pci/controller/cadence/Kconfig > +++ b/drivers/pci/controller/cadence/Kconfig > @@ -42,6 +42,15 @@ config PCIE_CADENCE_PLAT_EP > endpoint mode. This PCIe controller may be embedded into many > different vendors SoCs. > > +config PCIE_SG2042_HOST > + tristate "Sophgo SG2042 PCIe controller (host mode)" > + depends on OF && (ARCH_SOPHGO || COMPILE_TEST) > + select PCIE_CADENCE_HOST > + help > + Say Y here if you want to support the Sophgo SG2042 PCIe platform > + controller in host mode. Sophgo SG2042 PCIe controller uses Cadence > + PCIe core. > + > config PCI_J721E > tristate > select PCIE_CADENCE_HOST if PCI_J721E_HOST != n > @@ -67,4 +76,5 @@ config PCI_J721E_EP > Say Y here if you want to support the TI J721E PCIe platform > controller in endpoint mode. TI J721E PCIe controller uses Cadence PCIe > core. > + > endmenu > diff --git a/drivers/pci/controller/cadence/Makefile b/drivers/pci/controller/cadence/Makefile > index 9bac5fb2f13d..5e23f8539ecc 100644 > --- a/drivers/pci/controller/cadence/Makefile > +++ b/drivers/pci/controller/cadence/Makefile > @@ -4,3 +4,4 @@ obj-$(CONFIG_PCIE_CADENCE_HOST) += pcie-cadence-host.o > obj-$(CONFIG_PCIE_CADENCE_EP) += pcie-cadence-ep.o > obj-$(CONFIG_PCIE_CADENCE_PLAT) += pcie-cadence-plat.o > obj-$(CONFIG_PCI_J721E) += pci-j721e.o > +obj-$(CONFIG_PCIE_SG2042_HOST) += pcie-sg2042.o > diff --git a/drivers/pci/controller/cadence/pcie-sg2042.c b/drivers/pci/controller/cadence/pcie-sg2042.c > new file mode 100644 > index 000000000000..c026e1ca5d6e > --- /dev/null > +++ b/drivers/pci/controller/cadence/pcie-sg2042.c > @@ -0,0 +1,104 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * pcie-sg2042 - PCIe controller driver for Sophgo SG2042 SoC > + * > + * Copyright (C) 2025 Sophgo Technology Inc. > + * Copyright (C) 2025 Chen Wang > + */ > + > +#include > +#include > +#include > +#include > + > +#include "pcie-cadence.h" > + > +/* > + * SG2042 only supports 4-byte aligned access, so for the rootbus (i.e. to > + * read/write the Root Port itself, read32/write32 is required. For > + * non-rootbus (i.e. to read/write the PCIe peripheral registers, supports > + * 1/2/4 byte aligned access, so directly using read/write should be fine. > + */ > + > +static struct pci_ops sg2042_pcie_root_ops = { > + .map_bus = cdns_pci_map_bus, > + .read = pci_generic_config_read32, > + .write = pci_generic_config_write32, > +}; > + > +static struct pci_ops sg2042_pcie_child_ops = { > + .map_bus = cdns_pci_map_bus, > + .read = pci_generic_config_read, > + .write = pci_generic_config_write, > +}; > + > +static int sg2042_pcie_probe(struct platform_device *pdev) > +{ > + struct device *dev = &pdev->dev; > + struct pci_host_bridge *bridge; > + struct cdns_pcie *pcie; > + struct cdns_pcie_rc *rc; > + int ret; > + > + bridge = devm_pci_alloc_host_bridge(dev, sizeof(*rc)); > + if (!bridge) { > + dev_err_probe(dev, -ENOMEM, "Failed to alloc host bridge!\n"); > + return -ENOMEM; > + } > + > + bridge->ops = &sg2042_pcie_root_ops; > + bridge->child_ops = &sg2042_pcie_child_ops; > + > + rc = pci_host_bridge_priv(bridge); > + pcie = &rc->pcie; > + pcie->dev = dev; > + > + platform_set_drvdata(pdev, pcie); > + > + pm_runtime_set_active(dev); > + pm_runtime_no_callbacks(dev); > + devm_pm_runtime_enable(dev); > + > + ret = cdns_pcie_init_phy(dev, pcie); > + if (ret) { > + dev_err_probe(dev, ret, "Failed to init phy!\n"); > + return ret; > + } > + > + ret = cdns_pcie_host_setup(rc); > + if (ret) { > + dev_err_probe(dev, ret, "Failed to setup host!\n"); > + cdns_pcie_disable_phy(pcie); > + return ret; > + } > + > + return 0; > +} > + > +static void sg2042_pcie_remove(struct platform_device *pdev) > +{ > + struct cdns_pcie *pcie = platform_get_drvdata(pdev); > + > + cdns_pcie_disable_phy(pcie); > +} > + I think this remove is useless, as it is almost impossible to remove a pcie at runtime. Regards, Inochi From unicorn_wang at outlook.com Tue Sep 9 20:20:50 2025 From: unicorn_wang at outlook.com (Chen Wang) Date: Wed, 10 Sep 2025 11:20:50 +0800 Subject: [PATCH v2 3/7] PCI: sg2042: Add Sophgo SG2042 PCIe driver In-Reply-To: References: <162d064228261ccd0bf9313a20288e510912effd.1757467895.git.unicorn_wang@outlook.com> Message-ID: On 9/10/2025 10:56 AM, Inochi Amaoto wrote: > On Wed, Sep 10, 2025 at 10:08:39AM +0800, Chen Wang wrote: [......] >> +static void sg2042_pcie_remove(struct platform_device *pdev) >> +{ >> + struct cdns_pcie *pcie = platform_get_drvdata(pdev); >> + >> + cdns_pcie_disable_phy(pcie); >> +} >> + > I think this remove is useless, as it is almost impossible to > remove a pcie at runtime. I think since we implemented the driver as a module, supporting remove is also a requirement for completeness. So I add this as per request from Manivannan in the last review. Thanks? Chen > Regards, > Inochi From brgl at bgdev.pl Wed Sep 10 00:12:36 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:36 +0200 Subject: [PATCH v2 00/15] gpio: replace legacy bgpio_init() with its modernized alternative - part 4 Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Here's the final part of the generic GPIO chip conversions. Once all the existing users are switched to the new API, the final patch in the series removes bgpio_init(), moves the gpio-mmio fields out of struct gpio_chip and into struct gpio_generic_chip and adjusts gpio-mmio.c to the new situation. Down the line we could probably improve gpio-mmio.c by using lock guards and replacing the - now obsolete - "bgpio" prefix with "gpio_generic" or something similar but this series is already big as is so I'm leaving that for the future. Tested in qemu on vexpress-a9. Signed-off-by: Bartosz Golaszewski --- Changes in v2: - Use a more common syntax for compound literals - Link to v1: https://lore.kernel.org/r/20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a at linaro.org --- Bartosz Golaszewski (15): gpio: loongson1: allow building the module with COMPILE_TEST enabled gpio: loongson1: use new generic GPIO chip API gpio: hlwd: use new generic GPIO chip API gpio: ath79: use new generic GPIO chip API gpio: ath79: use the generic GPIO chip lock for IRQ handling gpio: xgene-sb: use generic GPIO chip register read and write APIs gpio: brcmstb: use new generic GPIO chip API gpio: mt7621: use new generic GPIO chip API gpio: mt7621: use the generic GPIO chip lock for IRQ handling gpio: menz127: use new generic GPIO chip API gpio: sifive: use new generic GPIO chip API gpio: spacemit-k1: use new generic GPIO chip API gpio: sodaville: use new generic GPIO chip API gpio: mmio: use new generic GPIO chip API gpio: move gpio-mmio-specific fields out of struct gpio_chip drivers/gpio/Kconfig | 2 +- drivers/gpio/TODO | 5 - drivers/gpio/gpio-ath79.c | 88 +++++----- drivers/gpio/gpio-brcmstb.c | 112 +++++++------ drivers/gpio/gpio-hlwd.c | 105 ++++++------ drivers/gpio/gpio-loongson1.c | 40 +++-- drivers/gpio/gpio-menz127.c | 31 ++-- drivers/gpio/gpio-mlxbf2.c | 2 +- drivers/gpio/gpio-mmio.c | 350 +++++++++++++++++++++------------------- drivers/gpio/gpio-mpc8xxx.c | 5 +- drivers/gpio/gpio-mt7621.c | 80 ++++----- drivers/gpio/gpio-sifive.c | 73 +++++---- drivers/gpio/gpio-sodaville.c | 20 ++- drivers/gpio/gpio-spacemit-k1.c | 28 +++- drivers/gpio/gpio-xgene-sb.c | 5 +- include/linux/gpio/driver.h | 44 ----- include/linux/gpio/generic.h | 67 +++++--- 17 files changed, 548 insertions(+), 509 deletions(-) --- base-commit: 65dd046ef55861190ecde44c6d9fcde54b9fb77d change-id: 20250904-gpio-mmio-gpio-conv-part4-5e1f772ba724 Best regards, -- Bartosz Golaszewski From brgl at bgdev.pl Wed Sep 10 00:12:37 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:37 +0200 Subject: [PATCH v2 01/15] gpio: loongson1: allow building the module with COMPILE_TEST enabled In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-1-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski Increase build coverage by allowing the module to be built with COMPILE_TEST=y. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index 31f8bab4b09df1640c892f4d839860edaa2ad6a3..09cb144f076661e0a2069016175d0692257fb156 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -885,7 +885,7 @@ config GPIO_ZYNQMP_MODEPIN config GPIO_LOONGSON1 tristate "Loongson1 GPIO support" - depends on MACH_LOONGSON32 + depends on MACH_LOONGSON32 || COMPILE_TEST select GPIO_GENERIC help Say Y or M here to support GPIO on Loongson1 SoCs. -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:38 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:38 +0200 Subject: [PATCH v2 02/15] gpio: loongson1: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-2-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-loongson1.c | 40 +++++++++++++++++++++++----------------- 1 file changed, 23 insertions(+), 17 deletions(-) diff --git a/drivers/gpio/gpio-loongson1.c b/drivers/gpio/gpio-loongson1.c index 6ca3b969db4df231517d021a7b4b5e3ddcf626f7..9750a7a175081781624a49a794926b3f1e45b4d2 100644 --- a/drivers/gpio/gpio-loongson1.c +++ b/drivers/gpio/gpio-loongson1.c @@ -5,10 +5,11 @@ * Copyright (C) 2015-2023 Keguang Zhang */ +#include #include #include +#include #include -#include /* Loongson 1 GPIO Register Definitions */ #define GPIO_CFG 0x0 @@ -17,19 +18,18 @@ #define GPIO_OUTPUT 0x30 struct ls1x_gpio_chip { - struct gpio_chip gc; + struct gpio_generic_chip chip; void __iomem *reg_base; }; static int ls1x_gpio_request(struct gpio_chip *gc, unsigned int offset) { struct ls1x_gpio_chip *ls1x_gc = gpiochip_get_data(gc); - unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + guard(gpio_generic_lock_irqsave)(&ls1x_gc->chip); + __raw_writel(__raw_readl(ls1x_gc->reg_base + GPIO_CFG) | BIT(offset), ls1x_gc->reg_base + GPIO_CFG); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); return 0; } @@ -37,16 +37,16 @@ static int ls1x_gpio_request(struct gpio_chip *gc, unsigned int offset) static void ls1x_gpio_free(struct gpio_chip *gc, unsigned int offset) { struct ls1x_gpio_chip *ls1x_gc = gpiochip_get_data(gc); - unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + guard(gpio_generic_lock_irqsave)(&ls1x_gc->chip); + __raw_writel(__raw_readl(ls1x_gc->reg_base + GPIO_CFG) & ~BIT(offset), ls1x_gc->reg_base + GPIO_CFG); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); } static int ls1x_gpio_probe(struct platform_device *pdev) { + struct gpio_generic_chip_config config; struct device *dev = &pdev->dev; struct ls1x_gpio_chip *ls1x_gc; int ret; @@ -59,29 +59,35 @@ static int ls1x_gpio_probe(struct platform_device *pdev) if (IS_ERR(ls1x_gc->reg_base)) return PTR_ERR(ls1x_gc->reg_base); - ret = bgpio_init(&ls1x_gc->gc, dev, 4, ls1x_gc->reg_base + GPIO_DATA, - ls1x_gc->reg_base + GPIO_OUTPUT, NULL, - NULL, ls1x_gc->reg_base + GPIO_DIR, 0); + config = (struct gpio_generic_chip_config) { + .dev = dev, + .sz = 4, + .dat = ls1x_gc->reg_base + GPIO_DATA, + .set = ls1x_gc->reg_base + GPIO_OUTPUT, + .dirin = ls1x_gc->reg_base + GPIO_DIR, + }; + + ret = gpio_generic_chip_init(&ls1x_gc->chip, &config); if (ret) goto err; - ls1x_gc->gc.owner = THIS_MODULE; - ls1x_gc->gc.request = ls1x_gpio_request; - ls1x_gc->gc.free = ls1x_gpio_free; + ls1x_gc->chip.gc.owner = THIS_MODULE; + ls1x_gc->chip.gc.request = ls1x_gpio_request; + ls1x_gc->chip.gc.free = ls1x_gpio_free; /* * Clear ngpio to let gpiolib get the correct number * by reading ngpios property */ - ls1x_gc->gc.ngpio = 0; + ls1x_gc->chip.gc.ngpio = 0; - ret = devm_gpiochip_add_data(dev, &ls1x_gc->gc, ls1x_gc); + ret = devm_gpiochip_add_data(dev, &ls1x_gc->chip.gc, ls1x_gc); if (ret) goto err; platform_set_drvdata(pdev, ls1x_gc); dev_info(dev, "GPIO controller registered with %d pins\n", - ls1x_gc->gc.ngpio); + ls1x_gc->chip.gc.ngpio); return 0; err: -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:39 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:39 +0200 Subject: [PATCH v2 03/15] gpio: hlwd: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-3-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-hlwd.c | 105 ++++++++++++++++++++++++----------------------- 1 file changed, 54 insertions(+), 51 deletions(-) diff --git a/drivers/gpio/gpio-hlwd.c b/drivers/gpio/gpio-hlwd.c index 0580f6712bea9a4d510bd332645982adbc5c6a32..a395f87436ac4df386ce2ee345fc0a7cc34c843d 100644 --- a/drivers/gpio/gpio-hlwd.c +++ b/drivers/gpio/gpio-hlwd.c @@ -6,6 +6,7 @@ // Nintendo Wii (Hollywood) GPIO driver #include +#include #include #include #include @@ -48,7 +49,7 @@ #define HW_GPIO_OWNER 0x3c struct hlwd_gpio { - struct gpio_chip gpioc; + struct gpio_generic_chip gpioc; struct device *dev; void __iomem *regs; int irq; @@ -61,45 +62,44 @@ static void hlwd_gpio_irqhandler(struct irq_desc *desc) struct hlwd_gpio *hlwd = gpiochip_get_data(irq_desc_get_handler_data(desc)); struct irq_chip *chip = irq_desc_get_chip(desc); - unsigned long flags; unsigned long pending; int hwirq; u32 emulated_pending; - raw_spin_lock_irqsave(&hlwd->gpioc.bgpio_lock, flags); - pending = ioread32be(hlwd->regs + HW_GPIOB_INTFLAG); - pending &= ioread32be(hlwd->regs + HW_GPIOB_INTMASK); + scoped_guard(gpio_generic_lock_irqsave, &hlwd->gpioc) { + pending = ioread32be(hlwd->regs + HW_GPIOB_INTFLAG); + pending &= ioread32be(hlwd->regs + HW_GPIOB_INTMASK); - /* Treat interrupts due to edge trigger emulation separately */ - emulated_pending = hlwd->edge_emulation & pending; - pending &= ~emulated_pending; - if (emulated_pending) { - u32 level, rising, falling; + /* Treat interrupts due to edge trigger emulation separately */ + emulated_pending = hlwd->edge_emulation & pending; + pending &= ~emulated_pending; + if (emulated_pending) { + u32 level, rising, falling; - level = ioread32be(hlwd->regs + HW_GPIOB_INTLVL); - rising = level & emulated_pending; - falling = ~level & emulated_pending; + level = ioread32be(hlwd->regs + HW_GPIOB_INTLVL); + rising = level & emulated_pending; + falling = ~level & emulated_pending; - /* Invert the levels */ - iowrite32be(level ^ emulated_pending, - hlwd->regs + HW_GPIOB_INTLVL); + /* Invert the levels */ + iowrite32be(level ^ emulated_pending, + hlwd->regs + HW_GPIOB_INTLVL); - /* Ack all emulated-edge interrupts */ - iowrite32be(emulated_pending, hlwd->regs + HW_GPIOB_INTFLAG); + /* Ack all emulated-edge interrupts */ + iowrite32be(emulated_pending, hlwd->regs + HW_GPIOB_INTFLAG); - /* Signal interrupts only on the correct edge */ - rising &= hlwd->rising_edge; - falling &= hlwd->falling_edge; + /* Signal interrupts only on the correct edge */ + rising &= hlwd->rising_edge; + falling &= hlwd->falling_edge; - /* Mark emulated interrupts as pending */ - pending |= rising | falling; + /* Mark emulated interrupts as pending */ + pending |= rising | falling; + } } - raw_spin_unlock_irqrestore(&hlwd->gpioc.bgpio_lock, flags); chained_irq_enter(chip, desc); for_each_set_bit(hwirq, &pending, 32) - generic_handle_domain_irq(hlwd->gpioc.irq.domain, hwirq); + generic_handle_domain_irq(hlwd->gpioc.gc.irq.domain, hwirq); chained_irq_exit(chip, desc); } @@ -116,30 +116,29 @@ static void hlwd_gpio_irq_mask(struct irq_data *data) { struct hlwd_gpio *hlwd = gpiochip_get_data(irq_data_get_irq_chip_data(data)); - unsigned long flags; u32 mask; - raw_spin_lock_irqsave(&hlwd->gpioc.bgpio_lock, flags); - mask = ioread32be(hlwd->regs + HW_GPIOB_INTMASK); - mask &= ~BIT(data->hwirq); - iowrite32be(mask, hlwd->regs + HW_GPIOB_INTMASK); - raw_spin_unlock_irqrestore(&hlwd->gpioc.bgpio_lock, flags); - gpiochip_disable_irq(&hlwd->gpioc, irqd_to_hwirq(data)); + scoped_guard(gpio_generic_lock_irqsave, &hlwd->gpioc) { + mask = ioread32be(hlwd->regs + HW_GPIOB_INTMASK); + mask &= ~BIT(data->hwirq); + iowrite32be(mask, hlwd->regs + HW_GPIOB_INTMASK); + } + gpiochip_disable_irq(&hlwd->gpioc.gc, irqd_to_hwirq(data)); } static void hlwd_gpio_irq_unmask(struct irq_data *data) { struct hlwd_gpio *hlwd = gpiochip_get_data(irq_data_get_irq_chip_data(data)); - unsigned long flags; u32 mask; - gpiochip_enable_irq(&hlwd->gpioc, irqd_to_hwirq(data)); - raw_spin_lock_irqsave(&hlwd->gpioc.bgpio_lock, flags); + gpiochip_enable_irq(&hlwd->gpioc.gc, irqd_to_hwirq(data)); + + guard(gpio_generic_lock_irqsave)(&hlwd->gpioc); + mask = ioread32be(hlwd->regs + HW_GPIOB_INTMASK); mask |= BIT(data->hwirq); iowrite32be(mask, hlwd->regs + HW_GPIOB_INTMASK); - raw_spin_unlock_irqrestore(&hlwd->gpioc.bgpio_lock, flags); } static void hlwd_gpio_irq_enable(struct irq_data *data) @@ -173,10 +172,9 @@ static int hlwd_gpio_irq_set_type(struct irq_data *data, unsigned int flow_type) { struct hlwd_gpio *hlwd = gpiochip_get_data(irq_data_get_irq_chip_data(data)); - unsigned long flags; u32 level; - raw_spin_lock_irqsave(&hlwd->gpioc.bgpio_lock, flags); + guard(gpio_generic_lock_irqsave)(&hlwd->gpioc); hlwd->edge_emulation &= ~BIT(data->hwirq); @@ -197,11 +195,9 @@ static int hlwd_gpio_irq_set_type(struct irq_data *data, unsigned int flow_type) hlwd_gpio_irq_setup_emulation(hlwd, data->hwirq, flow_type); break; default: - raw_spin_unlock_irqrestore(&hlwd->gpioc.bgpio_lock, flags); return -EINVAL; } - raw_spin_unlock_irqrestore(&hlwd->gpioc.bgpio_lock, flags); return 0; } @@ -225,6 +221,7 @@ static const struct irq_chip hlwd_gpio_irq_chip = { static int hlwd_gpio_probe(struct platform_device *pdev) { + struct gpio_generic_chip_config config; struct hlwd_gpio *hlwd; u32 ngpios; int res; @@ -244,25 +241,31 @@ static int hlwd_gpio_probe(struct platform_device *pdev) * systems where the AHBPROT memory firewall hasn't been configured to * permit PPC access to HW_GPIO_*. * - * Note that this has to happen before bgpio_init reads the - * HW_GPIOB_OUT and HW_GPIOB_DIR, because otherwise it reads the wrong - * values. + * Note that this has to happen before gpio_generic_chip_init() reads + * the HW_GPIOB_OUT and HW_GPIOB_DIR, because otherwise it reads the + * wrong values. */ iowrite32be(0xffffffff, hlwd->regs + HW_GPIO_OWNER); - res = bgpio_init(&hlwd->gpioc, &pdev->dev, 4, - hlwd->regs + HW_GPIOB_IN, hlwd->regs + HW_GPIOB_OUT, - NULL, hlwd->regs + HW_GPIOB_DIR, NULL, - BGPIOF_BIG_ENDIAN_BYTE_ORDER); + config = (struct gpio_generic_chip_config) { + .dev = &pdev->dev, + .sz = 4, + .dat = hlwd->regs + HW_GPIOB_IN, + .set = hlwd->regs + HW_GPIOB_OUT, + .dirout = hlwd->regs + HW_GPIOB_DIR, + .flags = BGPIOF_BIG_ENDIAN_BYTE_ORDER, + }; + + res = gpio_generic_chip_init(&hlwd->gpioc, &config); if (res < 0) { - dev_warn(&pdev->dev, "bgpio_init failed: %d\n", res); + dev_warn(&pdev->dev, "failed to initialize generic GPIO chip: %d\n", res); return res; } res = of_property_read_u32(pdev->dev.of_node, "ngpios", &ngpios); if (res) ngpios = 32; - hlwd->gpioc.ngpio = ngpios; + hlwd->gpioc.gc.ngpio = ngpios; /* Mask and ack all interrupts */ iowrite32be(0, hlwd->regs + HW_GPIOB_INTMASK); @@ -282,7 +285,7 @@ static int hlwd_gpio_probe(struct platform_device *pdev) return hlwd->irq; } - girq = &hlwd->gpioc.irq; + girq = &hlwd->gpioc.gc.irq; gpio_irq_chip_set_chip(girq, &hlwd_gpio_irq_chip); girq->parent_handler = hlwd_gpio_irqhandler; girq->num_parents = 1; @@ -296,7 +299,7 @@ static int hlwd_gpio_probe(struct platform_device *pdev) girq->handler = handle_level_irq; } - return devm_gpiochip_add_data(&pdev->dev, &hlwd->gpioc, hlwd); + return devm_gpiochip_add_data(&pdev->dev, &hlwd->gpioc.gc, hlwd); } static const struct of_device_id hlwd_gpio_match[] = { -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:42 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:42 +0200 Subject: [PATCH v2 06/15] gpio: xgene-sb: use generic GPIO chip register read and write APIs In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-6-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski The conversion to using the modernized generic GPIO chip API was incomplete without also converting the direct calls to write/read_reg() callbacks. Use the provided wrappers from linux/gpio/generic.h. Fixes: 38d98a822c14 ("gpio: xgene-sb: use new generic GPIO chip API") Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-xgene-sb.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpio/gpio-xgene-sb.c b/drivers/gpio/gpio-xgene-sb.c index c559a89aadf7a77bd9cce7e5a7d4a2b241307812..62545e358b6c4b1cab25e1135cb24ccc3e955078 100644 --- a/drivers/gpio/gpio-xgene-sb.c +++ b/drivers/gpio/gpio-xgene-sb.c @@ -63,14 +63,15 @@ struct xgene_gpio_sb { static void xgene_gpio_set_bit(struct gpio_chip *gc, void __iomem *reg, u32 gpio, int val) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); u32 data; - data = gc->read_reg(reg); + data = gpio_generic_read_reg(chip, reg); if (val) data |= GPIO_MASK(gpio); else data &= ~GPIO_MASK(gpio); - gc->write_reg(reg, data); + gpio_generic_write_reg(chip, reg, data); } static int xgene_gpio_sb_irq_set_type(struct irq_data *d, unsigned int type) -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:41 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:41 +0200 Subject: [PATCH v2 05/15] gpio: ath79: use the generic GPIO chip lock for IRQ handling In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-5-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski This driver uses its own raw spinlock in interrupt routines while the generic GPIO chip callbacks use a separate one. This is, of course, racy so use the fact that the lock in generic GPIO chip is also a raw spinlock and convert the interrupt handling functions in this module to using the provided generic GPIO chip locking API. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-ath79.c | 51 ++++++++++++++++++----------------------------- 1 file changed, 19 insertions(+), 32 deletions(-) diff --git a/drivers/gpio/gpio-ath79.c b/drivers/gpio/gpio-ath79.c index 8879f23f1871ed323513082f4d2ebb2c40544cde..2ad9f6ac66362fba8cdab152a2b2c782dddf427c 100644 --- a/drivers/gpio/gpio-ath79.c +++ b/drivers/gpio/gpio-ath79.c @@ -31,7 +31,6 @@ struct ath79_gpio_ctrl { struct gpio_generic_chip chip; void __iomem *base; - raw_spinlock_t lock; unsigned long both_edges; }; @@ -72,23 +71,22 @@ static void ath79_gpio_irq_unmask(struct irq_data *data) { struct ath79_gpio_ctrl *ctrl = irq_data_to_ath79_gpio(data); u32 mask = BIT(irqd_to_hwirq(data)); - unsigned long flags; gpiochip_enable_irq(&ctrl->chip.gc, irqd_to_hwirq(data)); - raw_spin_lock_irqsave(&ctrl->lock, flags); + + guard(gpio_generic_lock_irqsave)(&ctrl->chip); + ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, mask); - raw_spin_unlock_irqrestore(&ctrl->lock, flags); } static void ath79_gpio_irq_mask(struct irq_data *data) { struct ath79_gpio_ctrl *ctrl = irq_data_to_ath79_gpio(data); u32 mask = BIT(irqd_to_hwirq(data)); - unsigned long flags; - raw_spin_lock_irqsave(&ctrl->lock, flags); - ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, 0); - raw_spin_unlock_irqrestore(&ctrl->lock, flags); + scoped_guard(gpio_generic_lock_irqsave, &ctrl->chip) + ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, 0); + gpiochip_disable_irq(&ctrl->chip.gc, irqd_to_hwirq(data)); } @@ -96,24 +94,20 @@ static void ath79_gpio_irq_enable(struct irq_data *data) { struct ath79_gpio_ctrl *ctrl = irq_data_to_ath79_gpio(data); u32 mask = BIT(irqd_to_hwirq(data)); - unsigned long flags; - raw_spin_lock_irqsave(&ctrl->lock, flags); + guard(gpio_generic_lock_irqsave)(&ctrl->chip); ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_ENABLE, mask, mask); ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, mask); - raw_spin_unlock_irqrestore(&ctrl->lock, flags); } static void ath79_gpio_irq_disable(struct irq_data *data) { struct ath79_gpio_ctrl *ctrl = irq_data_to_ath79_gpio(data); u32 mask = BIT(irqd_to_hwirq(data)); - unsigned long flags; - raw_spin_lock_irqsave(&ctrl->lock, flags); + guard(gpio_generic_lock_irqsave)(&ctrl->chip); ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, 0); ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_ENABLE, mask, 0); - raw_spin_unlock_irqrestore(&ctrl->lock, flags); } static int ath79_gpio_irq_set_type(struct irq_data *data, @@ -122,7 +116,6 @@ static int ath79_gpio_irq_set_type(struct irq_data *data, struct ath79_gpio_ctrl *ctrl = irq_data_to_ath79_gpio(data); u32 mask = BIT(irqd_to_hwirq(data)); u32 type = 0, polarity = 0; - unsigned long flags; bool disabled; switch (flow_type) { @@ -144,7 +137,7 @@ static int ath79_gpio_irq_set_type(struct irq_data *data, return -EINVAL; } - raw_spin_lock_irqsave(&ctrl->lock, flags); + guard(gpio_generic_lock_irqsave)(&ctrl->chip); if (flow_type == IRQ_TYPE_EDGE_BOTH) { ctrl->both_edges |= mask; @@ -169,8 +162,6 @@ static int ath79_gpio_irq_set_type(struct irq_data *data, ath79_gpio_update_bits( ctrl, AR71XX_GPIO_REG_INT_ENABLE, mask, mask); - raw_spin_unlock_irqrestore(&ctrl->lock, flags); - return 0; } @@ -192,26 +183,24 @@ static void ath79_gpio_irq_handler(struct irq_desc *desc) struct gpio_generic_chip *gen_gc = to_gpio_generic_chip(gc); struct ath79_gpio_ctrl *ctrl = container_of(gen_gc, struct ath79_gpio_ctrl, chip); - unsigned long flags, pending; + unsigned long pending; u32 both_edges, state; int irq; chained_irq_enter(irqchip, desc); - raw_spin_lock_irqsave(&ctrl->lock, flags); + scoped_guard(gpio_generic_lock_irqsave, &ctrl->chip) { + pending = ath79_gpio_read(ctrl, AR71XX_GPIO_REG_INT_PENDING); - pending = ath79_gpio_read(ctrl, AR71XX_GPIO_REG_INT_PENDING); - - /* Update the polarity of the both edges irqs */ - both_edges = ctrl->both_edges & pending; - if (both_edges) { - state = ath79_gpio_read(ctrl, AR71XX_GPIO_REG_IN); - ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_POLARITY, - both_edges, ~state); + /* Update the polarity of the both edges irqs */ + both_edges = ctrl->both_edges & pending; + if (both_edges) { + state = ath79_gpio_read(ctrl, AR71XX_GPIO_REG_IN); + ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_POLARITY, + both_edges, ~state); + } } - raw_spin_unlock_irqrestore(&ctrl->lock, flags); - for_each_set_bit(irq, &pending, gc->ngpio) generic_handle_domain_irq(gc->irq.domain, irq); @@ -256,8 +245,6 @@ static int ath79_gpio_probe(struct platform_device *pdev) if (IS_ERR(ctrl->base)) return PTR_ERR(ctrl->base); - raw_spin_lock_init(&ctrl->lock); - config = (struct gpio_generic_chip_config) { .dev = dev, .sz = 4, -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:40 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:40 +0200 Subject: [PATCH v2 04/15] gpio: ath79: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-4-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-ath79.c | 39 ++++++++++++++++++++++++--------------- 1 file changed, 24 insertions(+), 15 deletions(-) diff --git a/drivers/gpio/gpio-ath79.c b/drivers/gpio/gpio-ath79.c index de4cc12e5e0399abcef61a89c8c91a1b203d20fb..8879f23f1871ed323513082f4d2ebb2c40544cde 100644 --- a/drivers/gpio/gpio-ath79.c +++ b/drivers/gpio/gpio-ath79.c @@ -10,6 +10,7 @@ #include #include +#include #include #include #include @@ -28,7 +29,7 @@ #define AR71XX_GPIO_REG_INT_MASK 0x24 struct ath79_gpio_ctrl { - struct gpio_chip gc; + struct gpio_generic_chip chip; void __iomem *base; raw_spinlock_t lock; unsigned long both_edges; @@ -37,8 +38,9 @@ struct ath79_gpio_ctrl { static struct ath79_gpio_ctrl *irq_data_to_ath79_gpio(struct irq_data *data) { struct gpio_chip *gc = irq_data_get_irq_chip_data(data); + struct gpio_generic_chip *gen_gc = to_gpio_generic_chip(gc); - return container_of(gc, struct ath79_gpio_ctrl, gc); + return container_of(gen_gc, struct ath79_gpio_ctrl, chip); } static u32 ath79_gpio_read(struct ath79_gpio_ctrl *ctrl, unsigned reg) @@ -72,7 +74,7 @@ static void ath79_gpio_irq_unmask(struct irq_data *data) u32 mask = BIT(irqd_to_hwirq(data)); unsigned long flags; - gpiochip_enable_irq(&ctrl->gc, irqd_to_hwirq(data)); + gpiochip_enable_irq(&ctrl->chip.gc, irqd_to_hwirq(data)); raw_spin_lock_irqsave(&ctrl->lock, flags); ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, mask); raw_spin_unlock_irqrestore(&ctrl->lock, flags); @@ -87,7 +89,7 @@ static void ath79_gpio_irq_mask(struct irq_data *data) raw_spin_lock_irqsave(&ctrl->lock, flags); ath79_gpio_update_bits(ctrl, AR71XX_GPIO_REG_INT_MASK, mask, 0); raw_spin_unlock_irqrestore(&ctrl->lock, flags); - gpiochip_disable_irq(&ctrl->gc, irqd_to_hwirq(data)); + gpiochip_disable_irq(&ctrl->chip.gc, irqd_to_hwirq(data)); } static void ath79_gpio_irq_enable(struct irq_data *data) @@ -187,8 +189,9 @@ static void ath79_gpio_irq_handler(struct irq_desc *desc) { struct gpio_chip *gc = irq_desc_get_handler_data(desc); struct irq_chip *irqchip = irq_desc_get_chip(desc); + struct gpio_generic_chip *gen_gc = to_gpio_generic_chip(gc); struct ath79_gpio_ctrl *ctrl = - container_of(gc, struct ath79_gpio_ctrl, gc); + container_of(gen_gc, struct ath79_gpio_ctrl, chip); unsigned long flags, pending; u32 both_edges, state; int irq; @@ -224,6 +227,7 @@ MODULE_DEVICE_TABLE(of, ath79_gpio_of_match); static int ath79_gpio_probe(struct platform_device *pdev) { + struct gpio_generic_chip_config config; struct device *dev = &pdev->dev; struct ath79_gpio_ctrl *ctrl; struct gpio_irq_chip *girq; @@ -253,21 +257,26 @@ static int ath79_gpio_probe(struct platform_device *pdev) return PTR_ERR(ctrl->base); raw_spin_lock_init(&ctrl->lock); - err = bgpio_init(&ctrl->gc, dev, 4, - ctrl->base + AR71XX_GPIO_REG_IN, - ctrl->base + AR71XX_GPIO_REG_SET, - ctrl->base + AR71XX_GPIO_REG_CLEAR, - oe_inverted ? NULL : ctrl->base + AR71XX_GPIO_REG_OE, - oe_inverted ? ctrl->base + AR71XX_GPIO_REG_OE : NULL, - 0); + + config = (struct gpio_generic_chip_config) { + .dev = dev, + .sz = 4, + .dat = ctrl->base + AR71XX_GPIO_REG_IN, + .set = ctrl->base + AR71XX_GPIO_REG_SET, + .clr = ctrl->base + AR71XX_GPIO_REG_CLEAR, + .dirout = oe_inverted ? NULL : ctrl->base + AR71XX_GPIO_REG_OE, + .dirin = oe_inverted ? ctrl->base + AR71XX_GPIO_REG_OE : NULL, + }; + + err = gpio_generic_chip_init(&ctrl->chip, &config); if (err) { - dev_err(dev, "bgpio_init failed\n"); + dev_err(dev, "failed to initialize generic GPIO chip\n"); return err; } /* Optional interrupt setup */ if (device_property_read_bool(dev, "interrupt-controller")) { - girq = &ctrl->gc.irq; + girq = &ctrl->chip.gc.irq; gpio_irq_chip_set_chip(girq, &ath79_gpio_irqchip); girq->parent_handler = ath79_gpio_irq_handler; girq->num_parents = 1; @@ -280,7 +289,7 @@ static int ath79_gpio_probe(struct platform_device *pdev) girq->handler = handle_simple_irq; } - return devm_gpiochip_add_data(dev, &ctrl->gc, ctrl); + return devm_gpiochip_add_data(dev, &ctrl->chip.gc, ctrl); } static struct platform_driver ath79_gpio_driver = { -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:43 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:43 +0200 Subject: [PATCH v2 07/15] gpio: brcmstb: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-7-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-brcmstb.c | 112 ++++++++++++++++++++++++-------------------- 1 file changed, 60 insertions(+), 52 deletions(-) diff --git a/drivers/gpio/gpio-brcmstb.c b/drivers/gpio/gpio-brcmstb.c index e29a9589b3ccbd17d10f6671088dca3e76537927..be3ff916e134a674d3e1d334a7d431b7ad767a33 100644 --- a/drivers/gpio/gpio-brcmstb.c +++ b/drivers/gpio/gpio-brcmstb.c @@ -3,6 +3,7 @@ #include #include +#include #include #include #include @@ -37,7 +38,7 @@ enum gio_reg_index { struct brcmstb_gpio_bank { struct list_head node; int id; - struct gpio_chip gc; + struct gpio_generic_chip chip; struct brcmstb_gpio_priv *parent_priv; u32 width; u32 wake_active; @@ -72,19 +73,18 @@ __brcmstb_gpio_get_active_irqs(struct brcmstb_gpio_bank *bank) { void __iomem *reg_base = bank->parent_priv->reg_base; - return bank->gc.read_reg(reg_base + GIO_STAT(bank->id)) & - bank->gc.read_reg(reg_base + GIO_MASK(bank->id)); + return gpio_generic_read_reg(&bank->chip, reg_base + GIO_STAT(bank->id)) & + gpio_generic_read_reg(&bank->chip, reg_base + GIO_MASK(bank->id)); } static unsigned long brcmstb_gpio_get_active_irqs(struct brcmstb_gpio_bank *bank) { unsigned long status; - unsigned long flags; - raw_spin_lock_irqsave(&bank->gc.bgpio_lock, flags); + guard(gpio_generic_lock_irqsave)(&bank->chip); + status = __brcmstb_gpio_get_active_irqs(bank); - raw_spin_unlock_irqrestore(&bank->gc.bgpio_lock, flags); return status; } @@ -92,26 +92,26 @@ brcmstb_gpio_get_active_irqs(struct brcmstb_gpio_bank *bank) static int brcmstb_gpio_hwirq_to_offset(irq_hw_number_t hwirq, struct brcmstb_gpio_bank *bank) { - return hwirq - bank->gc.offset; + return hwirq - bank->chip.gc.offset; } static void brcmstb_gpio_set_imask(struct brcmstb_gpio_bank *bank, unsigned int hwirq, bool enable) { - struct gpio_chip *gc = &bank->gc; struct brcmstb_gpio_priv *priv = bank->parent_priv; u32 mask = BIT(brcmstb_gpio_hwirq_to_offset(hwirq, bank)); u32 imask; - unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); - imask = gc->read_reg(priv->reg_base + GIO_MASK(bank->id)); + guard(gpio_generic_lock_irqsave)(&bank->chip); + + imask = gpio_generic_read_reg(&bank->chip, + priv->reg_base + GIO_MASK(bank->id)); if (enable) imask |= mask; else imask &= ~mask; - gc->write_reg(priv->reg_base + GIO_MASK(bank->id), imask); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_MASK(bank->id), imask); } static int brcmstb_gpio_to_irq(struct gpio_chip *gc, unsigned offset) @@ -150,7 +150,8 @@ static void brcmstb_gpio_irq_ack(struct irq_data *d) struct brcmstb_gpio_priv *priv = bank->parent_priv; u32 mask = BIT(brcmstb_gpio_hwirq_to_offset(d->hwirq, bank)); - gc->write_reg(priv->reg_base + GIO_STAT(bank->id), mask); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_STAT(bank->id), mask); } static int brcmstb_gpio_irq_set_type(struct irq_data *d, unsigned int type) @@ -162,7 +163,6 @@ static int brcmstb_gpio_irq_set_type(struct irq_data *d, unsigned int type) u32 edge_insensitive, iedge_insensitive; u32 edge_config, iedge_config; u32 level, ilevel; - unsigned long flags; switch (type) { case IRQ_TYPE_LEVEL_LOW: @@ -194,23 +194,25 @@ static int brcmstb_gpio_irq_set_type(struct irq_data *d, unsigned int type) return -EINVAL; } - raw_spin_lock_irqsave(&bank->gc.bgpio_lock, flags); + guard(gpio_generic_lock_irqsave)(&bank->chip); - iedge_config = bank->gc.read_reg(priv->reg_base + - GIO_EC(bank->id)) & ~mask; - iedge_insensitive = bank->gc.read_reg(priv->reg_base + - GIO_EI(bank->id)) & ~mask; - ilevel = bank->gc.read_reg(priv->reg_base + - GIO_LEVEL(bank->id)) & ~mask; + iedge_config = gpio_generic_read_reg(&bank->chip, + priv->reg_base + GIO_EC(bank->id)) & ~mask; + iedge_insensitive = gpio_generic_read_reg(&bank->chip, + priv->reg_base + GIO_EI(bank->id)) & ~mask; + ilevel = gpio_generic_read_reg(&bank->chip, + priv->reg_base + GIO_LEVEL(bank->id)) & ~mask; - bank->gc.write_reg(priv->reg_base + GIO_EC(bank->id), - iedge_config | edge_config); - bank->gc.write_reg(priv->reg_base + GIO_EI(bank->id), - iedge_insensitive | edge_insensitive); - bank->gc.write_reg(priv->reg_base + GIO_LEVEL(bank->id), - ilevel | level); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_EC(bank->id), + iedge_config | edge_config); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_EI(bank->id), + iedge_insensitive | edge_insensitive); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_LEVEL(bank->id), + ilevel | level); - raw_spin_unlock_irqrestore(&bank->gc.bgpio_lock, flags); return 0; } @@ -263,7 +265,7 @@ static void brcmstb_gpio_irq_bank_handler(struct brcmstb_gpio_bank *bank) { struct brcmstb_gpio_priv *priv = bank->parent_priv; struct irq_domain *domain = priv->irq_domain; - int hwbase = bank->gc.offset; + int hwbase = bank->chip.gc.offset; unsigned long status; while ((status = brcmstb_gpio_get_active_irqs(bank))) { @@ -303,7 +305,7 @@ static struct brcmstb_gpio_bank *brcmstb_gpio_hwirq_to_bank( /* banks are in descending order */ list_for_each_entry_reverse(bank, &priv->bank_list, node) { - i += bank->gc.ngpio; + i += bank->chip.gc.ngpio; if (hwirq < i) return bank; } @@ -332,7 +334,7 @@ static int brcmstb_gpio_irq_map(struct irq_domain *d, unsigned int irq, dev_dbg(&pdev->dev, "Mapping irq %d for gpio line %d (bank %d)\n", irq, (int)hwirq, bank->id); - ret = irq_set_chip_data(irq, &bank->gc); + ret = irq_set_chip_data(irq, &bank->chip.gc); if (ret < 0) return ret; irq_set_lockdep_class(irq, &brcmstb_gpio_irq_lock_class, @@ -394,7 +396,7 @@ static void brcmstb_gpio_remove(struct platform_device *pdev) * more important to actually perform all of the steps. */ list_for_each_entry(bank, &priv->bank_list, node) - gpiochip_remove(&bank->gc); + gpiochip_remove(&bank->chip.gc); } static int brcmstb_gpio_of_xlate(struct gpio_chip *gc, @@ -412,7 +414,7 @@ static int brcmstb_gpio_of_xlate(struct gpio_chip *gc, if (WARN_ON(gpiospec->args_count < gc->of_gpio_n_cells)) return -EINVAL; - offset = gpiospec->args[0] - bank->gc.offset; + offset = gpiospec->args[0] - bank->chip.gc.offset; if (offset >= gc->ngpio || offset < 0) return -EINVAL; @@ -493,19 +495,17 @@ static int brcmstb_gpio_irq_setup(struct platform_device *pdev, static void brcmstb_gpio_bank_save(struct brcmstb_gpio_priv *priv, struct brcmstb_gpio_bank *bank) { - struct gpio_chip *gc = &bank->gc; unsigned int i; for (i = 0; i < GIO_REG_STAT; i++) - bank->saved_regs[i] = gc->read_reg(priv->reg_base + - GIO_BANK_OFF(bank->id, i)); + bank->saved_regs[i] = gpio_generic_read_reg(&bank->chip, + priv->reg_base + GIO_BANK_OFF(bank->id, i)); } static void brcmstb_gpio_quiesce(struct device *dev, bool save) { struct brcmstb_gpio_priv *priv = dev_get_drvdata(dev); struct brcmstb_gpio_bank *bank; - struct gpio_chip *gc; u32 imask; /* disable non-wake interrupt */ @@ -513,8 +513,6 @@ static void brcmstb_gpio_quiesce(struct device *dev, bool save) disable_irq(priv->parent_irq); list_for_each_entry(bank, &priv->bank_list, node) { - gc = &bank->gc; - if (save) brcmstb_gpio_bank_save(priv, bank); @@ -523,8 +521,9 @@ static void brcmstb_gpio_quiesce(struct device *dev, bool save) imask = bank->wake_active; else imask = 0; - gc->write_reg(priv->reg_base + GIO_MASK(bank->id), - imask); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_MASK(bank->id), + imask); } } @@ -538,12 +537,12 @@ static void brcmstb_gpio_shutdown(struct platform_device *pdev) static void brcmstb_gpio_bank_restore(struct brcmstb_gpio_priv *priv, struct brcmstb_gpio_bank *bank) { - struct gpio_chip *gc = &bank->gc; unsigned int i; for (i = 0; i < GIO_REG_STAT; i++) - gc->write_reg(priv->reg_base + GIO_BANK_OFF(bank->id, i), - bank->saved_regs[i]); + gpio_generic_write_reg(&bank->chip, + priv->reg_base + GIO_BANK_OFF(bank->id, i), + bank->saved_regs[i]); } static int brcmstb_gpio_suspend(struct device *dev) @@ -585,6 +584,7 @@ static const struct dev_pm_ops brcmstb_gpio_pm_ops = { static int brcmstb_gpio_probe(struct platform_device *pdev) { + struct gpio_generic_chip_config config; struct device *dev = &pdev->dev; struct device_node *np = dev->of_node; void __iomem *reg_base; @@ -665,17 +665,24 @@ static int brcmstb_gpio_probe(struct platform_device *pdev) bank->width = bank_width; } + gc = &bank->chip.gc; + /* * Regs are 4 bytes wide, have data reg, no set/clear regs, * and direction bits have 0 = output and 1 = input */ - gc = &bank->gc; - err = bgpio_init(gc, dev, 4, - reg_base + GIO_DATA(bank->id), - NULL, NULL, NULL, - reg_base + GIO_IODIR(bank->id), flags); + + config = (struct gpio_generic_chip_config) { + .dev = dev, + .sz = 4, + .dat = reg_base + GIO_DATA(bank->id), + .dirin = reg_base + GIO_IODIR(bank->id), + .flags = flags, + }; + + err = gpio_generic_chip_init(&bank->chip, &config); if (err) { - dev_err(dev, "bgpio_init() failed\n"); + dev_err(dev, "failed to initialize generic GPIO chip\n"); goto fail; } @@ -700,7 +707,8 @@ static int brcmstb_gpio_probe(struct platform_device *pdev) * be retained from S5 cold boot */ need_wakeup_event |= !!__brcmstb_gpio_get_active_irqs(bank); - gc->write_reg(reg_base + GIO_MASK(bank->id), 0); + gpio_generic_write_reg(&bank->chip, + reg_base + GIO_MASK(bank->id), 0); err = gpiochip_add_data(gc, bank); if (err) { -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:44 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:44 +0200 Subject: [PATCH v2 08/15] gpio: mt7621: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-8-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-mt7621.c | 51 +++++++++++++++++++++++++++++----------------- 1 file changed, 32 insertions(+), 19 deletions(-) diff --git a/drivers/gpio/gpio-mt7621.c b/drivers/gpio/gpio-mt7621.c index 93facbebb80efadbdd3fb4500e0db14936287f1a..e56812a1721151c8f3b32b5093aee5c74bb798bc 100644 --- a/drivers/gpio/gpio-mt7621.c +++ b/drivers/gpio/gpio-mt7621.c @@ -6,6 +6,7 @@ #include #include +#include #include #include #include @@ -30,7 +31,7 @@ struct mtk_gc { struct irq_chip irq_chip; - struct gpio_chip chip; + struct gpio_generic_chip chip; spinlock_t lock; int bank; u32 rising; @@ -59,27 +60,29 @@ struct mtk { static inline struct mtk_gc * to_mediatek_gpio(struct gpio_chip *chip) { - return container_of(chip, struct mtk_gc, chip); + struct gpio_generic_chip *gen_gc = to_gpio_generic_chip(chip); + + return container_of(gen_gc, struct mtk_gc, chip); } static inline void mtk_gpio_w32(struct mtk_gc *rg, u32 offset, u32 val) { - struct gpio_chip *gc = &rg->chip; + struct gpio_chip *gc = &rg->chip.gc; struct mtk *mtk = gpiochip_get_data(gc); offset = (rg->bank * GPIO_BANK_STRIDE) + offset; - gc->write_reg(mtk->base + offset, val); + gpio_generic_write_reg(&rg->chip, mtk->base + offset, val); } static inline u32 mtk_gpio_r32(struct mtk_gc *rg, u32 offset) { - struct gpio_chip *gc = &rg->chip; + struct gpio_chip *gc = &rg->chip.gc; struct mtk *mtk = gpiochip_get_data(gc); offset = (rg->bank * GPIO_BANK_STRIDE) + offset; - return gc->read_reg(mtk->base + offset); + return gpio_generic_read_reg(&rg->chip, mtk->base + offset); } static irqreturn_t @@ -220,6 +223,7 @@ static const struct irq_chip mt7621_irq_chip = { static int mediatek_gpio_bank_probe(struct device *dev, int bank) { + struct gpio_generic_chip_config config; struct mtk *mtk = dev_get_drvdata(dev); struct mtk_gc *rg; void __iomem *dat, *set, *ctrl, *diro; @@ -236,21 +240,30 @@ mediatek_gpio_bank_probe(struct device *dev, int bank) ctrl = mtk->base + GPIO_REG_DCLR + (rg->bank * GPIO_BANK_STRIDE); diro = mtk->base + GPIO_REG_CTRL + (rg->bank * GPIO_BANK_STRIDE); - ret = bgpio_init(&rg->chip, dev, 4, dat, set, ctrl, diro, NULL, - BGPIOF_NO_SET_ON_INPUT); + config = (struct gpio_generic_chip_config) { + .dev = dev, + .sz = 4, + .dat = dat, + .set = set, + .clr = ctrl, + .dirout = diro, + .flags = BGPIOF_NO_SET_ON_INPUT, + }; + + ret = gpio_generic_chip_init(&rg->chip, &config); if (ret) { - dev_err(dev, "bgpio_init() failed\n"); + dev_err(dev, "failed to initialize generic GPIO chip\n"); return ret; } - rg->chip.of_gpio_n_cells = 2; - rg->chip.of_xlate = mediatek_gpio_xlate; - rg->chip.label = devm_kasprintf(dev, GFP_KERNEL, "%s-bank%d", + rg->chip.gc.of_gpio_n_cells = 2; + rg->chip.gc.of_xlate = mediatek_gpio_xlate; + rg->chip.gc.label = devm_kasprintf(dev, GFP_KERNEL, "%s-bank%d", dev_name(dev), bank); - if (!rg->chip.label) + if (!rg->chip.gc.label) return -ENOMEM; - rg->chip.offset = bank * MTK_BANK_WIDTH; + rg->chip.gc.offset = bank * MTK_BANK_WIDTH; if (mtk->gpio_irq) { struct gpio_irq_chip *girq; @@ -261,7 +274,7 @@ mediatek_gpio_bank_probe(struct device *dev, int bank) */ ret = devm_request_irq(dev, mtk->gpio_irq, mediatek_gpio_irq_handler, IRQF_SHARED, - rg->chip.label, &rg->chip); + rg->chip.gc.label, &rg->chip.gc); if (ret) { dev_err(dev, "Error requesting IRQ %d: %d\n", @@ -269,7 +282,7 @@ mediatek_gpio_bank_probe(struct device *dev, int bank) return ret; } - girq = &rg->chip.irq; + girq = &rg->chip.gc.irq; gpio_irq_chip_set_chip(girq, &mt7621_irq_chip); /* This will let us handle the parent IRQ in the driver */ girq->parent_handler = NULL; @@ -279,17 +292,17 @@ mediatek_gpio_bank_probe(struct device *dev, int bank) girq->handler = handle_simple_irq; } - ret = devm_gpiochip_add_data(dev, &rg->chip, mtk); + ret = devm_gpiochip_add_data(dev, &rg->chip.gc, mtk); if (ret < 0) { dev_err(dev, "Could not register gpio %d, ret=%d\n", - rg->chip.ngpio, ret); + rg->chip.gc.ngpio, ret); return ret; } /* set polarity to low for all gpios */ mtk_gpio_w32(rg, GPIO_REG_POL, 0); - dev_info(dev, "registering %d gpios\n", rg->chip.ngpio); + dev_info(dev, "registering %d gpios\n", rg->chip.gc.ngpio); return 0; } -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:45 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:45 +0200 Subject: [PATCH v2 09/15] gpio: mt7621: use the generic GPIO chip lock for IRQ handling In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-9-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski This driver uses its own spinlock in interrupt routines while the generic GPIO chip callbacks use a separate one. This is, of course, racy so use the fact that the lock in generic GPIO chip is also a spinlock and convert the interrupt handling functions in this module to using the provided generic GPIO chip locking API. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-mt7621.c | 29 ++++++++++++----------------- 1 file changed, 12 insertions(+), 17 deletions(-) diff --git a/drivers/gpio/gpio-mt7621.c b/drivers/gpio/gpio-mt7621.c index e56812a1721151c8f3b32b5093aee5c74bb798bc..e7bb9b2cd6cf32baa71b4185ea274075a7bc2d8f 100644 --- a/drivers/gpio/gpio-mt7621.c +++ b/drivers/gpio/gpio-mt7621.c @@ -11,7 +11,6 @@ #include #include #include -#include #define MTK_BANK_CNT 3 #define MTK_BANK_WIDTH 32 @@ -32,7 +31,6 @@ struct mtk_gc { struct irq_chip irq_chip; struct gpio_generic_chip chip; - spinlock_t lock; int bank; u32 rising; u32 falling; @@ -111,12 +109,12 @@ mediatek_gpio_irq_unmask(struct irq_data *d) struct gpio_chip *gc = irq_data_get_irq_chip_data(d); struct mtk_gc *rg = to_mediatek_gpio(gc); int pin = d->hwirq; - unsigned long flags; u32 rise, fall, high, low; gpiochip_enable_irq(gc, d->hwirq); - spin_lock_irqsave(&rg->lock, flags); + guard(gpio_generic_lock_irqsave)(&rg->chip); + rise = mtk_gpio_r32(rg, GPIO_REG_REDGE); fall = mtk_gpio_r32(rg, GPIO_REG_FEDGE); high = mtk_gpio_r32(rg, GPIO_REG_HLVL); @@ -125,7 +123,6 @@ mediatek_gpio_irq_unmask(struct irq_data *d) mtk_gpio_w32(rg, GPIO_REG_FEDGE, fall | (BIT(pin) & rg->falling)); mtk_gpio_w32(rg, GPIO_REG_HLVL, high | (BIT(pin) & rg->hlevel)); mtk_gpio_w32(rg, GPIO_REG_LLVL, low | (BIT(pin) & rg->llevel)); - spin_unlock_irqrestore(&rg->lock, flags); } static void @@ -134,19 +131,18 @@ mediatek_gpio_irq_mask(struct irq_data *d) struct gpio_chip *gc = irq_data_get_irq_chip_data(d); struct mtk_gc *rg = to_mediatek_gpio(gc); int pin = d->hwirq; - unsigned long flags; u32 rise, fall, high, low; - spin_lock_irqsave(&rg->lock, flags); - rise = mtk_gpio_r32(rg, GPIO_REG_REDGE); - fall = mtk_gpio_r32(rg, GPIO_REG_FEDGE); - high = mtk_gpio_r32(rg, GPIO_REG_HLVL); - low = mtk_gpio_r32(rg, GPIO_REG_LLVL); - mtk_gpio_w32(rg, GPIO_REG_FEDGE, fall & ~BIT(pin)); - mtk_gpio_w32(rg, GPIO_REG_REDGE, rise & ~BIT(pin)); - mtk_gpio_w32(rg, GPIO_REG_HLVL, high & ~BIT(pin)); - mtk_gpio_w32(rg, GPIO_REG_LLVL, low & ~BIT(pin)); - spin_unlock_irqrestore(&rg->lock, flags); + scoped_guard(gpio_generic_lock_irqsave, &rg->chip) { + rise = mtk_gpio_r32(rg, GPIO_REG_REDGE); + fall = mtk_gpio_r32(rg, GPIO_REG_FEDGE); + high = mtk_gpio_r32(rg, GPIO_REG_HLVL); + low = mtk_gpio_r32(rg, GPIO_REG_LLVL); + mtk_gpio_w32(rg, GPIO_REG_FEDGE, fall & ~BIT(pin)); + mtk_gpio_w32(rg, GPIO_REG_REDGE, rise & ~BIT(pin)); + mtk_gpio_w32(rg, GPIO_REG_HLVL, high & ~BIT(pin)); + mtk_gpio_w32(rg, GPIO_REG_LLVL, low & ~BIT(pin)); + } gpiochip_disable_irq(gc, d->hwirq); } @@ -232,7 +228,6 @@ mediatek_gpio_bank_probe(struct device *dev, int bank) rg = &mtk->gc_map[bank]; memset(rg, 0, sizeof(*rg)); - spin_lock_init(&rg->lock); rg->bank = bank; dat = mtk->base + GPIO_REG_DATA + (rg->bank * GPIO_BANK_STRIDE); -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:46 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:46 +0200 Subject: [PATCH v2 10/15] gpio: menz127: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-10-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-menz127.c | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/drivers/gpio/gpio-menz127.c b/drivers/gpio/gpio-menz127.c index ebe5da4933bce730c70f83c1c0f86fc4a4cc9906..da2bf9381cc43cd489f6a8593636bbbc95ab5660 100644 --- a/drivers/gpio/gpio-menz127.c +++ b/drivers/gpio/gpio-menz127.c @@ -12,6 +12,7 @@ #include #include #include +#include #define MEN_Z127_CTRL 0x00 #define MEN_Z127_PSR 0x04 @@ -30,7 +31,7 @@ (db <= MEN_Z127_DB_MAX_US)) struct men_z127_gpio { - struct gpio_chip gc; + struct gpio_generic_chip chip; void __iomem *reg_base; struct resource *mem; }; @@ -64,7 +65,7 @@ static int men_z127_debounce(struct gpio_chip *gc, unsigned gpio, debounce /= 50; } - raw_spin_lock(&gc->bgpio_lock); + guard(gpio_generic_lock)(&priv->chip); db_en = readl(priv->reg_base + MEN_Z127_DBER); @@ -79,8 +80,6 @@ static int men_z127_debounce(struct gpio_chip *gc, unsigned gpio, writel(db_en, priv->reg_base + MEN_Z127_DBER); writel(db_cnt, priv->reg_base + GPIO_TO_DBCNT_REG(gpio)); - raw_spin_unlock(&gc->bgpio_lock); - return 0; } @@ -91,7 +90,8 @@ static int men_z127_set_single_ended(struct gpio_chip *gc, struct men_z127_gpio *priv = gpiochip_get_data(gc); u32 od_en; - raw_spin_lock(&gc->bgpio_lock); + guard(gpio_generic_lock)(&priv->chip); + od_en = readl(priv->reg_base + MEN_Z127_ODER); if (param == PIN_CONFIG_DRIVE_OPEN_DRAIN) @@ -101,7 +101,6 @@ static int men_z127_set_single_ended(struct gpio_chip *gc, od_en &= ~BIT(offset); writel(od_en, priv->reg_base + MEN_Z127_ODER); - raw_spin_unlock(&gc->bgpio_lock); return 0; } @@ -137,6 +136,7 @@ static void men_z127_release_mem(void *data) static int men_z127_probe(struct mcb_device *mdev, const struct mcb_device_id *id) { + struct gpio_generic_chip_config config; struct men_z127_gpio *men_z127_gpio; struct device *dev = &mdev->dev; int ret; @@ -163,18 +163,21 @@ static int men_z127_probe(struct mcb_device *mdev, mcb_set_drvdata(mdev, men_z127_gpio); - ret = bgpio_init(&men_z127_gpio->gc, &mdev->dev, 4, - men_z127_gpio->reg_base + MEN_Z127_PSR, - men_z127_gpio->reg_base + MEN_Z127_CTRL, - NULL, - men_z127_gpio->reg_base + MEN_Z127_GPIODR, - NULL, 0); + config = (struct gpio_generic_chip_config) { + .dev = &mdev->dev, + .sz = 4, + .dat = men_z127_gpio->reg_base + MEN_Z127_PSR, + .set = men_z127_gpio->reg_base + MEN_Z127_CTRL, + .dirout = men_z127_gpio->reg_base + MEN_Z127_GPIODR, + }; + + ret = gpio_generic_chip_init(&men_z127_gpio->chip, &config); if (ret) return ret; - men_z127_gpio->gc.set_config = men_z127_set_config; + men_z127_gpio->chip.gc.set_config = men_z127_set_config; - ret = devm_gpiochip_add_data(dev, &men_z127_gpio->gc, men_z127_gpio); + ret = devm_gpiochip_add_data(dev, &men_z127_gpio->chip.gc, men_z127_gpio); if (ret) return dev_err_probe(dev, ret, "failed to register MEN 16Z127 GPIO controller"); -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:47 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:47 +0200 Subject: [PATCH v2 11/15] gpio: sifive: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-11-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-sifive.c | 73 ++++++++++++++++++++++++---------------------- 1 file changed, 38 insertions(+), 35 deletions(-) diff --git a/drivers/gpio/gpio-sifive.c b/drivers/gpio/gpio-sifive.c index 98ef975c44d9a6c9238605cfd1d5820fd70a66ca..2ced87ffd3bbf219c11857391eb4ea808adc0527 100644 --- a/drivers/gpio/gpio-sifive.c +++ b/drivers/gpio/gpio-sifive.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include @@ -32,7 +33,7 @@ struct sifive_gpio { void __iomem *base; - struct gpio_chip gc; + struct gpio_generic_chip gen_gc; struct regmap *regs; unsigned long irq_state; unsigned int trigger[SIFIVE_GPIO_MAX]; @@ -41,10 +42,10 @@ struct sifive_gpio { static void sifive_gpio_set_ie(struct sifive_gpio *chip, unsigned int offset) { - unsigned long flags; unsigned int trigger; - raw_spin_lock_irqsave(&chip->gc.bgpio_lock, flags); + guard(gpio_generic_lock_irqsave)(&chip->gen_gc); + trigger = (chip->irq_state & BIT(offset)) ? chip->trigger[offset] : 0; regmap_update_bits(chip->regs, SIFIVE_GPIO_RISE_IE, BIT(offset), (trigger & IRQ_TYPE_EDGE_RISING) ? BIT(offset) : 0); @@ -54,7 +55,6 @@ static void sifive_gpio_set_ie(struct sifive_gpio *chip, unsigned int offset) (trigger & IRQ_TYPE_LEVEL_HIGH) ? BIT(offset) : 0); regmap_update_bits(chip->regs, SIFIVE_GPIO_LOW_IE, BIT(offset), (trigger & IRQ_TYPE_LEVEL_LOW) ? BIT(offset) : 0); - raw_spin_unlock_irqrestore(&chip->gc.bgpio_lock, flags); } static int sifive_gpio_irq_set_type(struct irq_data *d, unsigned int trigger) @@ -72,13 +72,12 @@ static int sifive_gpio_irq_set_type(struct irq_data *d, unsigned int trigger) } static void sifive_gpio_irq_enable(struct irq_data *d) -{ + { struct gpio_chip *gc = irq_data_get_irq_chip_data(d); struct sifive_gpio *chip = gpiochip_get_data(gc); irq_hw_number_t hwirq = irqd_to_hwirq(d); int offset = hwirq % SIFIVE_GPIO_MAX; u32 bit = BIT(offset); - unsigned long flags; gpiochip_enable_irq(gc, hwirq); irq_chip_enable_parent(d); @@ -86,13 +85,13 @@ static void sifive_gpio_irq_enable(struct irq_data *d) /* Switch to input */ gc->direction_input(gc, offset); - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); - /* Clear any sticky pending interrupts */ - regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); - regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); - regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); - regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + scoped_guard(gpio_generic_lock_irqsave, &chip->gen_gc) { + /* Clear any sticky pending interrupts */ + regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); + regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); + regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); + regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); + } /* Enable interrupts */ assign_bit(offset, &chip->irq_state, 1); @@ -118,15 +117,14 @@ static void sifive_gpio_irq_eoi(struct irq_data *d) struct sifive_gpio *chip = gpiochip_get_data(gc); int offset = irqd_to_hwirq(d) % SIFIVE_GPIO_MAX; u32 bit = BIT(offset); - unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); - /* Clear all pending interrupts */ - regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); - regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); - regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); - regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + scoped_guard(gpio_generic_lock_irqsave, &chip->gen_gc) { + /* Clear all pending interrupts */ + regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); + regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); + regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); + regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); + } irq_chip_eoi_parent(d); } @@ -179,6 +177,7 @@ static const struct regmap_config sifive_gpio_regmap_config = { static int sifive_gpio_probe(struct platform_device *pdev) { + struct gpio_generic_chip_config config; struct device *dev = &pdev->dev; struct irq_domain *parent; struct gpio_irq_chip *girq; @@ -217,13 +216,17 @@ static int sifive_gpio_probe(struct platform_device *pdev) */ parent = irq_get_irq_data(chip->irq_number[0])->domain; - ret = bgpio_init(&chip->gc, dev, 4, - chip->base + SIFIVE_GPIO_INPUT_VAL, - chip->base + SIFIVE_GPIO_OUTPUT_VAL, - NULL, - chip->base + SIFIVE_GPIO_OUTPUT_EN, - chip->base + SIFIVE_GPIO_INPUT_EN, - BGPIOF_READ_OUTPUT_REG_SET); + config = (struct gpio_generic_chip_config) { + .dev = dev, + .sz = 4, + .dat = chip->base + SIFIVE_GPIO_INPUT_VAL, + .set = chip->base + SIFIVE_GPIO_OUTPUT_VAL, + .dirout = chip->base + SIFIVE_GPIO_OUTPUT_EN, + .dirin = chip->base + SIFIVE_GPIO_INPUT_EN, + .flags = BGPIOF_READ_OUTPUT_REG_SET, + }; + + ret = gpio_generic_chip_init(&chip->gen_gc, &config); if (ret) { dev_err(dev, "unable to init generic GPIO\n"); return ret; @@ -236,12 +239,12 @@ static int sifive_gpio_probe(struct platform_device *pdev) regmap_write(chip->regs, SIFIVE_GPIO_LOW_IE, 0); chip->irq_state = 0; - chip->gc.base = -1; - chip->gc.ngpio = ngpio; - chip->gc.label = dev_name(dev); - chip->gc.parent = dev; - chip->gc.owner = THIS_MODULE; - girq = &chip->gc.irq; + chip->gen_gc.gc.base = -1; + chip->gen_gc.gc.ngpio = ngpio; + chip->gen_gc.gc.label = dev_name(dev); + chip->gen_gc.gc.parent = dev; + chip->gen_gc.gc.owner = THIS_MODULE; + girq = &chip->gen_gc.gc.irq; gpio_irq_chip_set_chip(girq, &sifive_gpio_irqchip); girq->fwnode = dev_fwnode(dev); girq->parent_domain = parent; @@ -249,7 +252,7 @@ static int sifive_gpio_probe(struct platform_device *pdev) girq->handler = handle_bad_irq; girq->default_type = IRQ_TYPE_NONE; - return gpiochip_add_data(&chip->gc, chip); + return gpiochip_add_data(&chip->gen_gc.gc, chip); } static const struct of_device_id sifive_gpio_match[] = { -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:49 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:49 +0200 Subject: [PATCH v2 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-13-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-sodaville.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/gpio/gpio-sodaville.c b/drivers/gpio/gpio-sodaville.c index abd13c79ace09db228e975f93c92e727d3864ef8..37c1338377295fa2995bac98f1ae2db892209602 100644 --- a/drivers/gpio/gpio-sodaville.c +++ b/drivers/gpio/gpio-sodaville.c @@ -9,6 +9,7 @@ #include #include +#include #include #include #include @@ -39,7 +40,7 @@ struct sdv_gpio_chip_data { void __iomem *gpio_pub_base; struct irq_domain *id; struct irq_chip_generic *gc; - struct gpio_chip chip; + struct gpio_generic_chip gen_gc; }; static int sdv_gpio_pub_set_type(struct irq_data *d, unsigned int type) @@ -180,6 +181,7 @@ static int sdv_register_irqsupport(struct sdv_gpio_chip_data *sd, static int sdv_gpio_probe(struct pci_dev *pdev, const struct pci_device_id *pci_id) { + struct gpio_generic_chip_config config; struct sdv_gpio_chip_data *sd; int ret; u32 mux_val; @@ -206,15 +208,21 @@ static int sdv_gpio_probe(struct pci_dev *pdev, if (!ret) writel(mux_val, sd->gpio_pub_base + GPMUXCTL); - ret = bgpio_init(&sd->chip, &pdev->dev, 4, - sd->gpio_pub_base + GPINR, sd->gpio_pub_base + GPOUTR, - NULL, sd->gpio_pub_base + GPOER, NULL, 0); + config = (struct gpio_generic_chip_config) { + .dev = &pdev->dev, + .sz = 4, + .dat = sd->gpio_pub_base + GPINR, + .set = sd->gpio_pub_base + GPOUTR, + .dirout = sd->gpio_pub_base + GPOER, + }; + + ret = gpio_generic_chip_init(&sd->gen_gc, &config); if (ret) return ret; - sd->chip.ngpio = SDV_NUM_PUB_GPIOS; + sd->gen_gc.gc.ngpio = SDV_NUM_PUB_GPIOS; - ret = devm_gpiochip_add_data(&pdev->dev, &sd->chip, sd); + ret = devm_gpiochip_add_data(&pdev->dev, &sd->gen_gc.gc, sd); if (ret < 0) { dev_err(&pdev->dev, "gpiochip_add() failed.\n"); return ret; -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:50 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:50 +0200 Subject: [PATCH v2 14/15] gpio: mmio: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-14-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-mmio.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/drivers/gpio/gpio-mmio.c b/drivers/gpio/gpio-mmio.c index 79e1be149c94842cb6fa6b657343b11e78701220..b4f0ab0daaeb11bd88723f8b1c15bd09225f1d97 100644 --- a/drivers/gpio/gpio-mmio.c +++ b/drivers/gpio/gpio-mmio.c @@ -57,6 +57,7 @@ o ` ~~~~\___/~~~~ ` controller in FPGA is ,.` #include #include +#include #include "gpiolib.h" @@ -737,6 +738,8 @@ MODULE_DEVICE_TABLE(of, bgpio_of_match); static int bgpio_pdev_probe(struct platform_device *pdev) { + struct gpio_generic_chip_config config; + struct gpio_generic_chip *gen_gc; struct device *dev = &pdev->dev; struct resource *r; void __iomem *dat; @@ -748,7 +751,6 @@ static int bgpio_pdev_probe(struct platform_device *pdev) unsigned long flags = 0; unsigned int base; int err; - struct gpio_chip *gc; const char *label; r = platform_get_resource_byname(pdev, IORESOURCE_MEM, "dat"); @@ -777,8 +779,8 @@ static int bgpio_pdev_probe(struct platform_device *pdev) if (IS_ERR(dirin)) return PTR_ERR(dirin); - gc = devm_kzalloc(&pdev->dev, sizeof(*gc), GFP_KERNEL); - if (!gc) + gen_gc = devm_kzalloc(&pdev->dev, sizeof(*gen_gc), GFP_KERNEL); + if (!gen_gc) return -ENOMEM; if (device_is_big_endian(dev)) @@ -787,13 +789,24 @@ static int bgpio_pdev_probe(struct platform_device *pdev) if (device_property_read_bool(dev, "no-output")) flags |= BGPIOF_NO_OUTPUT; - err = bgpio_init(gc, dev, sz, dat, set, clr, dirout, dirin, flags); + config = (struct gpio_generic_chip_config) { + .dev = dev, + .sz = sz, + .dat = dat, + .set = set, + .clr = clr, + .dirout = dirout, + .dirin = dirin, + .flags = flags, + }; + + err = gpio_generic_chip_init(gen_gc, &config); if (err) return err; err = device_property_read_string(dev, "label", &label); if (!err) - gc->label = label; + gen_gc->gc.label = label; /* * This property *must not* be used in device-tree sources, it's only @@ -801,11 +814,11 @@ static int bgpio_pdev_probe(struct platform_device *pdev) */ err = device_property_read_u32(dev, "gpio-mmio,base", &base); if (!err && base <= INT_MAX) - gc->base = base; + gen_gc->gc.base = base; - platform_set_drvdata(pdev, gc); + platform_set_drvdata(pdev, &gen_gc->gc); - return devm_gpiochip_add_data(&pdev->dev, gc, NULL); + return devm_gpiochip_add_data(&pdev->dev, &gen_gc->gc, NULL); } static const struct platform_device_id bgpio_id_table[] = { -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:48 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:48 +0200 Subject: [PATCH v2 12/15] gpio: spacemit-k1: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-12-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski Convert the driver to using the new generic GPIO chip interfaces from linux/gpio/generic.h. Reviewed-by: Yixun Lan Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-spacemit-k1.c | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/drivers/gpio/gpio-spacemit-k1.c b/drivers/gpio/gpio-spacemit-k1.c index 3cc75c701ec40194e602b80d3f96f23204ce3b4d..a0af23f732819be9329af1cb62887dc6eb100ac9 100644 --- a/drivers/gpio/gpio-spacemit-k1.c +++ b/drivers/gpio/gpio-spacemit-k1.c @@ -6,6 +6,7 @@ #include #include +#include #include #include #include @@ -38,7 +39,7 @@ struct spacemit_gpio; struct spacemit_gpio_bank { - struct gpio_chip gc; + struct gpio_generic_chip chip; struct spacemit_gpio *sg; void __iomem *base; u32 irq_mask; @@ -72,7 +73,7 @@ static irqreturn_t spacemit_gpio_irq_handler(int irq, void *dev_id) return IRQ_NONE; for_each_set_bit(n, &pending, BITS_PER_LONG) - handle_nested_irq(irq_find_mapping(gb->gc.irq.domain, n)); + handle_nested_irq(irq_find_mapping(gb->chip.gc.irq.domain, n)); return IRQ_HANDLED; } @@ -143,7 +144,7 @@ static void spacemit_gpio_irq_print_chip(struct irq_data *data, struct seq_file { struct spacemit_gpio_bank *gb = irq_data_get_irq_chip_data(data); - seq_printf(p, "%s-%d", dev_name(gb->gc.parent), spacemit_gpio_bank_index(gb)); + seq_printf(p, "%s-%d", dev_name(gb->chip.gc.parent), spacemit_gpio_bank_index(gb)); } static struct irq_chip spacemit_gpio_chip = { @@ -165,7 +166,7 @@ static bool spacemit_of_node_instance_match(struct gpio_chip *gc, unsigned int i if (i >= SPACEMIT_NR_BANKS) return false; - return (gc == &sg->sgb[i].gc); + return (gc == &sg->sgb[i].chip.gc); } static int spacemit_gpio_add_bank(struct spacemit_gpio *sg, @@ -173,7 +174,8 @@ static int spacemit_gpio_add_bank(struct spacemit_gpio *sg, int index, int irq) { struct spacemit_gpio_bank *gb = &sg->sgb[index]; - struct gpio_chip *gc = &gb->gc; + struct gpio_generic_chip_config config; + struct gpio_chip *gc = &gb->chip.gc; struct device *dev = sg->dev; struct gpio_irq_chip *girq; void __iomem *dat, *set, *clr, *dirin, *dirout; @@ -187,9 +189,19 @@ static int spacemit_gpio_add_bank(struct spacemit_gpio *sg, dirin = gb->base + SPACEMIT_GCDR; dirout = gb->base + SPACEMIT_GSDR; + config = (struct gpio_generic_chip_config) { + .dev = dev, + .sz = 4, + .dat = dat, + .set = set, + .clr = clr, + .dirout = dirout, + .dirin = dirin, + .flags = BGPIOF_UNREADABLE_REG_SET | BGPIOF_UNREADABLE_REG_DIR, + }; + /* This registers 32 GPIO lines per bank */ - ret = bgpio_init(gc, dev, 4, dat, set, clr, dirout, dirin, - BGPIOF_UNREADABLE_REG_SET | BGPIOF_UNREADABLE_REG_DIR); + ret = gpio_generic_chip_init(&gb->chip, &config); if (ret) return dev_err_probe(dev, ret, "failed to init gpio chip\n"); @@ -221,7 +233,7 @@ static int spacemit_gpio_add_bank(struct spacemit_gpio *sg, ret = devm_request_threaded_irq(dev, irq, NULL, spacemit_gpio_irq_handler, IRQF_ONESHOT | IRQF_SHARED, - gb->gc.label, gb); + gb->chip.gc.label, gb); if (ret < 0) return dev_err_probe(dev, ret, "failed to register IRQ\n"); -- 2.48.1 From brgl at bgdev.pl Wed Sep 10 00:12:51 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:12:51 +0200 Subject: [PATCH v2 15/15] gpio: move gpio-mmio-specific fields out of struct gpio_chip In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <20250910-gpio-mmio-gpio-conv-part4-v2-15-f3d1a4c57124@linaro.org> From: Bartosz Golaszewski With all users of bgpio_init() converted to using the modernized generic GPIO chip API, we can now move the gpio-mmio-specific fields out of struct gpio_chip and into the dedicated struct gpio_generic_chip. To that end: adjust the gpio-mmio driver to the new layout, update the docs, etc. The changes in gpio-mlxbf2.c and gpio-mpc8xxx.c are here and not in their respective conversion commits because the former passes the address of the generic chip's lock to the __releases() annotation and we cannot really hide it while gpio-mpc8xxx.c accesses the shadow registers in a driver-specific workaround and there's no reason to make them available in a public API. Also: drop the relevant task from TODO as it's now done. Signed-off-by: Bartosz Golaszewski --- drivers/gpio/TODO | 5 - drivers/gpio/gpio-mlxbf2.c | 2 +- drivers/gpio/gpio-mmio.c | 321 ++++++++++++++++++++++--------------------- drivers/gpio/gpio-mpc8xxx.c | 5 +- include/linux/gpio/driver.h | 44 ------ include/linux/gpio/generic.h | 67 ++++++--- 6 files changed, 211 insertions(+), 233 deletions(-) diff --git a/drivers/gpio/TODO b/drivers/gpio/TODO index b797499e627ee9fdb1ee9c564b8278241f720850..8ed74e05903a972e99e0789319ed19ebd8545a1a 100644 --- a/drivers/gpio/TODO +++ b/drivers/gpio/TODO @@ -131,11 +131,6 @@ Work items: helpers (x86 inb()/outb()) and convert port-mapped I/O drivers to use this with dry-coding and sending to maintainers to test -- Move the MMIO GPIO specific fields out of struct gpio_chip into a - dedicated structure. Currently every GPIO chip has them if gpio-mmio is - enabled in Kconfig even if it itself doesn't register with the helper - library. - ------------------------------------------------------------------------------- Generic regmap GPIO diff --git a/drivers/gpio/gpio-mlxbf2.c b/drivers/gpio/gpio-mlxbf2.c index f99f66cd189ca71c9d188dff0a0b42ef2223abb3..9520d26b20a5851ac8b5de239b8f5980dabc2820 100644 --- a/drivers/gpio/gpio-mlxbf2.c +++ b/drivers/gpio/gpio-mlxbf2.c @@ -156,7 +156,7 @@ static int mlxbf2_gpio_lock_acquire(struct mlxbf2_gpio_context *gs) * Release the YU arm_gpio_lock after changing the direction mode. */ static void mlxbf2_gpio_lock_release(struct mlxbf2_gpio_context *gs) - __releases(&gs->chip.gc.bgpio_lock) + __releases(&gs->chip.lock) __releases(yu_arm_gpio_lock_param.lock) { writel(YU_ARM_GPIO_LOCK_RELEASE, yu_arm_gpio_lock_param.io); diff --git a/drivers/gpio/gpio-mmio.c b/drivers/gpio/gpio-mmio.c index b4f0ab0daaeb11bd88723f8b1c15bd09225f1d97..a3df14d672a92ac771014315458cb50933b6c539 100644 --- a/drivers/gpio/gpio-mmio.c +++ b/drivers/gpio/gpio-mmio.c @@ -125,20 +125,23 @@ static unsigned long bgpio_read32be(void __iomem *reg) static unsigned long bgpio_line2mask(struct gpio_chip *gc, unsigned int line) { - if (gc->be_bits) - return BIT(gc->bgpio_bits - 1 - line); + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + + if (chip->be_bits) + return BIT(chip->bits - 1 - line); return BIT(line); } static int bgpio_get_set(struct gpio_chip *gc, unsigned int gpio) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long pinmask = bgpio_line2mask(gc, gpio); - bool dir = !!(gc->bgpio_dir & pinmask); + bool dir = !!(chip->sdir & pinmask); if (dir) - return !!(gc->read_reg(gc->reg_set) & pinmask); - else - return !!(gc->read_reg(gc->reg_dat) & pinmask); + return !!(chip->read_reg(chip->reg_set) & pinmask); + + return !!(chip->read_reg(chip->reg_dat) & pinmask); } /* @@ -148,26 +151,28 @@ static int bgpio_get_set(struct gpio_chip *gc, unsigned int gpio) static int bgpio_get_set_multiple(struct gpio_chip *gc, unsigned long *mask, unsigned long *bits) { - unsigned long get_mask = 0; - unsigned long set_mask = 0; + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + unsigned long get_mask = 0, set_mask = 0; /* Make sure we first clear any bits that are zero when we read the register */ *bits &= ~*mask; - set_mask = *mask & gc->bgpio_dir; - get_mask = *mask & ~gc->bgpio_dir; + set_mask = *mask & chip->sdir; + get_mask = *mask & ~chip->sdir; if (set_mask) - *bits |= gc->read_reg(gc->reg_set) & set_mask; + *bits |= chip->read_reg(chip->reg_set) & set_mask; if (get_mask) - *bits |= gc->read_reg(gc->reg_dat) & get_mask; + *bits |= chip->read_reg(chip->reg_dat) & get_mask; return 0; } static int bgpio_get(struct gpio_chip *gc, unsigned int gpio) { - return !!(gc->read_reg(gc->reg_dat) & bgpio_line2mask(gc, gpio)); + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + + return !!(chip->read_reg(chip->reg_dat) & bgpio_line2mask(gc, gpio)); } /* @@ -176,9 +181,11 @@ static int bgpio_get(struct gpio_chip *gc, unsigned int gpio) static int bgpio_get_multiple(struct gpio_chip *gc, unsigned long *mask, unsigned long *bits) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + /* Make sure we first clear any bits that are zero when we read the register */ *bits &= ~*mask; - *bits |= gc->read_reg(gc->reg_dat) & *mask; + *bits |= chip->read_reg(chip->reg_dat) & *mask; return 0; } @@ -188,6 +195,7 @@ static int bgpio_get_multiple(struct gpio_chip *gc, unsigned long *mask, static int bgpio_get_multiple_be(struct gpio_chip *gc, unsigned long *mask, unsigned long *bits) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long readmask = 0; unsigned long val; int bit; @@ -200,7 +208,7 @@ static int bgpio_get_multiple_be(struct gpio_chip *gc, unsigned long *mask, readmask |= bgpio_line2mask(gc, bit); /* Read the register */ - val = gc->read_reg(gc->reg_dat) & readmask; + val = chip->read_reg(chip->reg_dat) & readmask; /* * Mirror the result into the "bits" result, this will give line 0 @@ -219,19 +227,20 @@ static int bgpio_set_none(struct gpio_chip *gc, unsigned int gpio, int val) static int bgpio_set(struct gpio_chip *gc, unsigned int gpio, int val) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long mask = bgpio_line2mask(gc, gpio); unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + raw_spin_lock_irqsave(&chip->lock, flags); if (val) - gc->bgpio_data |= mask; + chip->sdata |= mask; else - gc->bgpio_data &= ~mask; + chip->sdata &= ~mask; - gc->write_reg(gc->reg_dat, gc->bgpio_data); + chip->write_reg(chip->reg_dat, chip->sdata); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->lock, flags); return 0; } @@ -239,31 +248,32 @@ static int bgpio_set(struct gpio_chip *gc, unsigned int gpio, int val) static int bgpio_set_with_clear(struct gpio_chip *gc, unsigned int gpio, int val) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long mask = bgpio_line2mask(gc, gpio); if (val) - gc->write_reg(gc->reg_set, mask); + chip->write_reg(chip->reg_set, mask); else - gc->write_reg(gc->reg_clr, mask); + chip->write_reg(chip->reg_clr, mask); return 0; } static int bgpio_set_set(struct gpio_chip *gc, unsigned int gpio, int val) { - unsigned long mask = bgpio_line2mask(gc, gpio); - unsigned long flags; + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + unsigned long mask = bgpio_line2mask(gc, gpio), flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + raw_spin_lock_irqsave(&chip->lock, flags); if (val) - gc->bgpio_data |= mask; + chip->sdata |= mask; else - gc->bgpio_data &= ~mask; + chip->sdata &= ~mask; - gc->write_reg(gc->reg_set, gc->bgpio_data); + chip->write_reg(chip->reg_set, chip->sdata); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->lock, flags); return 0; } @@ -273,12 +283,13 @@ static void bgpio_multiple_get_masks(struct gpio_chip *gc, unsigned long *set_mask, unsigned long *clear_mask) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); int i; *set_mask = 0; *clear_mask = 0; - for_each_set_bit(i, mask, gc->bgpio_bits) { + for_each_set_bit(i, mask, chip->bits) { if (test_bit(i, bits)) *set_mask |= bgpio_line2mask(gc, i); else @@ -291,25 +302,27 @@ static void bgpio_set_multiple_single_reg(struct gpio_chip *gc, unsigned long *bits, void __iomem *reg) { - unsigned long flags; - unsigned long set_mask, clear_mask; + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + unsigned long flags, set_mask, clear_mask; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + raw_spin_lock_irqsave(&chip->lock, flags); bgpio_multiple_get_masks(gc, mask, bits, &set_mask, &clear_mask); - gc->bgpio_data |= set_mask; - gc->bgpio_data &= ~clear_mask; + chip->sdata |= set_mask; + chip->sdata &= ~clear_mask; - gc->write_reg(reg, gc->bgpio_data); + chip->write_reg(reg, chip->sdata); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->lock, flags); } static int bgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask, unsigned long *bits) { - bgpio_set_multiple_single_reg(gc, mask, bits, gc->reg_dat); + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + + bgpio_set_multiple_single_reg(gc, mask, bits, chip->reg_dat); return 0; } @@ -317,7 +330,9 @@ static int bgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask, static int bgpio_set_multiple_set(struct gpio_chip *gc, unsigned long *mask, unsigned long *bits) { - bgpio_set_multiple_single_reg(gc, mask, bits, gc->reg_set); + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + + bgpio_set_multiple_single_reg(gc, mask, bits, chip->reg_set); return 0; } @@ -326,21 +341,24 @@ static int bgpio_set_multiple_with_clear(struct gpio_chip *gc, unsigned long *mask, unsigned long *bits) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long set_mask, clear_mask; bgpio_multiple_get_masks(gc, mask, bits, &set_mask, &clear_mask); if (set_mask) - gc->write_reg(gc->reg_set, set_mask); + chip->write_reg(chip->reg_set, set_mask); if (clear_mask) - gc->write_reg(gc->reg_clr, clear_mask); + chip->write_reg(chip->reg_clr, clear_mask); return 0; } static int bgpio_dir_return(struct gpio_chip *gc, unsigned int gpio, bool dir_out) { - if (!gc->bgpio_pinctrl) + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + + if (!chip->pinctrl) return 0; if (dir_out) @@ -375,39 +393,42 @@ static int bgpio_simple_dir_out(struct gpio_chip *gc, unsigned int gpio, static int bgpio_dir_in(struct gpio_chip *gc, unsigned int gpio) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + raw_spin_lock_irqsave(&chip->lock, flags); - gc->bgpio_dir &= ~bgpio_line2mask(gc, gpio); + chip->sdir &= ~bgpio_line2mask(gc, gpio); - if (gc->reg_dir_in) - gc->write_reg(gc->reg_dir_in, ~gc->bgpio_dir); - if (gc->reg_dir_out) - gc->write_reg(gc->reg_dir_out, gc->bgpio_dir); + if (chip->reg_dir_in) + chip->write_reg(chip->reg_dir_in, ~chip->sdir); + if (chip->reg_dir_out) + chip->write_reg(chip->reg_dir_out, chip->sdir); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->lock, flags); return bgpio_dir_return(gc, gpio, false); } static int bgpio_get_dir(struct gpio_chip *gc, unsigned int gpio) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + /* Return 0 if output, 1 if input */ - if (gc->bgpio_dir_unreadable) { - if (gc->bgpio_dir & bgpio_line2mask(gc, gpio)) + if (chip->dir_unreadable) { + if (chip->sdir & bgpio_line2mask(gc, gpio)) return GPIO_LINE_DIRECTION_OUT; return GPIO_LINE_DIRECTION_IN; } - if (gc->reg_dir_out) { - if (gc->read_reg(gc->reg_dir_out) & bgpio_line2mask(gc, gpio)) + if (chip->reg_dir_out) { + if (chip->read_reg(chip->reg_dir_out) & bgpio_line2mask(gc, gpio)) return GPIO_LINE_DIRECTION_OUT; return GPIO_LINE_DIRECTION_IN; } - if (gc->reg_dir_in) - if (!(gc->read_reg(gc->reg_dir_in) & bgpio_line2mask(gc, gpio))) + if (chip->reg_dir_in) + if (!(chip->read_reg(chip->reg_dir_in) & bgpio_line2mask(gc, gpio))) return GPIO_LINE_DIRECTION_OUT; return GPIO_LINE_DIRECTION_IN; @@ -415,18 +436,19 @@ static int bgpio_get_dir(struct gpio_chip *gc, unsigned int gpio) static void bgpio_dir_out(struct gpio_chip *gc, unsigned int gpio, int val) { + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); unsigned long flags; - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); + raw_spin_lock_irqsave(&chip->lock, flags); - gc->bgpio_dir |= bgpio_line2mask(gc, gpio); + chip->sdir |= bgpio_line2mask(gc, gpio); - if (gc->reg_dir_in) - gc->write_reg(gc->reg_dir_in, ~gc->bgpio_dir); - if (gc->reg_dir_out) - gc->write_reg(gc->reg_dir_out, gc->bgpio_dir); + if (chip->reg_dir_in) + chip->write_reg(chip->reg_dir_in, ~chip->sdir); + if (chip->reg_dir_out) + chip->write_reg(chip->reg_dir_out, chip->sdir); - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); + raw_spin_unlock_irqrestore(&chip->lock, flags); } static int bgpio_dir_out_dir_first(struct gpio_chip *gc, unsigned int gpio, @@ -446,31 +468,30 @@ static int bgpio_dir_out_val_first(struct gpio_chip *gc, unsigned int gpio, } static int bgpio_setup_accessors(struct device *dev, - struct gpio_chip *gc, + struct gpio_generic_chip *chip, bool byte_be) { - - switch (gc->bgpio_bits) { + switch (chip->bits) { case 8: - gc->read_reg = bgpio_read8; - gc->write_reg = bgpio_write8; + chip->read_reg = bgpio_read8; + chip->write_reg = bgpio_write8; break; case 16: if (byte_be) { - gc->read_reg = bgpio_read16be; - gc->write_reg = bgpio_write16be; + chip->read_reg = bgpio_read16be; + chip->write_reg = bgpio_write16be; } else { - gc->read_reg = bgpio_read16; - gc->write_reg = bgpio_write16; + chip->read_reg = bgpio_read16; + chip->write_reg = bgpio_write16; } break; case 32: if (byte_be) { - gc->read_reg = bgpio_read32be; - gc->write_reg = bgpio_write32be; + chip->read_reg = bgpio_read32be; + chip->write_reg = bgpio_write32be; } else { - gc->read_reg = bgpio_read32; - gc->write_reg = bgpio_write32; + chip->read_reg = bgpio_read32; + chip->write_reg = bgpio_write32; } break; #if BITS_PER_LONG >= 64 @@ -480,13 +501,13 @@ static int bgpio_setup_accessors(struct device *dev, "64 bit big endian byte order unsupported\n"); return -EINVAL; } else { - gc->read_reg = bgpio_read64; - gc->write_reg = bgpio_write64; + chip->read_reg = bgpio_read64; + chip->write_reg = bgpio_write64; } break; #endif /* BITS_PER_LONG >= 64 */ default: - dev_err(dev, "unsupported data width %u bits\n", gc->bgpio_bits); + dev_err(dev, "unsupported data width %u bits\n", chip->bits); return -EINVAL; } @@ -515,27 +536,25 @@ static int bgpio_setup_accessors(struct device *dev, * - an input direction register (named "dirin") where a 1 bit indicates * the GPIO is an input. */ -static int bgpio_setup_io(struct gpio_chip *gc, - void __iomem *dat, - void __iomem *set, - void __iomem *clr, - unsigned long flags) +static int bgpio_setup_io(struct gpio_generic_chip *chip, + const struct gpio_generic_chip_config *cfg) { + struct gpio_chip *gc = &chip->gc; - gc->reg_dat = dat; - if (!gc->reg_dat) + chip->reg_dat = cfg->dat; + if (!chip->reg_dat) return -EINVAL; - if (set && clr) { - gc->reg_set = set; - gc->reg_clr = clr; + if (cfg->set && cfg->clr) { + chip->reg_set = cfg->set; + chip->reg_clr = cfg->clr; gc->set = bgpio_set_with_clear; gc->set_multiple = bgpio_set_multiple_with_clear; - } else if (set && !clr) { - gc->reg_set = set; + } else if (cfg->set && !cfg->clr) { + chip->reg_set = cfg->set; gc->set = bgpio_set_set; gc->set_multiple = bgpio_set_multiple_set; - } else if (flags & BGPIOF_NO_OUTPUT) { + } else if (cfg->flags & BGPIOF_NO_OUTPUT) { gc->set = bgpio_set_none; gc->set_multiple = NULL; } else { @@ -543,10 +562,10 @@ static int bgpio_setup_io(struct gpio_chip *gc, gc->set_multiple = bgpio_set_multiple; } - if (!(flags & BGPIOF_UNREADABLE_REG_SET) && - (flags & BGPIOF_READ_OUTPUT_REG_SET)) { + if (!(cfg->flags & BGPIOF_UNREADABLE_REG_SET) && + (cfg->flags & BGPIOF_READ_OUTPUT_REG_SET)) { gc->get = bgpio_get_set; - if (!gc->be_bits) + if (!chip->be_bits) gc->get_multiple = bgpio_get_set_multiple; /* * We deliberately avoid assigning the ->get_multiple() call @@ -557,7 +576,7 @@ static int bgpio_setup_io(struct gpio_chip *gc, */ } else { gc->get = bgpio_get; - if (gc->be_bits) + if (chip->be_bits) gc->get_multiple = bgpio_get_multiple_be; else gc->get_multiple = bgpio_get_multiple; @@ -566,27 +585,27 @@ static int bgpio_setup_io(struct gpio_chip *gc, return 0; } -static int bgpio_setup_direction(struct gpio_chip *gc, - void __iomem *dirout, - void __iomem *dirin, - unsigned long flags) +static int bgpio_setup_direction(struct gpio_generic_chip *chip, + const struct gpio_generic_chip_config *cfg) { - if (dirout || dirin) { - gc->reg_dir_out = dirout; - gc->reg_dir_in = dirin; - if (flags & BGPIOF_NO_SET_ON_INPUT) + struct gpio_chip *gc = &chip->gc; + + if (cfg->dirout || cfg->dirin) { + chip->reg_dir_out = cfg->dirout; + chip->reg_dir_in = cfg->dirin; + if (cfg->flags & BGPIOF_NO_SET_ON_INPUT) gc->direction_output = bgpio_dir_out_dir_first; else gc->direction_output = bgpio_dir_out_val_first; gc->direction_input = bgpio_dir_in; gc->get_direction = bgpio_get_dir; } else { - if (flags & BGPIOF_NO_OUTPUT) + if (cfg->flags & BGPIOF_NO_OUTPUT) gc->direction_output = bgpio_dir_out_err; else gc->direction_output = bgpio_simple_dir_out; - if (flags & BGPIOF_NO_INPUT) + if (cfg->flags & BGPIOF_NO_INPUT) gc->direction_input = bgpio_dir_in_err; else gc->direction_input = bgpio_simple_dir_in; @@ -595,117 +614,101 @@ static int bgpio_setup_direction(struct gpio_chip *gc, return 0; } -static int bgpio_request(struct gpio_chip *chip, unsigned gpio_pin) +static int bgpio_request(struct gpio_chip *gc, unsigned int gpio_pin) { - if (gpio_pin >= chip->ngpio) + struct gpio_generic_chip *chip = to_gpio_generic_chip(gc); + + if (gpio_pin >= gc->ngpio) return -EINVAL; - if (chip->bgpio_pinctrl) - return gpiochip_generic_request(chip, gpio_pin); + if (chip->pinctrl) + return gpiochip_generic_request(gc, gpio_pin); return 0; } /** - * bgpio_init() - Initialize generic GPIO accessor functions - * @gc: the GPIO chip to set up - * @dev: the parent device of the new GPIO chip (compulsory) - * @sz: the size (width) of the MMIO registers in bytes, typically 1, 2 or 4 - * @dat: MMIO address for the register to READ the value of the GPIO lines, it - * is expected that a 1 in the corresponding bit in this register means the - * line is asserted - * @set: MMIO address for the register to SET the value of the GPIO lines, it is - * expected that we write the line with 1 in this register to drive the GPIO line - * high. - * @clr: MMIO address for the register to CLEAR the value of the GPIO lines, it is - * expected that we write the line with 1 in this register to drive the GPIO line - * low. It is allowed to leave this address as NULL, in that case the SET register - * will be assumed to also clear the GPIO lines, by actively writing the line - * with 0. - * @dirout: MMIO address for the register to set the line as OUTPUT. It is assumed - * that setting a line to 1 in this register will turn that line into an - * output line. Conversely, setting the line to 0 will turn that line into - * an input. - * @dirin: MMIO address for the register to set this line as INPUT. It is assumed - * that setting a line to 1 in this register will turn that line into an - * input line. Conversely, setting the line to 0 will turn that line into - * an output. - * @flags: Different flags that will affect the behaviour of the device, such as - * endianness etc. + * gpio_generic_chip_init() - Initialize a generic GPIO chip. + * @chip: Generic GPIO chip to set up. + * @cfg: Generic GPIO chip configuration. + * + * Returns 0 on success, negative error number on failure. */ -int bgpio_init(struct gpio_chip *gc, struct device *dev, - unsigned long sz, void __iomem *dat, void __iomem *set, - void __iomem *clr, void __iomem *dirout, void __iomem *dirin, - unsigned long flags) +int gpio_generic_chip_init(struct gpio_generic_chip *chip, + const struct gpio_generic_chip_config *cfg) { + struct gpio_chip *gc = &chip->gc; + unsigned long flags = cfg->flags; + struct device *dev = cfg->dev; int ret; - if (!is_power_of_2(sz)) + if (!is_power_of_2(cfg->sz)) return -EINVAL; - gc->bgpio_bits = sz * 8; - if (gc->bgpio_bits > BITS_PER_LONG) + chip->bits = cfg->sz * 8; + if (chip->bits > BITS_PER_LONG) return -EINVAL; - raw_spin_lock_init(&gc->bgpio_lock); + raw_spin_lock_init(&chip->lock); gc->parent = dev; gc->label = dev_name(dev); gc->base = -1; gc->request = bgpio_request; - gc->be_bits = !!(flags & BGPIOF_BIG_ENDIAN); + chip->be_bits = !!(flags & BGPIOF_BIG_ENDIAN); ret = gpiochip_get_ngpios(gc, dev); if (ret) - gc->ngpio = gc->bgpio_bits; + gc->ngpio = chip->bits; - ret = bgpio_setup_io(gc, dat, set, clr, flags); + ret = bgpio_setup_io(chip, cfg); if (ret) return ret; - ret = bgpio_setup_accessors(dev, gc, flags & BGPIOF_BIG_ENDIAN_BYTE_ORDER); + ret = bgpio_setup_accessors(dev, chip, + flags & BGPIOF_BIG_ENDIAN_BYTE_ORDER); if (ret) return ret; - ret = bgpio_setup_direction(gc, dirout, dirin, flags); + ret = bgpio_setup_direction(chip, cfg); if (ret) return ret; if (flags & BGPIOF_PINCTRL_BACKEND) { - gc->bgpio_pinctrl = true; + chip->pinctrl = true; /* Currently this callback is only used for pincontrol */ gc->free = gpiochip_generic_free; } - gc->bgpio_data = gc->read_reg(gc->reg_dat); + chip->sdata = chip->read_reg(chip->reg_dat); if (gc->set == bgpio_set_set && !(flags & BGPIOF_UNREADABLE_REG_SET)) - gc->bgpio_data = gc->read_reg(gc->reg_set); + chip->sdata = chip->read_reg(chip->reg_set); if (flags & BGPIOF_UNREADABLE_REG_DIR) - gc->bgpio_dir_unreadable = true; + chip->dir_unreadable = true; /* * Inspect hardware to find initial direction setting. */ - if ((gc->reg_dir_out || gc->reg_dir_in) && + if ((chip->reg_dir_out || chip->reg_dir_in) && !(flags & BGPIOF_UNREADABLE_REG_DIR)) { - if (gc->reg_dir_out) - gc->bgpio_dir = gc->read_reg(gc->reg_dir_out); - else if (gc->reg_dir_in) - gc->bgpio_dir = ~gc->read_reg(gc->reg_dir_in); + if (chip->reg_dir_out) + chip->sdir = chip->read_reg(chip->reg_dir_out); + else if (chip->reg_dir_in) + chip->sdir = ~chip->read_reg(chip->reg_dir_in); /* * If we have two direction registers, synchronise * input setting to output setting, the library * can not handle a line being input and output at * the same time. */ - if (gc->reg_dir_out && gc->reg_dir_in) - gc->write_reg(gc->reg_dir_in, ~gc->bgpio_dir); + if (chip->reg_dir_out && chip->reg_dir_in) + chip->write_reg(chip->reg_dir_in, ~chip->sdir); } return ret; } -EXPORT_SYMBOL_GPL(bgpio_init); +EXPORT_SYMBOL_GPL(gpio_generic_chip_init); #if IS_ENABLED(CONFIG_GPIO_GENERIC_PLATFORM) diff --git a/drivers/gpio/gpio-mpc8xxx.c b/drivers/gpio/gpio-mpc8xxx.c index 38643fb813c562957076aab48d804f8048cee5e4..2bb6100840ea27fb63ce7cdc3e1eb3e43526eb4d 100644 --- a/drivers/gpio/gpio-mpc8xxx.c +++ b/drivers/gpio/gpio-mpc8xxx.c @@ -71,7 +71,7 @@ static int mpc8572_gpio_get(struct gpio_chip *gc, unsigned int gpio) mpc8xxx_gc->regs + GPIO_DIR); val = gpio_generic_read_reg(&mpc8xxx_gc->chip, mpc8xxx_gc->regs + GPIO_DAT) & ~out_mask; - out_shadow = gc->bgpio_data & out_mask; + out_shadow = mpc8xxx_gc->chip.sdata & out_mask; return !!((val | out_shadow) & mpc_pin2mask(gpio)); } @@ -399,7 +399,8 @@ static int mpc8xxx_probe(struct platform_device *pdev) gpio_generic_write_reg(&mpc8xxx_gc->chip, mpc8xxx_gc->regs + GPIO_IBE, 0xffffffff); /* Also, latch state of GPIOs configured as output by bootloader. */ - gc->bgpio_data = gpio_generic_read_reg(&mpc8xxx_gc->chip, + mpc8xxx_gc->chip.sdata = + gpio_generic_read_reg(&mpc8xxx_gc->chip, mpc8xxx_gc->regs + GPIO_DAT) & gpio_generic_read_reg(&mpc8xxx_gc->chip, mpc8xxx_gc->regs + GPIO_DIR); diff --git a/include/linux/gpio/driver.h b/include/linux/gpio/driver.h index 9fcd4a988081f74d25dc88535705ba9265e56fd2..9b14fd20f13eee7d465e065e7ded2c92e2bbc78e 100644 --- a/include/linux/gpio/driver.h +++ b/include/linux/gpio/driver.h @@ -388,28 +388,6 @@ struct gpio_irq_chip { * implies that if the chip supports IRQs, these IRQs need to be threaded * as the chip access may sleep when e.g. reading out the IRQ status * registers. - * @read_reg: reader function for generic GPIO - * @write_reg: writer function for generic GPIO - * @be_bits: if the generic GPIO has big endian bit order (bit 31 is representing - * line 0, bit 30 is line 1 ... bit 0 is line 31) this is set to true by the - * generic GPIO core. It is for internal housekeeping only. - * @reg_dat: data (in) register for generic GPIO - * @reg_set: output set register (out=high) for generic GPIO - * @reg_clr: output clear register (out=low) for generic GPIO - * @reg_dir_out: direction out setting register for generic GPIO - * @reg_dir_in: direction in setting register for generic GPIO - * @bgpio_dir_unreadable: indicates that the direction register(s) cannot - * be read and we need to rely on out internal state tracking. - * @bgpio_pinctrl: the generic GPIO uses a pin control backend. - * @bgpio_bits: number of register bits used for a generic GPIO i.e. - * * 8 - * @bgpio_lock: used to lock chip->bgpio_data. Also, this is needed to keep - * shadowed and real data registers writes together. - * @bgpio_data: shadowed data register for generic GPIO to clear/set bits - * safely. - * @bgpio_dir: shadowed direction register for generic GPIO to clear/set - * direction safely. A "1" in this word means the line is set as - * output. * * A gpio_chip can help platforms abstract various sources of GPIOs so * they can all be accessed through a common programming interface. @@ -475,23 +453,6 @@ struct gpio_chip { const char *const *names; bool can_sleep; -#if IS_ENABLED(CONFIG_GPIO_GENERIC) - unsigned long (*read_reg)(void __iomem *reg); - void (*write_reg)(void __iomem *reg, unsigned long data); - bool be_bits; - void __iomem *reg_dat; - void __iomem *reg_set; - void __iomem *reg_clr; - void __iomem *reg_dir_out; - void __iomem *reg_dir_in; - bool bgpio_dir_unreadable; - bool bgpio_pinctrl; - int bgpio_bits; - raw_spinlock_t bgpio_lock; - unsigned long bgpio_data; - unsigned long bgpio_dir; -#endif /* CONFIG_GPIO_GENERIC */ - #ifdef CONFIG_GPIOLIB_IRQCHIP /* * With CONFIG_GPIOLIB_IRQCHIP we get an irqchip inside the gpiolib @@ -723,11 +684,6 @@ int gpiochip_populate_parent_fwspec_fourcell(struct gpio_chip *gc, #endif /* CONFIG_IRQ_DOMAIN_HIERARCHY */ -int bgpio_init(struct gpio_chip *gc, struct device *dev, - unsigned long sz, void __iomem *dat, void __iomem *set, - void __iomem *clr, void __iomem *dirout, void __iomem *dirin, - unsigned long flags); - #define BGPIOF_BIG_ENDIAN BIT(0) #define BGPIOF_UNREADABLE_REG_SET BIT(1) /* reg_set is unreadable */ #define BGPIOF_UNREADABLE_REG_DIR BIT(2) /* reg_dir is unreadable */ diff --git a/include/linux/gpio/generic.h b/include/linux/gpio/generic.h index 4c0626b53ec90388a034bc7797eefa53e7ea064e..162430d96660e96b995eb4a2e64183503fc618e3 100644 --- a/include/linux/gpio/generic.h +++ b/include/linux/gpio/generic.h @@ -50,9 +50,44 @@ struct gpio_generic_chip_config { * struct gpio_generic_chip - Generic GPIO chip implementation. * @gc: The underlying struct gpio_chip object, implementing low-level GPIO * chip routines. + * @read_reg: reader function for generic GPIO + * @write_reg: writer function for generic GPIO + * @be_bits: if the generic GPIO has big endian bit order (bit 31 is + * representing line 0, bit 30 is line 1 ... bit 0 is line 31) this + * is set to true by the generic GPIO core. It is for internal + * housekeeping only. + * @reg_dat: data (in) register for generic GPIO + * @reg_set: output set register (out=high) for generic GPIO + * @reg_clr: output clear register (out=low) for generic GPIO + * @reg_dir_out: direction out setting register for generic GPIO + * @reg_dir_in: direction in setting register for generic GPIO + * @dir_unreadable: indicates that the direction register(s) cannot be read and + * we need to rely on out internal state tracking. + * @pinctrl: the generic GPIO uses a pin control backend. + * @bits: number of register bits used for a generic GPIO + * i.e. * 8 + * @lock: used to lock chip->sdata. Also, this is needed to keep + * shadowed and real data registers writes together. + * @sdata: shadowed data register for generic GPIO to clear/set bits safely. + * @sdir: shadowed direction register for generic GPIO to clear/set direction + * safely. A "1" in this word means the line is set as output. */ struct gpio_generic_chip { struct gpio_chip gc; + unsigned long (*read_reg)(void __iomem *reg); + void (*write_reg)(void __iomem *reg, unsigned long data); + bool be_bits; + void __iomem *reg_dat; + void __iomem *reg_set; + void __iomem *reg_clr; + void __iomem *reg_dir_out; + void __iomem *reg_dir_in; + bool dir_unreadable; + bool pinctrl; + int bits; + raw_spinlock_t lock; + unsigned long sdata; + unsigned long sdir; }; static inline struct gpio_generic_chip * @@ -61,20 +96,8 @@ to_gpio_generic_chip(struct gpio_chip *gc) return container_of(gc, struct gpio_generic_chip, gc); } -/** - * gpio_generic_chip_init() - Initialize a generic GPIO chip. - * @chip: Generic GPIO chip to set up. - * @cfg: Generic GPIO chip configuration. - * - * Returns 0 on success, negative error number on failure. - */ -static inline int -gpio_generic_chip_init(struct gpio_generic_chip *chip, - const struct gpio_generic_chip_config *cfg) -{ - return bgpio_init(&chip->gc, cfg->dev, cfg->sz, cfg->dat, cfg->set, - cfg->clr, cfg->dirout, cfg->dirin, cfg->flags); -} +int gpio_generic_chip_init(struct gpio_generic_chip *chip, + const struct gpio_generic_chip_config *cfg); /** * gpio_generic_chip_set() - Set the GPIO line value of the generic GPIO chip. @@ -110,10 +133,10 @@ gpio_generic_chip_set(struct gpio_generic_chip *chip, unsigned int offset, static inline unsigned long gpio_generic_read_reg(struct gpio_generic_chip *chip, void __iomem *reg) { - if (WARN_ON(!chip->gc.read_reg)) + if (WARN_ON(!chip->read_reg)) return 0; - return chip->gc.read_reg(reg); + return chip->read_reg(reg); } /** @@ -125,23 +148,23 @@ gpio_generic_read_reg(struct gpio_generic_chip *chip, void __iomem *reg) static inline void gpio_generic_write_reg(struct gpio_generic_chip *chip, void __iomem *reg, unsigned long val) { - if (WARN_ON(!chip->gc.write_reg)) + if (WARN_ON(!chip->write_reg)) return; - chip->gc.write_reg(reg, val); + chip->write_reg(reg, val); } #define gpio_generic_chip_lock(gen_gc) \ - raw_spin_lock(&(gen_gc)->gc.bgpio_lock) + raw_spin_lock(&(gen_gc)->lock) #define gpio_generic_chip_unlock(gen_gc) \ - raw_spin_unlock(&(gen_gc)->gc.bgpio_lock) + raw_spin_unlock(&(gen_gc)->lock) #define gpio_generic_chip_lock_irqsave(gen_gc, flags) \ - raw_spin_lock_irqsave(&(gen_gc)->gc.bgpio_lock, flags) + raw_spin_lock_irqsave(&(gen_gc)->lock, flags) #define gpio_generic_chip_unlock_irqrestore(gen_gc, flags) \ - raw_spin_unlock_irqrestore(&(gen_gc)->gc.bgpio_lock, flags) + raw_spin_unlock_irqrestore(&(gen_gc)->lock, flags) DEFINE_LOCK_GUARD_1(gpio_generic_lock, struct gpio_generic_chip, -- 2.48.1 From andy.shevchenko at gmail.com Wed Sep 10 00:19:07 2025 From: andy.shevchenko at gmail.com (Andy Shevchenko) Date: Wed, 10 Sep 2025 10:19:07 +0300 Subject: [PATCH v2 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-13-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> <20250910-gpio-mmio-gpio-conv-part4-v2-13-f3d1a4c57124@linaro.org> Message-ID: On Wed, Sep 10, 2025 at 10:13?AM Bartosz Golaszewski wrote: > > Convert the driver to using the new generic GPIO chip interfaces from > linux/gpio/generic.h. In case you want to take it Reviewed-by: Andy Shevchenko Otherwise I can take it via my tree and then PR to you. -- With Best Regards, Andy Shevchenko From brgl at bgdev.pl Wed Sep 10 00:28:59 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Wed, 10 Sep 2025 09:28:59 +0200 Subject: [PATCH v2 13/15] gpio: sodaville: use new generic GPIO chip API In-Reply-To: References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> <20250910-gpio-mmio-gpio-conv-part4-v2-13-f3d1a4c57124@linaro.org> Message-ID: On Wed, Sep 10, 2025 at 9:19?AM Andy Shevchenko wrote: > > On Wed, Sep 10, 2025 at 10:13?AM Bartosz Golaszewski wrote: > > > > Convert the driver to using the new generic GPIO chip interfaces from > > linux/gpio/generic.h. > > In case you want to take it > Reviewed-by: Andy Shevchenko > Otherwise I can take it via my tree and then PR to you. > I would prefer to apply the whole series directly, this way the conversion will be done in one go. Bart From zhang.lyra at gmail.com Wed Sep 10 01:25:28 2025 From: zhang.lyra at gmail.com (Chunyan Zhang) Date: Wed, 10 Sep 2025 16:25:28 +0800 Subject: [PATCH V10 1/5] mm: softdirty: Add pte_soft_dirty_available() In-Reply-To: <6b2f12aa-8ed9-476d-a69d-f05ea526f16a@redhat.com> References: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> <20250909095611.803898-2-zhangchunyan@iscas.ac.cn> <6b2f12aa-8ed9-476d-a69d-f05ea526f16a@redhat.com> Message-ID: Hi David, On Tue, 9 Sept 2025 at 19:42, David Hildenbrand wrote: > > On 09.09.25 11:56, Chunyan Zhang wrote: > > Some platforms can customize the PTE soft dirty bit and make it unavailable > > even if the architecture allows providing the PTE resource. > > > > Add an API which architectures can define their specific implementations > > to detect if the PTE soft-dirty bit is available, on which the kernel > > is running. > > > > Signed-off-by: Chunyan Zhang > > --- > > fs/proc/task_mmu.c | 17 ++++++++++++++++- > > include/linux/pgtable.h | 10 ++++++++++ > > mm/debug_vm_pgtable.c | 9 +++++---- > > mm/huge_memory.c | 10 ++++++---- > > mm/internal.h | 2 +- > > mm/mremap.c | 10 ++++++---- > > mm/userfaultfd.c | 6 ++++-- > > 7 files changed, 48 insertions(+), 16 deletions(-) > > > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > > index 29cca0e6d0ff..20a609ec1ba6 100644 > > --- a/fs/proc/task_mmu.c > > +++ b/fs/proc/task_mmu.c > > @@ -1058,7 +1058,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) > > * -Werror=unterminated-string-initialization warning > > * with GCC 15 > > */ > > - static const char mnemonics[BITS_PER_LONG][3] = { > > + static char mnemonics[BITS_PER_LONG][3] = { > > /* > > * In case if we meet a flag we don't know about. > > */ > > @@ -1129,6 +1129,16 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) > > [ilog2(VM_SEALED)] = "sl", > > #endif > > }; > > +/* > > + * We should remove the VM_SOFTDIRTY flag if the PTE soft-dirty bit is > > + * unavailable on which the kernel is running, even if the architecture > > + * allows providing the PTE resource and soft-dirty is compiled in. > > + */ > > +#ifdef CONFIG_MEM_SOFT_DIRTY > > + if (!pte_soft_dirty_available()) > > + mnemonics[ilog2(VM_SOFTDIRTY)][0] = 0; > > +#endif > > + > > size_t i; > > > > seq_puts(m, "VmFlags: "); > > @@ -1531,6 +1541,8 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, > > static inline void clear_soft_dirty(struct vm_area_struct *vma, > > unsigned long addr, pte_t *pte) > > { > > + if (!pte_soft_dirty_available()) > > + return; > > /* > > * The soft-dirty tracker uses #PF-s to catch writes > > * to pages, so write-protect the pte as well. See the > > @@ -1566,6 +1578,9 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > > { > > pmd_t old, pmd = *pmdp; > > > > + if (!pte_soft_dirty_available()) > > + return; > > + > > if (pmd_present(pmd)) { > > /* See comment in change_huge_pmd() */ > > old = pmdp_invalidate(vma, addr, pmdp); > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > > index 4c035637eeb7..c0e2a6dc69f4 100644 > > --- a/include/linux/pgtable.h > > +++ b/include/linux/pgtable.h > > @@ -1538,6 +1538,15 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) > > #endif > > > > #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY > > + > > +/* > > + * Some platforms can customize the PTE soft dirty bit and make it unavailable > > + * even if the architecture allows providing the PTE resource. > > + */ > > +#ifndef pte_soft_dirty_available > > +#define pte_soft_dirty_available() (true) > > +#endif > > + > > #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION > > static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) > > { > > @@ -1555,6 +1564,7 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) > > } > > #endif > > #else /* !CONFIG_HAVE_ARCH_SOFT_DIRTY */ > > +#define pte_soft_dirty_available() (false) > > static inline int pte_soft_dirty(pte_t pte) > > { > > return 0; > > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > > index 830107b6dd08..98ed7e22ccec 100644 > > --- a/mm/debug_vm_pgtable.c > > +++ b/mm/debug_vm_pgtable.c > > @@ -690,7 +690,7 @@ static void __init pte_soft_dirty_tests(struct pgtable_debug_args *args) > > { > > pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); > > > > - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > > + if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || !pte_soft_dirty_available()) > > I suggest that you instead make pte_soft_dirty_available() be false without CONFIG_MEM_SOFT_DIRTY. > > e.g., for the default implementation > > define pte_soft_dirty_available() IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) > > That way you can avoid some ifefs and cleanup these checks. Do you mean something like this: --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1538,6 +1538,16 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) #endif #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY +#ifndef arch_soft_dirty_available +#define arch_soft_dirty_available() (true) +#endif +#define pgtable_soft_dirty_supported() (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && arch_soft_dirty_available()) + #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) { @@ -1555,6 +1565,7 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) } #endif #else /* !CONFIG_HAVE_ARCH_SOFT_DIRTY */ +#define pgtable_soft_dirty_supported() (false) > > > But as we do also have PMD soft-dirty support, I guess we would want to call this > something more abstract "pgtable_soft_dirty_available" or "pgtable_soft_dirty_supported" > > -- > Cheers > > David / dhildenb > From kingxukai at zohomail.com Wed Sep 10 01:38:11 2025 From: kingxukai at zohomail.com (Xukai Wang) Date: Wed, 10 Sep 2025 16:38:11 +0800 Subject: [PATCH v8 2/3] clk: canaan: Add clock driver for Canaan K230 In-Reply-To: <8ca70773-42b0-4dcc-8b54-338594e9a8ea@iscas.ac.cn> References: <20250905-b4-k230-clk-v8-0-96caa02d5428@zohomail.com> <20250905-b4-k230-clk-v8-2-96caa02d5428@zohomail.com> <0947d9cc-86ba-46e0-92aa-04f4714e7a20@zohomail.com> <8ca70773-42b0-4dcc-8b54-338594e9a8ea@iscas.ac.cn> Message-ID: <9a0eedb5-1d90-4bdb-9bc3-4b3ade29cc2f@zohomail.com> On 2025/9/9 15:02, Vivian Wang wrote: > On 9/8/25 22:13, Xukai Wang wrote: >>>> [...] >>>> >>>> + >>>> +static int k230_clk_set_rate_mul_div(struct clk_hw *hw, unsigned long rate, >>>> + unsigned long parent_rate) >>>> +{ >>>> + struct k230_clk_rate *clk = hw_to_k230_clk_rate(hw); >>>> + struct k230_clk_rate_self *rate_self = &clk->clk; >>>> + u32 div, mul, div_reg, mul_reg; >>>> + >>>> + if (rate > parent_rate) >>>> + return -EINVAL; >>>> + >>>> + if (rate_self->read_only) >>>> + return 0; >>>> + >>>> + if (k230_clk_find_approximate_mul_div(rate_self->mul_min, rate_self->mul_max, >>>> + rate_self->div_min, rate_self->div_max, >>>> + rate, parent_rate, &div, &mul)) >>>> + return -EINVAL; >>>> + >>>> + guard(spinlock)(rate_self->lock); >>>> + >>>> + div_reg = readl(rate_self->reg + clk->div_reg_off); >>>> + div_reg |= ((div - 1) & rate_self->div_mask) << (rate_self->div_shift); >>>> + div_reg |= BIT(rate_self->write_enable_bit); >>>> + writel(div_reg, rate_self->reg + clk->div_reg_off); >>>> + >>>> + mul_reg = readl(rate_self->reg + clk->mul_reg_off); >>>> + mul_reg |= ((mul - 1) & rate_self->mul_mask) << (rate_self->mul_shift); >>>> + mul_reg |= BIT(rate_self->write_enable_bit); >>>> + writel(mul_reg, rate_self->reg + clk->mul_reg_off); >>>> + >>>> + return 0; >>>> +} >>> There are three variants of rate clocks, mul-only, div-only and mul-div >>> ones, which are similar to clk-multiplier, clk-divider, >>> clk-fractional-divider. >>> >>> The only difference is to setup new parameters for K230's rate clocks, >>> a register bit, described as k230_clk_rate_self.write_enable_bit, must >>> be set first. >> Actually, I think the differences are not limited to just the >> write_enable_bit. There are also distinct mul_min, mul_max, div_min, and >> div_max values, which are not typically just 1 and (1 << bit_width) as >> in standard clock divider or multiplier structures. > So the part I have been thinking about is, consider just checking the > {mul,div}_{min,max} values to determine which kind it is? As is this is > just redundant information, since you can infer whether there is a > configurable multiplier by checking if mul_{min,max} are equal. Same for > div_{min,max}. > > Vivian "dramforever" Wang Thanks for pointing it out. I see your idea, but I don?t think it?s necessary to determine the clock type from {mul,div}_{min,max} dynamically since we already statically specify each mul, div, and mul-div clock by different macros. From david at redhat.com Wed Sep 10 01:51:10 2025 From: david at redhat.com (David Hildenbrand) Date: Wed, 10 Sep 2025 10:51:10 +0200 Subject: [PATCH V10 1/5] mm: softdirty: Add pte_soft_dirty_available() In-Reply-To: References: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> <20250909095611.803898-2-zhangchunyan@iscas.ac.cn> <6b2f12aa-8ed9-476d-a69d-f05ea526f16a@redhat.com> Message-ID: <8f9a4a13-2881-4baf-ab62-3d0d79e0cd3c@redhat.com> On 10.09.25 10:25, Chunyan Zhang wrote: > Hi David, > > On Tue, 9 Sept 2025 at 19:42, David Hildenbrand wrote: >> >> On 09.09.25 11:56, Chunyan Zhang wrote: >>> Some platforms can customize the PTE soft dirty bit and make it unavailable >>> even if the architecture allows providing the PTE resource. >>> >>> Add an API which architectures can define their specific implementations >>> to detect if the PTE soft-dirty bit is available, on which the kernel >>> is running. >>> >>> Signed-off-by: Chunyan Zhang >>> --- >>> fs/proc/task_mmu.c | 17 ++++++++++++++++- >>> include/linux/pgtable.h | 10 ++++++++++ >>> mm/debug_vm_pgtable.c | 9 +++++---- >>> mm/huge_memory.c | 10 ++++++---- >>> mm/internal.h | 2 +- >>> mm/mremap.c | 10 ++++++---- >>> mm/userfaultfd.c | 6 ++++-- >>> 7 files changed, 48 insertions(+), 16 deletions(-) >>> >>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c >>> index 29cca0e6d0ff..20a609ec1ba6 100644 >>> --- a/fs/proc/task_mmu.c >>> +++ b/fs/proc/task_mmu.c >>> @@ -1058,7 +1058,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) >>> * -Werror=unterminated-string-initialization warning >>> * with GCC 15 >>> */ >>> - static const char mnemonics[BITS_PER_LONG][3] = { >>> + static char mnemonics[BITS_PER_LONG][3] = { >>> /* >>> * In case if we meet a flag we don't know about. >>> */ >>> @@ -1129,6 +1129,16 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) >>> [ilog2(VM_SEALED)] = "sl", >>> #endif >>> }; >>> +/* >>> + * We should remove the VM_SOFTDIRTY flag if the PTE soft-dirty bit is >>> + * unavailable on which the kernel is running, even if the architecture >>> + * allows providing the PTE resource and soft-dirty is compiled in. >>> + */ >>> +#ifdef CONFIG_MEM_SOFT_DIRTY >>> + if (!pte_soft_dirty_available()) >>> + mnemonics[ilog2(VM_SOFTDIRTY)][0] = 0; >>> +#endif >>> + >>> size_t i; >>> >>> seq_puts(m, "VmFlags: "); >>> @@ -1531,6 +1541,8 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, >>> static inline void clear_soft_dirty(struct vm_area_struct *vma, >>> unsigned long addr, pte_t *pte) >>> { >>> + if (!pte_soft_dirty_available()) >>> + return; >>> /* >>> * The soft-dirty tracker uses #PF-s to catch writes >>> * to pages, so write-protect the pte as well. See the >>> @@ -1566,6 +1578,9 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, >>> { >>> pmd_t old, pmd = *pmdp; >>> >>> + if (!pte_soft_dirty_available()) >>> + return; >>> + >>> if (pmd_present(pmd)) { >>> /* See comment in change_huge_pmd() */ >>> old = pmdp_invalidate(vma, addr, pmdp); >>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h >>> index 4c035637eeb7..c0e2a6dc69f4 100644 >>> --- a/include/linux/pgtable.h >>> +++ b/include/linux/pgtable.h >>> @@ -1538,6 +1538,15 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) >>> #endif >>> >>> #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY >>> + >>> +/* >>> + * Some platforms can customize the PTE soft dirty bit and make it unavailable >>> + * even if the architecture allows providing the PTE resource. >>> + */ >>> +#ifndef pte_soft_dirty_available >>> +#define pte_soft_dirty_available() (true) >>> +#endif >>> + >>> #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION >>> static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) >>> { >>> @@ -1555,6 +1564,7 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) >>> } >>> #endif >>> #else /* !CONFIG_HAVE_ARCH_SOFT_DIRTY */ >>> +#define pte_soft_dirty_available() (false) >>> static inline int pte_soft_dirty(pte_t pte) >>> { >>> return 0; >>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c >>> index 830107b6dd08..98ed7e22ccec 100644 >>> --- a/mm/debug_vm_pgtable.c >>> +++ b/mm/debug_vm_pgtable.c >>> @@ -690,7 +690,7 @@ static void __init pte_soft_dirty_tests(struct pgtable_debug_args *args) >>> { >>> pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); >>> >>> - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) >>> + if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || !pte_soft_dirty_available()) >> >> I suggest that you instead make pte_soft_dirty_available() be false without CONFIG_MEM_SOFT_DIRTY. >> >> e.g., for the default implementation >> >> define pte_soft_dirty_available() IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) >> >> That way you can avoid some ifefs and cleanup these checks. > > Do you mean something like this: > > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -1538,6 +1538,16 @@ static inline pgprot_t pgprot_modify(pgprot_t > oldprot, pgprot_t newprot) > #endif > > #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY > +#ifndef arch_soft_dirty_available > +#define arch_soft_dirty_available() (true) > +#endif > +#define pgtable_soft_dirty_supported() > (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && arch_soft_dirty_available()) > + > #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION > static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) > { > @@ -1555,6 +1565,7 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) > } > #endif > #else /* !CONFIG_HAVE_ARCH_SOFT_DIRTY */ > +#define pgtable_soft_dirty_supported() (false) Maybe we can simplify to #ifndef pgtable_soft_dirty_supported #define pgtable_soft_dirty_supported() IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) #endif And then just let the arch that overrides this function just make it respect IS_ENABLED(CONFIG_MEM_SOFT_DIRTY). -- Cheers David / dhildenb From tianruidong at linux.alibaba.com Wed Sep 10 02:33:42 2025 From: tianruidong at linux.alibaba.com (Ruidong Tian) Date: Wed, 10 Sep 2025 17:33:42 +0800 Subject: [RFC PATCH 0/5] riscv: Handle synchronous hardware error exception Message-ID: <20250910093347.75822-1-tianruidong@linux.alibaba.com> Hi all, This patch series introduces support for handling synchronous hardware errors on RISC-V, laying the groundwork for more robust kernel-mode error recovery. 1. Background Hardware error reporting mechanisms typically fall into two categories: asynchronous and synchronous. - Asynchronous errors (e.g., memory scrubbing errors) repoted by a asynchronous exceptions or a interrupt, are usually handled by GHES subsystems. For instance, ARM uses SDEI, and a similar SSE specification is being proposed for RISC-V. - Synchronous errors (e.g., reading poisoned data) cause the processor core to take a precise exception. This is known as a Synchronous External Abort (SEA) on ARM, a Machine Check Exception (MCE) on x86, and is designated as trap with mcause 19 on RISC-V. Discussions within the RVI PRS TG have already led to proposals[0] to UEFI for standardizing two notification methods, SSE and Hardware Error Exception, on RISC-V. This series focuses on implementing Hardware Error Exception notification to handle synchronous errors. Himanshu Chauhan has already started working on SSE[1]. 2. Motivation While a synchronous hardware errors occurring in kernel context (e.g., during get_user, put_user, CoW, etc.). The kernel requires a fixup mechanism (via extable) to recover from such errors and prevent a system panic. However, the APEI/GHES subsystem, being asynchronous, cannot directly leverage the synchronous extable fixup path. By handling the synchronous exception directly, we enable the use of this fixup mechanism, allowing the kernel to gracefully recover from hardware errors encountered during kernel execution. This brings RISC-V's error handling capabilities closer to the robustness found on ARM[2] and x86[3]. 3. What This Patch Series Does This initial series lays the foundational infrastructure. It primarily: - Introduces a new exception handler for synchronous hardware errors (mcause=19). - Establishes the core exception path, which is a prerequisite for kernel context error recovery. Please note that this version does not yet implement the full kernel fixup logic for recovery. That functionality is planned for the next formal version. Some adaptations for GHES are included, based on the work from Himanshu Chauhan[1] 4. Future Plans - Implement full kernel fixup support to handle and recover from errors in some kernel context[2]. - Add support for handling "double trap" scenarios. 5. Testing Methodology test program: ras-tools: https://kernel.googlesource.com/pub/scm/linux/kernel/git/aegl/ras-tools/ qemu: https://github.com/winterddd/qemu offcial opensbi and edk2: - Run qemu: qemu-system-riscv64 -M virt,pflash0=pflash0,pflash1=pflash1,acpi=on,aia=aplic-imsic -cpu max -m 64G -smp 64 -device virtio-gpu-pci -full-screen -device qemu-xhci -device usb-kbd -device virtio-rng-pci -blockdev node-name=pflash0,driver=file,read-only=on,filename=RISCV_VIRT_CODE.fd -blockdev node-name=pflash1,driver=file,filename=RISCV_VIRT_VARS.fd -bios fw_dynamic.bin -device virtio-net-device,netdev=net0 -netdev user,id=net0,hostfwd=tcp::2223-:22 -kernel Image -initrd rootfs -append "rdinit=/sbin/init earlycon verbose debug strict_devmem=0 nokaslr" -monitor telnet:127.0.0.1:5557,server,nowait -nographic - Run ras-tools: ./einj_mem_uc -j -k single & $ 0: single ? vaddr = 0x7fff86ff4400 paddr = 107d11b400 - Inject poison telnet localhost 5557 poison_enable on poison_add 0x107d11b400 - Read poison echo trigger > ./trigger_start $ triggering ... $ signal 7 code 3 addr 0x7fff86ff4400 [0]: https://lists.riscv.org/g/tech-prs/topic/risc_v_ras_related_ecrs/113685653 [1]: https://patchew.org/linux/20250227123628.2931490-1-hchauhan at ventanamicro.com/ [2]: https://lore.kernel.org/lkml/20241209024257.3618492-1-tongtiangen at huawei.com/ [3]: https://github.com/torvalds/linux/blob/9dd1835ecda5b96ac88c166f4a87386f3e727bd9/arch/x86/kernel/cpu/mce/core.c#L1514 Himanshu Chauhan (2): riscv: Define ioremap_cache for RISC-V riscv: Define arch_apei_get_mem_attribute for RISC-V Ruidong Tian (3): acpi: Introduce SSE and HEE in HEST notification types riscv: Introduce HEST HEE notification handlers for APEI riscv: Add Hardware Error Exception trap handler arch/riscv/Kconfig | 1 + arch/riscv/include/asm/acpi.h | 22 +++++++++++++ arch/riscv/include/asm/fixmap.h | 6 ++++ arch/riscv/include/asm/io.h | 3 ++ arch/riscv/kernel/acpi.c | 55 +++++++++++++++++++++++++++++++ arch/riscv/kernel/entry.S | 4 +++ arch/riscv/kernel/traps.c | 19 +++++++++++ drivers/acpi/apei/Kconfig | 12 +++++++ drivers/acpi/apei/ghes.c | 58 +++++++++++++++++++++++++++++++++ include/acpi/actbl1.h | 4 ++- include/acpi/ghes.h | 6 ++++ 11 files changed, 189 insertions(+), 1 deletion(-) -- 2.43.7 From tianruidong at linux.alibaba.com Wed Sep 10 02:33:43 2025 From: tianruidong at linux.alibaba.com (Ruidong Tian) Date: Wed, 10 Sep 2025 17:33:43 +0800 Subject: [RFC PATCH 1/5] riscv: Define ioremap_cache for RISC-V In-Reply-To: <20250910093347.75822-1-tianruidong@linux.alibaba.com> References: <20250910093347.75822-1-tianruidong@linux.alibaba.com> Message-ID: <20250910093347.75822-2-tianruidong@linux.alibaba.com> From: Himanshu Chauhan bert and einj drivers use ioremap_cache for mapping entries but ioremap_cache is not defined for RISC-V. Signed-off-by: Himanshu Chauhan --- arch/riscv/include/asm/io.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h index a0e51840b9db..56eca6b3031f 100644 --- a/arch/riscv/include/asm/io.h +++ b/arch/riscv/include/asm/io.h @@ -30,6 +30,9 @@ #define PCI_IOBASE ((void __iomem *)PCI_IO_START) #endif /* CONFIG_MMU */ +#define ioremap_cache(addr, size) \ + ((__force void *)ioremap_prot((addr), (size), __pgprot(_PAGE_KERNEL))) + /* * Emulation routines for the port-mapped IO space used by some PCI drivers. * These are defined as being "fully synchronous", but also "not guaranteed to -- 2.43.7 From tianruidong at linux.alibaba.com Wed Sep 10 02:33:44 2025 From: tianruidong at linux.alibaba.com (Ruidong Tian) Date: Wed, 10 Sep 2025 17:33:44 +0800 Subject: [RFC PATCH 2/5] riscv: Define arch_apei_get_mem_attribute for RISC-V In-Reply-To: <20250910093347.75822-1-tianruidong@linux.alibaba.com> References: <20250910093347.75822-1-tianruidong@linux.alibaba.com> Message-ID: <20250910093347.75822-3-tianruidong@linux.alibaba.com> From: Himanshu Chauhan ghes_map function uses arch_apei_get_mem_attribute to get the protection bits for a given physical address. These protection bits are then used to map the physical address. Signed-off-by: Himanshu Chauhan --- arch/riscv/include/asm/acpi.h | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/arch/riscv/include/asm/acpi.h b/arch/riscv/include/asm/acpi.h index 6e13695120bc..0c599452ef48 100644 --- a/arch/riscv/include/asm/acpi.h +++ b/arch/riscv/include/asm/acpi.h @@ -27,6 +27,26 @@ extern int acpi_disabled; extern int acpi_noirq; extern int acpi_pci_disabled; +#ifdef CONFIG_ACPI_APEI +/* + * acpi_disable_cmcff is used in drivers/acpi/apei/hest.c for disabling + * IA-32 Architecture Corrected Machine Check (CMC) Firmware-First mode + * with a kernel command line parameter "acpi=nocmcoff". But we don't + * have this IA-32 specific feature on ARM64, this definition is only + * for compatibility. + */ +#define acpi_disable_cmcff 1 +static inline pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr) +{ + /* + * Until we have a way to look for EFI memory attributes. + */ + return PAGE_KERNEL; +} +#else /* CONFIG_ACPI_APEI */ +#define acpi_disable_cmcff 0 +#endif /* !CONFIG_ACPI_APEI */ + static inline void disable_acpi(void) { acpi_disabled = 1; -- 2.43.7 From tianruidong at linux.alibaba.com Wed Sep 10 02:33:45 2025 From: tianruidong at linux.alibaba.com (Ruidong Tian) Date: Wed, 10 Sep 2025 17:33:45 +0800 Subject: [RFC PATCH 3/5] acpi: Introduce SSE and HEE in HEST notification types In-Reply-To: <20250910093347.75822-1-tianruidong@linux.alibaba.com> References: <20250910093347.75822-1-tianruidong@linux.alibaba.com> Message-ID: <20250910093347.75822-4-tianruidong@linux.alibaba.com> Introduce atwo new HEST notification type for RISC-V Hardware Error Exception and SSE. The GHES entry's notification structure contains the notification to be used for a given error source. Signed-off-by: Ruidong Tian --- include/acpi/actbl1.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h index 99fd1588ff38..0f04ef10f510 100644 --- a/include/acpi/actbl1.h +++ b/include/acpi/actbl1.h @@ -1534,7 +1534,9 @@ enum acpi_hest_notify_types { ACPI_HEST_NOTIFY_SEI = 9, /* ACPI 6.1 */ ACPI_HEST_NOTIFY_GSIV = 10, /* ACPI 6.1 */ ACPI_HEST_NOTIFY_SOFTWARE_DELEGATED = 11, /* ACPI 6.2 */ - ACPI_HEST_NOTIFY_RESERVED = 12 /* 12 and greater are reserved */ + ACPI_HEST_NOTIFY_SSE = 12, /* RISCV SSE */ + ACPI_HEST_NOTIFY_HEE = 13, /* RISCV Hardware Error Exception */ + ACPI_HEST_NOTIFY_RESERVED = 14 /* 14 and greater are reserved */ }; /* Values for config_write_enable bitfield above */ -- 2.43.7 From tianruidong at linux.alibaba.com Wed Sep 10 02:33:46 2025 From: tianruidong at linux.alibaba.com (Ruidong Tian) Date: Wed, 10 Sep 2025 17:33:46 +0800 Subject: [RFC PATCH 4/5] riscv: Introduce HEST HEE notification handlers for APEI In-Reply-To: <20250910093347.75822-1-tianruidong@linux.alibaba.com> References: <20250910093347.75822-1-tianruidong@linux.alibaba.com> Message-ID: <20250910093347.75822-5-tianruidong@linux.alibaba.com> Add functions to register a ghes entry with HEE, allowing the OS to receive hardware error notifications from firmware through standardized ACPI interfaces. Signed-off-by: Ruidong Tian --- arch/riscv/Kconfig | 1 + arch/riscv/include/asm/fixmap.h | 6 ++++ drivers/acpi/apei/Kconfig | 12 +++++++ drivers/acpi/apei/ghes.c | 58 +++++++++++++++++++++++++++++++++ include/acpi/ghes.h | 6 ++++ 5 files changed, 83 insertions(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index a4b233a0659e..b085e172b355 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -23,6 +23,7 @@ config RISCV select ARCH_ENABLE_MEMORY_HOTREMOVE if MEMORY_HOTPLUG select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2 select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE + select HAVE_ACPI_APEI if (ACPI && EFI) select ARCH_HAS_BINFMT_FLAT select ARCH_HAS_CURRENT_STACK_POINTER select ARCH_HAS_DEBUG_VIRTUAL if MMU diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h index 0a55099bb734..07421edc9daa 100644 --- a/arch/riscv/include/asm/fixmap.h +++ b/arch/riscv/include/asm/fixmap.h @@ -38,6 +38,12 @@ enum fixed_addresses { FIX_TEXT_POKE0, FIX_EARLYCON_MEM_BASE, +#ifdef CONFIG_ACPI_APEI_HEE + /* Used for GHES mapping from assorted contexts */ + FIX_APEI_GHES_IRQ, + FIX_APEI_GHES_HEE, +#endif /* CONFIG_ACPI_APEI_GHES */ + __end_of_permanent_fixed_addresses, /* * Temporary boot-time mappings, used by early_ioremap(), diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig index 070c07d68dfb..d54a295cfc8d 100644 --- a/drivers/acpi/apei/Kconfig +++ b/drivers/acpi/apei/Kconfig @@ -46,6 +46,18 @@ config ACPI_APEI_SEA depends on ARM64 && ACPI_APEI_GHES default y +config ACPI_APEI_HEE + bool "APEI Hardware Error Exception support" + depends on RISCV && ACPI_APEI_GHES + default y + help + Enable support for RISC-V Hardware Error Exception (HEE) notification + in ACPI Platform Error Interface (APEI). This allows firmware + to report hardware errors through RISC-V exception mechanism. + + Say Y if you want to support firmware-first error handling + on RISC-V platforms with ACPI. + config ACPI_APEI_MEMORY_FAILURE bool "APEI memory error recovering support" depends on ACPI_APEI && MEMORY_FAILURE diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index a0d54993edb3..1011e28091dc 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -97,6 +97,11 @@ #define FIX_APEI_GHES_SDEI_CRITICAL __end_of_fixed_addresses #endif +#if !defined(CONFIG_X86) && !defined(CONFIG_ARM64) +#define FIX_APEI_GHES_NMI __end_of_fixed_addresses +#define FIX_APEI_GHES_SEA __end_of_fixed_addresses +#endif + static ATOMIC_NOTIFIER_HEAD(ghes_report_chain); static inline bool is_hest_type_generic_v2(struct ghes *ghes) @@ -1415,6 +1420,45 @@ static inline void ghes_sea_add(struct ghes *ghes) { } static inline void ghes_sea_remove(struct ghes *ghes) { } #endif /* CONFIG_ACPI_APEI_SEA */ +#ifdef CONFIG_ACPI_APEI_HEE +static LIST_HEAD(ghes_hee); + +/* + * Return 0 only if one of the HEE error sources successfully reported an error + * record sent from the firmware. + */ +int ghes_notify_hee(void) +{ + static DEFINE_RAW_SPINLOCK(ghes_notify_lock_hee); + int rv; + + raw_spin_lock(&ghes_notify_lock_hee); + rv = ghes_in_nmi_spool_from_list(&ghes_hee, FIX_APEI_GHES_HEE); + raw_spin_unlock(&ghes_notify_lock_hee); + + return rv; +} +EXPORT_SYMBOL_GPL(ghes_notify_hee); + +static void ghes_hee_add(struct ghes *ghes) +{ + mutex_lock(&ghes_list_mutex); + list_add_rcu(&ghes->list, &ghes_hee); + mutex_unlock(&ghes_list_mutex); +} + +static void ghes_hee_remove(struct ghes *ghes) +{ + mutex_lock(&ghes_list_mutex); + list_del_rcu(&ghes->list); + mutex_unlock(&ghes_list_mutex); + synchronize_rcu(); +} +#else /* CONFIG_ACPI_APEI_HEE */ +static inline void ghes_hee_add(struct ghes *ghes) { } +static inline void ghes_hee_remove(struct ghes *ghes) { } +#endif /* CONFIG_ACPI_APEI_HEE */ + #ifdef CONFIG_HAVE_ACPI_APEI_NMI /* * NMI may be triggered on any CPU, so ghes_in_nmi is used for @@ -1558,6 +1602,14 @@ static int ghes_probe(struct platform_device *ghes_dev) goto err; } break; + case ACPI_HEST_NOTIFY_HEE: + if (!IS_ENABLED(CONFIG_ACPI_APEI_HEE)) { + pr_warn(GHES_PFX "Generic hardware error source: %d notified via HEE is not supported\n", + generic->header.source_id); + rc = -ENOTSUPP; + goto err; + } + break; case ACPI_HEST_NOTIFY_NMI: if (!IS_ENABLED(CONFIG_HAVE_ACPI_APEI_NMI)) { pr_warn(GHES_PFX "Generic hardware error source: %d notified via NMI interrupt is not supported!\n", @@ -1631,6 +1683,9 @@ static int ghes_probe(struct platform_device *ghes_dev) case ACPI_HEST_NOTIFY_SEA: ghes_sea_add(ghes); break; + case ACPI_HEST_NOTIFY_HEE: + ghes_hee_add(ghes); + break; case ACPI_HEST_NOTIFY_NMI: ghes_nmi_add(ghes); break; @@ -1698,6 +1753,9 @@ static void ghes_remove(struct platform_device *ghes_dev) case ACPI_HEST_NOTIFY_SEA: ghes_sea_remove(ghes); break; + case ACPI_HEST_NOTIFY_HEE: + ghes_hee_remove(ghes); + break; case ACPI_HEST_NOTIFY_NMI: ghes_nmi_remove(ghes); break; diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h index ebd21b05fe6e..8046e1b30c21 100644 --- a/include/acpi/ghes.h +++ b/include/acpi/ghes.h @@ -127,6 +127,12 @@ int ghes_notify_sea(void); static inline int ghes_notify_sea(void) { return -ENOENT; } #endif +#ifdef CONFIG_ACPI_APEI_HEE +int ghes_notify_hee(void); +#else +static inline int ghes_notify_hee(void) { return -ENOENT; } +#endif + struct notifier_block; extern void ghes_register_report_chain(struct notifier_block *nb); extern void ghes_unregister_report_chain(struct notifier_block *nb); -- 2.43.7 From tianruidong at linux.alibaba.com Wed Sep 10 02:33:47 2025 From: tianruidong at linux.alibaba.com (Ruidong Tian) Date: Wed, 10 Sep 2025 17:33:47 +0800 Subject: [RFC PATCH 5/5] riscv: Add Hardware Error Exception trap handler In-Reply-To: <20250910093347.75822-1-tianruidong@linux.alibaba.com> References: <20250910093347.75822-1-tianruidong@linux.alibaba.com> Message-ID: <20250910093347.75822-6-tianruidong@linux.alibaba.com> Implement the Hardware Error Exception trap handler for RISC-V architecture synchronous hardware error handling. This enables the OS to receive hardware error notifications from firmware through the standardized ACPI HEST (Hardware Error Source Table) interface. The implementation includes: - A new exception vector entry for Hardware Error Exceptio - A trap handler (do_trap_hardware_error) that processes hardware errors in both kernel(panic now) and user modes(SIGBUS) - Integration with APEI GHES (Generic Hardware Error Source) to report hardware errors from firmware This change enables RISC-V systems with ACPI to handle synchronous hardware errors in a firmware-first manner. Signed-off-by: Ruidong Tian --- arch/riscv/include/asm/acpi.h | 2 ++ arch/riscv/kernel/acpi.c | 55 +++++++++++++++++++++++++++++++++++ arch/riscv/kernel/entry.S | 4 +++ arch/riscv/kernel/traps.c | 19 ++++++++++++ 4 files changed, 80 insertions(+) diff --git a/arch/riscv/include/asm/acpi.h b/arch/riscv/include/asm/acpi.h index 0c599452ef48..ae861885b97d 100644 --- a/arch/riscv/include/asm/acpi.h +++ b/arch/riscv/include/asm/acpi.h @@ -91,6 +91,7 @@ int acpi_get_riscv_isa(struct acpi_table_header *table, void acpi_get_cbo_block_size(struct acpi_table_header *table, u32 *cbom_size, u32 *cboz_size, u32 *cbop_size); +int apei_claim_hee(struct pt_regs *regs); #else static inline void acpi_init_rintc_map(void) { } static inline struct acpi_madt_rintc *acpi_cpu_get_madt_rintc(int cpu) @@ -108,6 +109,7 @@ static inline void acpi_get_cbo_block_size(struct acpi_table_header *table, u32 *cbom_size, u32 *cboz_size, u32 *cbop_size) { } +static inline int apei_claim_hee(struct pt_regs *regs) { return -ENOENT; } #endif /* CONFIG_ACPI */ #ifdef CONFIG_ACPI_NUMA diff --git a/arch/riscv/kernel/acpi.c b/arch/riscv/kernel/acpi.c index 3f6d5a6789e8..928f9474bfee 100644 --- a/arch/riscv/kernel/acpi.c +++ b/arch/riscv/kernel/acpi.c @@ -20,6 +20,11 @@ #include #include #include +#include +#include +#include +#include +#include int acpi_noirq = 1; /* skip ACPI IRQ initialization */ int acpi_disabled = 1; @@ -334,3 +339,53 @@ int raw_pci_write(unsigned int domain, unsigned int bus, } #endif /* CONFIG_PCI */ + +/* + * Claim Hardware Error Exception as a firmware first notification. + * + * Used by RISC-V exception handler for hardware error processing. + * @regs may be NULL when called from process context. + */ +int apei_claim_hee(struct pt_regs *regs) +{ + int err = -ENOENT; + bool return_to_irqs_enabled; + unsigned long flags; + + if (!IS_ENABLED(CONFIG_ACPI_APEI_GHES)) + return err; + + /* Save current interrupt state */ + local_irq_save(flags); + return_to_irqs_enabled = !irqs_disabled(); + + if (regs) + return_to_irqs_enabled = (regs->status & SR_SIE) != 0; + + /* + * HEE can interrupt other operations, handle as NMI-like context + * to ensure proper APEI processing + */ + nmi_enter(); + err = ghes_notify_hee(); + nmi_exit(); + + /* + * APEI NMI-like notifications are deferred to irq_work. Unless + * we interrupted irqs-masked code, we can do that now. + */ + if (!err) { + if (return_to_irqs_enabled) { + local_irq_restore(flags); + irq_work_run(); + } else { + pr_warn_ratelimited("APEI work queued but not completed"); + err = -EINPROGRESS; + } + } else { + local_irq_restore(flags); + } + + return err; +} +EXPORT_SYMBOL(apei_claim_hee); diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 3a0ec6fd5956..1cbefe934d84 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -459,6 +459,10 @@ SYM_DATA_START_LOCAL(excp_vect_table) RISCV_PTR do_page_fault /* load page fault */ RISCV_PTR do_trap_unknown RISCV_PTR do_page_fault /* store page fault */ + RISCV_PTR do_trap_unknown + RISCV_PTR do_trap_unknown + RISCV_PTR do_trap_unknown + RISCV_PTR do_trap_hardware_error /* Hardware Error */ SYM_DATA_END_LABEL(excp_vect_table, SYM_L_LOCAL, excp_vect_table_end) #ifndef CONFIG_MMU diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 80230de167de..48f1ea1e03e6 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -442,3 +443,21 @@ asmlinkage void handle_bad_stack(struct pt_regs *regs) wait_for_interrupt(); } #endif + +asmlinkage __visible __trap_section void do_trap_hardware_error(struct pt_regs *regs) +{ + if (user_mode(regs)) { + irqentry_enter_from_user_mode(regs); + + if (apei_claim_hee(regs)) + do_trap_error(regs, SIGBUS, BUS_OBJERR, regs->badaddr, "Hardware Error"); + + irqentry_exit_to_user_mode(regs); + } else { + irqentry_state_t state = irqentry_nmi_enter(regs); + + die(regs, "Hardware Error"); + + irqentry_nmi_exit(regs, state); + } +} -- 2.43.7 From guoyaxing at bosc.ac.cn Wed Sep 10 02:54:30 2025 From: guoyaxing at bosc.ac.cn (Yaxing Guo) Date: Wed, 10 Sep 2025 17:54:30 +0800 Subject: [PATCH v1] riscv: iommu: Fix irq failure due to idx mismatch in icvec Message-ID: <20250910095430.93868-1-guoyaxing@bosc.ac.cn> In icvec, the idx of civ, fiv, pmiv and piv are 0, 1, 2, 3 (According to spec 5.27). And usually, the interrupt-names property in dts riscv-iommu node also follows this (In qemu virt machine follows this) which will cause hardware irq number errors (Especially when using qemu virt machine to start Linux). By the way, should use interfaces such as platform_get_irq_byname to implement it further? Signed-off-by: Yaxing Guo --- drivers/iommu/riscv/iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c index 52d5e3d76019..00f11368bf24 100644 --- a/drivers/iommu/riscv/iommu.c +++ b/drivers/iommu/riscv/iommu.c @@ -1583,8 +1583,8 @@ static int riscv_iommu_init_check(struct riscv_iommu_device *iommu) return -EINVAL; iommu->icvec = FIELD_PREP(RISCV_IOMMU_ICVEC_FIV, 1 % iommu->irqs_count) | - FIELD_PREP(RISCV_IOMMU_ICVEC_PIV, 2 % iommu->irqs_count) | - FIELD_PREP(RISCV_IOMMU_ICVEC_PMIV, 3 % iommu->irqs_count); + FIELD_PREP(RISCV_IOMMU_ICVEC_PIV, 3 % iommu->irqs_count) | + FIELD_PREP(RISCV_IOMMU_ICVEC_PMIV, 2 % iommu->irqs_count); riscv_iommu_writeq(iommu, RISCV_IOMMU_REG_ICVEC, iommu->icvec); iommu->icvec = riscv_iommu_readq(iommu, RISCV_IOMMU_REG_ICVEC); if (max(max(FIELD_GET(RISCV_IOMMU_ICVEC_CIV, iommu->icvec), -- 2.34.1 From guoyaxing at bosc.ac.cn Wed Sep 10 03:11:25 2025 From: guoyaxing at bosc.ac.cn (Yaxing Guo) Date: Wed, 10 Sep 2025 18:11:25 +0800 Subject: [PATCH v1] riscv: iommu: Kconfig: Add RISC-V to IOMMU_DMA def_bool Message-ID: <20250910101125.94345-1-guoyaxing@bosc.ac.cn> The IOMMU_DMA configuration option is used to enable DMA mapping support via IOMMU for platforms that require it. Currently, it is enabled for ARM64, X86, and S390 architectures. This patch adds RISC-V (RISCV) to the def_bool condition, enabling IOMMU_DMA support for RISC-V platforms that have an IOMMU and require DMA remapping. Signed-off-by: Yaxing Guo --- drivers/iommu/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 70d29b14d851..9d8c90690275 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -150,7 +150,7 @@ config OF_IOMMU # IOMMU-agnostic DMA-mapping layer config IOMMU_DMA - def_bool ARM64 || X86 || S390 + def_bool ARM64 || X86 || S390 || RISCV select DMA_OPS_HELPERS select IOMMU_API select IOMMU_IOVA -- 2.34.1 From rabenda.cn at gmail.com Wed Sep 10 03:55:31 2025 From: rabenda.cn at gmail.com (Han Gao) Date: Wed, 10 Sep 2025 18:55:31 +0800 Subject: [PATCH] dts: sophgo: sg2042: added numa id description Message-ID: <20250910105531.519897-1-rabenda.cn@gmail.com> According to the description of [1], sg2042 is divided into 4 numa. STREAM test performance will improve. Before: Function Best Rate MB/s Avg time Min time Max time Copy: 10739.7 0.015687 0.014898 0.016385 Scale: 10865.9 0.015628 0.014725 0.016757 Add: 10622.3 0.023276 0.022594 0.023899 Triad: 10583.4 0.023653 0.022677 0.024761 After: Function Best Rate MB/s Avg time Min time Max time Copy: 34254.9 0.005142 0.004671 0.005995 Scale: 37735.5 0.004752 0.004240 0.005407 Add: 44206.8 0.005983 0.005429 0.006461 Triad: 43040.6 0.006320 0.005576 0.006996 [1] https://github.com/sophgo/sophgo-doc/blob/main/SG2042/TRM/source/pic/mesh.png Signed-off-by: Han Gao --- arch/riscv/boot/dts/sophgo/sg2042-cpus.dtsi | 64 +++++++++++++++++++++ arch/riscv/boot/dts/sophgo/sg2042.dtsi | 20 +++++++ 2 files changed, 84 insertions(+) diff --git a/arch/riscv/boot/dts/sophgo/sg2042-cpus.dtsi b/arch/riscv/boot/dts/sophgo/sg2042-cpus.dtsi index 77ded5304272..94a4b71acad3 100644 --- a/arch/riscv/boot/dts/sophgo/sg2042-cpus.dtsi +++ b/arch/riscv/boot/dts/sophgo/sg2042-cpus.dtsi @@ -272,6 +272,7 @@ cpu0: cpu at 0 { d-cache-sets = <512>; next-level-cache = <&l2_cache0>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu0_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -299,6 +300,7 @@ cpu1: cpu at 1 { d-cache-sets = <512>; next-level-cache = <&l2_cache0>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu1_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -326,6 +328,7 @@ cpu2: cpu at 2 { d-cache-sets = <512>; next-level-cache = <&l2_cache0>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu2_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -353,6 +356,7 @@ cpu3: cpu at 3 { d-cache-sets = <512>; next-level-cache = <&l2_cache0>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu3_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -380,6 +384,7 @@ cpu4: cpu at 4 { d-cache-sets = <512>; next-level-cache = <&l2_cache1>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu4_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -407,6 +412,7 @@ cpu5: cpu at 5 { d-cache-sets = <512>; next-level-cache = <&l2_cache1>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu5_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -434,6 +440,7 @@ cpu6: cpu at 6 { d-cache-sets = <512>; next-level-cache = <&l2_cache1>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu6_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -461,6 +468,7 @@ cpu7: cpu at 7 { d-cache-sets = <512>; next-level-cache = <&l2_cache1>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu7_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -488,6 +496,7 @@ cpu8: cpu at 8 { d-cache-sets = <512>; next-level-cache = <&l2_cache4>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu8_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -515,6 +524,7 @@ cpu9: cpu at 9 { d-cache-sets = <512>; next-level-cache = <&l2_cache4>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu9_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -542,6 +552,7 @@ cpu10: cpu at 10 { d-cache-sets = <512>; next-level-cache = <&l2_cache4>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu10_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -569,6 +580,7 @@ cpu11: cpu at 11 { d-cache-sets = <512>; next-level-cache = <&l2_cache4>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu11_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -596,6 +608,7 @@ cpu12: cpu at 12 { d-cache-sets = <512>; next-level-cache = <&l2_cache5>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu12_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -623,6 +636,7 @@ cpu13: cpu at 13 { d-cache-sets = <512>; next-level-cache = <&l2_cache5>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu13_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -650,6 +664,7 @@ cpu14: cpu at 14 { d-cache-sets = <512>; next-level-cache = <&l2_cache5>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu14_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -677,6 +692,7 @@ cpu15: cpu at 15 { d-cache-sets = <512>; next-level-cache = <&l2_cache5>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu15_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -704,6 +720,7 @@ cpu16: cpu at 16 { d-cache-sets = <512>; next-level-cache = <&l2_cache2>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu16_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -731,6 +748,7 @@ cpu17: cpu at 17 { d-cache-sets = <512>; next-level-cache = <&l2_cache2>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu17_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -758,6 +776,7 @@ cpu18: cpu at 18 { d-cache-sets = <512>; next-level-cache = <&l2_cache2>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu18_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -785,6 +804,7 @@ cpu19: cpu at 19 { d-cache-sets = <512>; next-level-cache = <&l2_cache2>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu19_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -812,6 +832,7 @@ cpu20: cpu at 20 { d-cache-sets = <512>; next-level-cache = <&l2_cache3>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu20_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -839,6 +860,7 @@ cpu21: cpu at 21 { d-cache-sets = <512>; next-level-cache = <&l2_cache3>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu21_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -866,6 +888,7 @@ cpu22: cpu at 22 { d-cache-sets = <512>; next-level-cache = <&l2_cache3>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu22_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -893,6 +916,7 @@ cpu23: cpu at 23 { d-cache-sets = <512>; next-level-cache = <&l2_cache3>; mmu-type = "riscv,sv39"; + numa-node-id = <0>; cpu23_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -920,6 +944,7 @@ cpu24: cpu at 24 { d-cache-sets = <512>; next-level-cache = <&l2_cache6>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu24_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -947,6 +972,7 @@ cpu25: cpu at 25 { d-cache-sets = <512>; next-level-cache = <&l2_cache6>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu25_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -974,6 +1000,7 @@ cpu26: cpu at 26 { d-cache-sets = <512>; next-level-cache = <&l2_cache6>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu26_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1001,6 +1028,7 @@ cpu27: cpu at 27 { d-cache-sets = <512>; next-level-cache = <&l2_cache6>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu27_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1028,6 +1056,7 @@ cpu28: cpu at 28 { d-cache-sets = <512>; next-level-cache = <&l2_cache7>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu28_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1055,6 +1084,7 @@ cpu29: cpu at 29 { d-cache-sets = <512>; next-level-cache = <&l2_cache7>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu29_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1082,6 +1112,7 @@ cpu30: cpu at 30 { d-cache-sets = <512>; next-level-cache = <&l2_cache7>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu30_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1109,6 +1140,7 @@ cpu31: cpu at 31 { d-cache-sets = <512>; next-level-cache = <&l2_cache7>; mmu-type = "riscv,sv39"; + numa-node-id = <1>; cpu31_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1136,6 +1168,7 @@ cpu32: cpu at 32 { d-cache-sets = <512>; next-level-cache = <&l2_cache8>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu32_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1163,6 +1196,7 @@ cpu33: cpu at 33 { d-cache-sets = <512>; next-level-cache = <&l2_cache8>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu33_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1190,6 +1224,7 @@ cpu34: cpu at 34 { d-cache-sets = <512>; next-level-cache = <&l2_cache8>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu34_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1217,6 +1252,7 @@ cpu35: cpu at 35 { d-cache-sets = <512>; next-level-cache = <&l2_cache8>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu35_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1244,6 +1280,7 @@ cpu36: cpu at 36 { d-cache-sets = <512>; next-level-cache = <&l2_cache9>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu36_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1271,6 +1308,7 @@ cpu37: cpu at 37 { d-cache-sets = <512>; next-level-cache = <&l2_cache9>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu37_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1298,6 +1336,7 @@ cpu38: cpu at 38 { d-cache-sets = <512>; next-level-cache = <&l2_cache9>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu38_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1325,6 +1364,7 @@ cpu39: cpu at 39 { d-cache-sets = <512>; next-level-cache = <&l2_cache9>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu39_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1352,6 +1392,7 @@ cpu40: cpu at 40 { d-cache-sets = <512>; next-level-cache = <&l2_cache12>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu40_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1379,6 +1420,7 @@ cpu41: cpu at 41 { d-cache-sets = <512>; next-level-cache = <&l2_cache12>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu41_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1406,6 +1448,7 @@ cpu42: cpu at 42 { d-cache-sets = <512>; next-level-cache = <&l2_cache12>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu42_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1433,6 +1476,7 @@ cpu43: cpu at 43 { d-cache-sets = <512>; next-level-cache = <&l2_cache12>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu43_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1460,6 +1504,7 @@ cpu44: cpu at 44 { d-cache-sets = <512>; next-level-cache = <&l2_cache13>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu44_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1487,6 +1532,7 @@ cpu45: cpu at 45 { d-cache-sets = <512>; next-level-cache = <&l2_cache13>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu45_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1514,6 +1560,7 @@ cpu46: cpu at 46 { d-cache-sets = <512>; next-level-cache = <&l2_cache13>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu46_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1541,6 +1588,7 @@ cpu47: cpu at 47 { d-cache-sets = <512>; next-level-cache = <&l2_cache13>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu47_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1568,6 +1616,7 @@ cpu48: cpu at 48 { d-cache-sets = <512>; next-level-cache = <&l2_cache10>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu48_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1595,6 +1644,7 @@ cpu49: cpu at 49 { d-cache-sets = <512>; next-level-cache = <&l2_cache10>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu49_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1622,6 +1672,7 @@ cpu50: cpu at 50 { d-cache-sets = <512>; next-level-cache = <&l2_cache10>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu50_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1649,6 +1700,7 @@ cpu51: cpu at 51 { d-cache-sets = <512>; next-level-cache = <&l2_cache10>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu51_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1676,6 +1728,7 @@ cpu52: cpu at 52 { d-cache-sets = <512>; next-level-cache = <&l2_cache11>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu52_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1703,6 +1756,7 @@ cpu53: cpu at 53 { d-cache-sets = <512>; next-level-cache = <&l2_cache11>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu53_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1730,6 +1784,7 @@ cpu54: cpu at 54 { d-cache-sets = <512>; next-level-cache = <&l2_cache11>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu54_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1757,6 +1812,7 @@ cpu55: cpu at 55 { d-cache-sets = <512>; next-level-cache = <&l2_cache11>; mmu-type = "riscv,sv39"; + numa-node-id = <2>; cpu55_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1784,6 +1840,7 @@ cpu56: cpu at 56 { d-cache-sets = <512>; next-level-cache = <&l2_cache14>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu56_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1811,6 +1868,7 @@ cpu57: cpu at 57 { d-cache-sets = <512>; next-level-cache = <&l2_cache14>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu57_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1838,6 +1896,7 @@ cpu58: cpu at 58 { d-cache-sets = <512>; next-level-cache = <&l2_cache14>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu58_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1865,6 +1924,7 @@ cpu59: cpu at 59 { d-cache-sets = <512>; next-level-cache = <&l2_cache14>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu59_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1892,6 +1952,7 @@ cpu60: cpu at 60 { d-cache-sets = <512>; next-level-cache = <&l2_cache15>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu60_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1919,6 +1980,7 @@ cpu61: cpu at 61 { d-cache-sets = <512>; next-level-cache = <&l2_cache15>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu61_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1946,6 +2008,7 @@ cpu62: cpu at 62 { d-cache-sets = <512>; next-level-cache = <&l2_cache15>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu62_intc: interrupt-controller { compatible = "riscv,cpu-intc"; @@ -1973,6 +2036,7 @@ cpu63: cpu at 63 { d-cache-sets = <512>; next-level-cache = <&l2_cache15>; mmu-type = "riscv,sv39"; + numa-node-id = <3>; cpu63_intc: interrupt-controller { compatible = "riscv,cpu-intc"; diff --git a/arch/riscv/boot/dts/sophgo/sg2042.dtsi b/arch/riscv/boot/dts/sophgo/sg2042.dtsi index b3e4d3c18fdc..029561b6ad81 100644 --- a/arch/riscv/boot/dts/sophgo/sg2042.dtsi +++ b/arch/riscv/boot/dts/sophgo/sg2042.dtsi @@ -19,6 +19,26 @@ / { #size-cells = <2>; dma-noncoherent; + distance-map { + compatible = "numa-distance-map-v1"; + distance-matrix = <0 0 10>, + <0 1 15>, + <0 2 25>, + <0 3 30>, + <1 0 15>, + <1 1 10>, + <1 2 30>, + <1 3 25>, + <2 0 25>, + <2 1 30>, + <2 2 10>, + <2 3 15>, + <3 0 30>, + <3 1 25>, + <3 2 15>, + <3 3 10>; + }; + aliases { serial0 = &uart0; }; -- 2.47.3 From rabenda.cn at gmail.com Wed Sep 10 04:24:01 2025 From: rabenda.cn at gmail.com (Han Gao) Date: Wed, 10 Sep 2025 19:24:01 +0800 Subject: [PATCH] riscv: acpi: chose to boot from acpi then disable FDT Message-ID: <20250910112401.552987-1-rabenda.cn@gmail.com> avoid errors caused by repeated driver initialization. commit 3505f30fb6a9 ("ARM64 / ACPI: If we chose to boot from acpi then disable FDT") Signed-off-by: Han Gao --- arch/riscv/kernel/setup.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index f90cce7a3ace..d7ee62837aa4 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -330,11 +330,14 @@ void __init setup_arch(char **cmdline_p) /* Parse the ACPI tables for possible boot-time configuration */ acpi_boot_table_init(); + if (acpi_disabled) { #if IS_ENABLED(CONFIG_BUILTIN_DTB) - unflatten_and_copy_device_tree(); + unflatten_and_copy_device_tree(); #else - unflatten_device_tree(); + unflatten_device_tree(); #endif + } + misc_mem_init(); init_resources(); -- 2.47.3 From cp0613 at linux.alibaba.com Wed Sep 10 05:11:19 2025 From: cp0613 at linux.alibaba.com (cp0613 at linux.alibaba.com) Date: Wed, 10 Sep 2025 20:11:19 +0800 Subject: [PATCH 0/2] perf vendor events riscv: Add T-HEAD C930 JSON files Message-ID: <20250910121121.7203-1-cp0613@linux.alibaba.com> From: Chen Pei Add pmu json files for T-HEAD C930. Including topdown and some other metric groups. Chen Pei (2): perf vendor events riscv: Add T-HEAD C930 JSON file perf vendor events riscv: Add T-HEAD C930 metrics tools/perf/pmu-events/arch/riscv/mapfile.csv | 1 + .../arch/riscv/thead/c930/basic.json | 117 ++++ .../pmu-events/arch/riscv/thead/c930/ieu.json | 97 ++++ .../pmu-events/arch/riscv/thead/c930/ifu.json | 62 ++ .../pmu-events/arch/riscv/thead/c930/l2c.json | 87 +++ .../pmu-events/arch/riscv/thead/c930/lsu.json | 182 ++++++ .../arch/riscv/thead/c930/metrics.json | 538 ++++++++++++++++++ .../arch/riscv/thead/c930/vfpu.json | 177 ++++++ 8 files changed, 1261 insertions(+) create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/basic.json create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/ieu.json create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/ifu.json create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/l2c.json create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/lsu.json create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/metrics.json create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/vfpu.json -- 2.49.0 From cp0613 at linux.alibaba.com Wed Sep 10 05:11:20 2025 From: cp0613 at linux.alibaba.com (cp0613 at linux.alibaba.com) Date: Wed, 10 Sep 2025 20:11:20 +0800 Subject: [PATCH 1/2] perf vendor events riscv: Add T-HEAD C930 JSON file In-Reply-To: <20250910121121.7203-1-cp0613@linux.alibaba.com> References: <20250910121121.7203-1-cp0613@linux.alibaba.com> Message-ID: <20250910121121.7203-2-cp0613@linux.alibaba.com> From: Chen Pei Add pmu json file of T-HEAD C930. Signed-off-by: Chen Pei --- tools/perf/pmu-events/arch/riscv/mapfile.csv | 1 + .../arch/riscv/thead/c930/basic.json | 117 +++++++++++ .../pmu-events/arch/riscv/thead/c930/ieu.json | 97 ++++++++++ .../pmu-events/arch/riscv/thead/c930/ifu.json | 62 ++++++ .../pmu-events/arch/riscv/thead/c930/l2c.json | 87 +++++++++ .../pmu-events/arch/riscv/thead/c930/lsu.json | 182 ++++++++++++++++++ .../arch/riscv/thead/c930/vfpu.json | 177 +++++++++++++++++ 7 files changed, 723 insertions(+) create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/basic.json create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/ieu.json create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/ifu.json create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/l2c.json create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/lsu.json create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/vfpu.json diff --git a/tools/perf/pmu-events/arch/riscv/mapfile.csv b/tools/perf/pmu-events/arch/riscv/mapfile.csv index 0a7e7dcc81be..93ba00fa5890 100644 --- a/tools/perf/pmu-events/arch/riscv/mapfile.csv +++ b/tools/perf/pmu-events/arch/riscv/mapfile.csv @@ -20,5 +20,6 @@ 0x489-0x8000000000000008-0x[[:xdigit:]]+,v1,sifive/p550,core 0x489-0x8000000000000[1-6]08-0x[9b][[:xdigit:]]+,v1,sifive/p650,core 0x5b7-0x0-0x0,v1,thead/c900-legacy,core +0x5b7-0x8000000009201600-0x[[:xdigit:]]+,v1,thead/c930,core 0x67e-0x80000000db0000[89]0-0x[[:xdigit:]]+,v1,starfive/dubhe-80,core 0x31e-0x8000000000008a45-0x[[:xdigit:]]+,v1,andes/ax45,core diff --git a/tools/perf/pmu-events/arch/riscv/thead/c930/basic.json b/tools/perf/pmu-events/arch/riscv/thead/c930/basic.json new file mode 100644 index 000000000000..afb4bec67af9 --- /dev/null +++ b/tools/perf/pmu-events/arch/riscv/thead/c930/basic.json @@ -0,0 +1,117 @@ +[ + { + "EventName": "cycles.hart", + "EventCode": "0x00000001", + "BriefDescription": "cpu execute cycle" + }, + { + "EventName": "inst.ret", + "EventCode": "0x00000002", + "BriefDescription": "inst retire num" + }, + { + "EventName": "inst.brjmp.spec", + "EventCode": "0x00000003", + "BriefDescription": "speculative execute br num" + }, + { + "EventName": "inst.mispred.brjmp.spec", + "EventCode": "0x00000004", + "BriefDescription": "speculative execute br mispred num" + }, + { + "EventName": "cache.l1i.rd.access", + "EventCode": "0x00000005", + "BriefDescription": "l1 icache acc num" + }, + { + "EventName": "cache.l1i.rd.miss", + "EventCode": "0x00000006", + "BriefDescription": "l1 icache acc miss num" + }, + { + "EventName": "cache.l1d.rd.access", + "EventCode": "0x00000007", + "BriefDescription": "l1 dcache load acc num" + }, + { + "EventName": "cache.l1d.rd.miss", + "EventCode": "0x00000008", + "BriefDescription": "l1 dcache load acc miss num" + }, + { + "EventName": "tlb.l1i.access", + "EventCode": "0x00000009", + "BriefDescription": "itlb acc num" + }, + { + "EventName": "tlb.l1i.miss", + "EventCode": "0x0000000a", + "BriefDescription": "itlb acc miss num" + }, + { + "EventName": "tlb.l1d.access", + "EventCode": "0x0000000b", + "BriefDescription": "dtlb acc num" + }, + { + "EventName": "tlb.l1d.miss", + "EventCode": "0x0000000c", + "BriefDescription": "dtlb acc miss num" + }, + { + "EventName": "tlb.pf.access", + "EventCode": "0x0000000d", + "BriefDescription": "ptlb acc num" + }, + { + "EventName": "tlb.pf.miss", + "EventCode": "0x0000000e", + "BriefDescription": "ptlb acc miss num" + }, + { + "EventName": "cache.l2.access", + "EventCode": "0x0000000f", + "BriefDescription": "l2 acc num" + }, + { + "EventName": "cache.l2.miss", + "EventCode": "0x00000010", + "BriefDescription": "l2 acc miss num" + }, + { + "EventName": "uop.spec", + "EventCode": "0x00000011", + "BriefDescription": "rename stage issue slots num" + }, + { + "EventName": "topdown.frontend_bound.slots", + "EventCode": "0x00000012", + "BriefDescription": "rename stage no stall and no in recovery, rename stage bubble slot num" + }, + { + "EventName": "topdown.bad_speculation.recovery_bubbles", + "EventCode": "0x00000013", + "BriefDescription": "backend flush, rename stage stall cycle num" + }, + { + "EventName": "topdown.frontend_bound.latency.slots", + "EventCode": "0x00000014", + "BriefDescription": "rename stage no stall, frontend waste slots num" + }, + { + "EventName": "topdown.backend_bound.memory.load", + "EventCode": "0x00000015", + "BriefDescription": "issue queue full, and exist inflight load cycle num" + }, + { + "EventName": "topdown.backend_bound.memory.store", + "EventCode": "0x00000016", + "BriefDescription": "issue queue full, and exist inflight store cycle num" + }, + { + "EventName": "uop.ret", + "EventCode": "0x00000017", + "BriefDescription": "retire uop num" + } +] diff --git a/tools/perf/pmu-events/arch/riscv/thead/c930/ieu.json b/tools/perf/pmu-events/arch/riscv/thead/c930/ieu.json new file mode 100644 index 000000000000..61e57c0e415b --- /dev/null +++ b/tools/perf/pmu-events/arch/riscv/thead/c930/ieu.json @@ -0,0 +1,97 @@ +[ + { + "EventName": "topdown.backend_bound.core.barrier_csr", + "EventCode": "0x0000003c", + "BriefDescription": "Stall cycles caused by CSR barrier at Rename" + }, + { + "EventName": "topdown.backend_bound.core.highload", + "EventCode": "0x0000003d", + "BriefDescription": "Stall cycles caused by high load (ptag full, etc.) at Rename" + }, + { + "EventName": "topdown.backend_bound.core.rob_full", + "EventCode": "0x0000003e", + "BriefDescription": "Stall cycles caused by ROB full at Rename" + }, + { + "EventName": "topdown.backend_bound.core.flush_or_rebuild", + "EventCode": "0x0000003f", + "BriefDescription": "Stall cycles caused by flush or unfinished rebuilding at Rename" + }, + { + "EventName": "ieu.de.inst_cnt", + "EventCode": "0x00000040", + "BriefDescription": "instr nums in de" + }, + { + "EventName": "ieu.rn.inst_cnt", + "EventCode": "0x00000041", + "BriefDescription": "instr nums in rn" + }, + { + "EventName": "topdown.bad_speculation.exception_flush", + "EventCode": "0x00000042", + "BriefDescription": "Flushes generated due to exceptions" + }, + { + "EventName": "topdown.bad_speculation.interrupt_flush", + "EventCode": "0x00000043", + "BriefDescription": "Flushes generated due to interrupts" + }, + { + "EventName": "topdown.bad_speculation.other_flush", + "EventCode": "0x00000044", + "BriefDescription": "Other flushes generated" + }, + { + "EventName": "inst.int.alu.spec", + "EventCode": "0x00000045", + "BriefDescription": "Completed ALU instructions" + }, + { + "EventName": "inst.int.mul.spec", + "EventCode": "0x00000046", + "BriefDescription": "Completed MULT instructions" + }, + { + "EventName": "inst.int.div.spec", + "EventCode": "0x00000047", + "BriefDescription": "Completed DIV instructions" + }, + { + "EventName": "inst.int.csr.spec", + "EventCode": "0x00000048", + "BriefDescription": "Completed CSR instructions" + }, + { + "EventName": "ieu.is.siq.stall", + "EventCode": "0x000000c0", + "BriefDescription": "cycle nums of siq full stall cycles" + }, + { + "EventName": "ieu.is.miq.stall", + "EventCode": "0x000000c1", + "BriefDescription": "cycle nums of miq full stall cycles" + }, + { + "EventName": "ieu.is.biq.stall", + "EventCode": "0x000000c2", + "BriefDescription": "cycle nums of biq full stall cycles" + }, + { + "EventName": "ieu.is.lsiq.stall", + "EventCode": "0x000000c3", + "BriefDescription": "cycle nums of lsiq full stall cycles" + }, + { + "EventName": "ieu.is.vfpq.stall", + "EventCode": "0x000000c4", + "BriefDescription": "cycle nums of fpiq full stall cycles" + }, + { + "EventName": "topdown.backend_bound.core.div_busy", + "EventCode": "0x000000c5", + "BriefDescription": "cycle nums of div busy stall cycles" + } +] diff --git a/tools/perf/pmu-events/arch/riscv/thead/c930/ifu.json b/tools/perf/pmu-events/arch/riscv/thead/c930/ifu.json new file mode 100644 index 000000000000..11057f66f797 --- /dev/null +++ b/tools/perf/pmu-events/arch/riscv/thead/c930/ifu.json @@ -0,0 +1,62 @@ +[ + { + "EventName": "inst.mispred.branch", + "EventCode": "0x00000018", + "BriefDescription": "speculative cond br mispred num" + }, + { + "EventName": "inst.mispred.uncond_branch", + "EventCode": "0x00000019", + "BriefDescription": "speculative uncond br mispred num" + }, + { + "EventName": "inst.mispred.ind", + "EventCode": "0x0000001a", + "BriefDescription": "speculative indir mispred num" + }, + { + "EventName": "inst.mispred.ret", + "EventCode": "0x0000001b", + "BriefDescription": "speculative return mispred num" + }, + { + "EventName": "inst.brjmp.branch.spec", + "EventCode": "0x0000001c", + "BriefDescription": "speculative execute cond br num" + }, + { + "EventName": "inst.brjmp.uncond_branch.spec", + "EventCode": "0x0000001d", + "BriefDescription": "speculative execute uncond br num" + }, + { + "EventName": "inst.brjmp.ind.spec", + "EventCode": "0x0000001e", + "BriefDescription": "speculative execute indir br num" + }, + { + "EventName": "inst.brjmp.ret.spec", + "EventCode": "0x0000001f", + "BriefDescription": "speculative execute return num" + }, + { + "EventName": "inst.brjmp.branch.tk", + "EventCode": "0x00000020", + "BriefDescription": "speculative br taken num" + }, + { + "EventName": "cache.l1i.rd.miss.latency", + "EventCode": "0x000000b8", + "BriefDescription": "stall cycle because of l1 icache miss" + }, + { + "EventName": "tlb.l1i.miss.latency", + "EventCode": "0x000000b9", + "BriefDescription": "stall cycle because of l1 itle miss" + }, + { + "EventName": "inst.mispred.brjmp.latency", + "EventCode": "0x000000ba", + "BriefDescription": "stall cycle because of br miss" + } +] diff --git a/tools/perf/pmu-events/arch/riscv/thead/c930/l2c.json b/tools/perf/pmu-events/arch/riscv/thead/c930/l2c.json new file mode 100644 index 000000000000..885c554d5025 --- /dev/null +++ b/tools/perf/pmu-events/arch/riscv/thead/c930/l2c.json @@ -0,0 +1,87 @@ +[ + { + "EventName": "cache.l2.wb", + "EventCode": "0x000000a2", + "BriefDescription": "l2 cache wb number, dirty snpresp data included." + }, + { + "EventName": "cache.l2.rd", + "EventCode": "0x000000a3", + "BriefDescription": "l2 cache read access." + }, + { + "EventName": "cache.l2.wr", + "EventCode": "0x000000a4", + "BriefDescription": "l2 cache store miss req from lsu." + }, + { + "EventName": "cache.l2.refill.rd", + "EventCode": "0x000000a5", + "BriefDescription": "l2 cache refill raised by lsu/ifu." + }, + { + "EventName": "cache.l2.refill.wr", + "EventCode": "0x000000a6", + "BriefDescription": "l2 cache refill raised by lsu stream write." + }, + { + "EventName": "cache.l2.wb.victim", + "EventCode": "0x000000a7", + "BriefDescription": "l2 cache write back to next-level cache raised by cache replace." + }, + { + "EventName": "cache.l2.wb.clean", + "EventCode": "0x000000a8", + "BriefDescription": "l2 cache write back to next-level cache raised by CMO or snoop." + }, + { + "EventName": "cache.l2.inval", + "EventCode": "0x000000a9", + "BriefDescription": "l2 cache invalidation to next-level cache raised by CMO or snoop." + }, + { + "EventName": "cache.l2.refill.inst", + "EventCode": "0x000000aa", + "BriefDescription": "l2 cache refill raised by ifu load miss." + }, + { + "EventName": "bus.access", + "EventCode": "0x000000ab", + "BriefDescription": "bus req count." + }, + { + "EventName": "bus.rd.access", + "EventCode": "0x000000ac", + "BriefDescription": "bus read access count." + }, + { + "EventName": "bus.wr.access", + "EventCode": "0x000000af", + "BriefDescription": "bus evict/write access count." + }, + { + "EventName": "topdown.backend_bound.memory.demand_read.l3", + "EventCode": "0x000000bb", + "BriefDescription": "cacheable demand read data from l3" + }, + { + "EventName": "topdown.backend_bound.memory.demand_read.peer_core", + "EventCode": "0x000000bc", + "BriefDescription": "cacheable demand read data from peer core" + }, + { + "EventName": "topdown.backend_bound.memory.demand_read.dram", + "EventCode": "0x000000bd", + "BriefDescription": "cacheable demand read data from dram" + }, + { + "EventName": "topdown.backend_bound.memory.demand_ostd_read", + "EventCode": "0x000000be", + "BriefDescription": "cacheable demand read with l2 miss and already sended bus req cycle" + }, + { + "EventName": "topdown.backend_bound.memory.demand_read", + "EventCode": "0x000000bf", + "BriefDescription": "cacheable demand read with l2 miss" + } +] diff --git a/tools/perf/pmu-events/arch/riscv/thead/c930/lsu.json b/tools/perf/pmu-events/arch/riscv/thead/c930/lsu.json new file mode 100644 index 000000000000..5779692951e4 --- /dev/null +++ b/tools/perf/pmu-events/arch/riscv/thead/c930/lsu.json @@ -0,0 +1,182 @@ +[ + { + "EventName": "topdown.backend_bound.memory.store.l2_miss", + "EventCode": "0x00000079", + "BriefDescription": "store l2 miss and results in issue block" + }, + { + "EventName": "topdown.backend_bound.memory.store.l1_miss", + "EventCode": "0x0000007a", + "BriefDescription": "store l1 miss and results in issue block" + }, + { + "EventName": "topdown.backend_bound.memory.load.l2_miss", + "EventCode": "0x0000007b", + "BriefDescription": "load l2 miss and results in issue block" + }, + { + "EventName": "topdown.backend_bound.memory.load.l1_miss", + "EventCode": "0x0000007c", + "BriefDescription": "load l1 miss and results in issue block" + }, + { + "EventName": "topdown.backend_bound.memory.load.struct_hazard", + "EventCode": "0x0000007d", + "BriefDescription": "load struct hazards and results in issue block" + }, + { + "EventName": "topdown.bad_speculation.rar_hazard_early_flush", + "EventCode": "0x0000007e", + "BriefDescription": "rar hazard results in early flush" + }, + { + "EventName": "topdown.bad_speculation.rar_hazard_abnr_flush", + "EventCode": "0x0000007f", + "BriefDescription": "rar hazard results in retire flush" + }, + { + "EventName": "topdown.bad_speculation.raw_hazard_early_flush", + "EventCode": "0x00000080", + "BriefDescription": "raw hazard results in early flush" + }, + { + "EventName": "topdown.bad_speculation.raw_hazard_abnr_flush", + "EventCode": "0x00000081", + "BriefDescription": "raw hazard results in retire flush" + }, + { + "EventName": "inst.load.unalign", + "EventCode": "0x00000083", + "BriefDescription": "load unalign split" + }, + { + "EventName": "inst.store.unalign", + "EventCode": "0x00000084", + "BriefDescription": "store unalign split" + }, + { + "EventName": "cache.l1d.wr.access", + "EventCode": "0x0000008a", + "BriefDescription": "store access l1 dcache" + }, + { + "EventName": "cache.l1d.wr.miss", + "EventCode": "0x0000008b", + "BriefDescription": "store l1 dcache miss" + }, + { + "EventName": "cache.l1d.refill.inner", + "EventCode": "0x0000008c", + "BriefDescription": "l1 dcache miss and l2c hit" + }, + { + "EventName": "cache.l1d.refill.outer", + "EventCode": "0x0000008d", + "BriefDescription": "l1 dcache miss and l2c miss" + }, + { + "EventName": "cache.l1d.wb", + "EventCode": "0x0000008e", + "BriefDescription": "l1 dcache dirty line eviction" + }, + { + "EventName": "cache.l1d.wb.victim", + "EventCode": "0x0000008f", + "BriefDescription": "l1 dcache dirty line evicted by new cache line refill" + }, + { + "EventName": "cache.l1d.wb.clean", + "EventCode": "0x00000090", + "BriefDescription": "l1 dcache dirty line evicted by cmo or snoop" + }, + { + "EventName": "cache.l1d.inval", + "EventCode": "0x00000091", + "BriefDescription": "l1 dcache line invalidated by cmo or snoop" + }, + { + "EventName": "inst.ldst.load.spec", + "EventCode": "0x00000092", + "BriefDescription": "load inst, not include prefetch" + }, + { + "EventName": "inst.ldst.store.spec", + "EventCode": "0x00000093", + "BriefDescription": "store inst, not include cmo" + }, + { + "EventName": "inst.ldst.lr.spec", + "EventCode": "0x00000094", + "BriefDescription": "lr inst" + }, + { + "EventName": "inst.ldst.sc", + "EventCode": "0x00000095", + "BriefDescription": "sc inst" + }, + { + "EventName": "inst.ldst.sc.pass", + "EventCode": "0x00000096", + "BriefDescription": "sc pass" + }, + { + "EventName": "inst.ldst.sc.fail", + "EventCode": "0x00000097", + "BriefDescription": "sc fail" + }, + { + "EventName": "inst.ldst.amo", + "EventCode": "0x00000098", + "BriefDescription": "amo inst" + }, + { + "EventName": "inst.ldst.load_acquire.spec", + "EventCode": "0x00000099", + "BriefDescription": "load acquire inst" + }, + { + "EventName": "inst.ldst.store_release.spec", + "EventCode": "0x0000009a", + "BriefDescription": "store release inst" + }, + { + "EventName": "inst.ldst.fence", + "EventCode": "0x0000009b", + "BriefDescription": "fence inst" + }, + { + "EventName": "inst.ldst.fencei", + "EventCode": "0x0000009c", + "BriefDescription": "fencei inst" + }, + { + "EventName": "inst.ldst.dvm_sync", + "EventCode": "0x0000009d", + "BriefDescription": "dvm sync inst" + }, + { + "EventName": "inst.ldst.vec_load.spec", + "EventCode": "0x0000009e", + "BriefDescription": "vector load inst(each split inst counts 1)" + }, + { + "EventName": "inst.ldst.vec_store.spec", + "EventCode": "0x0000009f", + "BriefDescription": "vector store inst(each split inst counts 1)" + }, + { + "EventName": "inst.ldst.float_load.spec", + "EventCode": "0x000000a0", + "BriefDescription": "float load inst" + }, + { + "EventName": "inst.ldst.float_store.spec", + "EventCode": "0x000000a1", + "BriefDescription": "float store inst" + }, + { + "EventName": "inst.ldst.spec", + "EventCode": "0x000000b3", + "BriefDescription": "lsu inst cmplt num" + } +] diff --git a/tools/perf/pmu-events/arch/riscv/thead/c930/vfpu.json b/tools/perf/pmu-events/arch/riscv/thead/c930/vfpu.json new file mode 100644 index 000000000000..4412e4acc817 --- /dev/null +++ b/tools/perf/pmu-events/arch/riscv/thead/c930/vfpu.json @@ -0,0 +1,177 @@ +[ + { + "EventName": "vfpu_ex1_rep", + "EventCode": "0x00000059", + "BriefDescription": "Number of inst replayed in VFPU EX1 stage" + }, + { + "EventName": "topdown.backend_bound.core.rvv_stall", + "EventCode": "0x0000005a", + "BriefDescription": "Number of stall cycles caused by vfpu blocking execution instructions" + }, + { + "EventName": "inst.sca.fp.arith.spec", + "EventCode": "0x0000005b", + "BriefDescription": "Number of executed scalar floating-point instructions" + }, + { + "EventName": "inst.fp.arith.half.spec", + "EventCode": "0x0000005c", + "BriefDescription": "Number of executed half-precision scalar floating-point micro-instructions" + }, + { + "EventName": "inst.fp.arith.single.spec", + "EventCode": "0x0000005d", + "BriefDescription": "Number of executed single-precision scalar floating-point micro-instructions" + }, + { + "EventName": "inst.fp.arith.double.spec", + "EventCode": "0x0000005e", + "BriefDescription": "Number of executed double-precision scalar floating-point micro-instructions" + }, + { + "EventName": "inst.rvv.arith.spec", + "EventCode": "0x0000005f", + "BriefDescription": "Number of executed vector macro-instructions" + }, + { + "EventName": "uop.rvv.arith.spec", + "EventCode": "0x00000060", + "BriefDescription": "Number of executed vector micro-instructions" + }, + { + "EventName": "uop.rvv.arith.fp.spec", + "EventCode": "0x00000061", + "BriefDescription": "Number of executed vector floating-point micro-instructions" + }, + { + "EventName": "inst.rvv.arith.fp.spec", + "EventCode": "0x00000062", + "BriefDescription": "Number of executed vector floating-point macro-instructions" + }, + { + "EventName": "uop.sca.fp.arith.bf.spec", + "EventCode": "0x00000063", + "BriefDescription": "Number of executed bfloat16 micro-instructions" + }, + { + "EventName": "uop.rvv.arith.vint8.spec", + "EventCode": "0x00000064", + "BriefDescription": "Number of vector micro-instructions executed with source operands of type int8" + }, + { + "EventName": "uop.rvv.arith.vint16.spec", + "EventCode": "0x00000065", + "BriefDescription": "Number of vector micro-instructions executed with source operands of type int16" + }, + { + "EventName": "uop.rvv.arith.vint32.spec", + "EventCode": "0x00000066", + "BriefDescription": "Number of vetor micro-instructions executed with source operands of type int32" + }, + { + "EventName": "uop.rvv.arith.vint64.spec", + "EventCode": "0x00000067", + "BriefDescription": "Number of vector micro-instructions executed with source operands of type int64" + }, + { + "EventName": "uop.rvv.arith.fix_point.spec", + "EventCode": "0x00000068", + "BriefDescription": "Number of executed fixed-point micro-instructions" + }, + { + "EventName": "uop.fp.arith.fdiv.spec", + "EventCode": "0x00000069", + "BriefDescription": "Number of executed floting-point division micro-instructions" + }, + { + "EventName": "uop.rvv.arith.idiv.spec", + "EventCode": "0x0000006a", + "BriefDescription": "Number of executed integer division micro-instructions" + }, + { + "EventName": "uop.rvv.arith.zvkn.spec", + "EventCode": "0x0000006b", + "BriefDescription": "Number of executed ZVKN-extension micro-instructions" + }, + { + "EventName": "uop.rvv.arith.zvks.spec", + "EventCode": "0x0000006c", + "BriefDescription": "Number of executed ZVKS-extension micro-instructions" + }, + { + "EventName": "uop.rvv.arith.vmulu64.spec", + "EventCode": "0x0000006d", + "BriefDescription": "Number of micro-instructions executed in vector 64-bit integer multiplication unit" + }, + { + "EventName": "uop.rvv.arith.vmulu.spec", + "EventCode": "0x0000006e", + "BriefDescription": "Number of micro-instructions executed in vector integer multiplication unit" + }, + { + "EventName": "uop.rvv.arith.vdot.spec", + "EventCode": "0x0000006f", + "BriefDescription": "Number of executed RISC-V and Xuantie dot-extension micro-instructions" + }, + { + "EventName": "uop.rvv.arith.vfmul.spec", + "EventCode": "0x00000070", + "BriefDescription": "Number of executed floating-point multiplication micro-instructions" + }, + { + "EventName": "uop.rvv.arith.vfadd.spec", + "EventCode": "0x00000071", + "BriefDescription": "Number of executed floating-point addition micro-instructions" + }, + { + "EventName": "uop.rvv.arith.perm.spec", + "EventCode": "0x00000072", + "BriefDescription": "Number of executed permutation micro-instructions" + }, + { + "EventName": "uop.rvv.arith.redu.spec", + "EventCode": "0x00000073", + "BriefDescription": "Number of executed integer reduction micro-instructions" + }, + { + "EventName": "uop.rvv.arith.fred.spec", + "EventCode": "0x00000074", + "BriefDescription": "Number of executed floating-point reduction micro-instructions" + }, + { + "EventName": "uop.rvv.arith.mask.spec", + "EventCode": "0x00000075", + "BriefDescription": "Number of executed vector masked micro-instructions" + }, + { + "EventName": "uop.rvv.arith.vlmax.spec", + "EventCode": "0x00000076", + "BriefDescription": "Number of vector micro-instructions excuted with max number of element" + }, + { + "EventName": "uop.rvv.arith.agnostic.spec", + "EventCode": "0x00000077", + "BriefDescription": "Number of vector micro-instructions excuted with agnostic policy" + }, + { + "EventName": "uop.rvv.arith.idiv_direct.spec", + "EventCode": "0x00000078", + "BriefDescription": "Number of vector integer division micro-instructions producing special value types" + }, + { + "EventName": "uop.rvv.vadd.spec", + "EventCode": "0x000000ae", + "BriefDescription": "Number of executed vector integer addition micro-instructions" + }, + { + "EventName": "uop.rvv.vmacc.spec", + "EventCode": "0x000000af", + "BriefDescription": "Number of executed vector integer multiply-add micro-instructions" + }, + { + "EventName": "uop.rvv.vfmacc.spec", + "EventCode": "0x000000b0", + "BriefDescription": "Number of executed vector floating-point multiply-add micro-instructions" + } +] -- 2.49.0 From cp0613 at linux.alibaba.com Wed Sep 10 05:11:21 2025 From: cp0613 at linux.alibaba.com (cp0613 at linux.alibaba.com) Date: Wed, 10 Sep 2025 20:11:21 +0800 Subject: [PATCH 2/2] perf vendor events riscv: Add T-HEAD C930 metrics In-Reply-To: <20250910121121.7203-1-cp0613@linux.alibaba.com> References: <20250910121121.7203-1-cp0613@linux.alibaba.com> Message-ID: <20250910121121.7203-3-cp0613@linux.alibaba.com> From: Chen Pei This patch adds T-HEAD C930 metrics, including topdown and some other metric groups. Signed-off-by: Chen Pei --- .../arch/riscv/thead/c930/metrics.json | 538 ++++++++++++++++++ 1 file changed, 538 insertions(+) create mode 100644 tools/perf/pmu-events/arch/riscv/thead/c930/metrics.json diff --git a/tools/perf/pmu-events/arch/riscv/thead/c930/metrics.json b/tools/perf/pmu-events/arch/riscv/thead/c930/metrics.json new file mode 100644 index 000000000000..689bae6209dc --- /dev/null +++ b/tools/perf/pmu-events/arch/riscv/thead/c930/metrics.json @@ -0,0 +1,538 @@ +[ + { + "MetricExpr": "(topdown.frontend_bound.slots - topdown.bad_speculation.recovery_bubbles*8) / topdown.slots ", + "PublicDescription": "Fraction of slots unused due to the frontend's inability to supply enough uops", + "BriefDescription": "Fraction of slots unused due to the frontend's inability to supply enough uops", + "CommonMetricgroupName": "TopdownL1", + "MetricGroup": "Common;TopdownL1", + "MetricName": "topdown.frontend_bound.rate" + }, + { + "MetricExpr": "(uop.spec - uop.ret + topdown.bad_speculation.recovery_bubbles*8) / topdown.slots", + "PublicDescription": "Fraction of slots wasted due to incorrect speculations", + "BriefDescription": "Fraction of slots wasted due to incorrect speculations", + "CommonMetricgroupName": "TopdownL1", + "MetricGroup": "Common;TopdownL1", + "MetricName": "topdown.bad_speculation.rate" + }, + { + "MetricExpr": "uop.ret / topdown.slots", + "PublicDescription": "Fraction of slots that retired", + "BriefDescription": "Fraction of slots that retired", + "CommonMetricgroupName": "TopdownL1", + "MetricGroup": "Common;TopdownL1", + "MetricName": "topdown.retiring.rate" + }, + { + "MetricExpr": "1 - (topdown.frontend_bound.rate + topdown.bad_speculation.rate + topdown.retiring.rate)", + "PublicDescription": "Fraction of slots unused due to a lack of backend resources", + "BriefDescription": "Fraction of slots unused due to a lack of backend resources", + "CommonMetricgroupName": "TopdownL1", + "MetricGroup": "Common;TopdownL1", + "MetricName": "topdown.backend_bound.rate" + }, + { + "MetricExpr": "topdown.frontend_bound.latency.slots / topdown.slots", + "PublicDescription": "Fetch latency bound L2 topdown metric", + "BriefDescription": "Fetch latency bound L2 topdown metric", + "MetricGroup": "TopdownL2", + "MetricName": "topdown.frontend_bound.latency.rate" + }, + { + "MetricExpr": "topdown.frontend_bound.rate - topdown.frontend_bound.latency.rate", + "PublicDescription": "Fetch bandwidth bound L2 topdown metric", + "BriefDescription": "Fetch bandwidth bound L2 topdown metric", + "MetricGroup": "TopdownL2", + "MetricName": "topdown.frontend_bound.bandwidth.rate" + }, + { + "MetricExpr": "topdown.bad_speculation.rate * inst.mispred.brjmp.spec / (inst.mispred.brjmp.spec + topdown.bad_speculation.exception_flush + topdown.bad_speculation.interrupt_flush + topdown.bad_speculation.other_flush + topdown.bad_speculation.rar_hazard_early_flush + topdown.bad_speculation.raw_hazard_early_flush)", + "PublicDescription": "Branch mispredicts L2 topdown metric", + "BriefDescription": "Branch mispredicts L2 topdown metric", + "MetricGroup": "TopdownL2", + "MetricName": "topdown.bad_speculation.branch_mispredicts.rate" + }, + { + "MetricExpr": "topdown.bad_speculation.rate * (topdown.bad_speculation.raw_hazard_early_flush + topdown.bad_speculation.rar_hazard_early_flush) / (inst.mispred.brjmp.spec + topdown.bad_speculation.exception_flush + topdown.bad_speculation.interrupt_flush + topdown.bad_speculation.other_flush + topdown.bad_speculation.rar_hazard_early_flush + topdown.bad_speculation.raw_hazard_early_flush)", + "PublicDescription": "Load/Store early flush L2 topdown metric", + "BriefDescription": "Load/Store early flush L2 topdown metric", + "MetricGroup": "TopdownL2", + "MetricName": "topdown.bad_speculation.ldst_early_flush.rate" + }, + { + "MetricExpr": "topdown.bad_speculation.rate - (topdown.bad_speculation.branch_mispredicts.rate + topdown.bad_speculation.ldst_early_flush.rate)", + "PublicDescription": "Machine clears L2 topdown metric", + "BriefDescription": "Machine clears L2 topdown metric", + "MetricGroup": "TopdownL2", + "MetricName": "topdown.bad_speculation.machine_clears.rate" + }, + { + "MetricExpr": "topdown.backend_bound.rate * (topdown.backend_bound.memory.load + topdown.backend_bound.memory.store)/cycles.hart", + "PublicDescription": "Memory bound L2 topdown metric", + "BriefDescription": "Memory bound L2 topdown metric", + "MetricGroup": "TopdownL2", + "MetricName": "topdown.backend_bound.memory_bound.rate" + }, + { + "MetricExpr": "topdown.backend_bound.rate - topdown.backend_bound.memory_bound.rate", + "PublicDescription": "Core bound L2 topdown metric", + "BriefDescription": "Core bound L2 topdown metric", + "MetricGroup": "TopdownL2", + "MetricName": "topdown.backend_bound.core_bound.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * (inst.int.alu.spec + inst.int.mul.spec + inst.int.div.spec + inst.int.csr.spec) / uop.spec", + "PublicDescription": "Integer operations L2 topdown metric", + "BriefDescription": "Integer operations L2 topdown metric", + "MetricGroup": "TopdownL2", + "MetricName": "topdown.retiring.int.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.brjmp.spec / uop.spec", + "PublicDescription": "Branch operations L2 topdown metric", + "BriefDescription": "Branch operations L2 topdown metric", + "MetricGroup": "TopdownL2", + "MetricName": "topdown.retiring.brjmp.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.ldst.spec / uop.spec", + "PublicDescription": "Load/Store operations L2 topdown metric", + "BriefDescription": "Load/Store operations L2 topdown metric", + "MetricGroup": "TopdownL2", + "MetricName": "topdown.retiring.ldst.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.sca.fp.arith.spec / uop.spec", + "PublicDescription": "Scalar float point arithmetic operations L2 topdown metric", + "BriefDescription": "Scalar float point arithmetic operations L2 topdown metric", + "MetricGroup": "TopdownL2", + "MetricName": "topdown.retiring.fp.arith.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.rvv.arith.spec / uop.spec", + "PublicDescription": "Vector arithmetic operations L2 topdown metric", + "BriefDescription": "Vector arithmetic operations L2 topdown metric", + "MetricGroup": "TopdownL2", + "MetricName": "topdown.retiring.rvv.arith.rate" + }, + { + "MetricExpr": "topdown.retiring.rate - topdown.retiring.int.rate - topdown.retiring.brjmp.rate - topdown.retiring.ldst.rate - topdown.retiring.fp.arith.rate - topdown.retiring.rvv.arith.rate", + "PublicDescription": "Other operations L2 topdown metric", + "BriefDescription": "Other operations L2 topdown metric", + "MetricGroup": "TopdownL2", + "MetricName": "topdown.retiring.other.rate" + }, + { + "MetricExpr": "cache.l1i.rd.miss.latency / cycles.hart", + "PublicDescription": "Idle by icache miss L3 topdown metric", + "BriefDescription": "Idle by icache miss L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.frontend_bound.latency.data.rate" + }, + { + "MetricExpr": "tlb.l1i.miss.latency / cycles.hart", + "PublicDescription": "Idle by itlb miss L3 topdown metric", + "BriefDescription": "Idle by itlb miss L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.frontend_bound.latency.addr.rate" + }, + { + "MetricExpr": "inst.mispred.brjmp.latency / cycles.hart", + "PublicDescription": "Idle by fetch pipeline bubbles L3 topdown metric", + "BriefDescription": "Idle by fetch pipeline bubbles L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.frontend_bound.latency.redirect.rate" + }, + { + "MetricExpr": "inst.mispred.brjmp.spec / inst.brjmp.spec", + "PublicDescription": "Branch misprediction rate L3 topdown metric", + "BriefDescription": "Branch misprediction rate L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "inst.mispred.rate" + }, + { + "MetricExpr": "inst.mispred.brjmp.spec * 1000 / uop.spec", + "PublicDescription": "Branch misprediction per 1000 instructions L3 topdown metric", + "BriefDescription": "Branch misprediction per 1000 instructions L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "inst.mispred.pki" + }, + { + "MetricExpr": "inst.mispred.branch / inst.brjmp.branch.spec", + "PublicDescription": "Condition branch misprediction rate L3 topdown metric", + "BriefDescription": "Condition branch misprediction rate L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "inst.mispred.branch.rate" + }, + { + "MetricExpr": "inst.mispred.branch *1000 / uop.spec", + "PublicDescription": "Condition branch misprediction per 1000 instructions L3 topdown metric", + "BriefDescription": "Condition branch misprediction per 1000 instructions L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "inst.mispred.branch.pki" + }, + { + "MetricExpr": "inst.mispred.uncond_branch / inst.brjmp.uncond_branch.spec", + "PublicDescription": "Unconditional branch (exclude indirect branch and function return) misprediction rate L3 topdown metric", + "BriefDescription": "Unconditional branch (exclude indirect branch and function return) misprediction rate L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "inst.mispred.uncond_branch.rate" + }, + { + "MetricExpr": "inst.mispred.uncond_branch *1000 / uop.spec", + "PublicDescription": "Unconditional branch (exclude indirect branch and function return) misprediction per 1000 instructions L3 topdown metric", + "BriefDescription": "Unconditional branch (exclude indirect branch and function return) misprediction per 1000 instructions L3 topdown metric", + "MetricGroup": "TopdownL3;Per-instruction", + "MetricName": "inst.mispred.uncond_branch.pki" + }, + { + "MetricExpr": "inst.mispred.ind / inst.brjmp.ind.spec", + "PublicDescription": "Indirect branch misprediction rate L3 topdown metric", + "BriefDescription": "Indirect branch misprediction rate L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "inst.mispred.ind.rate" + }, + { + "MetricExpr": "inst.mispred.ind *1000 / uop.spec", + "PublicDescription": "Indirect branch misprediction per 1000 instructions L3 topdown metric", + "BriefDescription": "Indirect branch misprediction per 1000 instructions L3 topdown metric", + "MetricGroup": "TopdownL3;Per-instruction", + "MetricName": "inst.mispred.ind.pki" + }, + { + "MetricExpr": "inst.mispred.ret / inst.brjmp.ret.spec", + "PublicDescription": "Function return misprediction rate L3 topdown metric", + "BriefDescription": "Function return misprediction rate L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "inst.mispred.ret.rate" + }, + { + "MetricExpr": "inst.mispred.ret *1000 / uop.spec", + "PublicDescription": "Function return misprediction per 1000 instructions L3 topdown metric", + "BriefDescription": "Function return misprediction per 1000 instructions L3 topdown metric", + "MetricGroup": "TopdownL3;Per-instruction", + "MetricName": "inst.mispred.ret.pki" + }, + { + "MetricExpr": "topdown.bad_speculation.raw_hazard_early_flush / (topdown.bad_speculation.raw_hazard_early_flush + topdown.bad_speculation.rar_hazard_early_flush)", + "PublicDescription": "RAW early flush rate L3 topdown metric", + "BriefDescription": "RAW early flush rate L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.bad_speculation.raw_hazard_early_flush.rate" + }, + { + "MetricExpr": "topdown.bad_speculation.rar_hazard_early_flush / (topdown.bad_speculation.raw_hazard_early_flush + topdown.bad_speculation.rar_hazard_early_flush)", + "PublicDescription": "RAR early flush rate L3 topdown metric", + "BriefDescription": "RAR early flush rate L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.bad_speculation.rar_hazard_early_flush.rate" + }, + { + "MetricExpr": "topdown.bad_speculation.exception_flush / (topdown.bad_speculation.exception_flush + topdown.bad_speculation.interrupt_flush + topdown.bad_speculation.other_flush)", + "PublicDescription": "exception flush rate L3 topdown metric", + "BriefDescription": "exception flush rate L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.bad_speculation.exception_flush.rate" + }, + { + "MetricExpr": "topdown.bad_speculation.interrupt_flush / (topdown.bad_speculation.exception_flush + topdown.bad_speculation.interrupt_flush + topdown.bad_speculation.other_flush)", + "PublicDescription": "interrupt flush rate L3 topdown metric", + "BriefDescription": "interrupt flush rate L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.bad_speculation.interrupt_flush.rate" + }, + { + "MetricExpr": "topdown.bad_speculation.other_flush / (topdown.bad_speculation.exception_flush + topdown.bad_speculation.interrupt_flush + topdown.bad_speculation.other_flush)", + "PublicDescription": "other flush rate L3 topdown metric", + "BriefDescription": "other flush rate L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.bad_speculation.other_flush.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.int.alu.spec / uop.spec", + "PublicDescription": "Arithmetic operations L3 topdown metric", + "BriefDescription": "Arithmetic operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.int.alu.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.int.mul.spec / uop.spec", + "PublicDescription": "Multiplication operations L3 topdown metric", + "BriefDescription": "Multiplication operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.int.mul.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.int.div.spec / uop.spec", + "PublicDescription": "Division operations L3 topdown metric", + "BriefDescription": "Division operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.int.div.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.int.csr.spec / uop.spec", + "PublicDescription": "CSR access operations L3 topdown metric", + "BriefDescription": "CSR access operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.int.csr.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.brjmp.branch.spec / uop.spec", + "PublicDescription": "Conditional branch operations L3 topdown metric", + "BriefDescription": "Conditional branch operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.brjmp.branch.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.brjmp.ind.spec / uop.spec", + "PublicDescription": "Indirect branch operations L3 topdown metric", + "BriefDescription": "Indirect branch operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.brjmp.ind.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.brjmp.ret.spec / uop.spec", + "PublicDescription": "Function return operations L3 topdown metric", + "BriefDescription": "Function return operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.brjmp.ret.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.brjmp.uncond_branch.spec / uop.spec", + "PublicDescription": "Unconditional branch operations L3 topdown metric", + "BriefDescription": "Unconditional branch operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.brjmp.uncond_branch.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.ldst.load.spec / uop.spec", + "PublicDescription": "Load operations L3 topdown metric", + "BriefDescription": "Load operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.ldst.load.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.ldst.store.spec / uop.spec", + "PublicDescription": "Store operations L3 topdown metric", + "BriefDescription": "Store operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.ldst.store.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.ldst.float_load.spec / uop.spec", + "PublicDescription": "Float load operations L3 topdown metric", + "BriefDescription": "Float laod operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.ldst.float_load.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.ldst.float_store.spec / uop.spec", + "PublicDescription": "Float store operations L3 topdown metric", + "BriefDescription": "Float store operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.ldst.float_store.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.ldst.vec_load.spec / uop.spec", + "PublicDescription": "Vector load operations L3 topdown metric", + "BriefDescription": "Vector load operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.ldst.vec_load.rate" + }, + { + "MetricExpr": "topdown.retiring.rate * inst.ldst.vec_store.spec / uop.spec", + "PublicDescription": "Vector store operations L3 topdown metric", + "BriefDescription": "Vector store operations L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.retiring.ldst.vec_store.rate" + }, + { + "MetricExpr": "topdown.backend_bound.core.barrier_csr / cycles.hart", + "PublicDescription": "Core bound barrier stall L3 topdown metric", + "BriefDescription": "Core bound barrier stall L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.backend_bound.core.barrier.rate" + }, + { + "MetricExpr": "topdown.backend_bound.core.highload / cycles.hart", + "PublicDescription": "Core bound high load stall L3 topdown metric", + "BriefDescription": "Core bound high load stall L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.backend_bound.core.highload.rate" + }, + { + "MetricExpr": "topdown.backend_bound.core.rob_full / cycles.hart", + "PublicDescription": "Core bound rob full stall L3 topdown metric", + "BriefDescription": "Core bound rob full stall L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.backend_bound.core.rob_full.rate" + }, + { + "MetricExpr": " (ieu.is.siq.stall + ieu.is.miq.stall + ieu.is.biq.stall + ieu.is.lsiq.stall + ieu.is.vfpq.stall) / cycles.hart", + "PublicDescription": "Core bound issue queue stall L3 topdown metric", + "BriefDescription": "Core bound issueu queue stall L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.backend_bound.core.is_stall.rate" + }, + { + "MetricExpr": " topdown.backend_bound.core.div_busy / cycles.hart", + "PublicDescription": "Core bound div busy stall L3 topdown metric", + "BriefDescription": "Core bound div busy stall L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.backend_bound.core.div_busy.rate" + }, + { + "MetricExpr": " topdown.backend_bound.core.rvv_stall / cycles.hart", + "PublicDescription": "Core bound rvv stall L3 topdown metric", + "BriefDescription": "Core bound rvv stall L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.backend_bound.core.rvv_stall.rate" + }, + { + "MetricExpr": "topdown.backend_bound.memory.load.l2_miss / (topdown.backend_bound.memory.load + topdown.backend_bound.memory.store)", + "PublicDescription": "External memory bound L3 topdown metric", + "BriefDescription": "External memory bound L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.backend_bound.memory.ext_mem_bound.rate" + }, + { + "MetricExpr": "topdown.backend_bound.memory.load.l1_miss / (topdown.backend_bound.memory.load + topdown.backend_bound.memory.store)", + "PublicDescription": "L2 memory bound L3 topdown metric", + "BriefDescription": "L2 memory bound L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.backend_bound.memory.l2_bound.rate" + }, + { + "MetricExpr": "1 - topdown.backend_bound.memory.ext_mem_bound.rate - topdown.backend_bound.memory.l2_bound.rate - topdown.backend_bound.memory.store_bound.rate", + "PublicDescription": "L1 memory bound L3 topdown metric", + "BriefDescription": "L1 memory bound L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.backend_bound.memory.l1_bound.rate" + }, + { + "MetricExpr": "topdown.backend_bound.memory.store / (topdown.backend_bound.memory.load + topdown.backend_bound.memory.store)", + "PublicDescription": "Store bound L3 topdown metric", + "BriefDescription": "Store bound L3 topdown metric", + "MetricGroup": "TopdownL3", + "MetricName": "topdown.backend_bound.memory.store_bound.rate" + }, + { + "MetricExpr": "cycles.hart * 8", + "PublicDescription": "Total slots", + "BriefDescription": "Total slots", + "MetricGroup": "Common", + "MetricName": "topdown.slots" + }, + { + "MetricExpr": "uop.ret / cycles.hart", + "PublicDescription": "Instructions per cycle", + "BriefDescription": "Instructions per cycle", + "MetricGroup": "Common;Per-cycle", + "MetricName": "ipc" + }, + { + "MetricExpr": "cache.l1d.rd.miss * 1000 / uop.ret", + "PublicDescription": "l1 dcache access misses per 1000 instructions", + "BriefDescription": "l1 dcache access misses per 1000 instructions", + "MetricGroup": "Common;Per-instruction", + "MetricName": "cache.l1d.rd.data.mpki" + }, + { + "MetricExpr": "cache.l1d.wr.miss * 1000 / uop.ret", + "PublicDescription": "l1 dcache write access misses per 1000 instructions", + "BriefDescription": "l1 dcache write access misses per 1000 instructions", + "MetricGroup": "Common;Per-instruction", + "MetricName": "cache.l1d.wr.data.mpki" + }, + { + "MetricExpr": "cache.l1i.rd.miss * 1000 / uop.ret", + "PublicDescription": "l1 icache access misses per 1000 instructions", + "BriefDescription": "l1 icache access misses per 1000 instructions", + "MetricGroup": "Common;Per-instruction", + "MetricName": "cache.l1i.rd.code.mpki" + }, + { + "MetricExpr": "cache.l2.miss * 1000 / uop.ret", + "PublicDescription": "l2 cache access misses per 1000 instructions", + "BriefDescription": "l2 cache access misses per 1000 instructions", + "MetricGroup": "Common;Per-instruction", + "MetricName": "cache.l2.mpki" + }, + { + "MetricExpr": "tlb.l1d.miss * 1000 / uop.ret", + "PublicDescription": "L1 TLB misses caused by data loads or stores per 1000 instructions", + "BriefDescription": "L1 TLB misses caused by data loads or stores per 1000 instructions", + "MetricGroup": "Common;Per-instruction", + "MetricName": "tlb.l1d.ldst.mpki" + }, + { + "MetricExpr": "tlb.l1i.miss * 1000 / uop.ret", + "PublicDescription": "L1 TLB misses caused by instruction fetch per 1000 instructions", + "BriefDescription": "L1 TLB misses caused by instruction fetch per 1000 instructions", + "MetricGroup": "Common;Per-instruction", + "MetricName": "tlb.l1i.code.mpki" + }, + { + "MetricExpr": "cache.l1d.rd.miss / cache.l1d.rd.access", + "PublicDescription": "l1 dcache access miss rate", + "BriefDescription": "l1 dcache access miss rate", + "MetricGroup": "Common", + "MetricName": "cache.l1d.rd.data.miss.rate" + }, + { + "MetricExpr": "cache.l1d.wr.miss / cache.l1d.wr.access", + "PublicDescription": "l1 dcache write access miss rate", + "BriefDescription": "l1 dcache write access miss rate", + "MetricGroup": "Common", + "MetricName": "cache.l1d.wr.data.miss.rate" + }, + { + "MetricExpr": "cache.l1i.rd.miss / cache.l1i.rd.access", + "PublicDescription": "l1 icache access miss rate", + "BriefDescription": "l1 icache access miss rate", + "MetricGroup": "Common", + "MetricName": "cache.l1i.rd.code.miss.rate" + }, + { + "MetricExpr": "cache.l2.miss / cache.l2.access", + "PublicDescription": "l2 cache access miss rate", + "BriefDescription": "l2 cache access miss rate", + "MetricGroup": "Common", + "MetricName": "cache.l2.rd.miss.rate" + }, + { + "MetricExpr": "1 - topdown.backend_bound.memory.demand_read.l3 / topdown.backend_bound.memory.demand_read", + "PublicDescription": "LLC access miss rate", + "BriefDescription": "lLC access miss rate", + "MetricGroup": "Common", + "MetricName": "cache.l3.rd.miss.rate" + }, + { + "MetricExpr": "tlb.l1d.miss / tlb.l1d.access", + "PublicDescription": "L1 TLB misses caused by data loads or stores rate", + "BriefDescription": "L1 TLB misses caused by data loads or stores rate", + "MetricGroup": "Common", + "MetricName": "tlb.l1d.ldst.miss.rate" + }, + { + "MetricExpr": "tlb.l1i.miss / tlb.l1i.access", + "PublicDescription": "L1 TLB misses caused by instruction fetch rate", + "BriefDescription": "L1 TLB misses caused by instruction fetch rate", + "MetricGroup": "Common", + "MetricName": "tlb.l1i.code.miss.rate" + }, + { + "MetricExpr": " bus.rd.access / bus.access", + "PublicDescription": "Bus read access rate", + "BriefDescription": "Bus read access rate", + "MetricGroup": "Common", + "MetricName": "bus.rd.rate" + }, + { + "MetricExpr": " bus.wr.access / bus.access", + "PublicDescription": "Bus write access rate", + "BriefDescription": "Bus write access rate", + "MetricGroup": "Common", + "MetricName": "bus.wr.rate" + } +] -- 2.49.0 From wens at csie.org Wed Sep 10 05:57:02 2025 From: wens at csie.org (Chen-Yu Tsai) Date: Wed, 10 Sep 2025 20:57:02 +0800 Subject: [PATCH v1] riscv: dts: allwinner: rename devterm i2c-gpio node to comply with binding In-Reply-To: <20250909-frown-wrinkle-f16df243a970@spud> References: <20250909-frown-wrinkle-f16df243a970@spud> Message-ID: <175750902214.2590389.9826563009120753959.b4-ty@csie.org> On Tue, 09 Sep 2025 20:58:17 +0100, Conor Dooley wrote: > The i2c controller binding does not permit permit the node name to > contain "gpio", resulting in two warnings: > > i2c-gpio-0 (i2c-gpio): $nodename:0: 'i2c-gpio-0' does not match '^i2c(@.+|-[a-z0-9]+)?$' > i2c-gpio-0 (i2c-gpio): Unevaluated properties are not allowed ('#address-cells', '#size-cells', 'adc at 54' were unexpected) > > Drop it to satisfy dtbs_check. > > [...] Applied to sunxi/fixes-for-6.17 in local tree, thanks! [1/1] riscv: dts: allwinner: rename devterm i2c-gpio node to comply with binding commit: a5d7a8ab4b21747173a2f8f0ebf71d72692793c3 Best regards, -- Chen-Yu Tsai From helgaas at kernel.org Wed Sep 10 07:23:21 2025 From: helgaas at kernel.org (Bjorn Helgaas) Date: Wed, 10 Sep 2025 09:23:21 -0500 Subject: [PATCH v2 2/7] PCI: cadence: Check pcie-ops before using it. In-Reply-To: <18aba25b853d00caf10cc784093c0b91fdc1747d.1757467895.git.unicorn_wang@outlook.com> Message-ID: <20250910142321.GA1533672@bhelgaas> Drop period at end of subject. On Wed, Sep 10, 2025 at 10:08:16AM +0800, Chen Wang wrote: > From: Chen Wang > > ops of struct cdns_pcie may be NULL, direct use > will result in a null pointer error. > > Add checking of pcie->ops before using it for new > driver that may not supply pcie->ops. > > Signed-off-by: Chen Wang > --- > drivers/pci/controller/cadence/pcie-cadence-host.c | 2 +- > drivers/pci/controller/cadence/pcie-cadence.c | 4 ++-- > drivers/pci/controller/cadence/pcie-cadence.h | 6 +++--- > 3 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/pci/controller/cadence/pcie-cadence-host.c b/drivers/pci/controller/cadence/pcie-cadence-host.c > index 59a4631de79f..fffd63d6665e 100644 > --- a/drivers/pci/controller/cadence/pcie-cadence-host.c > +++ b/drivers/pci/controller/cadence/pcie-cadence-host.c > @@ -531,7 +531,7 @@ static int cdns_pcie_host_init_address_translation(struct cdns_pcie_rc *rc) > cdns_pcie_writel(pcie, CDNS_PCIE_AT_OB_REGION_PCI_ADDR1(0), addr1); > cdns_pcie_writel(pcie, CDNS_PCIE_AT_OB_REGION_DESC1(0), desc1); > > - if (pcie->ops->cpu_addr_fixup) > + if (pcie->ops && pcie->ops->cpu_addr_fixup) > cpu_addr = pcie->ops->cpu_addr_fixup(pcie, cpu_addr); > > addr0 = CDNS_PCIE_AT_OB_REGION_CPU_ADDR0_NBITS(12) | > diff --git a/drivers/pci/controller/cadence/pcie-cadence.c b/drivers/pci/controller/cadence/pcie-cadence.c > index 70a19573440e..61806bbd8aa3 100644 > --- a/drivers/pci/controller/cadence/pcie-cadence.c > +++ b/drivers/pci/controller/cadence/pcie-cadence.c > @@ -92,7 +92,7 @@ void cdns_pcie_set_outbound_region(struct cdns_pcie *pcie, u8 busnr, u8 fn, > cdns_pcie_writel(pcie, CDNS_PCIE_AT_OB_REGION_DESC1(r), desc1); > > /* Set the CPU address */ > - if (pcie->ops->cpu_addr_fixup) > + if (pcie->ops && pcie->ops->cpu_addr_fixup) > cpu_addr = pcie->ops->cpu_addr_fixup(pcie, cpu_addr); > > addr0 = CDNS_PCIE_AT_OB_REGION_CPU_ADDR0_NBITS(nbits) | > @@ -123,7 +123,7 @@ void cdns_pcie_set_outbound_region_for_normal_msg(struct cdns_pcie *pcie, > } > > /* Set the CPU address */ > - if (pcie->ops->cpu_addr_fixup) > + if (pcie->ops && pcie->ops->cpu_addr_fixup) > cpu_addr = pcie->ops->cpu_addr_fixup(pcie, cpu_addr); > > addr0 = CDNS_PCIE_AT_OB_REGION_CPU_ADDR0_NBITS(17) | > diff --git a/drivers/pci/controller/cadence/pcie-cadence.h b/drivers/pci/controller/cadence/pcie-cadence.h > index 1d81c4bf6c6d..2f07ba661bda 100644 > --- a/drivers/pci/controller/cadence/pcie-cadence.h > +++ b/drivers/pci/controller/cadence/pcie-cadence.h > @@ -468,7 +468,7 @@ static inline u32 cdns_pcie_ep_fn_readl(struct cdns_pcie *pcie, u8 fn, u32 reg) > > static inline int cdns_pcie_start_link(struct cdns_pcie *pcie) > { > - if (pcie->ops->start_link) > + if (pcie->ops && pcie->ops->start_link) > return pcie->ops->start_link(pcie); > > return 0; > @@ -476,13 +476,13 @@ static inline int cdns_pcie_start_link(struct cdns_pcie *pcie) > > static inline void cdns_pcie_stop_link(struct cdns_pcie *pcie) > { > - if (pcie->ops->stop_link) > + if (pcie->ops && pcie->ops->stop_link) > pcie->ops->stop_link(pcie); > } > > static inline bool cdns_pcie_link_up(struct cdns_pcie *pcie) > { > - if (pcie->ops->link_up) > + if (pcie->ops && pcie->ops->link_up) > return pcie->ops->link_up(pcie); > > return true; > -- > 2.34.1 > From conor at kernel.org Wed Sep 10 07:27:19 2025 From: conor at kernel.org (Conor Dooley) Date: Wed, 10 Sep 2025 15:27:19 +0100 Subject: [PATCH v2] RISC-V: re-enable gcc + rust builds In-Reply-To: <20250909-gcc-rust-v2-v2-1-35e086b1b255@gmail.com> References: <20250909-gcc-rust-v2-v2-1-35e086b1b255@gmail.com> Message-ID: <20250910-harmless-bamboo-ebc94758fdad@spud> On Tue, Sep 09, 2025 at 06:53:11PM +0200, Asuna Yang wrote: > Commit 33549fcf37ec ("RISC-V: disallow gcc + rust builds") disabled GCC > + Rust builds for RISC-V due to differences in extension handling > compared to LLVM. > > Add a Kconfig symbol to indicate the version of libclang used by Rust > bindgen and add conditions for the availability of libclang to the > RISC-V extension Kconfig symbols that depend on the cc-option function. > > For Zicsr/Zifencei special handling, since LLVM/Clang always enables > these two extensions, either don't pass them to -march, or pass them > explicitly and Rust bindgen libclang must recognize them. > > Clang does not support -mno-riscv-attribute flag, filter it out to > resolve error: unknown argument: '-mno-riscv-attribute'. > > Define BINDGEN_TARGET_riscv to pass the target triplet to Rust bindgen > libclang for RISC-V to resolve error: unsupported argument 'medany' to > option '-mcmodel=' for target 'unknown'. Improve to output a clearer > error message if the target triplet is undefined for Rust bindgen > libclang. > > Update the documentation, GCC + Rust builds are now supported. > > --- FWIW, this --- breaks git, and anything after this line (including your signoff) is lost when the patch is applied. > Discussion: > https://lore.kernel.org/linux-riscv/68496eed-b5a4-4739-8d84-dcc428a08e20 at gmail.com/ > Patch v1: > https://lore.kernel.org/linux-riscv/20250903190806.2604757-1-SpriteOvO at gmail.com/ > > GCC + Rust builds for RISC-V are disabled about a year ago due to differences in > extension handling compared to LLVM, as discussed in > https://lore.kernel.org/all/20240917000848.720765-1-jmontleo at redhat.com/ > > This patch re-enables GCC + Rust builds. Compared to v1, v2 reverts the > separation of get-rust-bindgen-libclang script and improves Kconfig conditions > based on Conor's review. > > The separation of get-rust-bindgen-libclang script is reverted based on the > concerns raised by Miguel. However, it's worth noting that we now have 3 > different places rust/Makefile scripts/{Kconfig.include,rust_is_avilable.sh} > where manually calling bindgen rust_is_available_bindgen_libclang.h + sed to get > the version of libclang, and in particular, for our newly added Kconfig symbol, > we now use awk to canonicalize the version to an integer. I would still like to > do the script separation later for better maintainability and readability if > possible, which can be discussed further later when Miguel has time. > > Signed-off-by: Asuna Yang > diff --git a/init/Kconfig b/init/Kconfig > index e3eb63eadc8757a10b091c74bbee8008278c0521..0859d308a48591df769c7dbaef6f035324892bd3 100644 > --- a/init/Kconfig > +++ b/init/Kconfig > @@ -82,6 +82,12 @@ config RUSTC_LLVM_VERSION > int > default $(rustc-llvm-version) > > +config RUST_BINDGEN_LIBCLANG_VERSION > + int > + default $(rustc-bindgen-libclang-version) > + help > + This is the version of `libclang` used by the Rust bindings generator. The riscv patchwork CI stuff is really unhappy with this change: init/Kconfig:87: syntax error init/Kconfig:87: invalid statement init/Kconfig:88: invalid statement init/Kconfig:89:warning: ignoring unsupported character '`' init/Kconfig:89:warning: ignoring unsupported character '`' init/Kconfig:89:warning: ignoring unsupported character '.' init/Kconfig:89: unknown statement "This" Is this bogus, or can rustc-bindgen-libclang-version return nothing under some conditions where rust is not available? Should this have 2 default lines like some other options in the file? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From helgaas at kernel.org Wed Sep 10 07:34:53 2025 From: helgaas at kernel.org (Bjorn Helgaas) Date: Wed, 10 Sep 2025 09:34:53 -0500 Subject: [PATCH v2 3/7] PCI: sg2042: Add Sophgo SG2042 PCIe driver In-Reply-To: <162d064228261ccd0bf9313a20288e510912effd.1757467895.git.unicorn_wang@outlook.com> Message-ID: <20250910143453.GA1533730@bhelgaas> On Wed, Sep 10, 2025 at 10:08:39AM +0800, Chen Wang wrote: > From: Chen Wang > > Add support for PCIe controller in SG2042 SoC. The controller > uses the Cadence PCIe core programmed by pcie-cadence*.c. The > PCIe controller will work in host mode only, supporting data > rate(gen4) and lanes(x16 or x8). Strictly speaking, "gen4" is a spec revision, not a data rate. Include the GT/s rate instead or in addition. We can fix this when merging if there's no other reason to repost (I assume you mean 16 GT/s). Will also add spaces before the open "(". From joelagnelf at nvidia.com Wed Sep 10 07:45:54 2025 From: joelagnelf at nvidia.com (Joel Fernandes) Date: Wed, 10 Sep 2025 10:45:54 -0400 Subject: [PATCH v2 5/7] entry: Rename "kvm" entry code assets to "virt" to genericize APIs In-Reply-To: <20250828000156.23389-6-seanjc@google.com> References: <20250828000156.23389-1-seanjc@google.com> <20250828000156.23389-6-seanjc@google.com> Message-ID: <20250910144554.GA563958@joelbox2> On Wed, Aug 27, 2025 at 05:01:54PM -0700, Sean Christopherson wrote: > Rename the "kvm" entry code files and Kconfigs to use generic "virt" > nomenclature so that the code can be reused by other hypervisors (or > rather, their root/dom0 partition drivers), without incorrectly suggesting > the code somehow relies on and/or involves KVM. > > No functional change intended. > > Signed-off-by: Sean Christopherson > --- > MAINTAINERS | 2 +- > arch/arm64/kvm/Kconfig | 2 +- > arch/loongarch/kvm/Kconfig | 2 +- > arch/riscv/kvm/Kconfig | 2 +- > arch/x86/kvm/Kconfig | 2 +- > include/linux/{entry-kvm.h => entry-virt.h} | 8 ++++---- > include/linux/kvm_host.h | 6 +++--- > include/linux/rcupdate.h | 2 +- > kernel/entry/Makefile | 2 +- > kernel/entry/{kvm.c => virt.c} | 2 +- > kernel/rcu/tree.c | 6 +++--- For RCU part, Reviewed-by: Joel Fernandes thanks, - Joel > virt/kvm/Kconfig | 2 +- > 12 files changed, 19 insertions(+), 19 deletions(-) > rename include/linux/{entry-kvm.h => entry-virt.h} (94%) > rename kernel/entry/{kvm.c => virt.c} (97%) > > diff --git a/MAINTAINERS b/MAINTAINERS > index fe168477caa4..c255048333f0 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -10200,7 +10200,7 @@ L: linux-kernel at vger.kernel.org > S: Maintained > T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/entry > F: include/linux/entry-common.h > -F: include/linux/entry-kvm.h > +F: include/linux/entry-virt.h > F: include/linux/irq-entry-common.h > F: kernel/entry/ > > diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig > index 713248f240e0..6f4fc3caa31a 100644 > --- a/arch/arm64/kvm/Kconfig > +++ b/arch/arm64/kvm/Kconfig > @@ -25,7 +25,7 @@ menuconfig KVM > select HAVE_KVM_CPU_RELAX_INTERCEPT > select KVM_MMIO > select KVM_GENERIC_DIRTYLOG_READ_PROTECT > - select KVM_XFER_TO_GUEST_WORK > + select VIRT_XFER_TO_GUEST_WORK > select KVM_VFIO > select HAVE_KVM_DIRTY_RING_ACQ_REL > select NEED_KVM_DIRTY_RING_WITH_BITMAP > diff --git a/arch/loongarch/kvm/Kconfig b/arch/loongarch/kvm/Kconfig > index 40eea6da7c25..ae64bbdf83a7 100644 > --- a/arch/loongarch/kvm/Kconfig > +++ b/arch/loongarch/kvm/Kconfig > @@ -31,7 +31,7 @@ config KVM > select KVM_GENERIC_HARDWARE_ENABLING > select KVM_GENERIC_MMU_NOTIFIER > select KVM_MMIO > - select KVM_XFER_TO_GUEST_WORK > + select VIRT_XFER_TO_GUEST_WORK > select SCHED_INFO > select GUEST_PERF_EVENTS if PERF_EVENTS > help > diff --git a/arch/riscv/kvm/Kconfig b/arch/riscv/kvm/Kconfig > index 5a62091b0809..c50328212917 100644 > --- a/arch/riscv/kvm/Kconfig > +++ b/arch/riscv/kvm/Kconfig > @@ -30,7 +30,7 @@ config KVM > select KVM_GENERIC_DIRTYLOG_READ_PROTECT > select KVM_GENERIC_HARDWARE_ENABLING > select KVM_MMIO > - select KVM_XFER_TO_GUEST_WORK > + select VIRT_XFER_TO_GUEST_WORK > select KVM_GENERIC_MMU_NOTIFIER > select SCHED_INFO > select GUEST_PERF_EVENTS if PERF_EVENTS > diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig > index 2c86673155c9..f81074b0c0a8 100644 > --- a/arch/x86/kvm/Kconfig > +++ b/arch/x86/kvm/Kconfig > @@ -40,7 +40,7 @@ config KVM_X86 > select HAVE_KVM_MSI > select HAVE_KVM_CPU_RELAX_INTERCEPT > select HAVE_KVM_NO_POLL > - select KVM_XFER_TO_GUEST_WORK > + select VIRT_XFER_TO_GUEST_WORK > select KVM_GENERIC_DIRTYLOG_READ_PROTECT > select KVM_VFIO > select HAVE_KVM_PM_NOTIFIER if PM > diff --git a/include/linux/entry-kvm.h b/include/linux/entry-virt.h > similarity index 94% > rename from include/linux/entry-kvm.h > rename to include/linux/entry-virt.h > index 3644de7e6019..42c89e3e5ca7 100644 > --- a/include/linux/entry-kvm.h > +++ b/include/linux/entry-virt.h > @@ -1,6 +1,6 @@ > /* SPDX-License-Identifier: GPL-2.0 */ > -#ifndef __LINUX_ENTRYKVM_H > -#define __LINUX_ENTRYKVM_H > +#ifndef __LINUX_ENTRYVIRT_H > +#define __LINUX_ENTRYVIRT_H > > #include > #include > @@ -10,7 +10,7 @@ > #include > > /* Transfer to guest mode work */ > -#ifdef CONFIG_KVM_XFER_TO_GUEST_WORK > +#ifdef CONFIG_VIRT_XFER_TO_GUEST_WORK > > #ifndef ARCH_XFER_TO_GUEST_MODE_WORK > # define ARCH_XFER_TO_GUEST_MODE_WORK (0) > @@ -90,6 +90,6 @@ static inline bool xfer_to_guest_mode_work_pending(void) > lockdep_assert_irqs_disabled(); > return __xfer_to_guest_mode_work_pending(); > } > -#endif /* CONFIG_KVM_XFER_TO_GUEST_WORK */ > +#endif /* CONFIG_VIRT_XFER_TO_GUEST_WORK */ > > #endif > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 598b9473e46d..70ac2267d5d0 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -2,7 +2,7 @@ > #ifndef __KVM_HOST_H > #define __KVM_HOST_H > > -#include > +#include > #include > #include > #include > @@ -2444,7 +2444,7 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu) > } > #endif /* CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE */ > > -#ifdef CONFIG_KVM_XFER_TO_GUEST_WORK > +#ifdef CONFIG_VIRT_XFER_TO_GUEST_WORK > static inline void kvm_handle_signal_exit(struct kvm_vcpu *vcpu) > { > vcpu->run->exit_reason = KVM_EXIT_INTR; > @@ -2461,7 +2461,7 @@ static inline int kvm_xfer_to_guest_mode_handle_work(struct kvm_vcpu *vcpu) > } > return r; > } > -#endif /* CONFIG_KVM_XFER_TO_GUEST_WORK */ > +#endif /* CONFIG_VIRT_XFER_TO_GUEST_WORK */ > > /* > * If more than one page is being (un)accounted, @virt must be the address of > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > index 120536f4c6eb..1e1f3aa375d9 100644 > --- a/include/linux/rcupdate.h > +++ b/include/linux/rcupdate.h > @@ -129,7 +129,7 @@ static inline void rcu_sysrq_start(void) { } > static inline void rcu_sysrq_end(void) { } > #endif /* #else #ifdef CONFIG_RCU_STALL_COMMON */ > > -#if defined(CONFIG_NO_HZ_FULL) && (!defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK)) > +#if defined(CONFIG_NO_HZ_FULL) && (!defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_VIRT_XFER_TO_GUEST_WORK)) > void rcu_irq_work_resched(void); > #else > static __always_inline void rcu_irq_work_resched(void) { } > diff --git a/kernel/entry/Makefile b/kernel/entry/Makefile > index 77fcd83dd663..2333d70802e4 100644 > --- a/kernel/entry/Makefile > +++ b/kernel/entry/Makefile > @@ -14,4 +14,4 @@ CFLAGS_common.o += -fno-stack-protector > > obj-$(CONFIG_GENERIC_IRQ_ENTRY) += common.o > obj-$(CONFIG_GENERIC_SYSCALL) += syscall-common.o syscall_user_dispatch.o > -obj-$(CONFIG_KVM_XFER_TO_GUEST_WORK) += kvm.o > +obj-$(CONFIG_VIRT_XFER_TO_GUEST_WORK) += virt.o > diff --git a/kernel/entry/kvm.c b/kernel/entry/virt.c > similarity index 97% > rename from kernel/entry/kvm.c > rename to kernel/entry/virt.c > index 6fc762eaacca..c52f99249763 100644 > --- a/kernel/entry/kvm.c > +++ b/kernel/entry/virt.c > @@ -1,6 +1,6 @@ > // SPDX-License-Identifier: GPL-2.0 > > -#include > +#include > > static int xfer_to_guest_mode_work(unsigned long ti_work) > { > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 174ee243b349..995489b72535 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -573,7 +573,7 @@ void rcutorture_format_gp_seqs(unsigned long long seqs, char *cp, size_t len) > } > EXPORT_SYMBOL_GPL(rcutorture_format_gp_seqs); > > -#if defined(CONFIG_NO_HZ_FULL) && (!defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK)) > +#if defined(CONFIG_NO_HZ_FULL) && (!defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_VIRT_XFER_TO_GUEST_WORK)) > /* > * An empty function that will trigger a reschedule on > * IRQ tail once IRQs get re-enabled on userspace/guest resume. > @@ -602,7 +602,7 @@ noinstr void rcu_irq_work_resched(void) > if (IS_ENABLED(CONFIG_GENERIC_ENTRY) && !(current->flags & PF_VCPU)) > return; > > - if (IS_ENABLED(CONFIG_KVM_XFER_TO_GUEST_WORK) && (current->flags & PF_VCPU)) > + if (IS_ENABLED(CONFIG_VIRT_XFER_TO_GUEST_WORK) && (current->flags & PF_VCPU)) > return; > > instrumentation_begin(); > @@ -611,7 +611,7 @@ noinstr void rcu_irq_work_resched(void) > } > instrumentation_end(); > } > -#endif /* #if defined(CONFIG_NO_HZ_FULL) && (!defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK)) */ > +#endif /* #if defined(CONFIG_NO_HZ_FULL) && (!defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_VIRT_XFER_TO_GUEST_WORK)) */ > > #ifdef CONFIG_PROVE_RCU > /** > diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig > index 727b542074e7..ce843db53831 100644 > --- a/virt/kvm/Kconfig > +++ b/virt/kvm/Kconfig > @@ -87,7 +87,7 @@ config HAVE_KVM_VCPU_RUN_PID_CHANGE > config HAVE_KVM_NO_POLL > bool > > -config KVM_XFER_TO_GUEST_WORK > +config VIRT_XFER_TO_GUEST_WORK > bool > > config HAVE_KVM_PM_NOTIFIER > -- > 2.51.0.268.g9569e192d0-goog > From fvogt at suse.de Wed Sep 10 08:25:13 2025 From: fvogt at suse.de (Fabian Vogt) Date: Wed, 10 Sep 2025 17:25:13 +0200 Subject: [PATCH] riscv: kprobes: Fix probe address validation Message-ID: <6191817.lOV4Wx5bFT@fvogt-thinkpad> When adding a kprobe such as "p:probe/tcp_sendmsg _text+15392192", arch_check_kprobe would start iterating all instructions starting from _text until the probed address. Not only is this very inefficient, but literal values in there (e.g. left by function patching) are misinterpreted in a way that causes a desync. Fix this by doing it like x86: start the iteration at the closest preceding symbol instead of the given starting point. Fixes: 87f48c7ccc73 ("riscv: kprobe: Fixup kernel panic when probing an illegal position") Signed-off-by: Fabian Vogt Signed-off-by: Marvin Friedrich --- arch/riscv/kernel/probes/kprobes.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c index c0738d6c6498..8723390c7cad 100644 --- a/arch/riscv/kernel/probes/kprobes.c +++ b/arch/riscv/kernel/probes/kprobes.c @@ -49,10 +49,15 @@ static void __kprobes arch_simulate_insn(struct kprobe *p, struct pt_regs *regs) post_kprobe_handler(p, kcb, regs); } -static bool __kprobes arch_check_kprobe(struct kprobe *p) +static bool __kprobes arch_check_kprobe(unsigned long addr) { - unsigned long tmp = (unsigned long)p->addr - p->offset; - unsigned long addr = (unsigned long)p->addr; + unsigned long tmp, offset; + + /* start iterating at the closest preceding symbol */ + if (!kallsyms_lookup_size_offset(addr, NULL, &offset)) + return false; + + tmp = addr - offset; while (tmp <= addr) { if (tmp == addr) @@ -71,7 +76,7 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p) if ((unsigned long)insn & 0x1) return -EILSEQ; - if (!arch_check_kprobe(p)) + if (!arch_check_kprobe((unsigned long)p->addr)) return -EILSEQ; /* copy instruction */ -- 2.51.0 From vkoul at kernel.org Wed Sep 10 09:47:20 2025 From: vkoul at kernel.org (Vinod Koul) Date: Wed, 10 Sep 2025 22:17:20 +0530 Subject: [PATCH v5 0/2] riscv: sophgo: add USB phy support for CV18XX series In-Reply-To: <20250708063038.497473-1-inochiama@gmail.com> References: <20250708063038.497473-1-inochiama@gmail.com> Message-ID: <175752284008.484319.10400801407053619384.b4-ty@kernel.org> On Tue, 08 Jul 2025 14:30:35 +0800, Inochi Amaoto wrote: > Add USB PHY support for CV18XX/SG200X series > > Changed from v4: > - https://lore.kernel.org/all/20250611081804.1196397-1-inochiama at gmail.com > 1. patch 1: apply Conor's tag > 2. patch 2: remove dr_mode debugfs entry. > 3. patch 2: simplify the cv1800_usb_phy_set_clock function > > [...] Applied, thanks! [1/2] dt-bindings: phy: Add Sophgo CV1800 USB phy commit: cdb2511bf3925ce095c31e1647c12086d34f9cc2 [2/2] phy: sophgo: Add USB 2.0 PHY driver for Sophgo CV18XX/SG200X commit: f0c6d776f74d1d8bda94f6f042b2919bcd615280 Best regards, -- ~Vinod From apatel at ventanamicro.com Wed Sep 10 10:20:47 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 10 Sep 2025 22:50:47 +0530 Subject: [RFC PATCH 0/5] riscv: Handle synchronous hardware error exception In-Reply-To: <20250910093347.75822-1-tianruidong@linux.alibaba.com> References: <20250910093347.75822-1-tianruidong@linux.alibaba.com> Message-ID: On Wed, Sep 10, 2025 at 3:04?PM Ruidong Tian wrote: > > Hi all, > This patch series introduces support for handling synchronous hardware errors > on RISC-V, laying the groundwork for more robust kernel-mode error recovery. > > 1. Background > Hardware error reporting mechanisms typically fall into two categories: > asynchronous and synchronous. > > - Asynchronous errors (e.g., memory scrubbing errors) repoted by a asynchronous > exceptions or a interrupt, are usually handled by GHES subsystems. For instance, > ARM uses SDEI, and a similar SSE specification is being proposed for RISC-V. > - Synchronous errors (e.g., reading poisoned data) cause the processor core to > take a precise exception. This is known as a Synchronous External Abort (SEA) > on ARM, a Machine Check Exception (MCE) on x86, and is designated as trap with > mcause 19 on RISC-V. > > Discussions within the RVI PRS TG have already led to proposals[0] to UEFI for > standardizing two notification methods, SSE and Hardware Error Exception, > on RISC-V. > This series focuses on implementing Hardware Error Exception notification to > handle synchronous errors. Himanshu Chauhan has already started working on SSE[1]. > > 2. Motivation > While a synchronous hardware errors occurring in kernel context (e.g., during > get_user, put_user, CoW, etc.). The kernel requires a fixup mechanism (via > extable) to recover from such errors and prevent a system panic. However, the > APEI/GHES subsystem, being asynchronous, cannot directly leverage the synchronous > extable fixup path. > > By handling the synchronous exception directly, we enable the use of this fixup > mechanism, allowing the kernel to gracefully recover from hardware errors > encountered during kernel execution. This brings RISC-V's error handling > capabilities closer to the robustness found on ARM[2] and x86[3]. > > 3. What This Patch Series Does > This initial series lays the foundational infrastructure. It primarily: > - Introduces a new exception handler for synchronous hardware errors (mcause=19). > - Establishes the core exception path, which is a prerequisite for kernel > context error recovery. > > Please note that this version does not yet implement the full kernel fixup logic > for recovery. That functionality is planned for the next formal version. > > Some adaptations for GHES are included, based on the work from Himanshu Chauhan[1] > > 4. Future Plans > - Implement full kernel fixup support to handle and recover from errors in > some kernel context[2]. > - Add support for handling "double trap" scenarios. > > 5. Testing Methodology > > test program: ras-tools: https://kernel.googlesource.com/pub/scm/linux/kernel/git/aegl/ras-tools/ > qemu: https://github.com/winterddd/qemu > offcial opensbi and edk2: > > - Run qemu: > qemu-system-riscv64 -M virt,pflash0=pflash0,pflash1=pflash1,acpi=on,aia=aplic-imsic > -cpu max -m 64G -smp 64 -device virtio-gpu-pci -full-screen -device qemu-xhci > -device usb-kbd -device virtio-rng-pci > -blockdev node-name=pflash0,driver=file,read-only=on,filename=RISCV_VIRT_CODE.fd > -blockdev node-name=pflash1,driver=file,filename=RISCV_VIRT_VARS.fd > -bios fw_dynamic.bin -device virtio-net-device,netdev=net0 > -netdev user,id=net0,hostfwd=tcp::2223-:22 > -kernel Image -initrd rootfs > -append "rdinit=/sbin/init earlycon verbose debug strict_devmem=0 nokaslr" > -monitor telnet:127.0.0.1:5557,server,nowait -nographic > > - Run ras-tools: > ./einj_mem_uc -j -k single & > $ 0: single vaddr = 0x7fff86ff4400 paddr = 107d11b400 > > - Inject poison > telnet localhost 5557 > poison_enable on > poison_add 0x107d11b400 > > - Read poison > echo trigger > ./trigger_start > $ triggering ... > $ signal 7 code 3 addr 0x7fff86ff4400 > > [0]: https://lists.riscv.org/g/tech-prs/topic/risc_v_ras_related_ecrs/113685653 > [1]: https://patchew.org/linux/20250227123628.2931490-1-hchauhan at ventanamicro.com/ > [2]: https://lore.kernel.org/lkml/20241209024257.3618492-1-tongtiangen at huawei.com/ > [3]: https://github.com/torvalds/linux/blob/9dd1835ecda5b96ac88c166f4a87386f3e727bd9/arch/x86/kernel/cpu/mce/core.c#L1514 > > Himanshu Chauhan (2): > riscv: Define ioremap_cache for RISC-V > riscv: Define arch_apei_get_mem_attribute for RISC-V > > Ruidong Tian (3): > acpi: Introduce SSE and HEE in HEST notification types > riscv: Introduce HEST HEE notification handlers for APEI > riscv: Add Hardware Error Exception trap handler > Himanshu had already sent-out RFC v1 way back in Feb 2025 [1] which did not receive any comments or feedback. Instead of sending out a half-baked series, it will be helpful if you can review Himanshu's series. Regards, Anup [1] https://patchew.org/linux/20250227123628.2931490-1-hchauhan at ventanamicro.com/ From conor at kernel.org Wed Sep 10 10:55:51 2025 From: conor at kernel.org (Conor Dooley) Date: Wed, 10 Sep 2025 18:55:51 +0100 Subject: [PATCH v3 0/6] Icicle Kit with prod device and Discovery Kit support In-Reply-To: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> References: <20250908115732.31092-1-valentina.fernandezalanis@microchip.com> Message-ID: <20250910-visa-fanfare-ec0bfa5c588c@spud> On Mon, Sep 08, 2025 at 12:57:26PM +0100, Valentina Fernandez wrote: > Hi all, > > With the introduction of the Icicle Kit with the production device > (MPFS250T) to the market, it's necessary to distinguish it from the > engineering sample (-es) variant. This is because engineering samples > cannot write to flash from the MSS, as noted in the PolarFire SoC > FPGA ES errata. > > This series adds a common board DTSI for the Icicle Kit, containing > hardware shared by both the engineering sample and production > versions, as well as a DTS for each Icicle Kit variant. > > The last two patches add support for the PolarFire SoC Discovery Kit > board. > > Changes since v2: > - rename ccc clock to clock-cccref to match fixed clock binding > > Changes since v1: > - fix order of properties in mailbox nodes > - drop redundant status property from ddrc_cache nodes > - fix lowercase hex in reserved memory regions I've replaced v1 with this version in my tree. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From linus.walleij at linaro.org Wed Sep 10 14:32:26 2025 From: linus.walleij at linaro.org (Linus Walleij) Date: Wed, 10 Sep 2025 23:32:26 +0200 Subject: [PATCH v2 00/15] gpio: replace legacy bgpio_init() with its modernized alternative - part 4 In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: On Wed, Sep 10, 2025 at 9:12?AM Bartosz Golaszewski wrote: > Here's the final part of the generic GPIO chip conversions. Once all the > existing users are switched to the new API, the final patch in the > series removes bgpio_init(), moves the gpio-mmio fields out of struct > gpio_chip and into struct gpio_generic_chip and adjusts gpio-mmio.c to > the new situation. > > Down the line we could probably improve gpio-mmio.c by using lock guards > and replacing the - now obsolete - "bgpio" prefix with "gpio_generic" or > something similar but this series is already big as is so I'm leaving > that for the future. > > Tested in qemu on vexpress-a9. > > Signed-off-by: Bartosz Golaszewski The patch set is a beauty, hands down. Reviewed-by: Linus Walleij I especially like where you caught local spinlocks being (ab)used instead of the generic irqchip ones. I don't know about merging patch 15/15 into just the GPIO tree, that can make things fail in other subsystems depending on merge order into Torvalds tree or linux-next if your tree is merged first. I would merge the first 14 and keep the last for the later part of the merge window when all other trees with conversions are merged. (You probably already thought of this.) Yours, Linus Walleij From florian.fainelli at broadcom.com Wed Sep 10 15:05:11 2025 From: florian.fainelli at broadcom.com (Florian Fainelli) Date: Wed, 10 Sep 2025 15:05:11 -0700 Subject: [PATCH v2 07/15] gpio: brcmstb: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-7-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> <20250910-gpio-mmio-gpio-conv-part4-v2-7-f3d1a4c57124@linaro.org> Message-ID: On 9/10/25 00:12, Bartosz Golaszewski wrote: > From: Bartosz Golaszewski > > Convert the driver to using the new generic GPIO chip interfaces from > linux/gpio/generic.h. > > Signed-off-by: Bartosz Golaszewski Reviewed-by: Florian Fainelli Tested-by: Florian Fainelli -- Florian From Frank.li at nxp.com Wed Sep 10 15:18:39 2025 From: Frank.li at nxp.com (Frank Li) Date: Wed, 10 Sep 2025 18:18:39 -0400 Subject: [PATCH v7 1/2] dt-bindings: usb: dwc3: add support for SpacemiT K1 In-Reply-To: <20250729-dwc3_generic-v7-1-5c791bba826f@linux.dev> References: <20250729-dwc3_generic-v7-0-5c791bba826f@linux.dev> <20250729-dwc3_generic-v7-1-5c791bba826f@linux.dev> Message-ID: On Tue, Jul 29, 2025 at 12:33:55AM +0800, Ze Huang wrote: > Add support for the USB 3.0 Dual-Role Device (DRD) controller embedded > in the SpacemiT K1 SoC. The controller is based on the Synopsys > DesignWare Core USB 3 (DWC3) IP, supporting USB3.0 host mode and USB 2.0 > DRD mode. > > Reviewed-by: Krzysztof Kozlowski > Signed-off-by: Ze Huang > --- Ze Huang: I seen Krzysztof and Thinh Nguyen already acked this patches. Do you wait for greg pick it up or need respin? My one layerscape usb patch depend on this one! Frank > .../devicetree/bindings/usb/spacemit,k1-dwc3.yaml | 124 +++++++++++++++++++++ > 1 file changed, 124 insertions(+) > > diff --git a/Documentation/devicetree/bindings/usb/spacemit,k1-dwc3.yaml b/Documentation/devicetree/bindings/usb/spacemit,k1-dwc3.yaml > new file mode 100644 > index 0000000000000000000000000000000000000000..7007e2bd42016ae0e50c4007e75d26bada34d983 > --- /dev/null > +++ b/Documentation/devicetree/bindings/usb/spacemit,k1-dwc3.yaml > @@ -0,0 +1,124 @@ > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/usb/spacemit,k1-dwc3.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: SpacemiT K1 SuperSpeed DWC3 USB SoC Controller > + > +maintainers: > + - Ze Huang > + > +description: | > + The SpacemiT K1 embeds a DWC3 USB IP Core which supports Host functions > + for USB 3.0 and DRD for USB 2.0. > + > + Key features: > + - USB3.0 SuperSpeed and USB2.0 High/Full/Low-Speed support > + - Supports low-power modes (USB2.0 suspend, USB3.0 U1/U2/U3) > + - Internal DMA controller and flexible endpoint FIFO sizing > + > + Communication Interface: > + - Use of PIPE3 (125MHz) interface for USB3.0 PHY > + - Use of UTMI+ (30/60MHz) interface for USB2.0 PHY > + > +allOf: > + - $ref: snps,dwc3-common.yaml# > + > +properties: > + compatible: > + const: spacemit,k1-dwc3 > + > + reg: > + maxItems: 1 > + > + clocks: > + maxItems: 1 > + > + clock-names: > + const: usbdrd30 > + > + interrupts: > + maxItems: 1 > + > + phys: > + items: > + - description: phandle to USB2/HS PHY > + - description: phandle to USB3/SS PHY > + > + phy-names: > + items: > + - const: usb2-phy > + - const: usb3-phy > + > + resets: > + items: > + - description: USB3.0 AHB reset > + - description: USB3.0 VCC reset > + - description: USB3.0 PHY reset > + - description: PCIE0 global reset (for combo phy) > + > + reset-names: > + items: > + - const: ahb > + - const: vcc > + - const: phy > + - const: pcie0 > + > + reset-delay: > + $ref: /schemas/types.yaml#/definitions/uint32 > + default: 2 > + description: delay after reset sequence [us] > + > + vbus-supply: > + description: A phandle to the regulator supplying the VBUS voltage. > + > +required: > + - compatible > + - reg > + - clocks > + - clock-names > + - interrupts > + - phys > + - phy-names > + - resets > + - reset-names > + > +unevaluatedProperties: false > + > +examples: > + - | > + usb at c0a00000 { > + compatible = "spacemit,k1-dwc3"; > + reg = <0xc0a00000 0x10000>; > + clocks = <&syscon_apmu 16>; > + clock-names = "usbdrd30"; > + interrupts = <125>; > + phys = <&usb2phy>, <&usb3phy>; > + phy-names = "usb2-phy", "usb3-phy"; > + resets = <&syscon_apmu 8>, > + <&syscon_apmu 9>, > + <&syscon_apmu 10>, > + <&syscon_apmu 26>; > + reset-names = "ahb", "vcc", "phy", "pcie0"; > + reset-delay = <2>; > + vbus-supply = <&usb3_vbus>; > + #address-cells = <1>; > + #size-cells = <0>; > + > + hub_2_0: hub at 1 { > + compatible = "usb2109,2817"; > + reg = <1>; > + vdd-supply = <&usb3_vhub>; > + peer-hub = <&hub_3_0>; > + reset-gpios = <&gpio 3 28 1>; > + }; > + > + hub_3_0: hub at 2 { > + compatible = "usb2109,817"; > + reg = <2>; > + vdd-supply = <&usb3_vhub>; > + peer-hub = <&hub_2_0>; > + reset-gpios = <&gpio 3 28 1>; > + }; > + }; > > -- > 2.50.1 > From unicorn_wang at outlook.com Wed Sep 10 17:09:24 2025 From: unicorn_wang at outlook.com (Chen Wang) Date: Thu, 11 Sep 2025 08:09:24 +0800 Subject: [PATCH v2 3/7] PCI: sg2042: Add Sophgo SG2042 PCIe driver In-Reply-To: <20250910143453.GA1533730@bhelgaas> References: <20250910143453.GA1533730@bhelgaas> Message-ID: On 9/10/2025 10:34 PM, Bjorn Helgaas wrote: > On Wed, Sep 10, 2025 at 10:08:39AM +0800, Chen Wang wrote: >> From: Chen Wang >> >> Add support for PCIe controller in SG2042 SoC. The controller >> uses the Cadence PCIe core programmed by pcie-cadence*.c. The >> PCIe controller will work in host mode only, supporting data >> rate(gen4) and lanes(x16 or x8). > Strictly speaking, "gen4" is a spec revision, not a data rate. > Include the GT/s rate instead or in addition. We can fix this when > merging if there's no other reason to repost (I assume you mean 16 > GT/s). Will also add spaces before the open "(". Yes, I meant 16 GT/s. Please help fix this when merging together with dropping period at end of subject for the [2/7], if no repost. Thanks, Chen From opendmb at gmail.com Wed Sep 10 17:11:26 2025 From: opendmb at gmail.com (Doug Berger) Date: Wed, 10 Sep 2025 17:11:26 -0700 Subject: [PATCH v2 07/15] gpio: brcmstb: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-7-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> <20250910-gpio-mmio-gpio-conv-part4-v2-7-f3d1a4c57124@linaro.org> Message-ID: On 9/10/2025 12:12 AM, Bartosz Golaszewski wrote: > From: Bartosz Golaszewski > > Convert the driver to using the new generic GPIO chip interfaces from > linux/gpio/generic.h. > > Signed-off-by: Bartosz Golaszewski > --- > drivers/gpio/gpio-brcmstb.c | 112 ++++++++++++++++++++++++-------------------- > 1 file changed, 60 insertions(+), 52 deletions(-) > > diff --git a/drivers/gpio/gpio-brcmstb.c b/drivers/gpio/gpio-brcmstb.c > index e29a9589b3ccbd17d10f6671088dca3e76537927..be3ff916e134a674d3e1d334a7d431b7ad767a33 100644 > --- a/drivers/gpio/gpio-brcmstb.c > +++ b/drivers/gpio/gpio-brcmstb.c > @@ -3,6 +3,7 @@ > > #include > #include > +#include > #include > #include > #include > @@ -37,7 +38,7 @@ enum gio_reg_index { > struct brcmstb_gpio_bank { > struct list_head node; > int id; > - struct gpio_chip gc; > + struct gpio_generic_chip chip; > struct brcmstb_gpio_priv *parent_priv; > u32 width; > u32 wake_active; > @@ -72,19 +73,18 @@ __brcmstb_gpio_get_active_irqs(struct brcmstb_gpio_bank *bank) > { > void __iomem *reg_base = bank->parent_priv->reg_base; > > - return bank->gc.read_reg(reg_base + GIO_STAT(bank->id)) & > - bank->gc.read_reg(reg_base + GIO_MASK(bank->id)); > + return gpio_generic_read_reg(&bank->chip, reg_base + GIO_STAT(bank->id)) & > + gpio_generic_read_reg(&bank->chip, reg_base + GIO_MASK(bank->id)); > } > > static unsigned long > brcmstb_gpio_get_active_irqs(struct brcmstb_gpio_bank *bank) > { > unsigned long status; > - unsigned long flags; > > - raw_spin_lock_irqsave(&bank->gc.bgpio_lock, flags); > + guard(gpio_generic_lock_irqsave)(&bank->chip); > + > status = __brcmstb_gpio_get_active_irqs(bank); > - raw_spin_unlock_irqrestore(&bank->gc.bgpio_lock, flags); > > return status; > } > @@ -92,26 +92,26 @@ brcmstb_gpio_get_active_irqs(struct brcmstb_gpio_bank *bank) > static int brcmstb_gpio_hwirq_to_offset(irq_hw_number_t hwirq, > struct brcmstb_gpio_bank *bank) > { > - return hwirq - bank->gc.offset; > + return hwirq - bank->chip.gc.offset; > } > > static void brcmstb_gpio_set_imask(struct brcmstb_gpio_bank *bank, > unsigned int hwirq, bool enable) > { > - struct gpio_chip *gc = &bank->gc; > struct brcmstb_gpio_priv *priv = bank->parent_priv; > u32 mask = BIT(brcmstb_gpio_hwirq_to_offset(hwirq, bank)); > u32 imask; > - unsigned long flags; > > - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); > - imask = gc->read_reg(priv->reg_base + GIO_MASK(bank->id)); > + guard(gpio_generic_lock_irqsave)(&bank->chip); > + > + imask = gpio_generic_read_reg(&bank->chip, > + priv->reg_base + GIO_MASK(bank->id)); > if (enable) > imask |= mask; > else > imask &= ~mask; > - gc->write_reg(priv->reg_base + GIO_MASK(bank->id), imask); > - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); > + gpio_generic_write_reg(&bank->chip, > + priv->reg_base + GIO_MASK(bank->id), imask); > } > > static int brcmstb_gpio_to_irq(struct gpio_chip *gc, unsigned offset) > @@ -150,7 +150,8 @@ static void brcmstb_gpio_irq_ack(struct irq_data *d) > struct brcmstb_gpio_priv *priv = bank->parent_priv; > u32 mask = BIT(brcmstb_gpio_hwirq_to_offset(d->hwirq, bank)); > > - gc->write_reg(priv->reg_base + GIO_STAT(bank->id), mask); > + gpio_generic_write_reg(&bank->chip, > + priv->reg_base + GIO_STAT(bank->id), mask); > } > > static int brcmstb_gpio_irq_set_type(struct irq_data *d, unsigned int type) > @@ -162,7 +163,6 @@ static int brcmstb_gpio_irq_set_type(struct irq_data *d, unsigned int type) > u32 edge_insensitive, iedge_insensitive; > u32 edge_config, iedge_config; > u32 level, ilevel; > - unsigned long flags; > > switch (type) { > case IRQ_TYPE_LEVEL_LOW: > @@ -194,23 +194,25 @@ static int brcmstb_gpio_irq_set_type(struct irq_data *d, unsigned int type) > return -EINVAL; > } > > - raw_spin_lock_irqsave(&bank->gc.bgpio_lock, flags); > + guard(gpio_generic_lock_irqsave)(&bank->chip); > > - iedge_config = bank->gc.read_reg(priv->reg_base + > - GIO_EC(bank->id)) & ~mask; > - iedge_insensitive = bank->gc.read_reg(priv->reg_base + > - GIO_EI(bank->id)) & ~mask; > - ilevel = bank->gc.read_reg(priv->reg_base + > - GIO_LEVEL(bank->id)) & ~mask; > + iedge_config = gpio_generic_read_reg(&bank->chip, > + priv->reg_base + GIO_EC(bank->id)) & ~mask; > + iedge_insensitive = gpio_generic_read_reg(&bank->chip, > + priv->reg_base + GIO_EI(bank->id)) & ~mask; > + ilevel = gpio_generic_read_reg(&bank->chip, > + priv->reg_base + GIO_LEVEL(bank->id)) & ~mask; > > - bank->gc.write_reg(priv->reg_base + GIO_EC(bank->id), > - iedge_config | edge_config); > - bank->gc.write_reg(priv->reg_base + GIO_EI(bank->id), > - iedge_insensitive | edge_insensitive); > - bank->gc.write_reg(priv->reg_base + GIO_LEVEL(bank->id), > - ilevel | level); > + gpio_generic_write_reg(&bank->chip, > + priv->reg_base + GIO_EC(bank->id), > + iedge_config | edge_config); > + gpio_generic_write_reg(&bank->chip, > + priv->reg_base + GIO_EI(bank->id), > + iedge_insensitive | edge_insensitive); > + gpio_generic_write_reg(&bank->chip, > + priv->reg_base + GIO_LEVEL(bank->id), > + ilevel | level); > > - raw_spin_unlock_irqrestore(&bank->gc.bgpio_lock, flags); > return 0; > } > > @@ -263,7 +265,7 @@ static void brcmstb_gpio_irq_bank_handler(struct brcmstb_gpio_bank *bank) > { > struct brcmstb_gpio_priv *priv = bank->parent_priv; > struct irq_domain *domain = priv->irq_domain; > - int hwbase = bank->gc.offset; > + int hwbase = bank->chip.gc.offset; > unsigned long status; > > while ((status = brcmstb_gpio_get_active_irqs(bank))) { > @@ -303,7 +305,7 @@ static struct brcmstb_gpio_bank *brcmstb_gpio_hwirq_to_bank( > > /* banks are in descending order */ > list_for_each_entry_reverse(bank, &priv->bank_list, node) { > - i += bank->gc.ngpio; > + i += bank->chip.gc.ngpio; > if (hwirq < i) > return bank; > } > @@ -332,7 +334,7 @@ static int brcmstb_gpio_irq_map(struct irq_domain *d, unsigned int irq, > > dev_dbg(&pdev->dev, "Mapping irq %d for gpio line %d (bank %d)\n", > irq, (int)hwirq, bank->id); > - ret = irq_set_chip_data(irq, &bank->gc); > + ret = irq_set_chip_data(irq, &bank->chip.gc); > if (ret < 0) > return ret; > irq_set_lockdep_class(irq, &brcmstb_gpio_irq_lock_class, > @@ -394,7 +396,7 @@ static void brcmstb_gpio_remove(struct platform_device *pdev) > * more important to actually perform all of the steps. > */ > list_for_each_entry(bank, &priv->bank_list, node) > - gpiochip_remove(&bank->gc); > + gpiochip_remove(&bank->chip.gc); > } > > static int brcmstb_gpio_of_xlate(struct gpio_chip *gc, > @@ -412,7 +414,7 @@ static int brcmstb_gpio_of_xlate(struct gpio_chip *gc, > if (WARN_ON(gpiospec->args_count < gc->of_gpio_n_cells)) > return -EINVAL; > > - offset = gpiospec->args[0] - bank->gc.offset; > + offset = gpiospec->args[0] - bank->chip.gc.offset; > if (offset >= gc->ngpio || offset < 0) > return -EINVAL; > > @@ -493,19 +495,17 @@ static int brcmstb_gpio_irq_setup(struct platform_device *pdev, > static void brcmstb_gpio_bank_save(struct brcmstb_gpio_priv *priv, > struct brcmstb_gpio_bank *bank) > { > - struct gpio_chip *gc = &bank->gc; > unsigned int i; > > for (i = 0; i < GIO_REG_STAT; i++) > - bank->saved_regs[i] = gc->read_reg(priv->reg_base + > - GIO_BANK_OFF(bank->id, i)); > + bank->saved_regs[i] = gpio_generic_read_reg(&bank->chip, > + priv->reg_base + GIO_BANK_OFF(bank->id, i)); > } > > static void brcmstb_gpio_quiesce(struct device *dev, bool save) > { > struct brcmstb_gpio_priv *priv = dev_get_drvdata(dev); > struct brcmstb_gpio_bank *bank; > - struct gpio_chip *gc; > u32 imask; > > /* disable non-wake interrupt */ > @@ -513,8 +513,6 @@ static void brcmstb_gpio_quiesce(struct device *dev, bool save) > disable_irq(priv->parent_irq); > > list_for_each_entry(bank, &priv->bank_list, node) { > - gc = &bank->gc; > - > if (save) > brcmstb_gpio_bank_save(priv, bank); > > @@ -523,8 +521,9 @@ static void brcmstb_gpio_quiesce(struct device *dev, bool save) > imask = bank->wake_active; > else > imask = 0; > - gc->write_reg(priv->reg_base + GIO_MASK(bank->id), > - imask); > + gpio_generic_write_reg(&bank->chip, > + priv->reg_base + GIO_MASK(bank->id), > + imask); > } > } > > @@ -538,12 +537,12 @@ static void brcmstb_gpio_shutdown(struct platform_device *pdev) > static void brcmstb_gpio_bank_restore(struct brcmstb_gpio_priv *priv, > struct brcmstb_gpio_bank *bank) > { > - struct gpio_chip *gc = &bank->gc; > unsigned int i; > > for (i = 0; i < GIO_REG_STAT; i++) > - gc->write_reg(priv->reg_base + GIO_BANK_OFF(bank->id, i), > - bank->saved_regs[i]); > + gpio_generic_write_reg(&bank->chip, > + priv->reg_base + GIO_BANK_OFF(bank->id, i), > + bank->saved_regs[i]); > } > > static int brcmstb_gpio_suspend(struct device *dev) > @@ -585,6 +584,7 @@ static const struct dev_pm_ops brcmstb_gpio_pm_ops = { > > static int brcmstb_gpio_probe(struct platform_device *pdev) > { > + struct gpio_generic_chip_config config; > struct device *dev = &pdev->dev; > struct device_node *np = dev->of_node; > void __iomem *reg_base; > @@ -665,17 +665,24 @@ static int brcmstb_gpio_probe(struct platform_device *pdev) > bank->width = bank_width; > } > > + gc = &bank->chip.gc; > + > /* > * Regs are 4 bytes wide, have data reg, no set/clear regs, > * and direction bits have 0 = output and 1 = input > */ > - gc = &bank->gc; > - err = bgpio_init(gc, dev, 4, > - reg_base + GIO_DATA(bank->id), > - NULL, NULL, NULL, > - reg_base + GIO_IODIR(bank->id), flags); > + > + config = (struct gpio_generic_chip_config) { > + .dev = dev, > + .sz = 4, > + .dat = reg_base + GIO_DATA(bank->id), > + .dirin = reg_base + GIO_IODIR(bank->id), > + .flags = flags, > + }; > + > + err = gpio_generic_chip_init(&bank->chip, &config); > if (err) { > - dev_err(dev, "bgpio_init() failed\n"); > + dev_err(dev, "failed to initialize generic GPIO chip\n"); > goto fail; > } > > @@ -700,7 +707,8 @@ static int brcmstb_gpio_probe(struct platform_device *pdev) > * be retained from S5 cold boot > */ > need_wakeup_event |= !!__brcmstb_gpio_get_active_irqs(bank); > - gc->write_reg(reg_base + GIO_MASK(bank->id), 0); > + gpio_generic_write_reg(&bank->chip, > + reg_base + GIO_MASK(bank->id), 0); > > err = gpiochip_add_data(gc, bank); > if (err) { > I suppose I'm OK with all of this, but I'm just curious about the longer term plans for the member accesses. Is there an intent to have helpers for things like?: chip.gc.offset chip.gc.ngpio Thanks, Doug From samuel.holland at sifive.com Wed Sep 10 17:37:46 2025 From: samuel.holland at sifive.com (Samuel Holland) Date: Wed, 10 Sep 2025 19:37:46 -0500 Subject: [PATCH v2 11/15] gpio: sifive: use new generic GPIO chip API In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-11-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> <20250910-gpio-mmio-gpio-conv-part4-v2-11-f3d1a4c57124@linaro.org> Message-ID: <01a7cc78-fdae-4a1e-bf78-961e7ec214b2@sifive.com> Hi Bartosz, On 2025-09-10 2:12 AM, Bartosz Golaszewski wrote: > From: Bartosz Golaszewski > > Convert the driver to using the new generic GPIO chip interfaces from > linux/gpio/generic.h. > > Signed-off-by: Bartosz Golaszewski > --- > drivers/gpio/gpio-sifive.c | 73 ++++++++++++++++++++++++---------------------- > 1 file changed, 38 insertions(+), 35 deletions(-) > > diff --git a/drivers/gpio/gpio-sifive.c b/drivers/gpio/gpio-sifive.c > index 98ef975c44d9a6c9238605cfd1d5820fd70a66ca..2ced87ffd3bbf219c11857391eb4ea808adc0527 100644 > --- a/drivers/gpio/gpio-sifive.c > +++ b/drivers/gpio/gpio-sifive.c > @@ -7,6 +7,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -32,7 +33,7 @@ > > struct sifive_gpio { > void __iomem *base; > - struct gpio_chip gc; > + struct gpio_generic_chip gen_gc; > struct regmap *regs; > unsigned long irq_state; > unsigned int trigger[SIFIVE_GPIO_MAX]; > @@ -41,10 +42,10 @@ struct sifive_gpio { > > static void sifive_gpio_set_ie(struct sifive_gpio *chip, unsigned int offset) > { > - unsigned long flags; > unsigned int trigger; > > - raw_spin_lock_irqsave(&chip->gc.bgpio_lock, flags); > + guard(gpio_generic_lock_irqsave)(&chip->gen_gc); > + > trigger = (chip->irq_state & BIT(offset)) ? chip->trigger[offset] : 0; > regmap_update_bits(chip->regs, SIFIVE_GPIO_RISE_IE, BIT(offset), > (trigger & IRQ_TYPE_EDGE_RISING) ? BIT(offset) : 0); > @@ -54,7 +55,6 @@ static void sifive_gpio_set_ie(struct sifive_gpio *chip, unsigned int offset) > (trigger & IRQ_TYPE_LEVEL_HIGH) ? BIT(offset) : 0); > regmap_update_bits(chip->regs, SIFIVE_GPIO_LOW_IE, BIT(offset), > (trigger & IRQ_TYPE_LEVEL_LOW) ? BIT(offset) : 0); > - raw_spin_unlock_irqrestore(&chip->gc.bgpio_lock, flags); > } > > static int sifive_gpio_irq_set_type(struct irq_data *d, unsigned int trigger) > @@ -72,13 +72,12 @@ static int sifive_gpio_irq_set_type(struct irq_data *d, unsigned int trigger) > } > > static void sifive_gpio_irq_enable(struct irq_data *d) > -{ > + { This looks like an unintentional whitespace change. > struct gpio_chip *gc = irq_data_get_irq_chip_data(d); > struct sifive_gpio *chip = gpiochip_get_data(gc); > irq_hw_number_t hwirq = irqd_to_hwirq(d); > int offset = hwirq % SIFIVE_GPIO_MAX; > u32 bit = BIT(offset); > - unsigned long flags; > > gpiochip_enable_irq(gc, hwirq); > irq_chip_enable_parent(d); > @@ -86,13 +85,13 @@ static void sifive_gpio_irq_enable(struct irq_data *d) > /* Switch to input */ > gc->direction_input(gc, offset); > > - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); > - /* Clear any sticky pending interrupts */ > - regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); > - regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); > - regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); > - regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); > - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); > + scoped_guard(gpio_generic_lock_irqsave, &chip->gen_gc) { > + /* Clear any sticky pending interrupts */ > + regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); > + regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); > + regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); > + regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); > + } This block (and the copy below) don't actually need any locking, since these are R/W1C bits. From the manual: "Once the interrupt is pending, it will remain set until a 1 is written to the *_ip register at that bit." I can send this as a follow-up improvement if you want to keep this limited to the API conversion. So with the minor whitespace fix: Reviewed-by: Samuel Holland Regards, Samuel > > /* Enable interrupts */ > assign_bit(offset, &chip->irq_state, 1); > @@ -118,15 +117,14 @@ static void sifive_gpio_irq_eoi(struct irq_data *d) > struct sifive_gpio *chip = gpiochip_get_data(gc); > int offset = irqd_to_hwirq(d) % SIFIVE_GPIO_MAX; > u32 bit = BIT(offset); > - unsigned long flags; > > - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); > - /* Clear all pending interrupts */ > - regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); > - regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); > - regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); > - regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); > - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); > + scoped_guard(gpio_generic_lock_irqsave, &chip->gen_gc) { > + /* Clear all pending interrupts */ > + regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); > + regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); > + regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); > + regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); > + } > > irq_chip_eoi_parent(d); > } > @@ -179,6 +177,7 @@ static const struct regmap_config sifive_gpio_regmap_config = { > > static int sifive_gpio_probe(struct platform_device *pdev) > { > + struct gpio_generic_chip_config config; > struct device *dev = &pdev->dev; > struct irq_domain *parent; > struct gpio_irq_chip *girq; > @@ -217,13 +216,17 @@ static int sifive_gpio_probe(struct platform_device *pdev) > */ > parent = irq_get_irq_data(chip->irq_number[0])->domain; > > - ret = bgpio_init(&chip->gc, dev, 4, > - chip->base + SIFIVE_GPIO_INPUT_VAL, > - chip->base + SIFIVE_GPIO_OUTPUT_VAL, > - NULL, > - chip->base + SIFIVE_GPIO_OUTPUT_EN, > - chip->base + SIFIVE_GPIO_INPUT_EN, > - BGPIOF_READ_OUTPUT_REG_SET); > + config = (struct gpio_generic_chip_config) { > + .dev = dev, > + .sz = 4, > + .dat = chip->base + SIFIVE_GPIO_INPUT_VAL, > + .set = chip->base + SIFIVE_GPIO_OUTPUT_VAL, > + .dirout = chip->base + SIFIVE_GPIO_OUTPUT_EN, > + .dirin = chip->base + SIFIVE_GPIO_INPUT_EN, > + .flags = BGPIOF_READ_OUTPUT_REG_SET, > + }; > + > + ret = gpio_generic_chip_init(&chip->gen_gc, &config); > if (ret) { > dev_err(dev, "unable to init generic GPIO\n"); > return ret; > @@ -236,12 +239,12 @@ static int sifive_gpio_probe(struct platform_device *pdev) > regmap_write(chip->regs, SIFIVE_GPIO_LOW_IE, 0); > chip->irq_state = 0; > > - chip->gc.base = -1; > - chip->gc.ngpio = ngpio; > - chip->gc.label = dev_name(dev); > - chip->gc.parent = dev; > - chip->gc.owner = THIS_MODULE; > - girq = &chip->gc.irq; > + chip->gen_gc.gc.base = -1; > + chip->gen_gc.gc.ngpio = ngpio; > + chip->gen_gc.gc.label = dev_name(dev); > + chip->gen_gc.gc.parent = dev; > + chip->gen_gc.gc.owner = THIS_MODULE; > + girq = &chip->gen_gc.gc.irq; > gpio_irq_chip_set_chip(girq, &sifive_gpio_irqchip); > girq->fwnode = dev_fwnode(dev); > girq->parent_domain = parent; > @@ -249,7 +252,7 @@ static int sifive_gpio_probe(struct platform_device *pdev) > girq->handler = handle_bad_irq; > girq->default_type = IRQ_TYPE_NONE; > > - return gpiochip_add_data(&chip->gc, chip); > + return gpiochip_add_data(&chip->gen_gc.gc, chip); > } > > static const struct of_device_id sifive_gpio_match[] = { > From unicorn_wang at outlook.com Wed Sep 10 18:10:33 2025 From: unicorn_wang at outlook.com (Chen Wang) Date: Thu, 11 Sep 2025 09:10:33 +0800 Subject: [PATCH] dts: sophgo: sg2042: added numa id description In-Reply-To: <20250910105531.519897-1-rabenda.cn@gmail.com> References: <20250910105531.519897-1-rabenda.cn@gmail.com> Message-ID: On 9/10/2025 6:55 PM, Han Gao wrote: > According to the description of [1], sg2042 is divided into 4 numa. > STREAM test performance will improve. > > Before: > Function Best Rate MB/s Avg time Min time Max time > Copy: 10739.7 0.015687 0.014898 0.016385 > Scale: 10865.9 0.015628 0.014725 0.016757 > Add: 10622.3 0.023276 0.022594 0.023899 > Triad: 10583.4 0.023653 0.022677 0.024761 > > After: > Function Best Rate MB/s Avg time Min time Max time > Copy: 34254.9 0.005142 0.004671 0.005995 > Scale: 37735.5 0.004752 0.004240 0.005407 > Add: 44206.8 0.005983 0.005429 0.006461 > Triad: 43040.6 0.006320 0.005576 0.006996 > > [1] https://github.com/sophgo/sophgo-doc/blob/main/SG2042/TRM/source/pic/mesh.png > > Signed-off-by: Han Gao The subject of patches that modify the device tree are usually prefixed with "riscv: sophgo: dts: xxx". This isn't a big deal, so keep it in mind next time. If you don't mind, we can modify it when merging. Others LGTM. Reviewed-by: Chen Wang Thanks, Chen [......] From zhang.lyra at gmail.com Wed Sep 10 19:51:54 2025 From: zhang.lyra at gmail.com (Chunyan Zhang) Date: Thu, 11 Sep 2025 10:51:54 +0800 Subject: [PATCH V10 1/5] mm: softdirty: Add pte_soft_dirty_available() In-Reply-To: <8f9a4a13-2881-4baf-ab62-3d0d79e0cd3c@redhat.com> References: <20250909095611.803898-1-zhangchunyan@iscas.ac.cn> <20250909095611.803898-2-zhangchunyan@iscas.ac.cn> <6b2f12aa-8ed9-476d-a69d-f05ea526f16a@redhat.com> <8f9a4a13-2881-4baf-ab62-3d0d79e0cd3c@redhat.com> Message-ID: On Wed, 10 Sept 2025 at 16:51, David Hildenbrand wrote: > > On 10.09.25 10:25, Chunyan Zhang wrote: > > Hi David, > > > > On Tue, 9 Sept 2025 at 19:42, David Hildenbrand wrote: > >> > >> On 09.09.25 11:56, Chunyan Zhang wrote: > >>> Some platforms can customize the PTE soft dirty bit and make it unavailable > >>> even if the architecture allows providing the PTE resource. > >>> > >>> Add an API which architectures can define their specific implementations > >>> to detect if the PTE soft-dirty bit is available, on which the kernel > >>> is running. > >>> > >>> Signed-off-by: Chunyan Zhang > >>> --- > >>> fs/proc/task_mmu.c | 17 ++++++++++++++++- > >>> include/linux/pgtable.h | 10 ++++++++++ > >>> mm/debug_vm_pgtable.c | 9 +++++---- > >>> mm/huge_memory.c | 10 ++++++---- > >>> mm/internal.h | 2 +- > >>> mm/mremap.c | 10 ++++++---- > >>> mm/userfaultfd.c | 6 ++++-- > >>> 7 files changed, 48 insertions(+), 16 deletions(-) > >>> > >>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > >>> index 29cca0e6d0ff..20a609ec1ba6 100644 > >>> --- a/fs/proc/task_mmu.c > >>> +++ b/fs/proc/task_mmu.c > >>> @@ -1058,7 +1058,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) > >>> * -Werror=unterminated-string-initialization warning > >>> * with GCC 15 > >>> */ > >>> - static const char mnemonics[BITS_PER_LONG][3] = { > >>> + static char mnemonics[BITS_PER_LONG][3] = { > >>> /* > >>> * In case if we meet a flag we don't know about. > >>> */ > >>> @@ -1129,6 +1129,16 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) > >>> [ilog2(VM_SEALED)] = "sl", > >>> #endif > >>> }; > >>> +/* > >>> + * We should remove the VM_SOFTDIRTY flag if the PTE soft-dirty bit is > >>> + * unavailable on which the kernel is running, even if the architecture > >>> + * allows providing the PTE resource and soft-dirty is compiled in. > >>> + */ > >>> +#ifdef CONFIG_MEM_SOFT_DIRTY > >>> + if (!pte_soft_dirty_available()) > >>> + mnemonics[ilog2(VM_SOFTDIRTY)][0] = 0; > >>> +#endif > >>> + > >>> size_t i; > >>> > >>> seq_puts(m, "VmFlags: "); > >>> @@ -1531,6 +1541,8 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, > >>> static inline void clear_soft_dirty(struct vm_area_struct *vma, > >>> unsigned long addr, pte_t *pte) > >>> { > >>> + if (!pte_soft_dirty_available()) > >>> + return; > >>> /* > >>> * The soft-dirty tracker uses #PF-s to catch writes > >>> * to pages, so write-protect the pte as well. See the > >>> @@ -1566,6 +1578,9 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > >>> { > >>> pmd_t old, pmd = *pmdp; > >>> > >>> + if (!pte_soft_dirty_available()) > >>> + return; > >>> + > >>> if (pmd_present(pmd)) { > >>> /* See comment in change_huge_pmd() */ > >>> old = pmdp_invalidate(vma, addr, pmdp); > >>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > >>> index 4c035637eeb7..c0e2a6dc69f4 100644 > >>> --- a/include/linux/pgtable.h > >>> +++ b/include/linux/pgtable.h > >>> @@ -1538,6 +1538,15 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) > >>> #endif > >>> > >>> #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY > >>> + > >>> +/* > >>> + * Some platforms can customize the PTE soft dirty bit and make it unavailable > >>> + * even if the architecture allows providing the PTE resource. > >>> + */ > >>> +#ifndef pte_soft_dirty_available > >>> +#define pte_soft_dirty_available() (true) > >>> +#endif > >>> + > >>> #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION > >>> static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) > >>> { > >>> @@ -1555,6 +1564,7 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) > >>> } > >>> #endif > >>> #else /* !CONFIG_HAVE_ARCH_SOFT_DIRTY */ > >>> +#define pte_soft_dirty_available() (false) > >>> static inline int pte_soft_dirty(pte_t pte) > >>> { > >>> return 0; > >>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > >>> index 830107b6dd08..98ed7e22ccec 100644 > >>> --- a/mm/debug_vm_pgtable.c > >>> +++ b/mm/debug_vm_pgtable.c > >>> @@ -690,7 +690,7 @@ static void __init pte_soft_dirty_tests(struct pgtable_debug_args *args) > >>> { > >>> pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); > >>> > >>> - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > >>> + if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || !pte_soft_dirty_available()) > >> > >> I suggest that you instead make pte_soft_dirty_available() be false without CONFIG_MEM_SOFT_DIRTY. > >> > >> e.g., for the default implementation > >> > >> define pte_soft_dirty_available() IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) > >> > >> That way you can avoid some ifefs and cleanup these checks. > > > > Do you mean something like this: > > > > --- a/include/linux/pgtable.h > > +++ b/include/linux/pgtable.h > > @@ -1538,6 +1538,16 @@ static inline pgprot_t pgprot_modify(pgprot_t > > oldprot, pgprot_t newprot) > > #endif > > > > #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY > > +#ifndef arch_soft_dirty_available > > +#define arch_soft_dirty_available() (true) > > +#endif > > +#define pgtable_soft_dirty_supported() > > (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && arch_soft_dirty_available()) > > + > > #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION > > static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) > > { > > @@ -1555,6 +1565,7 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) > > } > > #endif > > #else /* !CONFIG_HAVE_ARCH_SOFT_DIRTY */ > > +#define pgtable_soft_dirty_supported() (false) > > Maybe we can simplify to > > #ifndef pgtable_soft_dirty_supported > #define pgtable_soft_dirty_supported() IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) > #endif > > And then just let the arch that overrides this function just make it > respect IS_ENABLED(CONFIG_MEM_SOFT_DIRTY). Ok, got you, I will address it. Thanks for your review, Chunyan > > -- > Cheers > > David / dhildenb > From huangze at whut.edu.cn Wed Sep 10 19:53:41 2025 From: huangze at whut.edu.cn (Ze Huang) Date: Thu, 11 Sep 2025 10:53:41 +0800 Subject: [PATCH v7 1/2] dt-bindings: usb: dwc3: add support for SpacemiT K1 In-Reply-To: References: <20250729-dwc3_generic-v7-0-5c791bba826f@linux.dev> <20250729-dwc3_generic-v7-1-5c791bba826f@linux.dev> Message-ID: On Wed, Sep 10, 2025 at 06:18:39PM -0400, Frank Li wrote: > On Tue, Jul 29, 2025 at 12:33:55AM +0800, Ze Huang wrote: > > Add support for the USB 3.0 Dual-Role Device (DRD) controller embedded > > in the SpacemiT K1 SoC. The controller is based on the Synopsys > > DesignWare Core USB 3 (DWC3) IP, supporting USB3.0 host mode and USB 2.0 > > DRD mode. > > > > Reviewed-by: Krzysztof Kozlowski > > Signed-off-by: Ze Huang > > --- > > Ze Huang: > > I seen Krzysztof and Thinh Nguyen already acked this patches. Do you > wait for greg pick it up or need respin? > > My one layerscape usb patch depend on this one! > > Frank Hi Frank, I'll remove the PCIe reset in the update - since Alex's latest combo PHY work now manages this functionality. The patch is otherwise in good shape though. Look for the updated series from me before end of week. Best, Ze From troy.mitchell at linux.spacemit.com Wed Sep 10 20:34:04 2025 From: troy.mitchell at linux.spacemit.com (Troy Mitchell) Date: Thu, 11 Sep 2025 11:34:04 +0800 Subject: [PATCH RESEND v4 2/3] clk: spacemit: introduce pre-div for ddn clock In-Reply-To: <20250911-k1-clk-i2s-generation-v4-0-cba204a50d48@linux.spacemit.com> References: <20250911-k1-clk-i2s-generation-v4-0-cba204a50d48@linux.spacemit.com> Message-ID: <20250911-k1-clk-i2s-generation-v4-2-cba204a50d48@linux.spacemit.com> The original DDN operations applied an implicit divide-by-2, which should not be a default behavior. This patch removes that assumption, letting each clock define its actual behavior explicitly. Reviewed-by: Haylen Chu Signed-off-by: Troy Mitchell --- drivers/clk/spacemit/ccu-k1.c | 4 ++-- drivers/clk/spacemit/ccu_ddn.c | 12 ++++++------ drivers/clk/spacemit/ccu_ddn.h | 6 ++++-- 3 files changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/clk/spacemit/ccu-k1.c b/drivers/clk/spacemit/ccu-k1.c index 65e6de030717afa60eefab7bda88f9a13b857650..7155824673fb450971439873b6b6163faf48c7e5 100644 --- a/drivers/clk/spacemit/ccu-k1.c +++ b/drivers/clk/spacemit/ccu-k1.c @@ -136,8 +136,8 @@ CCU_GATE_DEFINE(pll1_d3_819p2, CCU_PARENT_HW(pll1_d3), MPMU_ACGR, BIT(14), 0); CCU_GATE_DEFINE(pll1_d2_1228p8, CCU_PARENT_HW(pll1_d2), MPMU_ACGR, BIT(16), 0); CCU_GATE_DEFINE(slow_uart, CCU_PARENT_NAME(osc), MPMU_ACGR, BIT(1), CLK_IGNORE_UNUSED); -CCU_DDN_DEFINE(slow_uart1_14p74, pll1_d16_153p6, MPMU_SUCCR, 16, 13, 0, 13, 0); -CCU_DDN_DEFINE(slow_uart2_48, pll1_d4_614p4, MPMU_SUCCR_1, 16, 13, 0, 13, 0); +CCU_DDN_DEFINE(slow_uart1_14p74, pll1_d16_153p6, MPMU_SUCCR, 16, 13, 0, 13, 2, 0); +CCU_DDN_DEFINE(slow_uart2_48, pll1_d4_614p4, MPMU_SUCCR_1, 16, 13, 0, 13, 2, 0); CCU_GATE_DEFINE(wdt_clk, CCU_PARENT_HW(pll1_d96_25p6), MPMU_WDTPCR, BIT(1), 0); diff --git a/drivers/clk/spacemit/ccu_ddn.c b/drivers/clk/spacemit/ccu_ddn.c index be311b045698e95a688a35858a8ac1bcfbffd2c7..06d86748182bd1959cdab5c18d0a882ee25dcade 100644 --- a/drivers/clk/spacemit/ccu_ddn.c +++ b/drivers/clk/spacemit/ccu_ddn.c @@ -22,21 +22,21 @@ #include "ccu_ddn.h" -static unsigned long ccu_ddn_calc_rate(unsigned long prate, - unsigned long num, unsigned long den) +static unsigned long ccu_ddn_calc_rate(unsigned long prate, unsigned long num, + unsigned long den, unsigned int pre_div) { - return prate * den / 2 / num; + return prate * den / pre_div / num; } static unsigned long ccu_ddn_calc_best_rate(struct ccu_ddn *ddn, unsigned long rate, unsigned long prate, unsigned long *num, unsigned long *den) { - rational_best_approximation(rate, prate / 2, + rational_best_approximation(rate, prate / ddn->pre_div, ddn->den_mask >> ddn->den_shift, ddn->num_mask >> ddn->num_shift, den, num); - return ccu_ddn_calc_rate(prate, *num, *den); + return ccu_ddn_calc_rate(prate, *num, *den, ddn->pre_div); } static long ccu_ddn_round_rate(struct clk_hw *hw, unsigned long rate, @@ -58,7 +58,7 @@ static unsigned long ccu_ddn_recalc_rate(struct clk_hw *hw, unsigned long prate) num = (val & ddn->num_mask) >> ddn->num_shift; den = (val & ddn->den_mask) >> ddn->den_shift; - return ccu_ddn_calc_rate(prate, num, den); + return ccu_ddn_calc_rate(prate, num, den, ddn->pre_div); } static int ccu_ddn_set_rate(struct clk_hw *hw, unsigned long rate, diff --git a/drivers/clk/spacemit/ccu_ddn.h b/drivers/clk/spacemit/ccu_ddn.h index a52fabe77d62eba16426867a9c13481e72f025c0..4838414a8e8dc04af49d3b8d39280efedbd75616 100644 --- a/drivers/clk/spacemit/ccu_ddn.h +++ b/drivers/clk/spacemit/ccu_ddn.h @@ -18,13 +18,14 @@ struct ccu_ddn { unsigned int num_shift; unsigned int den_mask; unsigned int den_shift; + unsigned int pre_div; }; #define CCU_DDN_INIT(_name, _parent, _flags) \ CLK_HW_INIT_HW(#_name, &_parent.common.hw, &spacemit_ccu_ddn_ops, _flags) #define CCU_DDN_DEFINE(_name, _parent, _reg_ctrl, _num_shift, _num_width, \ - _den_shift, _den_width, _flags) \ + _den_shift, _den_width, _pre_div, _flags) \ static struct ccu_ddn _name = { \ .common = { \ .reg_ctrl = _reg_ctrl, \ @@ -33,7 +34,8 @@ static struct ccu_ddn _name = { \ .num_mask = GENMASK(_num_shift + _num_width - 1, _num_shift), \ .num_shift = _num_shift, \ .den_mask = GENMASK(_den_shift + _den_width - 1, _den_shift), \ - .den_shift = _den_shift, \ + .den_shift = _den_shift, \ + .pre_div = _pre_div, \ } static inline struct ccu_ddn *hw_to_ccu_ddn(struct clk_hw *hw) -- 2.51.0 From troy.mitchell at linux.spacemit.com Wed Sep 10 20:34:02 2025 From: troy.mitchell at linux.spacemit.com (Troy Mitchell) Date: Thu, 11 Sep 2025 11:34:02 +0800 Subject: [PATCH RESEND v4 0/3] clk: spacemit: fix i2s clock Message-ID: <20250911-k1-clk-i2s-generation-v4-0-cba204a50d48@linux.spacemit.com> Previously, the driver defined two clocks for the I2S controller: i2s_bclk and its parent i2s_sysclk. Both i2s_bclk and i2s_sysclk were treated as fixed-rate clocks, which clearly does not reflect the practical requirements for I2S operation. Additionally, the original driver overlooked some upstream clock sources. To fix the I2S clock, this series also introduces several new clock definition macros. The I2S clock hierarchy can be found here [1]. Link: https://developer.spacemit.com/documentation?token=LCrKwWDasiJuROkVNusc2pWTnEb [1] Signed-off-by: Troy Mitchell --- Since v4 of this series has not received any comments for a while, I'm resending the patchset. --- Changes in v4: - drop the erroneous change in ccu_mix.h(patch3/3) - modify comment - modify commit msg - Link to v3: https://lore.kernel.org/r/20250818-k1-clk-i2s-generation-v3-0-8139b22ae709 at linux.spacemit.com Changes in v3: - remove factor for CCU_DIV_GATE_DEFINE - introduce I2S_BCLK_FACTOR as I2S_BCLK parent clock - adjust consumers in patch2/3 - Link to v2: https://lore.kernel.org/all/20250811-k1-clk-i2s-generation-v2-0-e4d3ec268b7a at linux.spacemit.com/ Changes in v2: - remove CCU_DDN_GATE_DEFINE - remove CCU_DIV_TABLE_GATE_DEFINE - move gate of i2s_sysclk from DDN to MUX - introduce factor for CCU_DIV_GATE_DEFINE - modify commit message - split patch2/2 into separate patches - remove reformatting in k1-syscon.h - Link to v1: https://lore.kernel.org/r/20250807-k1-clk-i2s-generation-v1-0-7dc25eb4e4d3 at linux.spacemit.com --- Troy Mitchell (3): dt-bindings: clock: spacemit: introduce i2s pre-clock to fix i2s clock clk: spacemit: introduce pre-div for ddn clock clk: spacemit: fix i2s clock drivers/clk/spacemit/ccu-k1.c | 32 ++++++++++++++++++++++---- drivers/clk/spacemit/ccu_ddn.c | 12 +++++----- drivers/clk/spacemit/ccu_ddn.h | 6 +++-- include/dt-bindings/clock/spacemit,k1-syscon.h | 4 ++++ include/soc/spacemit/k1-syscon.h | 1 + 5 files changed, 43 insertions(+), 12 deletions(-) --- base-commit: e3324912fe5a05a3ea439df476625e7c8efc2b9a change-id: 20250804-k1-clk-i2s-generation-eee7049ee17a Best regards, -- Troy Mitchell From troy.mitchell at linux.spacemit.com Wed Sep 10 20:34:05 2025 From: troy.mitchell at linux.spacemit.com (Troy Mitchell) Date: Thu, 11 Sep 2025 11:34:05 +0800 Subject: [PATCH RESEND v4 3/3] clk: spacemit: fix i2s clock In-Reply-To: <20250911-k1-clk-i2s-generation-v4-0-cba204a50d48@linux.spacemit.com> References: <20250911-k1-clk-i2s-generation-v4-0-cba204a50d48@linux.spacemit.com> Message-ID: <20250911-k1-clk-i2s-generation-v4-3-cba204a50d48@linux.spacemit.com> Defining i2s_bclk and i2s_sysclk as fixed-rate clocks is insufficient for real I2S use cases. Moreover, the current I2S clock configuration does not work as expected due to missing parent clocks. This patch adds the missing parent clocks, defines i2s_sysclk as a DDN clock, and i2s_bclk as a DIV clock. A special note for i2s_bclk: >From the register definition, the i2s_bclk divider always implies an additional 1/2 factor. The following table shows the correspondence between index and frequency division coefficients: | index | div | |-------|-------| | 0 | 2 | | 1 | 4 | | 2 | 6 | | 3 | 8 | >From a software perspective, introducing i2s_bclk_factor as the parent of i2s_bclk is sufficient to address the issue. The I2S-related clock registers can be found here [1]. Link: https://developer.spacemit.com/documentation?token=LCrKwWDasiJuROkVNusc2pWTnEb [1] Fixes: 1b72c59db0add ("clk: spacemit: Add clock support for SpacemiT K1 SoC") Co-developer: Jinmei Wei Suggested-by: Haylen Chu Signed-off-by: Jinmei Wei Signed-off-by: Troy Mitchell --- drivers/clk/spacemit/ccu-k1.c | 28 ++++++++++++++++++++++++++-- include/soc/spacemit/k1-syscon.h | 1 + 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/drivers/clk/spacemit/ccu-k1.c b/drivers/clk/spacemit/ccu-k1.c index 7155824673fb450971439873b6b6163faf48c7e5..50b472a2721121414f33e9fac6370f544e6b8229 100644 --- a/drivers/clk/spacemit/ccu-k1.c +++ b/drivers/clk/spacemit/ccu-k1.c @@ -141,8 +141,28 @@ CCU_DDN_DEFINE(slow_uart2_48, pll1_d4_614p4, MPMU_SUCCR_1, 16, 13, 0, 13, 2, 0); CCU_GATE_DEFINE(wdt_clk, CCU_PARENT_HW(pll1_d96_25p6), MPMU_WDTPCR, BIT(1), 0); -CCU_FACTOR_GATE_DEFINE(i2s_sysclk, CCU_PARENT_HW(pll1_d16_153p6), MPMU_ISCCR, BIT(31), 50, 1); -CCU_FACTOR_GATE_DEFINE(i2s_bclk, CCU_PARENT_HW(i2s_sysclk), MPMU_ISCCR, BIT(29), 1, 1); +CCU_FACTOR_DEFINE(i2s_153p6, CCU_PARENT_HW(pll1_d8_307p2), 2, 1); + +static const struct clk_parent_data i2s_153p6_base_parents[] = { + CCU_PARENT_HW(i2s_153p6), + CCU_PARENT_HW(pll1_d8_307p2), +}; +CCU_MUX_DEFINE(i2s_153p6_base, i2s_153p6_base_parents, MPMU_FCCR, 29, 1, 0); + +static const struct clk_parent_data i2s_sysclk_src_parents[] = { + CCU_PARENT_HW(pll1_d96_25p6), + CCU_PARENT_HW(i2s_153p6_base) +}; +CCU_MUX_GATE_DEFINE(i2s_sysclk_src, i2s_sysclk_src_parents, MPMU_ISCCR, 30, 1, BIT(31), 0); + +CCU_DDN_DEFINE(i2s_sysclk, i2s_sysclk_src, MPMU_ISCCR, 0, 15, 15, 12, 1, 0); + +CCU_FACTOR_DEFINE(i2s_bclk_factor, CCU_PARENT_HW(i2s_sysclk), 2, 1); +/* + * Divider of i2s_bclk always implies a 1/2 factor, which is + * described by i2s_bclk_factor. + */ +CCU_DIV_GATE_DEFINE(i2s_bclk, CCU_PARENT_HW(i2s_bclk_factor), MPMU_ISCCR, 27, 2, BIT(29), 0); static const struct clk_parent_data apb_parents[] = { CCU_PARENT_HW(pll1_d96_25p6), @@ -756,6 +776,10 @@ static struct clk_hw *k1_ccu_mpmu_hws[] = { [CLK_I2S_BCLK] = &i2s_bclk.common.hw, [CLK_APB] = &apb_clk.common.hw, [CLK_WDT_BUS] = &wdt_bus_clk.common.hw, + [CLK_I2S_153P6] = &i2s_153p6.common.hw, + [CLK_I2S_153P6_BASE] = &i2s_153p6_base.common.hw, + [CLK_I2S_SYSCLK_SRC] = &i2s_sysclk_src.common.hw, + [CLK_I2S_BCLK_FACTOR] = &i2s_bclk_factor.common.hw, }; static const struct spacemit_ccu_data k1_ccu_mpmu_data = { diff --git a/include/soc/spacemit/k1-syscon.h b/include/soc/spacemit/k1-syscon.h index c59bd7a38e5b4219121341b9c0d9ffda13a9c3e2..354751562c55523ef8a22be931ddd8aca9651084 100644 --- a/include/soc/spacemit/k1-syscon.h +++ b/include/soc/spacemit/k1-syscon.h @@ -30,6 +30,7 @@ to_spacemit_ccu_adev(struct auxiliary_device *adev) /* MPMU register offset */ #define MPMU_POSR 0x0010 +#define MPMU_FCCR 0x0008 #define POSR_PLL1_LOCK BIT(27) #define POSR_PLL2_LOCK BIT(28) #define POSR_PLL3_LOCK BIT(29) -- 2.51.0 From troy.mitchell at linux.spacemit.com Wed Sep 10 20:34:03 2025 From: troy.mitchell at linux.spacemit.com (Troy Mitchell) Date: Thu, 11 Sep 2025 11:34:03 +0800 Subject: [PATCH RESEND v4 1/3] dt-bindings: clock: spacemit: introduce i2s pre-clock to fix i2s clock In-Reply-To: <20250911-k1-clk-i2s-generation-v4-0-cba204a50d48@linux.spacemit.com> References: <20250911-k1-clk-i2s-generation-v4-0-cba204a50d48@linux.spacemit.com> Message-ID: <20250911-k1-clk-i2s-generation-v4-1-cba204a50d48@linux.spacemit.com> Previously, the K1 clock driver did not include the parent clocks of the I2S sysclk. Introduce pre-clock to fix I2S clock. Otherwise, the I2S clock may not work as expected. This patch adds their definitions to allow proper registration in the driver and usage in the device tree. Fixes: 1b72c59db0add ("clk: spacemit: Add clock support for SpacemiT K1 SoC") Acked-by: Krzysztof Kozlowski Signed-off-by: Troy Mitchell --- include/dt-bindings/clock/spacemit,k1-syscon.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/dt-bindings/clock/spacemit,k1-syscon.h b/include/dt-bindings/clock/spacemit,k1-syscon.h index 2714c3fe66cd5b49e12c8b20689f5b01da36b774..ad62525be43a909633f8d3a65ece1acd60ba8052 100644 --- a/include/dt-bindings/clock/spacemit,k1-syscon.h +++ b/include/dt-bindings/clock/spacemit,k1-syscon.h @@ -77,6 +77,10 @@ #define CLK_I2S_BCLK 30 #define CLK_APB 31 #define CLK_WDT_BUS 32 +#define CLK_I2S_153P6 33 +#define CLK_I2S_153P6_BASE 34 +#define CLK_I2S_SYSCLK_SRC 35 +#define CLK_I2S_BCLK_FACTOR 36 /* MPMU resets */ #define RESET_WDT 0 -- 2.51.0 From spriteovo at gmail.com Wed Sep 10 21:46:01 2025 From: spriteovo at gmail.com (Asuna) Date: Thu, 11 Sep 2025 12:46:01 +0800 Subject: [PATCH v2] RISC-V: re-enable gcc + rust builds In-Reply-To: <20250910-harmless-bamboo-ebc94758fdad@spud> References: <20250909-gcc-rust-v2-v2-1-35e086b1b255@gmail.com> <20250910-harmless-bamboo-ebc94758fdad@spud> Message-ID: <6bceca9d-44cd-4373-a456-7c2129b418e3@gmail.com> On 9/10/25 10:27 PM, Conor Dooley wrote: > FWIW, this --- breaks git, and anything after this line (including your > signoff) is lost when the patch is applied. I used b4 command to prepare and send the cover letter and patch for v2, not sure what happened. I see that other people's patches have a [PATCH 0/n] email as a start that describes their patch series, this is called a cover-letter in b4 and git-send-email right? > The riscv patchwork CI stuff is really unhappy with this change: > init/Kconfig:87: syntax error > init/Kconfig:87: invalid statement > init/Kconfig:88: invalid statement > init/Kconfig:89:warning: ignoring unsupported character '`' > init/Kconfig:89:warning: ignoring unsupported character '`' > init/Kconfig:89:warning: ignoring unsupported character '.' > init/Kconfig:89: unknown statement "This" > > Is this bogus, or can rustc-bindgen-libclang-version return nothing > under some conditions where rust is not available? > Should this have 2 default lines like some other options in the file? This is because rustc-bindgen-libclang-version can't find the bindgen and returns nothing. Sorry I forgot to mention this, it's another reason why I wanted to separate the script, in a separate script we can easily fallback to return 0 when an error is encountered. Adding a second line `default 0` doesn't work, I'll try to fix it. BTW, when I fix it, if the diff isn't too large, do I need to open a v3 patch, or simply replying to the thread just fine? From troy.mitchell at linux.spacemit.com Wed Sep 10 22:47:09 2025 From: troy.mitchell at linux.spacemit.com (Troy Mitchell) Date: Thu, 11 Sep 2025 13:47:09 +0800 Subject: [PATCH v3 0/2] ASoC: spacemit: add i2s support to K1 SoC Message-ID: <20250911-k1-i2s-v3-0-57f173732f9c@linux.spacemit.com> On the K1 SoC, there is a full-duplex I2S controller. The I2S is programmable, with the sample width configurable to 8, 16, 18, or 32 bits. A dedicated FIFO is provided for transmit (TXFIFO) and another for receive (RXFIFO). In non-packed mode, both FIFOs are 32 entries deep and 32 bits wide, giving a total of 32 samples each. The register definitions can be found here[1] Link: https://developer.spacemit.com/documentation?token=Rn9Kw3iFHirAMgkIpTAcV2Arnkf#18.2-spi%2Fi2s [1] Signed-off-by: Troy Mitchell --- Changes in v3: - Patch 1/2: - simplify dma-names definition - Patch 2/2 - remove empty spacemit_i2s_remove() - move FSRT setup for DSP_A into switch-case in spacemit_i2s_set_fmt() - Link to v2: https://lore.kernel.org/r/20250828-k1-i2s-v2-0-09e7b40f002c at linux.spacemit.com Changes in v2: - Patch 1/2: - modify commit message - remove unused third cell from pdma dmas property - update SPDX license in spacemit,k1-i2s.yaml to (GPL-2.0-only OR BSD-2-Clause) - Patch 2/2: - modify commit message - reset_assert in dai_ops::remove - select CMA and DMA_CMA in Kconfig - use devm_reset_control_get_exclusive - Link to v1: https://lore.kernel.org/r/20250814-k1-i2s-v1-0-c31149b29041 at linux.spacemit.com --- Troy Mitchell (2): ASoC: dt-bindings: Add bindings for SpacemiT K1 ASoC: spacemit: add i2s support for K1 SoC .../devicetree/bindings/sound/spacemit,k1-i2s.yaml | 87 ++++ sound/soc/Kconfig | 1 + sound/soc/Makefile | 1 + sound/soc/spacemit/Kconfig | 16 + sound/soc/spacemit/Makefile | 5 + sound/soc/spacemit/k1_i2s.c | 444 +++++++++++++++++++++ 6 files changed, 554 insertions(+) --- base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585 change-id: 20250813-k1-i2s-115bf65eaac8 prerequisite-change-id: 20250701-working_dma_0701_v2-7d2cf506aad7:v5 prerequisite-patch-id: 3fe97698036c32c20d03b1b835a5735e8ee8126c prerequisite-patch-id: bf64cb2fbb9699d2ace64ae517532f13c6f8d277 prerequisite-patch-id: 49263c65c84a0b045f9b5ae6831dc011c4dea52f prerequisite-patch-id: 2b43599bf7568e6432faa2f6aca5b2db792cd1c1 prerequisite-patch-id: 1b840918a99543f4497b6475ee52977bdb59f1c3 prerequisite-patch-id: 2f77be523fd5423bd011e3081a3635d130410096 prerequisite-patch-id: 78bcc660796fc4f73b884d17a1b63e62f99dfdd0 prerequisite-patch-id: 62d0b3678cf825bca51424ad85cf35ebdd6dc171 prerequisite-message-id: <20250911-k1-clk-i2s-generation-v4-0-cba204a50d48 at linux.spacemit.com> prerequisite-patch-id: b46d4007c5b20f11845db739fc78ffccc54f4dab prerequisite-patch-id: 1e193c412de1206c024a674e2dd7da88092976b9 prerequisite-patch-id: af07a4bca4109b13a74c0b20a12f96af863090ef prerequisite-message-id: <20250824-k1-clk-i2s-v5-0-217b6b7cea06 at linux.spacemit.com> prerequisite-patch-id: 6f2626811da4833395f52f712d9f2a5fb553cb48 prerequisite-patch-id: d2594982f7a8f39c2aa4f21490a19e93ab67254d Best regards, -- Troy Mitchell From troy.mitchell at linux.spacemit.com Wed Sep 10 22:47:11 2025 From: troy.mitchell at linux.spacemit.com (Troy Mitchell) Date: Thu, 11 Sep 2025 13:47:11 +0800 Subject: [PATCH v3 2/2] ASoC: spacemit: add i2s support for K1 SoC In-Reply-To: <20250911-k1-i2s-v3-0-57f173732f9c@linux.spacemit.com> References: <20250911-k1-i2s-v3-0-57f173732f9c@linux.spacemit.com> Message-ID: <20250911-k1-i2s-v3-2-57f173732f9c@linux.spacemit.com> Add ASoC platform driver for the SpacemiT K1 SoC full-duplex I2S controller. Co-developer: Jinmei Wei Signed-off-by: Jinmei Wei Signed-off-by: Troy Mitchell --- sound/soc/Kconfig | 1 + sound/soc/Makefile | 1 + sound/soc/spacemit/Kconfig | 16 ++ sound/soc/spacemit/Makefile | 5 + sound/soc/spacemit/k1_i2s.c | 444 ++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 467 insertions(+) diff --git a/sound/soc/Kconfig b/sound/soc/Kconfig index ce74818bd7152dbe110b9fff7d908b0ddf34a9f5..36e0d443ba0ebe584ffe797c378c838f448ffcb9 100644 --- a/sound/soc/Kconfig +++ b/sound/soc/Kconfig @@ -127,6 +127,7 @@ source "sound/soc/renesas/Kconfig" source "sound/soc/rockchip/Kconfig" source "sound/soc/samsung/Kconfig" source "sound/soc/sdca/Kconfig" +source "sound/soc/spacemit/Kconfig" source "sound/soc/spear/Kconfig" source "sound/soc/sprd/Kconfig" source "sound/soc/starfive/Kconfig" diff --git a/sound/soc/Makefile b/sound/soc/Makefile index 462322c38aa42d4c394736239de0317d5918d5a7..8c0480e6484e75eb0b6db306630ba77d259ba8e3 100644 --- a/sound/soc/Makefile +++ b/sound/soc/Makefile @@ -70,6 +70,7 @@ obj-$(CONFIG_SND_SOC) += rockchip/ obj-$(CONFIG_SND_SOC) += samsung/ obj-$(CONFIG_SND_SOC) += sdca/ obj-$(CONFIG_SND_SOC) += sof/ +obj-$(CONFIG_SND_SOC) += spacemit/ obj-$(CONFIG_SND_SOC) += spear/ obj-$(CONFIG_SND_SOC) += sprd/ obj-$(CONFIG_SND_SOC) += starfive/ diff --git a/sound/soc/spacemit/Kconfig b/sound/soc/spacemit/Kconfig new file mode 100644 index 0000000000000000000000000000000000000000..2179f94f3f179c54cd06e6ced5523ed3f5225cf4 --- /dev/null +++ b/sound/soc/spacemit/Kconfig @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: GPL-2.0-only +menu "SpacemiT" + depends on COMPILE_TEST || ARCH_SPACEMIT + depends on HAVE_CLK + +config SND_SOC_K1_I2S + tristate "K1 I2S Device Driver" + select SND_SOC_GENERIC_DMAENGINE_PCM + select CMA + select DMA_CMA + help + Say Y or M if you want to add support for I2S driver for + K1 I2S controller. The device supports up to maximum of + 2 channels each for play and record. + +endmenu diff --git a/sound/soc/spacemit/Makefile b/sound/soc/spacemit/Makefile new file mode 100644 index 0000000000000000000000000000000000000000..9069de8ef89c84db8cc7d3a4d3b154fff9bd7aff --- /dev/null +++ b/sound/soc/spacemit/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 +# K1 Platform Support +snd-soc-k1-i2s-y := k1_i2s.o + +obj-$(CONFIG_SND_SOC_K1_I2S) += snd-soc-k1-i2s.o diff --git a/sound/soc/spacemit/k1_i2s.c b/sound/soc/spacemit/k1_i2s.c new file mode 100644 index 0000000000000000000000000000000000000000..bd3eb178e51cb76f824d9960a093eae0af55bfac --- /dev/null +++ b/sound/soc/spacemit/k1_i2s.c @@ -0,0 +1,444 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2025 Troy Mitchell */ + +#include +#include +#include +#include +#include +#include + +#define SSCR 0x00 /* SPI/I2S top control register */ +#define SSFCR 0x04 /* SPI/I2S FIFO control register */ +#define SSINTEN 0x08 /* SPI/I2S interrupt enable register */ +#define SSDATR 0x10 /* SPI/I2S data register */ +#define SSPSP 0x18 /* SPI/I2S programmable serial protocol control register */ +#define SSRWT 0x24 /* SPI/I2S root control register */ + +/* SPI/I2S Work data size, register bits value 0~31 indicated data size 1~32 bits */ +#define SSCR_FIELD_DSS GENMASK(9, 5) +#define SSCR_DW_8BYTE FIELD_PREP(SSCR_FIELD_DSS, 0x7) +#define SSCR_DW_16BYTE FIELD_PREP(SSCR_FIELD_DSS, 0xf) +#define SSCR_DW_18BYTE FIELD_PREP(SSCR_FIELD_DSS, 0x11) +#define SSCR_DW_32BYTE FIELD_PREP(SSCR_FIELD_DSS, 0x1f) + +#define SSCR_SSE BIT(0) /* SPI/I2S Enable */ +#define SSCR_FRF_PSP GENMASK(2, 1) /* Frame Format*/ +#define SSCR_TRAIL BIT(13) /* Trailing Byte */ + +#define SSFCR_FIELD_TFT GENMASK(3, 0) /* TXFIFO Trigger Threshold */ +#define SSFCR_FIELD_RFT GENMASK(8, 5) /* RXFIFO Trigger Threshold */ +#define SSFCR_TSRE BIT(10) /* Transmit Service Request Enable */ +#define SSFCR_RSRE BIT(11) /* Receive Service Request Enable */ + +#define SSPSP_FSRT BIT(3) /* Frame Sync Relative Timing Bit */ +#define SSPSP_SFRMP BIT(4) /* Serial Frame Polarity */ +#define SSPSP_FIELD_SFRMWDTH GENMASK(17, 12) /* Serial Frame Width field */ + +#define SSRWT_RWOT BIT(0) /* Receive Without Transmit */ + +#define SPACEMIT_PCM_RATES SNDRV_PCM_RATE_8000_192000 +#define SPACEMIT_PCM_FORMATS (SNDRV_PCM_FMTBIT_S8 | \ + SNDRV_PCM_FMTBIT_S16_LE | \ + SNDRV_PCM_FMTBIT_S24_LE | \ + SNDRV_PCM_FMTBIT_S32_LE) + +#define SPACEMIT_I2S_PERIOD_SIZE 1024 + +struct spacemit_i2s_dev { + struct device *dev; + + void __iomem *base; + + struct reset_control *reset; + + struct clk *sysclk; + struct clk *bclk; + struct clk *sspa_clk; + + struct snd_dmaengine_dai_dma_data capture_dma_data; + struct snd_dmaengine_dai_dma_data playback_dma_data; + + bool has_capture; + bool has_playback; + + int dai_fmt; + + int started_count; +}; + +static const struct snd_pcm_hardware spacemit_pcm_hardware = { + .info = SNDRV_PCM_INFO_INTERLEAVED | + SNDRV_PCM_INFO_BATCH, + .formats = SPACEMIT_PCM_FORMATS, + .rates = SPACEMIT_PCM_RATES, + .rate_min = SNDRV_PCM_RATE_8000, + .rate_max = SNDRV_PCM_RATE_192000, + .channels_min = 1, + .channels_max = 2, + .buffer_bytes_max = SPACEMIT_I2S_PERIOD_SIZE * 4 * 4, + .period_bytes_min = SPACEMIT_I2S_PERIOD_SIZE * 2, + .period_bytes_max = SPACEMIT_I2S_PERIOD_SIZE * 4, + .periods_min = 2, + .periods_max = 4, +}; + +static const struct snd_dmaengine_pcm_config spacemit_dmaengine_pcm_config = { + .pcm_hardware = &spacemit_pcm_hardware, + .prepare_slave_config = snd_dmaengine_pcm_prepare_slave_config, + .chan_names = {"tx", "rx"}, + .prealloc_buffer_size = 32 * 1024, +}; + +static void spacemit_i2s_init(struct spacemit_i2s_dev *i2s) +{ + u32 sscr_val, sspsp_val, ssfcr_val, ssrwt_val; + + sscr_val = SSCR_TRAIL | SSCR_FRF_PSP; + ssfcr_val = FIELD_PREP(SSFCR_FIELD_TFT, 5) | + FIELD_PREP(SSFCR_FIELD_RFT, 5) | + SSFCR_RSRE | SSFCR_TSRE; + ssrwt_val = SSRWT_RWOT; + + /* SSPSP register was set by set_fmt */ + sspsp_val = readl(i2s->base + SSPSP); + sspsp_val |= SSPSP_SFRMP; + + writel(sscr_val, i2s->base + SSCR); + writel(ssfcr_val, i2s->base + SSFCR); + writel(sspsp_val, i2s->base + SSPSP); + writel(ssrwt_val, i2s->base + SSRWT); + writel(0, i2s->base + SSINTEN); +} + +static int spacemit_i2s_hw_params(struct snd_pcm_substream *substream, + struct snd_pcm_hw_params *params, + struct snd_soc_dai *dai) +{ + struct spacemit_i2s_dev *i2s = snd_soc_dai_get_drvdata(dai); + struct snd_dmaengine_dai_dma_data *dma_data; + u32 data_width, data_bits; + unsigned long bclk_rate; + u32 val; + int ret; + + val = readl(i2s->base + SSCR); + if (val & SSCR_SSE) + return 0; + + dma_data = &i2s->playback_dma_data; + + if (substream->stream == SNDRV_PCM_STREAM_CAPTURE) + dma_data = &i2s->capture_dma_data; + + switch (params_format(params)) { + case SNDRV_PCM_FORMAT_S8: + data_bits = 8; + data_width = SSCR_DW_8BYTE; + dma_data->maxburst = 8; + dma_data->addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE; + break; + case SNDRV_PCM_FORMAT_S16_LE: + data_bits = 16; + data_width = SSCR_DW_16BYTE; + dma_data->maxburst = 16; + dma_data->addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES; + if ((i2s->dai_fmt & SND_SOC_DAIFMT_FORMAT_MASK) == SND_SOC_DAIFMT_I2S) { + data_width = SSCR_DW_32BYTE; + dma_data->maxburst = 32; + dma_data->addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES; + } + break; + case SNDRV_PCM_FORMAT_S32_LE: + data_bits = 32; + data_width = SSCR_DW_32BYTE; + dma_data->maxburst = 32; + dma_data->addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES; + break; + default: + dev_dbg(i2s->dev, "unexpected data width type"); + return -EINVAL; + } + + val = readl(i2s->base + SSCR); + val &= ~SSCR_DW_32BYTE; + val |= data_width; + writel(val, i2s->base + SSCR); + + bclk_rate = params_channels(params) * + params_rate(params) * + data_bits; + + ret = clk_set_rate(i2s->bclk, bclk_rate); + if (ret) + return ret; + + return clk_set_rate(i2s->sspa_clk, bclk_rate); +} + +static int spacemit_i2s_set_sysclk(struct snd_soc_dai *cpu_dai, int clk_id, + unsigned int freq, int dir) +{ + struct spacemit_i2s_dev *i2s = dev_get_drvdata(cpu_dai->dev); + + if (freq == 0) + return 0; + + return clk_set_rate(i2s->sysclk, freq); +} + +static int spacemit_i2s_set_fmt(struct snd_soc_dai *cpu_dai, + unsigned int fmt) +{ + struct spacemit_i2s_dev *i2s = dev_get_drvdata(cpu_dai->dev); + u32 sspsp_val; + + sspsp_val = readl(i2s->base + SSPSP); + sspsp_val &= ~SSPSP_FIELD_SFRMWDTH; + + i2s->dai_fmt = fmt; + + switch (fmt & SND_SOC_DAIFMT_FORMAT_MASK) { + case SND_SOC_DAIFMT_I2S: + cpu_dai->driver->playback.formats = SNDRV_PCM_FMTBIT_S16_LE; + cpu_dai->driver->capture.formats = SNDRV_PCM_FMTBIT_S16_LE; + sspsp_val |= FIELD_PREP(SSPSP_FIELD_SFRMWDTH, 0x10) | + SSPSP_FSRT; + break; + case SND_SOC_DAIFMT_DSP_A: + sspsp_val |= SSPSP_FSRT; + case SND_SOC_DAIFMT_DSP_B: + cpu_dai->driver->playback.channels_min = 1; + cpu_dai->driver->playback.channels_max = 1; + cpu_dai->driver->capture.channels_min = 1; + cpu_dai->driver->capture.channels_max = 1; + cpu_dai->driver->playback.formats = SNDRV_PCM_FMTBIT_S32_LE; + cpu_dai->driver->capture.formats = SNDRV_PCM_FMTBIT_S32_LE; + sspsp_val |= FIELD_PREP(SSPSP_FIELD_SFRMWDTH, 0x1); + break; + default: + dev_dbg(i2s->dev, "unexpected format type"); + return -EINVAL; + } + + writel(sspsp_val, i2s->base + SSPSP); + + return 0; +} + +static int spacemit_i2s_trigger(struct snd_pcm_substream *substream, + int cmd, struct snd_soc_dai *dai) +{ + struct spacemit_i2s_dev *i2s = snd_soc_dai_get_drvdata(dai); + u32 val; + + switch (cmd) { + case SNDRV_PCM_TRIGGER_START: + case SNDRV_PCM_TRIGGER_RESUME: + case SNDRV_PCM_TRIGGER_PAUSE_RELEASE: + if (!i2s->started_count) { + val = readl(i2s->base + SSCR); + val |= SSCR_SSE; + writel(val, i2s->base + SSCR); + } + i2s->started_count++; + break; + case SNDRV_PCM_TRIGGER_STOP: + case SNDRV_PCM_TRIGGER_SUSPEND: + case SNDRV_PCM_TRIGGER_PAUSE_PUSH: + if (i2s->started_count) + i2s->started_count--; + + if (!i2s->started_count) { + val = readl(i2s->base + SSCR); + val &= ~SSCR_SSE; + writel(val, i2s->base + SSCR); + } + break; + default: + return -EINVAL; + } + + return 0; +} + +static int spacemit_i2s_dai_probe(struct snd_soc_dai *dai) +{ + struct spacemit_i2s_dev *i2s = snd_soc_dai_get_drvdata(dai); + + snd_soc_dai_init_dma_data(dai, + i2s->has_playback ? &i2s->playback_dma_data : NULL, + i2s->has_capture ? &i2s->capture_dma_data : NULL); + + reset_control_deassert(i2s->reset); + + spacemit_i2s_init(i2s); + + return 0; +} + +static int spacemit_i2s_dai_remove(struct snd_soc_dai *dai) +{ + struct spacemit_i2s_dev *i2s = snd_soc_dai_get_drvdata(dai); + + reset_control_assert(i2s->reset); + + return 0; +} + +static const struct snd_soc_dai_ops spacemit_i2s_dai_ops = { + .probe = spacemit_i2s_dai_probe, + .remove = spacemit_i2s_dai_remove, + .hw_params = spacemit_i2s_hw_params, + .set_sysclk = spacemit_i2s_set_sysclk, + .set_fmt = spacemit_i2s_set_fmt, + .trigger = spacemit_i2s_trigger, +}; + +static struct snd_soc_dai_driver spacemit_i2s_dai = { + .ops = &spacemit_i2s_dai_ops, + .playback = { + .channels_min = 1, + .channels_max = 2, + .rates = SPACEMIT_PCM_RATES, + .rate_min = SNDRV_PCM_RATE_8000, + .rate_max = SNDRV_PCM_RATE_192000, + .formats = SPACEMIT_PCM_FORMATS, + }, + .capture = { + .channels_min = 1, + .channels_max = 2, + .rates = SPACEMIT_PCM_RATES, + .rate_min = SNDRV_PCM_RATE_8000, + .rate_max = SNDRV_PCM_RATE_192000, + .formats = SPACEMIT_PCM_FORMATS, + }, + .symmetric_rate = 1, +}; + +static int spacemit_i2s_init_dai(struct spacemit_i2s_dev *i2s, + struct snd_soc_dai_driver **dp, + dma_addr_t addr) +{ + struct device_node *node = i2s->dev->of_node; + struct snd_soc_dai_driver *dai; + struct property *dma_names; + const char *dma_name; + + of_property_for_each_string(node, "dma-names", dma_names, dma_name) { + if (!strcmp(dma_name, "tx")) + i2s->has_playback = true; + if (!strcmp(dma_name, "rx")) + i2s->has_capture = true; + } + + dai = devm_kmemdup(i2s->dev, &spacemit_i2s_dai, + sizeof(*dai), GFP_KERNEL); + if (!dai) + return -ENOMEM; + + if (i2s->has_playback) { + dai->playback.stream_name = "Playback"; + dai->playback.channels_min = 1; + dai->playback.channels_max = 2; + dai->playback.rates = SPACEMIT_PCM_RATES; + dai->playback.formats = SPACEMIT_PCM_FORMATS; + + i2s->playback_dma_data.addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES; + i2s->playback_dma_data.maxburst = 32; + i2s->playback_dma_data.addr = addr; + } + + if (i2s->has_capture) { + dai->capture.stream_name = "Capture"; + dai->capture.channels_min = 1; + dai->capture.channels_max = 2; + dai->capture.rates = SPACEMIT_PCM_RATES; + dai->capture.formats = SPACEMIT_PCM_FORMATS; + + i2s->capture_dma_data.addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES; + i2s->capture_dma_data.maxburst = 32; + i2s->capture_dma_data.addr = addr; + } + + if (dp) + *dp = dai; + + return 0; +} + +static const struct snd_soc_component_driver spacemit_i2s_component = { + .name = "i2s-k1", + .legacy_dai_naming = 1, +}; + +static int spacemit_i2s_probe(struct platform_device *pdev) +{ + struct snd_soc_dai_driver *dai; + struct spacemit_i2s_dev *i2s; + struct resource *res; + struct clk *clk; + int ret; + + i2s = devm_kzalloc(&pdev->dev, sizeof(*i2s), GFP_KERNEL); + if (!i2s) + return -ENOMEM; + + i2s->dev = &pdev->dev; + + i2s->sysclk = devm_clk_get_enabled(i2s->dev, "sysclk"); + if (IS_ERR(i2s->sysclk)) + return dev_err_probe(i2s->dev, PTR_ERR(i2s->sysclk), + "failed to enable sysbase clock\n"); + + i2s->bclk = devm_clk_get_enabled(i2s->dev, "bclk"); + if (IS_ERR(i2s->bclk)) + return dev_err_probe(i2s->dev, PTR_ERR(i2s->bclk), "failed to enable bit clock\n"); + + clk = devm_clk_get_enabled(i2s->dev, "sspa_bus"); + if (IS_ERR(clk)) + return dev_err_probe(i2s->dev, PTR_ERR(clk), "failed to enable sspa_bus clock\n"); + + i2s->sspa_clk = devm_clk_get_enabled(i2s->dev, "sspa"); + if (IS_ERR(clk)) + return dev_err_probe(i2s->dev, PTR_ERR(clk), "failed to enable sspa clock\n"); + + i2s->base = devm_platform_get_and_ioremap_resource(pdev, 0, &res); + if (IS_ERR(i2s->base)) + return dev_err_probe(i2s->dev, PTR_ERR(i2s->base), "failed to map registers\n"); + + i2s->reset = devm_reset_control_get_exclusive(&pdev->dev, NULL); + if (IS_ERR(i2s->reset)) + return dev_err_probe(i2s->dev, PTR_ERR(i2s->reset), + "failed to get reset control"); + + dev_set_drvdata(i2s->dev, i2s); + + spacemit_i2s_init_dai(i2s, &dai, res->start + SSDATR); + + ret = devm_snd_soc_register_component(i2s->dev, + &spacemit_i2s_component, + dai, 1); + if (ret) + return dev_err_probe(i2s->dev, ret, "failed to register component"); + + return devm_snd_dmaengine_pcm_register(&pdev->dev, &spacemit_dmaengine_pcm_config, 0); +} + +static const struct of_device_id spacemit_i2s_of_match[] = { + { .compatible = "spacemit,k1-i2s", }, + { /* sentinel */ } +}; +MODULE_DEVICE_TABLE(of, spacemit_i2s_of_match); + +static struct platform_driver spacemit_i2s_driver = { + .probe = spacemit_i2s_probe, + .driver = { + .name = "i2s-k1", + .of_match_table = spacemit_i2s_of_match, + }, +}; +module_platform_driver(spacemit_i2s_driver); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("I2S bus driver for SpacemiT K1 SoC"); -- 2.51.0 From troy.mitchell at linux.spacemit.com Wed Sep 10 22:47:10 2025 From: troy.mitchell at linux.spacemit.com (Troy Mitchell) Date: Thu, 11 Sep 2025 13:47:10 +0800 Subject: [PATCH v3 1/2] ASoC: dt-bindings: Add bindings for SpacemiT K1 In-Reply-To: <20250911-k1-i2s-v3-0-57f173732f9c@linux.spacemit.com> References: <20250911-k1-i2s-v3-0-57f173732f9c@linux.spacemit.com> Message-ID: <20250911-k1-i2s-v3-1-57f173732f9c@linux.spacemit.com> Add dt-binding for the i2s driver of SpacemiT's K1 SoC. Signed-off-by: Troy Mitchell --- .../devicetree/bindings/sound/spacemit,k1-i2s.yaml | 87 ++++++++++++++++++++++ 1 file changed, 87 insertions(+) diff --git a/Documentation/devicetree/bindings/sound/spacemit,k1-i2s.yaml b/Documentation/devicetree/bindings/sound/spacemit,k1-i2s.yaml new file mode 100644 index 0000000000000000000000000000000000000000..55bd0b307d22b3611d0fefb1e925e56812848dd1 --- /dev/null +++ b/Documentation/devicetree/bindings/sound/spacemit,k1-i2s.yaml @@ -0,0 +1,87 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/sound/spacemit,k1-i2s.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: K1 I2S controller + +description: + The I2S bus (Inter-IC sound bus) is a serial link for digital + audio data transfer between devices in the system. + +maintainers: + - Troy Mitchell + +allOf: + - $ref: dai-common.yaml# + +properties: + compatible: + const: spacemit,k1-i2s + + reg: + maxItems: 1 + + clocks: + items: + - description: clock for I2S sysclk + - description: clock for I2S bclk + - description: clock for I2S bus + - description: clock for I2S controller + + clock-names: + items: + - const: sysclk + - const: bclk + - const: bus + - const: func + + dmas: + minItems: 1 + maxItems: 2 + + dma-names: + minItems: 1 + items: + - const: tx + - const: rx + + resets: + maxItems: 1 + + port: + $ref: audio-graph-port.yaml# + unevaluatedProperties: false + + "#sound-dai-cells": + const: 0 + +required: + - compatible + - reg + - clocks + - clock-names + - dmas + - dma-names + - resets + - "#sound-dai-cells" + +unevaluatedProperties: false + +examples: + - | + #include + i2s at d4026000 { + compatible = "spacemit,k1-i2s"; + reg = <0xd4026000 0x30>; + clocks = <&syscon_mpmu CLK_I2S_SYSCLK>, + <&syscon_mpmu CLK_I2S_BCLK>, + <&syscon_apbc CLK_SSPA0_BUS>, + <&syscon_apbc CLK_SSPA0>; + clock-names = "sysclk", "bclk", "bus", "func"; + dmas = <&pdma0 21>, <&pdma0 22>; + dma-names = "tx", "rx"; + resets = <&syscon_apbc RESET_SSPA0>; + #sound-dai-cells = <0>; + }; -- 2.51.0 From troy.mitchell at linux.spacemit.com Wed Sep 10 22:50:36 2025 From: troy.mitchell at linux.spacemit.com (Troy Mitchell) Date: Thu, 11 Sep 2025 13:50:36 +0800 Subject: [PATCH v3] i2c: spacemit: configure ILCR for accurate SCL frequency In-Reply-To: <20250814-k1-i2c-ilcr-v3-1-317723e74bcd@linux.spacemit.com> References: <20250814-k1-i2c-ilcr-v3-1-317723e74bcd@linux.spacemit.com> Message-ID: <0E357339968F18D8+aMJjLLhA66Pe17I3@LT-Guozexi> On Thu, Aug 14, 2025 at 05:06:01PM +0800, Troy Mitchell wrote: > The SpacemiT I2C controller's SCL (Serial Clock Line) frequency for > master mode operations is determined by the ILCR (I2C Load Count Register). > Previously, the driver relied on the hardware's reset default > values for this register. > > The hardware's default ILCR values (SLV=0x156, FLV=0x5d) yield SCL > frequencies lower than intended. For example, with the default > 31.5 MHz input clock, these default settings result in an SCL > frequency of approximately 93 kHz (standard mode) when targeting 100 kHz, > and approximately 338 kHz (fast mode) when targeting 400 kHz. > These frequencies are below the 100 kHz/400 kHz nominal speeds. > > This patch integrates the SCL frequency management into > the Common Clock Framework (CCF). Specifically, the ILCR register, > which acts as a frequency divider for the SCL clock, is now registered > as a managed clock (scl_clk) within the CCF. > > This patch also cleans up unnecessary whitespace > in the included header files. > > Signed-off-by: Troy Mitchell > --- > Changelog in v3: > - use MASK macro in `recalc_rate` function > - rename clock name > - Link to v2: https://lore.kernel.org/r/20250718-k1-i2c-ilcr-v2-1-b4c68f13dcb1 at linux.spacemit.com Gentle ping. Any comments on this patch? It was last resent about 3 weeks ago. - Troy > > Changelog in v2: > - Align line breaks. > - Check `lv` in `clk_set_rate` function. > - Force fast mode when SCL frequency is illegal or unavailable. > - Change "linux/bits.h" to > - Kconfig: Add dependency on CCF. > - Link to v1: https://lore.kernel.org/all/20250710-k1-i2c-ilcr-v1-1-188d1f460c7d at linux.spacemit.com/ > --- > drivers/i2c/busses/Kconfig | 2 +- > drivers/i2c/busses/i2c-k1.c | 180 ++++++++++++++++++++++++++++++++++++++++---- > 2 files changed, 167 insertions(+), 15 deletions(-) > > diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig > index c8d115b58e449b59a38339b439190dcb0e332965..1382b6c257fa4ba4cf5098d684c1bbd5e2636fd4 100644 > --- a/drivers/i2c/busses/Kconfig > +++ b/drivers/i2c/busses/Kconfig > @@ -797,7 +797,7 @@ config I2C_JZ4780 > config I2C_K1 > tristate "SpacemiT K1 I2C adapter" > depends on ARCH_SPACEMIT || COMPILE_TEST > - depends on OF > + depends on OF && COMMON_CLK > help > This option enables support for the I2C interface on the SpacemiT K1 > platform. > diff --git a/drivers/i2c/busses/i2c-k1.c b/drivers/i2c/busses/i2c-k1.c > index b68a21fff0b56b59fe2032ccb7ca6953423aad32..34b22969cf6789e99de58dd93dda6f0951069f85 100644 > --- a/drivers/i2c/busses/i2c-k1.c > +++ b/drivers/i2c/busses/i2c-k1.c > @@ -3,17 +3,20 @@ > * Copyright (C) 2024-2025 Troy Mitchell > */ > > - #include > - #include > - #include > - #include > - #include > - #include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > > /* spacemit i2c registers */ > #define SPACEMIT_ICR 0x0 /* Control register */ > #define SPACEMIT_ISR 0x4 /* Status register */ > #define SPACEMIT_IDBR 0xc /* Data buffer register */ > +#define SPACEMIT_ILCR 0x10 /* Load Count Register */ > #define SPACEMIT_IBMR 0x1c /* Bus monitor register */ > > /* SPACEMIT_ICR register fields */ > @@ -80,6 +83,19 @@ > #define SPACEMIT_BMR_SDA BIT(0) /* SDA line level */ > #define SPACEMIT_BMR_SCL BIT(1) /* SCL line level */ > > +#define SPACEMIT_LCR_LV_STANDARD_SHIFT 0 > +#define SPACEMIT_LCR_LV_FAST_SHIFT 9 > +#define SPACEMIT_LCR_LV_STANDARD_WIDTH 9 > +#define SPACEMIT_LCR_LV_FAST_WIDTH 9 > +#define SPACEMIT_LCR_LV_STANDARD_MAX_VALUE GENMASK(SPACEMIT_LCR_LV_STANDARD_WIDTH - 1, 0) > +#define SPACEMIT_LCR_LV_FAST_MAX_VALUE GENMASK(SPACEMIT_LCR_LV_FAST_WIDTH - 1, 0) > +#define SPACEMIT_LCR_LV_STANDARD_MASK GENMASK(SPACEMIT_LCR_LV_STANDARD_SHIFT +\ > + SPACEMIT_LCR_LV_STANDARD_WIDTH - 1,\ > + SPACEMIT_LCR_LV_STANDARD_SHIFT) > +#define SPACEMIT_LCR_LV_FAST_MASK GENMASK(SPACEMIT_LCR_LV_FAST_SHIFT +\ > + SPACEMIT_LCR_LV_FAST_WIDTH - 1,\ > + SPACEMIT_LCR_LV_FAST_SHIFT) > + > /* i2c bus recover timeout: us */ > #define SPACEMIT_I2C_BUS_BUSY_TIMEOUT 100000 > > @@ -95,11 +111,20 @@ enum spacemit_i2c_state { > SPACEMIT_STATE_WRITE, > }; > > +enum spacemit_i2c_mode { > + SPACEMIT_MODE_STANDARD, > + SPACEMIT_MODE_FAST > +}; > + > /* i2c-spacemit driver's main struct */ > struct spacemit_i2c_dev { > struct device *dev; > struct i2c_adapter adapt; > > + struct clk_hw scl_clk_hw; > + struct clk *scl_clk; > + enum spacemit_i2c_mode mode; > + > /* hardware resources */ > void __iomem *base; > int irq; > @@ -120,6 +145,88 @@ struct spacemit_i2c_dev { > u32 status; > }; > > +static void spacemit_i2c_scl_clk_disable_unprepare(void *data) > +{ > + struct spacemit_i2c_dev *i2c = data; > + > + clk_disable_unprepare(i2c->scl_clk); > +} > + > +static void spacemit_i2c_scl_clk_exclusive_put(void *data) > +{ > + struct spacemit_i2c_dev *i2c = data; > + > + clk_rate_exclusive_put(i2c->scl_clk); > +} > + > +static int spacemit_i2c_clk_set_rate(struct clk_hw *hw, unsigned long rate, > + unsigned long parent_rate) > +{ > + struct spacemit_i2c_dev *i2c = container_of(hw, struct spacemit_i2c_dev, scl_clk_hw); > + u32 lv, lcr, mask, shift, max_lv; > + > + lv = DIV_ROUND_UP(parent_rate, rate); > + > + if (i2c->mode == SPACEMIT_MODE_STANDARD) { > + mask = SPACEMIT_LCR_LV_STANDARD_MASK; > + shift = SPACEMIT_LCR_LV_STANDARD_SHIFT; > + max_lv = SPACEMIT_LCR_LV_STANDARD_MAX_VALUE; > + } else if (i2c->mode == SPACEMIT_MODE_FAST) { > + mask = SPACEMIT_LCR_LV_FAST_MASK; > + shift = SPACEMIT_LCR_LV_FAST_SHIFT; > + max_lv = SPACEMIT_LCR_LV_FAST_MAX_VALUE; > + } > + > + if (!lv || lv > max_lv) { > + dev_err(i2c->dev, "set scl clock failed: lv 0x%x", lv); > + return -EINVAL; > + } > + > + lcr = readl(i2c->base + SPACEMIT_ILCR); > + lcr &= ~mask; > + lcr |= lv << shift; > + writel(lcr, i2c->base + SPACEMIT_ILCR); > + > + return 0; > +} > + > +static long spacemit_i2c_clk_round_rate(struct clk_hw *hw, unsigned long rate, > + unsigned long *parent_rate) > +{ > + u32 lv, freq; > + > + lv = DIV_ROUND_UP(*parent_rate, rate); > + freq = DIV_ROUND_UP(*parent_rate, lv); > + > + return freq; > +} > + > +static unsigned long spacemit_i2c_clk_recalc_rate(struct clk_hw *hw, > + unsigned long parent_rate) > +{ > + struct spacemit_i2c_dev *i2c = container_of(hw, struct spacemit_i2c_dev, scl_clk_hw); > + u32 lcr, lv = 0; > + > + lcr = readl(i2c->base + SPACEMIT_ILCR); > + > + if (i2c->mode == SPACEMIT_MODE_STANDARD) > + lv = (lcr & SPACEMIT_LCR_LV_STANDARD_MASK) >> > + SPACEMIT_LCR_LV_STANDARD_SHIFT; > + else if (i2c->mode == SPACEMIT_MODE_FAST) > + lv = (lcr & SPACEMIT_LCR_LV_FAST_MASK) >> > + SPACEMIT_LCR_LV_FAST_SHIFT; > + else > + return 0; > + > + return DIV_ROUND_UP(parent_rate, lv); > +} > + > +static const struct clk_ops spacemit_i2c_clk_ops = { > + .set_rate = spacemit_i2c_clk_set_rate, > + .round_rate = spacemit_i2c_clk_round_rate, > + .recalc_rate = spacemit_i2c_clk_recalc_rate, > +}; > + > static void spacemit_i2c_enable(struct spacemit_i2c_dev *i2c) > { > u32 val; > @@ -138,6 +245,27 @@ static void spacemit_i2c_disable(struct spacemit_i2c_dev *i2c) > writel(val, i2c->base + SPACEMIT_ICR); > } > > +static struct clk *spacemit_i2c_register_scl_clk(struct spacemit_i2c_dev *i2c, > + struct clk *parent) > +{ > + struct clk_init_data init; > + char name[32]; > + > + snprintf(name, sizeof(name), "%s_scl_clk", dev_name(i2c->dev)); > + > + init.name = name; > + init.ops = &spacemit_i2c_clk_ops; > + init.parent_data = (struct clk_parent_data[]) { > + { .fw_name = "func" }, > + }; > + init.num_parents = 1; > + init.flags = 0; > + > + i2c->scl_clk_hw.init = &init; > + > + return devm_clk_register(i2c->dev, &i2c->scl_clk_hw); > +} > + > static void spacemit_i2c_reset(struct spacemit_i2c_dev *i2c) > { > writel(SPACEMIT_CR_UR, i2c->base + SPACEMIT_ICR); > @@ -224,7 +352,7 @@ static void spacemit_i2c_init(struct spacemit_i2c_dev *i2c) > */ > val |= SPACEMIT_CR_DRFIE; > > - if (i2c->clock_freq == SPACEMIT_I2C_MAX_FAST_MODE_FREQ) > + if (i2c->mode == SPACEMIT_MODE_FAST) > val |= SPACEMIT_CR_MODE_FAST; > > /* disable response to general call */ > @@ -519,14 +647,15 @@ static int spacemit_i2c_probe(struct platform_device *pdev) > dev_warn(dev, "failed to read clock-frequency property: %d\n", ret); > > /* For now, this driver doesn't support high-speed. */ > - if (!i2c->clock_freq || i2c->clock_freq > SPACEMIT_I2C_MAX_FAST_MODE_FREQ) { > - dev_warn(dev, "unsupported clock frequency %u; using %u\n", > - i2c->clock_freq, SPACEMIT_I2C_MAX_FAST_MODE_FREQ); > + if (i2c->clock_freq > SPACEMIT_I2C_MAX_STANDARD_MODE_FREQ && > + i2c->clock_freq <= SPACEMIT_I2C_MAX_FAST_MODE_FREQ) { > + i2c->mode = SPACEMIT_MODE_FAST; > + } else if (i2c->clock_freq && i2c->clock_freq <= SPACEMIT_I2C_MAX_STANDARD_MODE_FREQ) { > + i2c->mode = SPACEMIT_MODE_STANDARD; > + } else { > + dev_warn(i2c->dev, "invalid clock-frequency, using fast mode"); > + i2c->mode = SPACEMIT_MODE_FAST; > i2c->clock_freq = SPACEMIT_I2C_MAX_FAST_MODE_FREQ; > - } else if (i2c->clock_freq < SPACEMIT_I2C_MAX_STANDARD_MODE_FREQ) { > - dev_warn(dev, "unsupported clock frequency %u; using %u\n", > - i2c->clock_freq, SPACEMIT_I2C_MAX_STANDARD_MODE_FREQ); > - i2c->clock_freq = SPACEMIT_I2C_MAX_STANDARD_MODE_FREQ; > } > > i2c->dev = &pdev->dev; > @@ -548,10 +677,33 @@ static int spacemit_i2c_probe(struct platform_device *pdev) > if (IS_ERR(clk)) > return dev_err_probe(dev, PTR_ERR(clk), "failed to enable func clock"); > > + i2c->scl_clk = spacemit_i2c_register_scl_clk(i2c, clk); > + if (IS_ERR(i2c->scl_clk)) > + return dev_err_probe(&pdev->dev, PTR_ERR(i2c->scl_clk), > + "failed to register scl clock\n"); > + > clk = devm_clk_get_enabled(dev, "bus"); > if (IS_ERR(clk)) > return dev_err_probe(dev, PTR_ERR(clk), "failed to enable bus clock"); > > + ret = clk_set_rate_exclusive(i2c->scl_clk, i2c->clock_freq); > + if (ret) > + return dev_err_probe(&pdev->dev, ret, "failed to set exclusive rate for SCL clock"); > + > + ret = devm_add_action_or_reset(dev, spacemit_i2c_scl_clk_exclusive_put, i2c); > + if (ret) > + return dev_err_probe(&pdev->dev, ret, > + "failed to register cleanup action for exclusive SCL clock rate"); > + > + ret = clk_prepare_enable(i2c->scl_clk); > + if (ret) > + return dev_err_probe(&pdev->dev, ret, "failed to prepare and enable clock"); > + > + ret = devm_add_action_or_reset(dev, spacemit_i2c_scl_clk_disable_unprepare, i2c); > + if (ret) > + return dev_err_probe(&pdev->dev, ret, > + "failed to register cleanup action for clk disable and unprepare"); > + > spacemit_i2c_reset(i2c); > > i2c_set_adapdata(&i2c->adapt, i2c); > > --- > base-commit: 733923397fd95405a48f165c9b1fbc8c4b0a4681 > change-id: 20250709-k1-i2c-ilcr-ea347e0850a4 > > Best regards, > -- > Troy Mitchell > From fustini at kernel.org Wed Sep 10 23:15:29 2025 From: fustini at kernel.org (Drew Fustini) Date: Wed, 10 Sep 2025 23:15:29 -0700 Subject: [PATCH 1/2] RISC-V: Detect the Ssqosid extension In-Reply-To: <20250910-ssqosid-v6-17-rc5-v1-0-72cb8f144615@kernel.org> References: <20250910-ssqosid-v6-17-rc5-v1-0-72cb8f144615@kernel.org> Message-ID: <20250910-ssqosid-v6-17-rc5-v1-1-72cb8f144615@kernel.org> Ssqosid is the RISC-V Quality-of-Service (QoS) Identifiers specification which defines the Supervisor Resource Management Configuration (srmcfg) register. Link: https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0 Signed-off-by: Kornel Dul?ba [fustini: rebase on v6.17-rc5] Signed-off-by: Drew Fustini --- arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/kernel/cpufeature.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index affd63e11b0a344c33a73647351ac02a94e42981..b4239f4f092d036ee3d037177b990e317d34a77f 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -106,6 +106,7 @@ #define RISCV_ISA_EXT_ZAAMO 97 #define RISCV_ISA_EXT_ZALRSC 98 #define RISCV_ISA_EXT_ZICBOP 99 +#define RISCV_ISA_EXT_SSQOSID 100 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 743d53415572e071fb22851161bd079ef3158b7c..e202564f6f7b550f3b44a0826b5a67d5c4ebee96 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -533,6 +533,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA), __RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF), __RISCV_ISA_EXT_SUPERSET(ssnpm, RISCV_ISA_EXT_SSNPM, riscv_xlinuxenvcfg_exts), + __RISCV_ISA_EXT_DATA(ssqosid, RISCV_ISA_EXT_SSQOSID), __RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC), __RISCV_ISA_EXT_DATA(svade, RISCV_ISA_EXT_SVADE), __RISCV_ISA_EXT_DATA_VALIDATE(svadu, RISCV_ISA_EXT_SVADU, riscv_ext_svadu_validate), -- 2.34.1 From fustini at kernel.org Wed Sep 10 23:15:30 2025 From: fustini at kernel.org (Drew Fustini) Date: Wed, 10 Sep 2025 23:15:30 -0700 Subject: [PATCH 2/2] RISC-V: Add support for srmcfg CSR from Ssqosid ext In-Reply-To: <20250910-ssqosid-v6-17-rc5-v1-0-72cb8f144615@kernel.org> References: <20250910-ssqosid-v6-17-rc5-v1-0-72cb8f144615@kernel.org> Message-ID: <20250910-ssqosid-v6-17-rc5-v1-2-72cb8f144615@kernel.org> Add support for the srmcfg CSR defined in the Ssqosid ISA extension (Supervisor-mode Quality of Service ID). The CSR contains two fields: - Resource Control ID (RCID) used determine resource allocation - Monitoring Counter ID (MCID) used to track resource usage Requests from a hart to shared resources like cache will be tagged with these IDs. This allows the usage of shared resources to be associated with the task currently running on the hart. A srmcfg field is added to thread_struct and has the same format as the srmcfg CSR. This allows the scheduler to set the hart's srmcfg CSR to contain the RCID and MCID for the task that is being scheduled in. The srmcfg CSR is only written to if the thread_struct.srmcfg is different than the current value of the CSR. A per-cpu variable cpu_srmcfg is used to mirror that state of the CSR. This is because access to L1D hot memory should be several times faster than a CSR read. Also, in the case of virtualization, accesses to this CSR are trapped in the hypervisor. Link: https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0 Co-developed-by: Kornel Dul?ba Signed-off-by: Kornel Dul?ba [fustini: rename csr to srmcfg, refactor switch_to, rebase on v6.17-rc5] Signed-off-by: Drew Fustini --- MAINTAINERS | 6 ++++++ arch/riscv/Kconfig | 17 ++++++++++++++++ arch/riscv/include/asm/csr.h | 8 ++++++++ arch/riscv/include/asm/processor.h | 3 +++ arch/riscv/include/asm/qos.h | 41 ++++++++++++++++++++++++++++++++++++++ arch/riscv/include/asm/switch_to.h | 3 +++ 6 files changed, 78 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index cd7ff55b5d321752ac44c91d2d7e74de28e08960..02a71e4b4a8f045be03f9c77a5d2314ee61d29f0 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -21729,6 +21729,12 @@ F: drivers/perf/riscv_pmu.c F: drivers/perf/riscv_pmu_legacy.c F: drivers/perf/riscv_pmu_sbi.c +RISC-V QOS RESCTRL SUPPORT +M: Drew Fustini +L: linux-riscv at lists.infradead.org +S: Supported +F: arch/riscv/include/asm/qos.h + RISC-V SPACEMIT SoC Support M: Yixun Lan L: linux-riscv at lists.infradead.org diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 51dcd8eaa24356d947ebe0f1c4a701a3cfc6b757..9b09a7aad29621d99f14d414751e67a43cbdad3a 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -605,6 +605,23 @@ config RISCV_ISA_SVNAPOT If you don't know what to do here, say Y. +config RISCV_ISA_SSQOSID + bool "Ssqosid extension support for supervisor mode Quality of Service ID" + default y + help + Adds support for the Ssqosid ISA extension (Supervisor-mode + Quality of Service ID). + + Ssqosid defines the srmcfg CSR which allows the system to tag the + running process with an RCID (Resource Control ID) and MCID + (Monitoring Counter ID). The RCID is used to determine resource + allocation. The MCID is used to track resource usage in event + counters. + + For example, a cache controller may use the RCID to apply a + cache partitioning scheme and use the MCID to track how much + cache a process, or a group of processes, is using. + config RISCV_ISA_SVPBMT bool "Svpbmt extension support for supervisor mode page-based memory types" depends on 64BIT && MMU diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index 6fed42e377059c7004ecd3c29eb36d5c0e36a656..ecc57492264c2a2616e1e147796157512da70e87 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -75,6 +75,13 @@ #define SATP_ASID_MASK _AC(0xFFFF, UL) #endif +/* SRMCFG fields */ +#define SRMCFG_RCID_MASK _AC(0x00000FFF, UL) +#define SRMCFG_MCID_MASK SRMCFG_RCID_MASK +#define SRMCFG_MCID_SHIFT 16 +#define SRMCFG_MASK ((SRMCFG_MCID_MASK << SRMCFG_MCID_SHIFT) | \ + SRMCFG_RCID_MASK) + /* Exception cause high bit - is an interrupt if set */ #define CAUSE_IRQ_FLAG (_AC(1, UL) << (__riscv_xlen - 1)) @@ -317,6 +324,7 @@ #define CSR_STVAL 0x143 #define CSR_SIP 0x144 #define CSR_SATP 0x180 +#define CSR_SRMCFG 0x181 #define CSR_STIMECMP 0x14D #define CSR_STIMECMPH 0x15D diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h index 24d3af4d3807e37396744ef26533ac4661abcb4f..cc9548b85d363ecbc3c416e52906107a73e6053d 100644 --- a/arch/riscv/include/asm/processor.h +++ b/arch/riscv/include/asm/processor.h @@ -122,6 +122,9 @@ struct thread_struct { /* A forced icache flush is not needed if migrating to the previous cpu. */ unsigned int prev_cpu; #endif +#ifdef CONFIG_RISCV_ISA_SSQOSID + u32 srmcfg; +#endif }; /* Whitelist the fstate from the task_struct for hardened usercopy */ diff --git a/arch/riscv/include/asm/qos.h b/arch/riscv/include/asm/qos.h new file mode 100644 index 0000000000000000000000000000000000000000..418ac8383fb7808c6e3f421a8d4e9389b702a264 --- /dev/null +++ b/arch/riscv/include/asm/qos.h @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_RISCV_QOS_H +#define _ASM_RISCV_QOS_H + +#ifdef CONFIG_RISCV_ISA_SSQOSID + +#include +#include + +#include +#include +#include + +/* cached value of srmcfg csr for each cpu */ +static DEFINE_PER_CPU(u32, cpu_srmcfg); + +static inline void __switch_to_srmcfg(struct task_struct *next) +{ + u32 *cpu_srmcfg_ptr = this_cpu_ptr(&cpu_srmcfg); + u32 thread_srmcfg; + + thread_srmcfg = READ_ONCE(next->thread.srmcfg); + + if (thread_srmcfg != *cpu_srmcfg_ptr) { + *cpu_srmcfg_ptr = thread_srmcfg; + csr_write(CSR_SRMCFG, thread_srmcfg); + } +} + +static __always_inline bool has_srmcfg(void) +{ + return riscv_has_extension_likely(RISCV_ISA_EXT_SSQOSID); +} + +#else /* ! CONFIG_RISCV_ISA_SSQOSID */ + +static __always_inline bool has_srmcfg(void) { return false; } +#define __switch_to_srmcfg() do { } while (0) + +#endif /* CONFIG_RISCV_ISA_SSQOSID */ +#endif /* _ASM_RISCV_QOS_H */ diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h index 0e71eb82f920cac2f14bb626879bb219a2f247cc..a684a3795d3d7f5e027ec0a83c30afd1a18d7228 100644 --- a/arch/riscv/include/asm/switch_to.h +++ b/arch/riscv/include/asm/switch_to.h @@ -14,6 +14,7 @@ #include #include #include +#include #ifdef CONFIG_FPU extern void __fstate_save(struct task_struct *save_to); @@ -119,6 +120,8 @@ do { \ __switch_to_fpu(__prev, __next); \ if (has_vector() || has_xtheadvector()) \ __switch_to_vector(__prev, __next); \ + if (has_srmcfg()) \ + __switch_to_srmcfg(__next); \ if (switch_to_should_flush_icache(__next)) \ local_flush_icache_all(); \ __switch_to_envcfg(__next); \ -- 2.34.1 From fustini at kernel.org Wed Sep 10 23:15:28 2025 From: fustini at kernel.org (Drew Fustini) Date: Wed, 10 Sep 2025 23:15:28 -0700 Subject: [PATCH 0/2] RISC-V: Detect Ssqosid extension and handle srmcfg CSR Message-ID: <20250910-ssqosid-v6-17-rc5-v1-0-72cb8f144615@kernel.org> This series adds support for the RISC-V Quality-of-Service Identifiers (Ssqosid) extension [1] which adds the srmcfg register. This CSR configures a hart with two identifiers: a Resource Control ID (RCID) and a Monitoring Counter ID (MCID). These identifiers accompany each request issued by the hart to shared resource controllers. Background on RISC-V QoS: The Ssqosid extension is used by the RISC-V Capacity and Bandwidth Controller QoS Register Interface (CBQRI) specification [2]. QoS in this context is concerned with shared resources on an SoC such as cache capacity and memory bandwidth. Intel and AMD already have QoS features on x86 and ARM has MPAM. There is an existing user interface in Linux: the resctrl virtual filesystem [3]. The srmcfg CSR provides a mechanism by which a software workload (e.g. a process or a set of processes) can be associated with an RCID and an MCID. CBQRI defines operations to configure resource usage limits, in the form of capacity or bandwidth. CBQRI also defines operations to configure counters to track the resource utilization. Goal for this series: These two patches are taken from the implementation of resctrl support for RISC-V CBQRI. Please refer to the proof-of-concept RFC [4] for details on the resctrl implementation. More recently, I have rebased the CBQRI support on mainline [5]. Big thanks to James Morse for the tireless work to extract resctrl from arch/x86 and make it available to all archs. I think it makes sense to first focus on the detection of Ssqosid and handling of srmcfg when switching tasks. It has been tested against a QEMU branch that implements Ssqosid and CBQRI [6]. A test driver [7] was used to set srmcfg for the current process. This allows switch_to to be tested without resctrl. Changes from RFC v2: - Rename all instances of the sqoscfg CSR to srmcfg to match the ratified Ssqosid spec - RFC v2: https://lore.kernel.org/linux-riscv/20230430-riscv-cbqri-rfc-v2-v2-0-8e3725c4a473 at baylibre.com/ Changes from RFC v1: - change DEFINE_PER_CPU to DECLARE_PER_CPU for cpu_sqoscfg in qos.h to prevent linking error about multiple definition. Move DEFINE_PER_CPU for cpu_sqoscfg into qos.c - renamed qos prefix in function names to sqoscfg to be less generic - handle sqoscfg the same way has_vector and has_fpu are handled in the vector patch series - RFC v1: https://lore.kernel.org/linux-riscv/20230410043646.3138446-1-dfustini at baylibre.com/ [1] https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0 [2] https://github.com/riscv-non-isa/riscv-cbqri/releases/tag/v1.0 [3] https://docs.kernel.org/filesystems/resctrl.html [4] https://lore.kernel.org/linux-riscv/20230419111111.477118-1-dfustini at baylibre.com/ [5] https://github.com/tt-fustini/linux/tree/b4/cbqri-v6-17-rc5 [6] https://github.com/tt-fustini/qemu/tree/riscv-cbqri-rqsc-pptt [7] https://github.com/tt-fustini/linux/tree/ssqosid-v6-17-rc5-debug Signed-off-by: Drew Fustini --- Drew Fustini (2): RISC-V: Detect the Ssqosid extension RISC-V: Add support for srmcfg CSR from Ssqosid ext MAINTAINERS | 6 ++++++ arch/riscv/Kconfig | 17 ++++++++++++++++ arch/riscv/include/asm/csr.h | 8 ++++++++ arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/include/asm/processor.h | 3 +++ arch/riscv/include/asm/qos.h | 41 ++++++++++++++++++++++++++++++++++++++ arch/riscv/include/asm/switch_to.h | 3 +++ arch/riscv/kernel/cpufeature.c | 1 + 8 files changed, 80 insertions(+) --- base-commit: 76eeb9b8de9880ca38696b2fb56ac45ac0a25c6c change-id: 20250909-ssqosid-v6-17-rc5-fcc0b68a70a2 Best regards, -- Drew Fustini From jgross at suse.com Wed Sep 10 23:34:19 2025 From: jgross at suse.com (Juergen Gross) Date: Thu, 11 Sep 2025 08:34:19 +0200 Subject: [PATCH 00/14] paravirt: cleanup and reorg Message-ID: <20250911063433.13783-1-jgross@suse.com> Some cleanups and reorg of paravirt code and headers: - The first 2 patches should be not controversial at all, as they remove just some no longer needed #include and struct forward declarations. - The 3rd patch is removing CONFIG_PARAVIRT_DEBUG, which IMO has no real value, as it just changes a crash to a BUG() (the stack trace will basically be the same). As the maintainer of the main paravirt user (Xen) I have never seen this crash/BUG() to happen. - The 4th patch is just a movement of code. - I don't know for what reason asm/paravirt_api_clock.h was added, as all archs supporting it do it exactly in the same way. Patch 5 is removing it. - Patches 6-12 are streamlining the paravirt clock interfaces by using a common implementation across architectures where possible and by moving the related code into common sched code, as this is where it should live. - Patches 13+14 are more like RFC material: patch 13 is doing some preparation work to enable patch 14 to move all spinlock related paravirt functions into qspinlock.h. If this approach is accepted, I'd like to continue with this work by moving most (or all?) paravirt functions from paravirt.h into the headers where their native counterparts are defined. This is meant to keep the native and paravirt function definitions together in one place and hopefully to be able to reduce the include hell with paravirt. Juergen Gross (14): x86/paravirt: remove not needed includes of paravirt.h x86/paravirt: remove some unneeded struct declarations x86/paravirt: remove PARAVIRT_DEBUG config option x86/paravirt: move thunk macros to paravirt_types.h paravirt: remove asm/paravirt_api_clock.h sched: move clock related paravirt code to kernel/sched arm/paravirt: use common code for paravirt_steal_clock() arm64/paravirt: use common code for paravirt_steal_clock() loongarch/paravirt: use common code for paravirt_steal_clock() riscv/paravirt: use common code for paravirt_steal_clock() x86/paravirt: use common code for paravirt_steal_clock() x86/paravirt: move paravirt_sched_clock() related code into tsc.c x86/paravirt: allow pv-calls outside paravirt.h x86/pvlocks: move paravirt spinlock functions into qspinlock.h arch/Kconfig | 3 + arch/arm/Kconfig | 1 + arch/arm/include/asm/paravirt.h | 22 --- arch/arm/include/asm/paravirt_api_clock.h | 1 - arch/arm/kernel/Makefile | 1 - arch/arm/kernel/paravirt.c | 23 --- arch/arm64/Kconfig | 1 + arch/arm64/include/asm/paravirt.h | 14 -- arch/arm64/include/asm/paravirt_api_clock.h | 1 - arch/arm64/kernel/paravirt.c | 11 +- arch/loongarch/Kconfig | 1 + arch/loongarch/include/asm/paravirt.h | 13 -- .../include/asm/paravirt_api_clock.h | 1 - arch/loongarch/kernel/paravirt.c | 10 +- arch/powerpc/include/asm/paravirt.h | 3 - arch/powerpc/include/asm/paravirt_api_clock.h | 2 - arch/powerpc/platforms/pseries/setup.c | 4 +- arch/riscv/Kconfig | 1 + arch/riscv/include/asm/paravirt.h | 14 -- arch/riscv/include/asm/paravirt_api_clock.h | 1 - arch/riscv/kernel/paravirt.c | 11 +- arch/x86/Kconfig | 8 +- arch/x86/entry/entry_64.S | 1 - arch/x86/entry/vsyscall/vsyscall_64.c | 1 - arch/x86/hyperv/hv_spinlock.c | 1 - arch/x86/include/asm/apic.h | 4 - arch/x86/include/asm/highmem.h | 1 - arch/x86/include/asm/mmu_context.h | 1 - arch/x86/include/asm/mshyperv.h | 1 - arch/x86/include/asm/paravirt.h | 166 ------------------ arch/x86/include/asm/paravirt_api_clock.h | 1 - arch/x86/include/asm/paravirt_types.h | 82 +++++++-- arch/x86/include/asm/pgtable_32.h | 1 - arch/x86/include/asm/qspinlock.h | 49 +++++- arch/x86/include/asm/spinlock.h | 1 - arch/x86/include/asm/timer.h | 1 + arch/x86/include/asm/tlbflush.h | 4 - arch/x86/kernel/apm_32.c | 1 - arch/x86/kernel/callthunks.c | 1 - arch/x86/kernel/cpu/bugs.c | 1 - arch/x86/kernel/cpu/vmware.c | 1 + arch/x86/kernel/kvm.c | 1 + arch/x86/kernel/kvmclock.c | 1 + arch/x86/kernel/paravirt.c | 16 -- arch/x86/kernel/tsc.c | 10 +- arch/x86/kernel/vsmp_64.c | 1 - arch/x86/kernel/x86_init.c | 1 - arch/x86/lib/cache-smp.c | 1 - arch/x86/mm/init.c | 1 - arch/x86/xen/spinlock.c | 1 - arch/x86/xen/time.c | 2 + drivers/clocksource/hyperv_timer.c | 2 + drivers/xen/time.c | 2 +- include/linux/sched/cputime.h | 18 ++ kernel/sched/core.c | 5 + kernel/sched/cputime.c | 13 ++ kernel/sched/sched.h | 3 +- 57 files changed, 182 insertions(+), 362 deletions(-) delete mode 100644 arch/arm/include/asm/paravirt.h delete mode 100644 arch/arm/include/asm/paravirt_api_clock.h delete mode 100644 arch/arm/kernel/paravirt.c delete mode 100644 arch/arm64/include/asm/paravirt_api_clock.h delete mode 100644 arch/loongarch/include/asm/paravirt_api_clock.h delete mode 100644 arch/powerpc/include/asm/paravirt_api_clock.h delete mode 100644 arch/riscv/include/asm/paravirt_api_clock.h delete mode 100644 arch/x86/include/asm/paravirt_api_clock.h -- 2.51.0 From jgross at suse.com Wed Sep 10 23:34:24 2025 From: jgross at suse.com (Juergen Gross) Date: Thu, 11 Sep 2025 08:34:24 +0200 Subject: [PATCH 05/14] paravirt: remove asm/paravirt_api_clock.h In-Reply-To: <20250911063433.13783-1-jgross@suse.com> References: <20250911063433.13783-1-jgross@suse.com> Message-ID: <20250911063433.13783-6-jgross@suse.com> All architectures supporting CONFIG_PARAVIRT share the same contents of asm/paravirt_api_clock.h: #include So remove all incarnations of asm/paravirt_api_clock.h and remove the only place where it is included, as there asm/paravirt.h is included anyway. Signed-off-by: Juergen Gross --- arch/arm/include/asm/paravirt_api_clock.h | 1 - arch/arm64/include/asm/paravirt_api_clock.h | 1 - arch/loongarch/include/asm/paravirt_api_clock.h | 1 - arch/powerpc/include/asm/paravirt_api_clock.h | 2 -- arch/riscv/include/asm/paravirt_api_clock.h | 1 - arch/x86/include/asm/paravirt_api_clock.h | 1 - kernel/sched/sched.h | 1 - 7 files changed, 8 deletions(-) delete mode 100644 arch/arm/include/asm/paravirt_api_clock.h delete mode 100644 arch/arm64/include/asm/paravirt_api_clock.h delete mode 100644 arch/loongarch/include/asm/paravirt_api_clock.h delete mode 100644 arch/powerpc/include/asm/paravirt_api_clock.h delete mode 100644 arch/riscv/include/asm/paravirt_api_clock.h delete mode 100644 arch/x86/include/asm/paravirt_api_clock.h diff --git a/arch/arm/include/asm/paravirt_api_clock.h b/arch/arm/include/asm/paravirt_api_clock.h deleted file mode 100644 index 65ac7cee0dad..000000000000 --- a/arch/arm/include/asm/paravirt_api_clock.h +++ /dev/null @@ -1 +0,0 @@ -#include diff --git a/arch/arm64/include/asm/paravirt_api_clock.h b/arch/arm64/include/asm/paravirt_api_clock.h deleted file mode 100644 index 65ac7cee0dad..000000000000 --- a/arch/arm64/include/asm/paravirt_api_clock.h +++ /dev/null @@ -1 +0,0 @@ -#include diff --git a/arch/loongarch/include/asm/paravirt_api_clock.h b/arch/loongarch/include/asm/paravirt_api_clock.h deleted file mode 100644 index 65ac7cee0dad..000000000000 --- a/arch/loongarch/include/asm/paravirt_api_clock.h +++ /dev/null @@ -1 +0,0 @@ -#include diff --git a/arch/powerpc/include/asm/paravirt_api_clock.h b/arch/powerpc/include/asm/paravirt_api_clock.h deleted file mode 100644 index d25ca7ac57c7..000000000000 --- a/arch/powerpc/include/asm/paravirt_api_clock.h +++ /dev/null @@ -1,2 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#include diff --git a/arch/riscv/include/asm/paravirt_api_clock.h b/arch/riscv/include/asm/paravirt_api_clock.h deleted file mode 100644 index 65ac7cee0dad..000000000000 --- a/arch/riscv/include/asm/paravirt_api_clock.h +++ /dev/null @@ -1 +0,0 @@ -#include diff --git a/arch/x86/include/asm/paravirt_api_clock.h b/arch/x86/include/asm/paravirt_api_clock.h deleted file mode 100644 index 65ac7cee0dad..000000000000 --- a/arch/x86/include/asm/paravirt_api_clock.h +++ /dev/null @@ -1 +0,0 @@ -#include diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index be9745d104f7..6442441b46d7 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -84,7 +84,6 @@ struct cpuidle_state; #ifdef CONFIG_PARAVIRT # include -# include #endif #include -- 2.51.0 From jgross at suse.com Wed Sep 10 23:34:25 2025 From: jgross at suse.com (Juergen Gross) Date: Thu, 11 Sep 2025 08:34:25 +0200 Subject: [PATCH 06/14] sched: move clock related paravirt code to kernel/sched In-Reply-To: <20250911063433.13783-1-jgross@suse.com> References: <20250911063433.13783-1-jgross@suse.com> Message-ID: <20250911063433.13783-7-jgross@suse.com> Paravirt clock related functions are available in multiple archs. In order to share the common parts, move the common static keys to kernel/sched/ and remove them from the arch specific files. Make a common paravirt_steal_clock() implementation available in kernel/sched/cputime.c, guarding it with a new config option CONFIG_HAVE_PV_STEAL_CLOCK_GEN, which can be selectd by an arch in case it wants to use that common variant. Signed-off-by: Juergen Gross --- arch/Kconfig | 3 +++ arch/arm/include/asm/paravirt.h | 4 ---- arch/arm/kernel/paravirt.c | 3 --- arch/arm64/include/asm/paravirt.h | 4 ---- arch/arm64/kernel/paravirt.c | 4 +--- arch/loongarch/include/asm/paravirt.h | 3 --- arch/loongarch/kernel/paravirt.c | 3 +-- arch/powerpc/include/asm/paravirt.h | 3 --- arch/powerpc/platforms/pseries/setup.c | 4 +--- arch/riscv/include/asm/paravirt.h | 4 ---- arch/riscv/kernel/paravirt.c | 4 +--- arch/x86/include/asm/paravirt.h | 4 ---- arch/x86/kernel/cpu/vmware.c | 1 + arch/x86/kernel/kvm.c | 1 + arch/x86/kernel/paravirt.c | 3 --- drivers/xen/time.c | 1 + include/linux/sched/cputime.h | 18 ++++++++++++++++++ kernel/sched/core.c | 5 +++++ kernel/sched/cputime.c | 13 +++++++++++++ kernel/sched/sched.h | 2 +- 20 files changed, 47 insertions(+), 40 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index d1b4ffd6e085..7921be052472 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -1003,6 +1003,9 @@ config HAVE_IRQ_TIME_ACCOUNTING Archs need to ensure they use a high enough resolution clock to support irq time accounting and then call enable_sched_clock_irqtime(). +config HAVE_PV_STEAL_CLOCK_GEN + bool + config HAVE_MOVE_PUD bool help diff --git a/arch/arm/include/asm/paravirt.h b/arch/arm/include/asm/paravirt.h index 95d5b0d625cd..69da4bdcf856 100644 --- a/arch/arm/include/asm/paravirt.h +++ b/arch/arm/include/asm/paravirt.h @@ -5,10 +5,6 @@ #ifdef CONFIG_PARAVIRT #include -struct static_key; -extern struct static_key paravirt_steal_enabled; -extern struct static_key paravirt_steal_rq_enabled; - u64 dummy_steal_clock(int cpu); DECLARE_STATIC_CALL(pv_steal_clock, dummy_steal_clock); diff --git a/arch/arm/kernel/paravirt.c b/arch/arm/kernel/paravirt.c index 7dd9806369fb..3895a5578852 100644 --- a/arch/arm/kernel/paravirt.c +++ b/arch/arm/kernel/paravirt.c @@ -12,9 +12,6 @@ #include #include -struct static_key paravirt_steal_enabled; -struct static_key paravirt_steal_rq_enabled; - static u64 native_steal_clock(int cpu) { return 0; diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h index 9aa193e0e8f2..c9f7590baacb 100644 --- a/arch/arm64/include/asm/paravirt.h +++ b/arch/arm64/include/asm/paravirt.h @@ -5,10 +5,6 @@ #ifdef CONFIG_PARAVIRT #include -struct static_key; -extern struct static_key paravirt_steal_enabled; -extern struct static_key paravirt_steal_rq_enabled; - u64 dummy_steal_clock(int cpu); DECLARE_STATIC_CALL(pv_steal_clock, dummy_steal_clock); diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index aa718d6a9274..943b60ce12f4 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -19,14 +19,12 @@ #include #include #include +#include #include #include #include -struct static_key paravirt_steal_enabled; -struct static_key paravirt_steal_rq_enabled; - static u64 native_steal_clock(int cpu) { return 0; diff --git a/arch/loongarch/include/asm/paravirt.h b/arch/loongarch/include/asm/paravirt.h index 3f4323603e6a..d219ea0d98ac 100644 --- a/arch/loongarch/include/asm/paravirt.h +++ b/arch/loongarch/include/asm/paravirt.h @@ -5,9 +5,6 @@ #ifdef CONFIG_PARAVIRT #include -struct static_key; -extern struct static_key paravirt_steal_enabled; -extern struct static_key paravirt_steal_rq_enabled; u64 dummy_steal_clock(int cpu); DECLARE_STATIC_CALL(pv_steal_clock, dummy_steal_clock); diff --git a/arch/loongarch/kernel/paravirt.c b/arch/loongarch/kernel/paravirt.c index b1b51f920b23..8caaa94fed1a 100644 --- a/arch/loongarch/kernel/paravirt.c +++ b/arch/loongarch/kernel/paravirt.c @@ -6,11 +6,10 @@ #include #include #include +#include #include static int has_steal_clock; -struct static_key paravirt_steal_enabled; -struct static_key paravirt_steal_rq_enabled; static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64); DEFINE_STATIC_KEY_FALSE(virt_spin_lock_key); diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h index b78b82d66057..92343a23ad15 100644 --- a/arch/powerpc/include/asm/paravirt.h +++ b/arch/powerpc/include/asm/paravirt.h @@ -23,9 +23,6 @@ static inline bool is_shared_processor(void) } #ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING -extern struct static_key paravirt_steal_enabled; -extern struct static_key paravirt_steal_rq_enabled; - u64 pseries_paravirt_steal_clock(int cpu); static inline u64 paravirt_steal_clock(int cpu) diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index b10a25325238..50b26ed8432d 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -42,6 +42,7 @@ #include #include #include +#include #include #include @@ -83,9 +84,6 @@ DEFINE_STATIC_KEY_FALSE(shared_processor); EXPORT_SYMBOL(shared_processor); #ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING -struct static_key paravirt_steal_enabled; -struct static_key paravirt_steal_rq_enabled; - static bool steal_acc = true; static int __init parse_no_stealacc(char *arg) { diff --git a/arch/riscv/include/asm/paravirt.h b/arch/riscv/include/asm/paravirt.h index c0abde70fc2c..17e5e39c72c0 100644 --- a/arch/riscv/include/asm/paravirt.h +++ b/arch/riscv/include/asm/paravirt.h @@ -5,10 +5,6 @@ #ifdef CONFIG_PARAVIRT #include -struct static_key; -extern struct static_key paravirt_steal_enabled; -extern struct static_key paravirt_steal_rq_enabled; - u64 dummy_steal_clock(int cpu); DECLARE_STATIC_CALL(pv_steal_clock, dummy_steal_clock); diff --git a/arch/riscv/kernel/paravirt.c b/arch/riscv/kernel/paravirt.c index fa6b0339a65d..d3c334f16172 100644 --- a/arch/riscv/kernel/paravirt.c +++ b/arch/riscv/kernel/paravirt.c @@ -16,15 +16,13 @@ #include #include #include +#include #include #include #include #include -struct static_key paravirt_steal_enabled; -struct static_key paravirt_steal_rq_enabled; - static u64 native_steal_clock(int cpu) { return 0; diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 0d1e611f619c..491cb7e037bf 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -34,10 +34,6 @@ static __always_inline u64 paravirt_sched_clock(void) return static_call(pv_sched_clock)(); } -struct static_key; -extern struct static_key paravirt_steal_enabled; -extern struct static_key paravirt_steal_rq_enabled; - __visible void __native_queued_spin_unlock(struct qspinlock *lock); bool pv_is_native_spin_unlock(void); __visible bool __native_vcpu_is_preempted(long cpu); diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c index cb3f900c46fc..a3e6936839b1 100644 --- a/arch/x86/kernel/cpu/vmware.c +++ b/arch/x86/kernel/cpu/vmware.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 8ae750cde0c6..a23211eaaeed 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index ab3e172dcc69..a3ba4747be1c 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -60,9 +60,6 @@ void __init native_pv_lock_init(void) static_branch_enable(&virt_spin_lock_key); } -struct static_key paravirt_steal_enabled; -struct static_key paravirt_steal_rq_enabled; - static u64 native_steal_clock(int cpu) { return 0; diff --git a/drivers/xen/time.c b/drivers/xen/time.c index 5683383d2305..d360ded2ef39 100644 --- a/drivers/xen/time.c +++ b/drivers/xen/time.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include diff --git a/include/linux/sched/cputime.h b/include/linux/sched/cputime.h index 5f8fd5b24a2e..e90efaf6d26e 100644 --- a/include/linux/sched/cputime.h +++ b/include/linux/sched/cputime.h @@ -2,6 +2,7 @@ #ifndef _LINUX_SCHED_CPUTIME_H #define _LINUX_SCHED_CPUTIME_H +#include #include /* @@ -180,4 +181,21 @@ static inline void prev_cputime_init(struct prev_cputime *prev) extern unsigned long long task_sched_runtime(struct task_struct *task); +#ifdef CONFIG_PARAVIRT +struct static_key; +extern struct static_key paravirt_steal_enabled; +extern struct static_key paravirt_steal_rq_enabled; + +#ifdef CONFIG_HAVE_PV_STEAL_CLOCK_GEN +u64 dummy_steal_clock(int cpu); + +DECLARE_STATIC_CALL(pv_steal_clock, dummy_steal_clock); + +static inline u64 paravirt_steal_clock(int cpu) +{ + return static_call(pv_steal_clock)(cpu); +} +#endif +#endif + #endif /* _LINUX_SCHED_CPUTIME_H */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index be00629f0ba4..e723226e4e11 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -767,6 +767,11 @@ struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf) * RQ-clock updating methods: */ +/* Use CONFIG_PARAVIRT as this will avoid more #ifdef in arch code. */ +#ifdef CONFIG_PARAVIRT +struct static_key paravirt_steal_rq_enabled; +#endif + static void update_rq_clock_task(struct rq *rq, s64 delta) { /* diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 7097de2c8cda..ed8f71e08047 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -251,6 +251,19 @@ void __account_forceidle_time(struct task_struct *p, u64 delta) * ticks are not redelivered later. Due to that, this function may on * occasion account more time than the calling functions think elapsed. */ +#ifdef CONFIG_PARAVIRT +struct static_key paravirt_steal_enabled; + +#ifdef CONFIG_HAVE_PV_STEAL_CLOCK_GEN +static u64 native_steal_clock(int cpu) +{ + return 0; +} + +DEFINE_STATIC_CALL(pv_steal_clock, native_steal_clock); +#endif +#endif + static __always_inline u64 steal_account_process_time(u64 maxtime) { #ifdef CONFIG_PARAVIRT diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 6442441b46d7..fdf3021bdf7d 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -82,7 +82,7 @@ struct rt_rq; struct sched_group; struct cpuidle_state; -#ifdef CONFIG_PARAVIRT +#if defined(CONFIG_PARAVIRT) && !defined(CONFIG_HAVE_PV_STEAL_CLOCK_GEN) # include #endif -- 2.51.0 From jgross at suse.com Wed Sep 10 23:34:29 2025 From: jgross at suse.com (Juergen Gross) Date: Thu, 11 Sep 2025 08:34:29 +0200 Subject: [PATCH 10/14] riscv/paravirt: use common code for paravirt_steal_clock() In-Reply-To: <20250911063433.13783-1-jgross@suse.com> References: <20250911063433.13783-1-jgross@suse.com> Message-ID: <20250911063433.13783-11-jgross@suse.com> Remove the arch specific variant of paravirt_steal_clock() and use the common one instead. Signed-off-by: Juergen Gross --- arch/riscv/Kconfig | 1 + arch/riscv/include/asm/paravirt.h | 10 ---------- arch/riscv/kernel/paravirt.c | 7 ------- 3 files changed, 1 insertion(+), 17 deletions(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 51dcd8eaa243..8a7573aebaca 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -1108,6 +1108,7 @@ config COMPAT config PARAVIRT bool "Enable paravirtualization code" depends on RISCV_SBI + select HAVE_PV_STEAL_CLOCK_GEN help This changes the kernel so it can modify itself when it is run under a hypervisor, potentially improving performance significantly diff --git a/arch/riscv/include/asm/paravirt.h b/arch/riscv/include/asm/paravirt.h index 17e5e39c72c0..c49c55b266f3 100644 --- a/arch/riscv/include/asm/paravirt.h +++ b/arch/riscv/include/asm/paravirt.h @@ -3,16 +3,6 @@ #define _ASM_RISCV_PARAVIRT_H #ifdef CONFIG_PARAVIRT -#include - -u64 dummy_steal_clock(int cpu); - -DECLARE_STATIC_CALL(pv_steal_clock, dummy_steal_clock); - -static inline u64 paravirt_steal_clock(int cpu) -{ - return static_call(pv_steal_clock)(cpu); -} int __init pv_time_init(void); diff --git a/arch/riscv/kernel/paravirt.c b/arch/riscv/kernel/paravirt.c index d3c334f16172..5f56be79cd06 100644 --- a/arch/riscv/kernel/paravirt.c +++ b/arch/riscv/kernel/paravirt.c @@ -23,13 +23,6 @@ #include #include -static u64 native_steal_clock(int cpu) -{ - return 0; -} - -DEFINE_STATIC_CALL(pv_steal_clock, native_steal_clock); - static bool steal_acc = true; static int __init parse_no_stealacc(char *arg) { -- 2.51.0 From krzk at kernel.org Thu Sep 11 00:26:09 2025 From: krzk at kernel.org (Krzysztof Kozlowski) Date: Thu, 11 Sep 2025 09:26:09 +0200 Subject: [PATCH v3 1/2] ASoC: dt-bindings: Add bindings for SpacemiT K1 In-Reply-To: <20250911-k1-i2s-v3-1-57f173732f9c@linux.spacemit.com> References: <20250911-k1-i2s-v3-0-57f173732f9c@linux.spacemit.com> <20250911-k1-i2s-v3-1-57f173732f9c@linux.spacemit.com> Message-ID: <20250911-thundering-wildebeest-of-awe-eddf22@kuoka> On Thu, Sep 11, 2025 at 01:47:10PM +0800, Troy Mitchell wrote: > Add dt-binding for the i2s driver of SpacemiT's K1 SoC. > > Signed-off-by: Troy Mitchell > --- > .../devicetree/bindings/sound/spacemit,k1-i2s.yaml | 87 ++++++++++++++++++++++ > 1 file changed, 87 insertions(+) Reviewed-by: Krzysztof Kozlowski Best regards, Krzysztof From brgl at bgdev.pl Thu Sep 11 00:38:12 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Thu, 11 Sep 2025 09:38:12 +0200 Subject: [PATCH v2 00/15] gpio: replace legacy bgpio_init() with its modernized alternative - part 4 In-Reply-To: References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: On Wed, Sep 10, 2025 at 11:32?PM Linus Walleij wrote: > > On Wed, Sep 10, 2025 at 9:12?AM Bartosz Golaszewski wrote: > > > Here's the final part of the generic GPIO chip conversions. Once all the > > existing users are switched to the new API, the final patch in the > > series removes bgpio_init(), moves the gpio-mmio fields out of struct > > gpio_chip and into struct gpio_generic_chip and adjusts gpio-mmio.c to > > the new situation. > > > > Down the line we could probably improve gpio-mmio.c by using lock guards > > and replacing the - now obsolete - "bgpio" prefix with "gpio_generic" or > > something similar but this series is already big as is so I'm leaving > > that for the future. > > > > Tested in qemu on vexpress-a9. > > > > Signed-off-by: Bartosz Golaszewski > > The patch set is a beauty, hands down. > Reviewed-by: Linus Walleij > > I especially like where you caught local spinlocks being > (ab)used instead of the generic irqchip ones. > > I don't know about merging patch 15/15 into just the GPIO > tree, that can make things fail in other subsystems depending > on merge order into Torvalds tree or linux-next if your tree is > merged first. > > I would merge the first 14 and keep the last for the later part > of the merge window when all other trees with conversions > are merged. > > (You probably already thought of this.) > > Yours, > Linus Walleij I already have both pinctrl and mfd changes in my tree from Lee's and your immutable branches. I pushed this into gpio/devel and it built just fine. Bart From peterz at infradead.org Thu Sep 11 00:48:17 2025 From: peterz at infradead.org (Peter Zijlstra) Date: Thu, 11 Sep 2025 09:48:17 +0200 Subject: [PATCH 00/14] paravirt: cleanup and reorg In-Reply-To: <20250911063433.13783-1-jgross@suse.com> References: <20250911063433.13783-1-jgross@suse.com> Message-ID: <20250911074817.GX3245006@noisy.programming.kicks-ass.net> On Thu, Sep 11, 2025 at 08:34:19AM +0200, Juergen Gross wrote: > Some cleanups and reorg of paravirt code and headers: > > - The first 2 patches should be not controversial at all, as they > remove just some no longer needed #include and struct forward > declarations. > > - The 3rd patch is removing CONFIG_PARAVIRT_DEBUG, which IMO has > no real value, as it just changes a crash to a BUG() (the stack > trace will basically be the same). As the maintainer of the main > paravirt user (Xen) I have never seen this crash/BUG() to happen. > > - The 4th patch is just a movement of code. > > - I don't know for what reason asm/paravirt_api_clock.h was added, > as all archs supporting it do it exactly in the same way. Patch > 5 is removing it. > > - Patches 6-12 are streamlining the paravirt clock interfaces by > using a common implementation across architectures where possible > and by moving the related code into common sched code, as this is > where it should live. > > - Patches 13+14 are more like RFC material: patch 13 is doing some > preparation work to enable patch 14 to move all spinlock related > paravirt functions into qspinlock.h. If this approach is accepted, > I'd like to continue with this work by moving most (or all?) > paravirt functions from paravirt.h into the headers where their > native counterparts are defined. This is meant to keep the native > and paravirt function definitions together in one place and > hopefully to be able to reduce the include hell with paravirt. > > Juergen Gross (14): > x86/paravirt: remove not needed includes of paravirt.h > x86/paravirt: remove some unneeded struct declarations > x86/paravirt: remove PARAVIRT_DEBUG config option > x86/paravirt: move thunk macros to paravirt_types.h > paravirt: remove asm/paravirt_api_clock.h > sched: move clock related paravirt code to kernel/sched > arm/paravirt: use common code for paravirt_steal_clock() > arm64/paravirt: use common code for paravirt_steal_clock() > loongarch/paravirt: use common code for paravirt_steal_clock() > riscv/paravirt: use common code for paravirt_steal_clock() > x86/paravirt: use common code for paravirt_steal_clock() > x86/paravirt: move paravirt_sched_clock() related code into tsc.c > x86/paravirt: allow pv-calls outside paravirt.h > x86/pvlocks: move paravirt spinlock functions into qspinlock.h With the note that tip typically likes a capital after the prefix, like: x86/paravirt: Remove unneeded includes of paravirt.h For 1-12: Acked-by: Peter Zijlstra (Intel) Now, as to the last two, I'm not sure. Leaking those macros out of PV isn't particularly nice, then again, not the end of the world either. Just not sure. From brgl at bgdev.pl Thu Sep 11 00:56:28 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Thu, 11 Sep 2025 09:56:28 +0200 Subject: [PATCH v2 07/15] gpio: brcmstb: use new generic GPIO chip API In-Reply-To: References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> <20250910-gpio-mmio-gpio-conv-part4-v2-7-f3d1a4c57124@linaro.org> Message-ID: On Thu, Sep 11, 2025 at 2:11?AM Doug Berger wrote: > > > > > @@ -700,7 +707,8 @@ static int brcmstb_gpio_probe(struct platform_device *pdev) > > * be retained from S5 cold boot > > */ > > need_wakeup_event |= !!__brcmstb_gpio_get_active_irqs(bank); > > - gc->write_reg(reg_base + GIO_MASK(bank->id), 0); > > + gpio_generic_write_reg(&bank->chip, > > + reg_base + GIO_MASK(bank->id), 0); > > > > err = gpiochip_add_data(gc, bank); > > if (err) { > > > I suppose I'm OK with all of this, but I'm just curious about the longer > term plans for the member accesses. Is there an intent to have helpers > for things like?: > chip.gc.offset > chip.gc.ngpio I don't think so. It would require an enormous effort and these fields in struct gpio_chip are pretty stable so there's no real reason for it. Bart From brgl at bgdev.pl Thu Sep 11 00:58:00 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Thu, 11 Sep 2025 09:58:00 +0200 Subject: [PATCH v2 11/15] gpio: sifive: use new generic GPIO chip API In-Reply-To: <01a7cc78-fdae-4a1e-bf78-961e7ec214b2@sifive.com> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> <20250910-gpio-mmio-gpio-conv-part4-v2-11-f3d1a4c57124@linaro.org> <01a7cc78-fdae-4a1e-bf78-961e7ec214b2@sifive.com> Message-ID: On Thu, Sep 11, 2025 at 2:37?AM Samuel Holland wrote: > > Hi Bartosz, > > On 2025-09-10 2:12 AM, Bartosz Golaszewski wrote: > > From: Bartosz Golaszewski > > > > Convert the driver to using the new generic GPIO chip interfaces from > > linux/gpio/generic.h. > > > > Signed-off-by: Bartosz Golaszewski > > --- > > drivers/gpio/gpio-sifive.c | 73 ++++++++++++++++++++++++---------------------- > > 1 file changed, 38 insertions(+), 35 deletions(-) > > > > diff --git a/drivers/gpio/gpio-sifive.c b/drivers/gpio/gpio-sifive.c > > index 98ef975c44d9a6c9238605cfd1d5820fd70a66ca..2ced87ffd3bbf219c11857391eb4ea808adc0527 100644 > > --- a/drivers/gpio/gpio-sifive.c > > +++ b/drivers/gpio/gpio-sifive.c > > @@ -7,6 +7,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -32,7 +33,7 @@ > > > > struct sifive_gpio { > > void __iomem *base; > > - struct gpio_chip gc; > > + struct gpio_generic_chip gen_gc; > > struct regmap *regs; > > unsigned long irq_state; > > unsigned int trigger[SIFIVE_GPIO_MAX]; > > @@ -41,10 +42,10 @@ struct sifive_gpio { > > > > static void sifive_gpio_set_ie(struct sifive_gpio *chip, unsigned int offset) > > { > > - unsigned long flags; > > unsigned int trigger; > > > > - raw_spin_lock_irqsave(&chip->gc.bgpio_lock, flags); > > + guard(gpio_generic_lock_irqsave)(&chip->gen_gc); > > + > > trigger = (chip->irq_state & BIT(offset)) ? chip->trigger[offset] : 0; > > regmap_update_bits(chip->regs, SIFIVE_GPIO_RISE_IE, BIT(offset), > > (trigger & IRQ_TYPE_EDGE_RISING) ? BIT(offset) : 0); > > @@ -54,7 +55,6 @@ static void sifive_gpio_set_ie(struct sifive_gpio *chip, unsigned int offset) > > (trigger & IRQ_TYPE_LEVEL_HIGH) ? BIT(offset) : 0); > > regmap_update_bits(chip->regs, SIFIVE_GPIO_LOW_IE, BIT(offset), > > (trigger & IRQ_TYPE_LEVEL_LOW) ? BIT(offset) : 0); > > - raw_spin_unlock_irqrestore(&chip->gc.bgpio_lock, flags); > > } > > > > static int sifive_gpio_irq_set_type(struct irq_data *d, unsigned int trigger) > > @@ -72,13 +72,12 @@ static int sifive_gpio_irq_set_type(struct irq_data *d, unsigned int trigger) > > } > > > > static void sifive_gpio_irq_enable(struct irq_data *d) > > -{ > > + { > > This looks like an unintentional whitespace change. > Ah, thanks, checkpatch did not spot it. I'll fix it when applying. > > struct gpio_chip *gc = irq_data_get_irq_chip_data(d); > > struct sifive_gpio *chip = gpiochip_get_data(gc); > > irq_hw_number_t hwirq = irqd_to_hwirq(d); > > int offset = hwirq % SIFIVE_GPIO_MAX; > > u32 bit = BIT(offset); > > - unsigned long flags; > > > > gpiochip_enable_irq(gc, hwirq); > > irq_chip_enable_parent(d); > > @@ -86,13 +85,13 @@ static void sifive_gpio_irq_enable(struct irq_data *d) > > /* Switch to input */ > > gc->direction_input(gc, offset); > > > > - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); > > - /* Clear any sticky pending interrupts */ > > - regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); > > - regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); > > - regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); > > - regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); > > - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); > > + scoped_guard(gpio_generic_lock_irqsave, &chip->gen_gc) { > > + /* Clear any sticky pending interrupts */ > > + regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); > > + regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); > > + regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); > > + regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); > > + } > > This block (and the copy below) don't actually need any locking, since these are > R/W1C bits. From the manual: "Once the interrupt is pending, it will remain set > until a 1 is written to the *_ip register at that bit." I can send this as a > follow-up improvement if you want to keep this limited to the API conversion. > Sure, please do. > So with the minor whitespace fix: > Reviewed-by: Samuel Holland > Thanks, Bart > Regards, > Samuel > > > > > /* Enable interrupts */ > > assign_bit(offset, &chip->irq_state, 1); > > @@ -118,15 +117,14 @@ static void sifive_gpio_irq_eoi(struct irq_data *d) > > struct sifive_gpio *chip = gpiochip_get_data(gc); > > int offset = irqd_to_hwirq(d) % SIFIVE_GPIO_MAX; > > u32 bit = BIT(offset); > > - unsigned long flags; > > > > - raw_spin_lock_irqsave(&gc->bgpio_lock, flags); > > - /* Clear all pending interrupts */ > > - regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); > > - regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); > > - regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); > > - regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); > > - raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags); > > + scoped_guard(gpio_generic_lock_irqsave, &chip->gen_gc) { > > + /* Clear all pending interrupts */ > > + regmap_write(chip->regs, SIFIVE_GPIO_RISE_IP, bit); > > + regmap_write(chip->regs, SIFIVE_GPIO_FALL_IP, bit); > > + regmap_write(chip->regs, SIFIVE_GPIO_HIGH_IP, bit); > > + regmap_write(chip->regs, SIFIVE_GPIO_LOW_IP, bit); > > + } > > > > irq_chip_eoi_parent(d); > > } > > @@ -179,6 +177,7 @@ static const struct regmap_config sifive_gpio_regmap_config = { > > > > static int sifive_gpio_probe(struct platform_device *pdev) > > { > > + struct gpio_generic_chip_config config; > > struct device *dev = &pdev->dev; > > struct irq_domain *parent; > > struct gpio_irq_chip *girq; > > @@ -217,13 +216,17 @@ static int sifive_gpio_probe(struct platform_device *pdev) > > */ > > parent = irq_get_irq_data(chip->irq_number[0])->domain; > > > > - ret = bgpio_init(&chip->gc, dev, 4, > > - chip->base + SIFIVE_GPIO_INPUT_VAL, > > - chip->base + SIFIVE_GPIO_OUTPUT_VAL, > > - NULL, > > - chip->base + SIFIVE_GPIO_OUTPUT_EN, > > - chip->base + SIFIVE_GPIO_INPUT_EN, > > - BGPIOF_READ_OUTPUT_REG_SET); > > + config = (struct gpio_generic_chip_config) { > > + .dev = dev, > > + .sz = 4, > > + .dat = chip->base + SIFIVE_GPIO_INPUT_VAL, > > + .set = chip->base + SIFIVE_GPIO_OUTPUT_VAL, > > + .dirout = chip->base + SIFIVE_GPIO_OUTPUT_EN, > > + .dirin = chip->base + SIFIVE_GPIO_INPUT_EN, > > + .flags = BGPIOF_READ_OUTPUT_REG_SET, > > + }; > > + > > + ret = gpio_generic_chip_init(&chip->gen_gc, &config); > > if (ret) { > > dev_err(dev, "unable to init generic GPIO\n"); > > return ret; > > @@ -236,12 +239,12 @@ static int sifive_gpio_probe(struct platform_device *pdev) > > regmap_write(chip->regs, SIFIVE_GPIO_LOW_IE, 0); > > chip->irq_state = 0; > > > > - chip->gc.base = -1; > > - chip->gc.ngpio = ngpio; > > - chip->gc.label = dev_name(dev); > > - chip->gc.parent = dev; > > - chip->gc.owner = THIS_MODULE; > > - girq = &chip->gc.irq; > > + chip->gen_gc.gc.base = -1; > > + chip->gen_gc.gc.ngpio = ngpio; > > + chip->gen_gc.gc.label = dev_name(dev); > > + chip->gen_gc.gc.parent = dev; > > + chip->gen_gc.gc.owner = THIS_MODULE; > > + girq = &chip->gen_gc.gc.irq; > > gpio_irq_chip_set_chip(girq, &sifive_gpio_irqchip); > > girq->fwnode = dev_fwnode(dev); > > girq->parent_domain = parent; > > @@ -249,7 +252,7 @@ static int sifive_gpio_probe(struct platform_device *pdev) > > girq->handler = handle_bad_irq; > > girq->default_type = IRQ_TYPE_NONE; > > > > - return gpiochip_add_data(&chip->gc, chip); > > + return gpiochip_add_data(&chip->gen_gc.gc, chip); > > } > > > > static const struct of_device_id sifive_gpio_match[] = { > > > From jgross at suse.com Thu Sep 11 01:00:04 2025 From: jgross at suse.com (=?UTF-8?B?SsO8cmdlbiBHcm/Dnw==?=) Date: Thu, 11 Sep 2025 10:00:04 +0200 Subject: [PATCH 00/14] paravirt: cleanup and reorg In-Reply-To: <20250911074817.GX3245006@noisy.programming.kicks-ass.net> References: <20250911063433.13783-1-jgross@suse.com> <20250911074817.GX3245006@noisy.programming.kicks-ass.net> Message-ID: On 11.09.25 09:48, Peter Zijlstra wrote: > On Thu, Sep 11, 2025 at 08:34:19AM +0200, Juergen Gross wrote: >> Some cleanups and reorg of paravirt code and headers: >> >> - The first 2 patches should be not controversial at all, as they >> remove just some no longer needed #include and struct forward >> declarations. >> >> - The 3rd patch is removing CONFIG_PARAVIRT_DEBUG, which IMO has >> no real value, as it just changes a crash to a BUG() (the stack >> trace will basically be the same). As the maintainer of the main >> paravirt user (Xen) I have never seen this crash/BUG() to happen. >> >> - The 4th patch is just a movement of code. >> >> - I don't know for what reason asm/paravirt_api_clock.h was added, >> as all archs supporting it do it exactly in the same way. Patch >> 5 is removing it. >> >> - Patches 6-12 are streamlining the paravirt clock interfaces by >> using a common implementation across architectures where possible >> and by moving the related code into common sched code, as this is >> where it should live. >> >> - Patches 13+14 are more like RFC material: patch 13 is doing some >> preparation work to enable patch 14 to move all spinlock related >> paravirt functions into qspinlock.h. If this approach is accepted, >> I'd like to continue with this work by moving most (or all?) >> paravirt functions from paravirt.h into the headers where their >> native counterparts are defined. This is meant to keep the native >> and paravirt function definitions together in one place and >> hopefully to be able to reduce the include hell with paravirt. >> >> Juergen Gross (14): >> x86/paravirt: remove not needed includes of paravirt.h >> x86/paravirt: remove some unneeded struct declarations >> x86/paravirt: remove PARAVIRT_DEBUG config option >> x86/paravirt: move thunk macros to paravirt_types.h >> paravirt: remove asm/paravirt_api_clock.h >> sched: move clock related paravirt code to kernel/sched >> arm/paravirt: use common code for paravirt_steal_clock() >> arm64/paravirt: use common code for paravirt_steal_clock() >> loongarch/paravirt: use common code for paravirt_steal_clock() >> riscv/paravirt: use common code for paravirt_steal_clock() >> x86/paravirt: use common code for paravirt_steal_clock() >> x86/paravirt: move paravirt_sched_clock() related code into tsc.c >> x86/paravirt: allow pv-calls outside paravirt.h >> x86/pvlocks: move paravirt spinlock functions into qspinlock.h > > With the note that tip typically likes a capital after the prefix, like: > > x86/paravirt: Remove unneeded includes of paravirt.h Noted, thanks. > > For 1-12: > > Acked-by: Peter Zijlstra (Intel) > > > Now, as to the last two, I'm not sure. Leaking those macros out of PV > isn't particularly nice, then again, not the end of the world either. > Just not sure. Yes, that's why I didn't continue with all of the other potential movement of paravirt functions. I want some feedback first. :-) Its a tradeoff between having functions with / without paravirt in one file against hiding the paravirt stuff from "normal" readers (not writers, as those probably need to touch the paravirt variant, too). BTW, I think the macro leaking isn't the main problem. There are other macros leaking already. Juergen -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_0xB0DE9DD628BF132F.asc Type: application/pgp-keys Size: 3683 bytes Desc: OpenPGP public key URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature.asc Type: application/pgp-signature Size: 495 bytes Desc: OpenPGP digital signature URL: From andriy.shevchenko at intel.com Thu Sep 11 01:02:26 2025 From: andriy.shevchenko at intel.com (Andy Shevchenko) Date: Thu, 11 Sep 2025 11:02:26 +0300 Subject: [PATCH v2 07/15] gpio: brcmstb: use new generic GPIO chip API In-Reply-To: References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> <20250910-gpio-mmio-gpio-conv-part4-v2-7-f3d1a4c57124@linaro.org> Message-ID: On Thu, Sep 11, 2025 at 09:56:28AM +0200, Bartosz Golaszewski wrote: > On Thu, Sep 11, 2025 at 2:11?AM Doug Berger wrote: ... > > I'm just curious about the longer term plans for the member accesses. Is > > there an intent to have helpers for things like?: > > chip.gc.offset > > chip.gc.ngpio > > I don't think so. It would require an enormous effort and these fields > in struct gpio_chip are pretty stable so there's no real reason for > it. What I would like to see in TODO is to "make struct gpio_chip const" when passing to the gpiochip_add_*(). -- With Best Regards, Andy Shevchenko From horms at kernel.org Thu Sep 11 02:44:04 2025 From: horms at kernel.org (Simon Horman) Date: Thu, 11 Sep 2025 10:44:04 +0100 Subject: [PATCH net-next v10 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250908-net-k1-emac-v10-2-90d807ccd469@iscas.ac.cn> References: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> <20250908-net-k1-emac-v10-2-90d807ccd469@iscas.ac.cn> Message-ID: <20250911094404.GE30363@horms.kernel.org> On Mon, Sep 08, 2025 at 08:34:26PM +0800, Vivian Wang wrote: > The Ethernet MACs found on SpacemiT K1 appears to be a custom design > that only superficially resembles some other embedded MACs. SpacemiT > refers to them as "EMAC", so let's just call the driver "k1_emac". > > Supports RGMII and RMII interfaces. Includes support for MAC hardware > statistics counters. PTP support is not implemented. > > Signed-off-by: Vivian Wang > Reviewed-by: Maxime Chevallier > Reviewed-by: Vadim Fedorenko > Reviewed-by: Troy Mitchell > Tested-by: Junhui Liu > Tested-by: Troy Mitchell > --- > drivers/net/ethernet/Kconfig | 1 + > drivers/net/ethernet/Makefile | 1 + > drivers/net/ethernet/spacemit/Kconfig | 29 + > drivers/net/ethernet/spacemit/Makefile | 6 + > drivers/net/ethernet/spacemit/k1_emac.c | 2156 +++++++++++++++++++++++++++++++ This is a large patch, so I'm sure I've missed some things. But, overall, I think this is coming together. Thanks for your recent updates. As the Kernel Patch Robot noticed a problem, I've provided some minor feedback for your consideration. ... > +static void emac_wr(struct emac_priv *priv, u32 reg, u32 val) > +{ > + writel(val, priv->iobase + reg); > +} > + > +static int emac_rd(struct emac_priv *priv, u32 reg) nit: maybe u32 would be a more suitable return type. > +{ > + return readl(priv->iobase + reg); > +} ... > +static int emac_alloc_tx_resources(struct emac_priv *priv) > +{ > + struct emac_desc_ring *tx_ring = &priv->tx_ring; > + struct platform_device *pdev = priv->pdev; > + u32 size; > + > + size = sizeof(struct emac_tx_desc_buffer) * tx_ring->total_cnt; > + > + tx_ring->tx_desc_buf = kzalloc(size, GFP_KERNEL); nit: I think you can use kcalloc() here. > + if (!tx_ring->tx_desc_buf) > + return -ENOMEM; > + > + tx_ring->total_size = tx_ring->total_cnt * sizeof(struct emac_desc); > + tx_ring->total_size = ALIGN(tx_ring->total_size, PAGE_SIZE); > + > + tx_ring->desc_addr = dma_alloc_coherent(&pdev->dev, tx_ring->total_size, > + &tx_ring->desc_dma_addr, > + GFP_KERNEL); > + if (!tx_ring->desc_addr) { > + kfree(tx_ring->tx_desc_buf); > + return -ENOMEM; > + } > + > + tx_ring->head = 0; > + tx_ring->tail = 0; > + > + return 0; > +} ... > +static int emac_alloc_rx_resources(struct emac_priv *priv) > +{ > + struct emac_desc_ring *rx_ring = &priv->rx_ring; > + struct platform_device *pdev = priv->pdev; > + u32 buf_len; > + > + buf_len = sizeof(struct emac_rx_desc_buffer) * rx_ring->total_cnt; > + > + rx_ring->rx_desc_buf = kzalloc(buf_len, GFP_KERNEL); Ditto. > + if (!rx_ring->rx_desc_buf) > + return -ENOMEM; > + > + rx_ring->total_size = rx_ring->total_cnt * sizeof(struct emac_desc); > + > + rx_ring->total_size = ALIGN(rx_ring->total_size, PAGE_SIZE); > + > + rx_ring->desc_addr = dma_alloc_coherent(&pdev->dev, rx_ring->total_size, > + &rx_ring->desc_dma_addr, > + GFP_KERNEL); > + if (!rx_ring->desc_addr) { > + kfree(rx_ring->rx_desc_buf); > + return -ENOMEM; > + } > + > + rx_ring->head = 0; > + rx_ring->tail = 0; > + > + return 0; > +} ... > +static int emac_mii_read(struct mii_bus *bus, int phy_addr, int regnum) > +{ > + struct emac_priv *priv = bus->priv; > + u32 cmd = 0, val; > + int ret; > + > + cmd |= phy_addr & 0x1F; > + cmd |= (regnum & 0x1F) << 5; nit: I think this could benefit from using FIELD_PREP Likewise for similar patterns in this patch. > + cmd |= MREGBIT_START_MDIO_TRANS | MREGBIT_MDIO_READ_WRITE; > + > + emac_wr(priv, MAC_MDIO_DATA, 0x0); > + emac_wr(priv, MAC_MDIO_CONTROL, cmd); > + > + ret = readl_poll_timeout(priv->iobase + MAC_MDIO_CONTROL, val, > + !((val >> 15) & 0x1), 100, 10000); > + > + if (ret) > + return ret; > + > + val = emac_rd(priv, MAC_MDIO_DATA); > + return val; > +} ... > +/* > + * Even though this MAC supports gigabit operation, it only provides 32-bit > + * statistics counters. The most overflow-prone counters are the "bytes" ones, > + * which at gigabit overflow about twice a minute. > + * > + * Therefore, we maintain the high 32 bits of counters ourselves, incrementing > + * every time statistics seem to go backwards. Also, update periodically to > + * catch overflows when we are not otherwise checking the statistics often > + * enough. > + */ > + > +#define EMAC_STATS_TIMER_PERIOD 20 > + > +static int emac_read_stat_cnt(struct emac_priv *priv, u8 cnt, u32 *res, > + u32 control_reg, u32 high_reg, u32 low_reg) > +{ > + u32 val; > + int ret; > + > + /* The "read" bit is the same for TX and RX */ > + > + val = MREGBIT_START_TX_COUNTER_READ | cnt; > + emac_wr(priv, control_reg, val); > + val = emac_rd(priv, control_reg); > + > + ret = readl_poll_timeout_atomic(priv->iobase + control_reg, val, > + !(val & MREGBIT_START_TX_COUNTER_READ), > + 100, 10000); > + > + if (ret) { > + netdev_err(priv->ndev, "Read stat timeout\n"); > + return ret; > + } > + > + *res = emac_rd(priv, high_reg) << 16; > + *res |= (u16)emac_rd(priv, low_reg); nit: I think lower_16_bits() and lower_16_bits() would be appropriate here. > + > + return 0; > +} ... > +static void emac_update_counter(u64 *counter, u32 new_low) > +{ > + u32 old_low = (u32)*counter; > + u64 high = *counter >> 32; Similarly, lower_32_bits() and upper_32_bits here. > + > + if (old_low > new_low) { > + /* Overflowed, increment high 32 bits */ > + high++; > + } > + > + *counter = (high << 32) | new_low; > +} > + > +static void emac_stats_update(struct emac_priv *priv) > +{ > + u64 *tx_stats_off = (u64 *)&priv->tx_stats_off; > + u64 *rx_stats_off = (u64 *)&priv->rx_stats_off; > + u64 *tx_stats = (u64 *)&priv->tx_stats; > + u64 *rx_stats = (u64 *)&priv->rx_stats; nit: I think it would be interesting to use a union containing 1. the existing tx/rx stats struct and 2. an array of u64. This may allow avoiding this cast. Which seems nice to me. But YMMV. > + u32 i, res; > + > + assert_spin_locked(&priv->stats_lock); > + > + if (!netif_running(priv->ndev) || !netif_device_present(priv->ndev)) { > + /* Not up, don't try to update */ > + return; > + } > + > + for (i = 0; i < sizeof(priv->tx_stats) / sizeof(*tx_stats); i++) { > + /* > + * If reading stats times out, everything is broken and there's > + * nothing we can do. Reading statistics also can't return an > + * error, so just return without updating and without > + * rescheduling. > + */ > + if (emac_tx_read_stat_cnt(priv, i, &res)) > + return; > + > + /* > + * Re-initializing while bringing interface up resets counters > + * to zero, so to provide continuity, we add the values saved > + * last time we did emac_down() to the new hardware-provided > + * value. > + */ > + emac_update_counter(&tx_stats[i], res + (u32)tx_stats_off[i]); nit: maybe lower_32_bits(tx_stats_off[i]) ? > + } > + > + /* Similar remarks as TX stats */ > + for (i = 0; i < sizeof(priv->rx_stats) / sizeof(*rx_stats); i++) { > + if (emac_rx_read_stat_cnt(priv, i, &res)) > + return; > + emac_update_counter(&rx_stats[i], res + (u32)rx_stats_off[i]); Likewise, here for rx_stats_off[i]. > + } > + > + mod_timer(&priv->stats_timer, jiffies + EMAC_STATS_TIMER_PERIOD * HZ); > +} ... > +static u64 emac_get_stat_tx_dropped(struct emac_priv *priv) > +{ > + u64 result; > + int cpu; > + > + for_each_possible_cpu(cpu) { > + result += READ_ONCE(per_cpu(*priv->stat_tx_dropped, cpu)); > + } nit: no need for {} here ? > + > + return result; > +} ... From zhangchunyan at iscas.ac.cn Thu Sep 11 02:55:58 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Thu, 11 Sep 2025 17:55:58 +0800 Subject: [PATCH v11 1/5] mm: softdirty: Add pgtable_soft_dirty_supported() In-Reply-To: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> References: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250911095602.1130290-2-zhangchunyan@iscas.ac.cn> Some platforms can customize the PTE PMD entry soft-dirty bit making it unavailable even if the architecture provides the resource. Add an API which architectures can define their specific implementations to detect if soft-dirty bit is available on which device the kernel is running. Signed-off-by: Chunyan Zhang --- fs/proc/task_mmu.c | 17 ++++++++++++++++- include/linux/pgtable.h | 12 ++++++++++++ mm/debug_vm_pgtable.c | 10 +++++----- mm/huge_memory.c | 13 +++++++------ mm/internal.h | 2 +- mm/mremap.c | 13 +++++++------ mm/userfaultfd.c | 10 ++++------ 7 files changed, 52 insertions(+), 25 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 29cca0e6d0ff..9e8083b6d4cd 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1058,7 +1058,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) * -Werror=unterminated-string-initialization warning * with GCC 15 */ - static const char mnemonics[BITS_PER_LONG][3] = { + static char mnemonics[BITS_PER_LONG][3] = { /* * In case if we meet a flag we don't know about. */ @@ -1129,6 +1129,16 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) [ilog2(VM_SEALED)] = "sl", #endif }; +/* + * We should remove the VM_SOFTDIRTY flag if the soft-dirty bit is + * unavailable on which the kernel is running, even if the architecture + * provides the resource and soft-dirty is compiled in. + */ +#ifdef CONFIG_MEM_SOFT_DIRTY + if (!pgtable_soft_dirty_supported()) + mnemonics[ilog2(VM_SOFTDIRTY)][0] = 0; +#endif + size_t i; seq_puts(m, "VmFlags: "); @@ -1531,6 +1541,8 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, static inline void clear_soft_dirty(struct vm_area_struct *vma, unsigned long addr, pte_t *pte) { + if (!pgtable_soft_dirty_supported()) + return; /* * The soft-dirty tracker uses #PF-s to catch writes * to pages, so write-protect the pte as well. See the @@ -1566,6 +1578,9 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, { pmd_t old, pmd = *pmdp; + if (!pgtable_soft_dirty_supported()) + return; + if (pmd_present(pmd)) { /* See comment in change_huge_pmd() */ old = pmdp_invalidate(vma, addr, pmdp); diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 4c035637eeb7..2a3578a4ae4c 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1537,6 +1537,18 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) #define arch_start_context_switch(prev) do {} while (0) #endif +/* + * Some platforms can customize the PTE soft-dirty bit making it unavailable + * even if the architecture provides the resource. + * Adding this API allows architectures to add their own checks for the + * devices on which the kernel is running. + * Note: When overiding it, please make sure the CONFIG_MEM_SOFT_DIRTY + * is part of this macro. + */ +#ifndef pgtable_soft_dirty_supported +#define pgtable_soft_dirty_supported() IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) +#endif + #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index 830107b6dd08..b32ce2b0b998 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -690,7 +690,7 @@ static void __init pte_soft_dirty_tests(struct pgtable_debug_args *args) { pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) + if (!pgtable_soft_dirty_supported()) return; pr_debug("Validating PTE soft dirty\n"); @@ -702,7 +702,7 @@ static void __init pte_swap_soft_dirty_tests(struct pgtable_debug_args *args) { pte_t pte; - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) + if (!pgtable_soft_dirty_supported()) return; pr_debug("Validating PTE swap soft dirty\n"); @@ -718,7 +718,7 @@ static void __init pmd_soft_dirty_tests(struct pgtable_debug_args *args) { pmd_t pmd; - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) + if (!pgtable_soft_dirty_supported()) return; if (!has_transparent_hugepage()) @@ -734,8 +734,8 @@ static void __init pmd_swap_soft_dirty_tests(struct pgtable_debug_args *args) { pmd_t pmd; - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || - !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) + if (!pgtable_soft_dirty_supported() || + !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) return; if (!has_transparent_hugepage()) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9c38a95e9f09..218d430a2ec6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2271,12 +2271,13 @@ static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl, static pmd_t move_soft_dirty_pmd(pmd_t pmd) { -#ifdef CONFIG_MEM_SOFT_DIRTY - if (unlikely(is_pmd_migration_entry(pmd))) - pmd = pmd_swp_mksoft_dirty(pmd); - else if (pmd_present(pmd)) - pmd = pmd_mksoft_dirty(pmd); -#endif + if (pgtable_soft_dirty_supported()) { + if (unlikely(is_pmd_migration_entry(pmd))) + pmd = pmd_swp_mksoft_dirty(pmd); + else if (pmd_present(pmd)) + pmd = pmd_mksoft_dirty(pmd); + } + return pmd; } diff --git a/mm/internal.h b/mm/internal.h index 45b725c3dc03..c6ca62f8ecf3 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1538,7 +1538,7 @@ static inline bool vma_soft_dirty_enabled(struct vm_area_struct *vma) * VM_SOFTDIRTY is defined as 0x0, then !(vm_flags & VM_SOFTDIRTY) * will be constantly true. */ - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) + if (!pgtable_soft_dirty_supported()) return false; /* diff --git a/mm/mremap.c b/mm/mremap.c index e618a706aff5..7beb3114dbf5 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -162,12 +162,13 @@ static pte_t move_soft_dirty_pte(pte_t pte) * Set soft dirty bit so we can notice * in userspace the ptes were moved. */ -#ifdef CONFIG_MEM_SOFT_DIRTY - if (pte_present(pte)) - pte = pte_mksoft_dirty(pte); - else if (is_swap_pte(pte)) - pte = pte_swp_mksoft_dirty(pte); -#endif + if (pgtable_soft_dirty_supported()) { + if (pte_present(pte)) + pte = pte_mksoft_dirty(pte); + else if (is_swap_pte(pte)) + pte = pte_swp_mksoft_dirty(pte); + } + return pte; } diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 45e6290e2e8b..85f43479b67a 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -1065,9 +1065,8 @@ static int move_present_pte(struct mm_struct *mm, orig_dst_pte = folio_mk_pte(src_folio, dst_vma->vm_page_prot); /* Set soft dirty bit so userspace can notice the pte was moved */ -#ifdef CONFIG_MEM_SOFT_DIRTY - orig_dst_pte = pte_mksoft_dirty(orig_dst_pte); -#endif + if (pgtable_soft_dirty_supported()) + orig_dst_pte = pte_mksoft_dirty(orig_dst_pte); if (pte_dirty(orig_src_pte)) orig_dst_pte = pte_mkdirty(orig_dst_pte); orig_dst_pte = pte_mkwrite(orig_dst_pte, dst_vma); @@ -1134,9 +1133,8 @@ static int move_swap_pte(struct mm_struct *mm, struct vm_area_struct *dst_vma, } orig_src_pte = ptep_get_and_clear(mm, src_addr, src_pte); -#ifdef CONFIG_MEM_SOFT_DIRTY - orig_src_pte = pte_swp_mksoft_dirty(orig_src_pte); -#endif + if (pgtable_soft_dirty_supported()) + orig_src_pte = pte_swp_mksoft_dirty(orig_src_pte); set_pte_at(mm, dst_addr, dst_pte, orig_src_pte); double_pt_unlock(dst_ptl, src_ptl); -- 2.34.1 From zhangchunyan at iscas.ac.cn Thu Sep 11 02:55:57 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Thu, 11 Sep 2025 17:55:57 +0800 Subject: [PATCH v11 0/5] riscv: mm: Add soft-dirty and uffd-wp support Message-ID: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> This patchset adds support for Svrsw60t59b [1] extension which is ratified now, also add soft dirty and userfaultfd write protect tracking for RISC-V. The patches 1 and 2 add macros to allow architectures to define their own checks if the soft-dirty / uffd_wp PTE bits are available, in other words for RISC-V, the Svrsw60t59b extension is supported on which device the kernel is running. This patchset has been tested with kselftest mm suite in which soft-dirty, madv_populate, test_unmerge_uffd_wp, and uffd-unit-tests run and pass, and no regressions are observed in any of the other tests. This patchset applies on top of v6.17-rc4. [1] https://github.com/riscv-non-isa/riscv-iommu/pull/543 V11: - Rename the macro API to pgtable_*_supported() since we also have PMD support; - Change the default implementations of two macros, make CONFIG_MEM_SOFT_DIRTY or CONFIG_HAVE_ARCH_USERFAULTFD_WP part of the macros; - Correct the order of insertion of RISCV_ISA_EXT_SVRSW60T59B; - Rephrase some comments. V10: https://lore.kernel.org/all/20250909095611.803898-1-zhangchunyan at iscas.ac.cn/ - Fixed the issue reported by kernel test irobot . V9: https://lore.kernel.org/all/20250905103651.489197-1-zhangchunyan at iscas.ac.cn/ - Add pte_soft_dirty/uffd_wp_available() API to allow dynamically checking if the PTE bit is available for the platform on which the kernel is running. V8: https://lore.kernel.org/all/20250619065232.1786470-1-zhangchunyan at iscas.ac.cn/) - Rebase on v6.16-rc1; - Add dependencies to MMU && 64BIT for RISCV_ISA_SVRSW60T59B; - Use 'Svrsw60t59b' instead of 'SVRSW60T59B' in Kconfig help paragraph; - Add Alex's Reviewed-by tag in patch 1. V7: https://lore.kernel.org/all/20250409095320.224100-1-zhangchunyan at iscas.ac.cn/ - Add Svrsw60t59b [1] extension support; - Have soft-dirty and uffd-wp depending on the Svrsw60t59b extension to avoid crashes for the hardware which don't have this extension. V6: https://lore.kernel.org/all/20250408084301.68186-1-zhangchunyan at iscas.ac.cn/ - Changes to use bits 59-60 which are supported by extension Svrsw60t59b for soft dirty and userfaultfd write protect tracking. V5: https://lore.kernel.org/all/20241113095833.1805746-1-zhangchunyan at iscas.ac.cn/ - Fixed typos and corrected some words in Kconfig and commit message; - Removed pte_wrprotect() from pte_swp_mkuffd_wp(), this is a copy-paste error; - Added Alex's Reviewed-by tag in patch 2. V4: https://lore.kernel.org/all/20240830011101.3189522-1-zhangchunyan at iscas.ac.cn/ - Added bit(4) descriptions into "Format of swap PTE". V3: https://lore.kernel.org/all/20240805095243.44809-1-zhangchunyan at iscas.ac.cn/ - Fixed the issue reported by kernel test irobot . V2: https://lore.kernel.org/all/20240731040444.3384790-1-zhangchunyan at iscas.ac.cn/ - Add uffd-wp supported; - Make soft-dirty uffd-wp and devmap mutually exclusive which all use the same PTE bit; - Add test results of CRIU in the cover-letter. Chunyan Zhang (5): mm: softdirty: Add pgtable_soft_dirty_supported() mm: userfaultfd: Add pgtable_uffd_wp_supported() riscv: Add RISC-V Svrsw60t59b extension support riscv: mm: Add soft-dirty page tracking support riscv: mm: Add userfaultfd write-protect support arch/riscv/Kconfig | 16 +++ arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/include/asm/pgtable-bits.h | 37 +++++++ arch/riscv/include/asm/pgtable.h | 143 +++++++++++++++++++++++++- arch/riscv/kernel/cpufeature.c | 1 + fs/proc/task_mmu.c | 17 ++- fs/userfaultfd.c | 23 +++-- include/asm-generic/pgtable_uffd.h | 11 ++ include/linux/mm_inline.h | 7 ++ include/linux/pgtable.h | 12 +++ include/linux/userfaultfd_k.h | 44 +++++--- mm/debug_vm_pgtable.c | 10 +- mm/huge_memory.c | 13 +-- mm/internal.h | 2 +- mm/memory.c | 6 +- mm/mremap.c | 13 +-- mm/userfaultfd.c | 10 +- 17 files changed, 310 insertions(+), 56 deletions(-) -- 2.34.1 From zhangchunyan at iscas.ac.cn Thu Sep 11 02:55:59 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Thu, 11 Sep 2025 17:55:59 +0800 Subject: [PATCH v11 2/5] mm: userfaultfd: Add pgtable_uffd_wp_supported() In-Reply-To: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> References: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250911095602.1130290-3-zhangchunyan@iscas.ac.cn> Some platforms can customize the PTE/PMD entry uffd-wp bit making it unavailable even if the architecture provides the resource. This patch adds a macro API that allows architectures to define their specific implementations to check if the uffd-wp bit is available on which device the kernel is running. Signed-off-by: Chunyan Zhang --- fs/userfaultfd.c | 23 ++++++++-------- include/asm-generic/pgtable_uffd.h | 11 ++++++++ include/linux/mm_inline.h | 7 +++++ include/linux/userfaultfd_k.h | 44 +++++++++++++++++++----------- mm/memory.c | 6 ++-- 5 files changed, 62 insertions(+), 29 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 54c6cc7fe9c6..b549c327d7ad 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1270,9 +1270,9 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MISSING) vm_flags |= VM_UFFD_MISSING; if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) { -#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP - goto out; -#endif + if (!pgtable_uffd_wp_supported()) + goto out; + vm_flags |= VM_UFFD_WP; } if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MINOR) { @@ -1980,14 +1980,15 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx, uffdio_api.features &= ~(UFFD_FEATURE_MINOR_HUGETLBFS | UFFD_FEATURE_MINOR_SHMEM); #endif -#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP - uffdio_api.features &= ~UFFD_FEATURE_PAGEFAULT_FLAG_WP; -#endif -#ifndef CONFIG_PTE_MARKER_UFFD_WP - uffdio_api.features &= ~UFFD_FEATURE_WP_HUGETLBFS_SHMEM; - uffdio_api.features &= ~UFFD_FEATURE_WP_UNPOPULATED; - uffdio_api.features &= ~UFFD_FEATURE_WP_ASYNC; -#endif + if (!pgtable_uffd_wp_supported()) + uffdio_api.features &= ~UFFD_FEATURE_PAGEFAULT_FLAG_WP; + + if (!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || + !pgtable_uffd_wp_supported()) { + uffdio_api.features &= ~UFFD_FEATURE_WP_HUGETLBFS_SHMEM; + uffdio_api.features &= ~UFFD_FEATURE_WP_UNPOPULATED; + uffdio_api.features &= ~UFFD_FEATURE_WP_ASYNC; + } ret = -EINVAL; if (features & ~uffdio_api.features) diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h index 828966d4c281..895d68ece0e7 100644 --- a/include/asm-generic/pgtable_uffd.h +++ b/include/asm-generic/pgtable_uffd.h @@ -1,6 +1,17 @@ #ifndef _ASM_GENERIC_PGTABLE_UFFD_H #define _ASM_GENERIC_PGTABLE_UFFD_H +/* + * Some platforms can customize the uffd-wp bit, making it unavailable + * even if the architecture provides the resource. + * Adding this API allows architectures to add their own checks for the + * devices on which the kernel is running. + * Note: When overiding it, please make sure the + * CONFIG_HAVE_ARCH_USERFAULTFD_WP is part of this macro. + */ +#ifndef pgtable_uffd_wp_supported +#define pgtable_uffd_wp_supported() IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP) +#endif #ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP static __always_inline int pte_uffd_wp(pte_t pte) { diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 89b518ff097e..38845b8b79ff 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -571,6 +571,13 @@ pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, pte_t *pte, pte_t pteval) { #ifdef CONFIG_PTE_MARKER_UFFD_WP + /* + * Some platforms can customize the PTE uffd-wp bit, making it unavailable + * even if the architecture allows providing the PTE resource. + */ + if (!pgtable_uffd_wp_supported()) + return false; + bool arm_uffd_pte = false; /* The current status of the pte should be "cleared" before calling */ diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index c0e716aec26a..6264b56ae961 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -228,15 +228,15 @@ static inline bool vma_can_userfault(struct vm_area_struct *vma, if (wp_async && (vm_flags == VM_UFFD_WP)) return true; -#ifndef CONFIG_PTE_MARKER_UFFD_WP /* * If user requested uffd-wp but not enabled pte markers for * uffd-wp, then shmem & hugetlbfs are not supported but only * anonymous. */ - if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma)) + if ((!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || + !pgtable_uffd_wp_supported()) && + (vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma)) return false; -#endif /* By default, allow any of anon|shmem|hugetlb */ return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || @@ -437,8 +437,11 @@ static inline bool userfaultfd_wp_use_markers(struct vm_area_struct *vma) static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) { #ifdef CONFIG_PTE_MARKER_UFFD_WP - return is_pte_marker_entry(entry) && - (pte_marker_get(entry) & PTE_MARKER_UFFD_WP); + if (pgtable_uffd_wp_supported()) + return is_pte_marker_entry(entry) && + (pte_marker_get(entry) & PTE_MARKER_UFFD_WP); + else + return false; #else return false; #endif @@ -447,14 +450,19 @@ static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) static inline bool pte_marker_uffd_wp(pte_t pte) { #ifdef CONFIG_PTE_MARKER_UFFD_WP - swp_entry_t entry; + if (pgtable_uffd_wp_supported()) { + swp_entry_t entry; - if (!is_swap_pte(pte)) - return false; + if (!is_swap_pte(pte)) + return false; - entry = pte_to_swp_entry(pte); + entry = pte_to_swp_entry(pte); + + return pte_marker_entry_uffd_wp(entry); + } else { + return false; + } - return pte_marker_entry_uffd_wp(entry); #else return false; #endif @@ -467,14 +475,18 @@ static inline bool pte_marker_uffd_wp(pte_t pte) static inline bool pte_swp_uffd_wp_any(pte_t pte) { #ifdef CONFIG_PTE_MARKER_UFFD_WP - if (!is_swap_pte(pte)) - return false; + if (pgtable_uffd_wp_supported()) { + if (!is_swap_pte(pte)) + return false; - if (pte_swp_uffd_wp(pte)) - return true; + if (pte_swp_uffd_wp(pte)) + return true; - if (pte_marker_uffd_wp(pte)) - return true; + if (pte_marker_uffd_wp(pte)) + return true; + } else { + return false; + } #endif return false; } diff --git a/mm/memory.c b/mm/memory.c index 0ba4f6b71847..4eb05c5f487b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1465,7 +1465,9 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, { bool was_installed = false; -#ifdef CONFIG_PTE_MARKER_UFFD_WP + if (!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || !pgtable_uffd_wp_supported()) + return false; + /* Zap on anonymous always means dropping everything */ if (vma_is_anonymous(vma)) return false; @@ -1482,7 +1484,7 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, pte++; addr += PAGE_SIZE; } -#endif + return was_installed; } -- 2.34.1 From zhangchunyan at iscas.ac.cn Thu Sep 11 02:56:00 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Thu, 11 Sep 2025 17:56:00 +0800 Subject: [PATCH v11 3/5] riscv: Add RISC-V Svrsw60t59b extension support In-Reply-To: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> References: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250911095602.1130290-4-zhangchunyan@iscas.ac.cn> The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59 for software to use. Reviewed-by: Alexandre Ghiti Reviewed-by: Andrew Jones Signed-off-by: Chunyan Zhang --- arch/riscv/Kconfig | 14 ++++++++++++++ arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/kernel/cpufeature.c | 1 + 3 files changed, 16 insertions(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index a4b233a0659e..d99df67cc7a4 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -862,6 +862,20 @@ config RISCV_ISA_ZICBOP If you don't know what to do here, say Y. +config RISCV_ISA_SVRSW60T59B + bool "Svrsw60t59b extension support for using PTE bits 60 and 59" + depends on MMU && 64BIT + depends on RISCV_ALTERNATIVE + default y + help + Adds support to dynamically detect the presence of the Svrsw60t59b + extension and enable its usage. + + The Svrsw60t59b extension allows to free the PTE reserved bits 60 + and 59 for software to use. + + If you don't know what to do here, say Y. + config TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI def_bool y # https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=aed44286efa8ae8717a77d94b51ac3614e2ca6dc diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index affd63e11b0a..f98fcb5c17d5 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -106,6 +106,7 @@ #define RISCV_ISA_EXT_ZAAMO 97 #define RISCV_ISA_EXT_ZALRSC 98 #define RISCV_ISA_EXT_ZICBOP 99 +#define RISCV_ISA_EXT_SVRSW60T59B 100 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 743d53415572..2ba71d2d3fa3 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -539,6 +539,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL), __RISCV_ISA_EXT_DATA(svnapot, RISCV_ISA_EXT_SVNAPOT), __RISCV_ISA_EXT_DATA(svpbmt, RISCV_ISA_EXT_SVPBMT), + __RISCV_ISA_EXT_DATA(svrsw60t59b, RISCV_ISA_EXT_SVRSW60T59B), __RISCV_ISA_EXT_DATA(svvptc, RISCV_ISA_EXT_SVVPTC), }; -- 2.34.1 From zhangchunyan at iscas.ac.cn Thu Sep 11 02:56:01 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Thu, 11 Sep 2025 17:56:01 +0800 Subject: [PATCH v11 4/5] riscv: mm: Add soft-dirty page tracking support In-Reply-To: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> References: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250911095602.1130290-5-zhangchunyan@iscas.ac.cn> The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59 for software, this patch uses bit 59 for soft-dirty. To add swap PTE soft-dirty tracking, we borrow bit 3 which is available for swap PTEs on RISC-V systems. Signed-off-by: Chunyan Zhang --- arch/riscv/Kconfig | 1 + arch/riscv/include/asm/pgtable-bits.h | 19 +++++++ arch/riscv/include/asm/pgtable.h | 75 ++++++++++++++++++++++++++- 3 files changed, 93 insertions(+), 2 deletions(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index d99df67cc7a4..53b73e4bdf3f 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -141,6 +141,7 @@ config RISCV select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET select HAVE_ARCH_SECCOMP_FILTER + select HAVE_ARCH_SOFT_DIRTY if 64BIT && MMU && RISCV_ISA_SVRSW60T59B select HAVE_ARCH_THREAD_STRUCT_WHITELIST select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRANSPARENT_HUGEPAGE if 64BIT && MMU diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h index 179bd4afece4..f3bac2bbc157 100644 --- a/arch/riscv/include/asm/pgtable-bits.h +++ b/arch/riscv/include/asm/pgtable-bits.h @@ -19,6 +19,25 @@ #define _PAGE_SOFT (3 << 8) /* Reserved for software */ #define _PAGE_SPECIAL (1 << 8) /* RSW: 0x1 */ + +#ifdef CONFIG_MEM_SOFT_DIRTY + +/* ext_svrsw60t59b: bit 59 for soft-dirty tracking */ +#define _PAGE_SOFT_DIRTY \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + (1UL << 59) : 0) +/* + * Bit 3 is always zero for swap entry computation, so we + * can borrow it for swap page soft-dirty tracking. + */ +#define _PAGE_SWP_SOFT_DIRTY \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + _PAGE_EXEC : 0) +#else +#define _PAGE_SOFT_DIRTY 0 +#define _PAGE_SWP_SOFT_DIRTY 0 +#endif /* CONFIG_MEM_SOFT_DIRTY */ + #define _PAGE_TABLE _PAGE_PRESENT /* diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 91697fbf1f90..77344ff0298b 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -427,7 +427,7 @@ static inline pte_t pte_mkwrite_novma(pte_t pte) static inline pte_t pte_mkdirty(pte_t pte) { - return __pte(pte_val(pte) | _PAGE_DIRTY); + return __pte(pte_val(pte) | _PAGE_DIRTY | _PAGE_SOFT_DIRTY); } static inline pte_t pte_mkclean(pte_t pte) @@ -455,6 +455,42 @@ static inline pte_t pte_mkhuge(pte_t pte) return pte; } +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY +#define pgtable_soft_dirty_supported() \ + (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && \ + riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) + +static inline bool pte_soft_dirty(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_SOFT_DIRTY); +} + +static inline pte_t pte_mksoft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SOFT_DIRTY); +} + +static inline pte_t pte_clear_soft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_SOFT_DIRTY)); +} + +static inline bool pte_swp_soft_dirty(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_SWP_SOFT_DIRTY); +} + +static inline pte_t pte_swp_mksoft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_SOFT_DIRTY); +} + +static inline pte_t pte_swp_clear_soft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_SWP_SOFT_DIRTY)); +} +#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */ + #ifdef CONFIG_RISCV_ISA_SVNAPOT #define pte_leaf_size(pte) (pte_napot(pte) ? \ napot_cont_size(napot_cont_order(pte)) :\ @@ -802,6 +838,40 @@ static inline pud_t pud_mkspecial(pud_t pud) } #endif +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY +static inline bool pmd_soft_dirty(pmd_t pmd) +{ + return pte_soft_dirty(pmd_pte(pmd)); +} + +static inline pmd_t pmd_mksoft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_mksoft_dirty(pmd_pte(pmd))); +} + +static inline pmd_t pmd_clear_soft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_clear_soft_dirty(pmd_pte(pmd))); +} + +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION +static inline bool pmd_swp_soft_dirty(pmd_t pmd) +{ + return pte_swp_soft_dirty(pmd_pte(pmd)); +} + +static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_swp_mksoft_dirty(pmd_pte(pmd))); +} + +static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_swp_clear_soft_dirty(pmd_pte(pmd))); +} +#endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */ +#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */ + static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd) { @@ -983,7 +1053,8 @@ static inline pud_t pud_modify(pud_t pud, pgprot_t newprot) * * Format of swap PTE: * bit 0: _PAGE_PRESENT (zero) - * bit 1 to 3: _PAGE_LEAF (zero) + * bit 1 to 2: (zero) + * bit 3: _PAGE_SWP_SOFT_DIRTY * bit 5: _PAGE_PROT_NONE (zero) * bit 6: exclusive marker * bits 7 to 11: swap type -- 2.34.1 From zhangchunyan at iscas.ac.cn Thu Sep 11 02:56:02 2025 From: zhangchunyan at iscas.ac.cn (Chunyan Zhang) Date: Thu, 11 Sep 2025 17:56:02 +0800 Subject: [PATCH v11 5/5] riscv: mm: Add userfaultfd write-protect support In-Reply-To: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> References: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> Message-ID: <20250911095602.1130290-6-zhangchunyan@iscas.ac.cn> The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59 for software, this patch uses bit 60 for uffd-wp tracking Additionally for tracking the uffd-wp state as a PTE swap bit, we borrow bit 4 which is not involved into swap entry computation. Signed-off-by: Chunyan Zhang --- arch/riscv/Kconfig | 1 + arch/riscv/include/asm/pgtable-bits.h | 18 +++++++ arch/riscv/include/asm/pgtable.h | 68 +++++++++++++++++++++++++++ 3 files changed, 87 insertions(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 53b73e4bdf3f..f928768bb14a 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -147,6 +147,7 @@ config RISCV select HAVE_ARCH_TRANSPARENT_HUGEPAGE if 64BIT && MMU select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if 64BIT && MMU select HAVE_ARCH_USERFAULTFD_MINOR if 64BIT && USERFAULTFD + select HAVE_ARCH_USERFAULTFD_WP if 64BIT && MMU && USERFAULTFD && RISCV_ISA_SVRSW60T59B select HAVE_ARCH_VMAP_STACK if MMU && 64BIT select HAVE_ASM_MODVERSIONS select HAVE_CONTEXT_TRACKING_USER diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h index f3bac2bbc157..b422d9691e60 100644 --- a/arch/riscv/include/asm/pgtable-bits.h +++ b/arch/riscv/include/asm/pgtable-bits.h @@ -38,6 +38,24 @@ #define _PAGE_SWP_SOFT_DIRTY 0 #endif /* CONFIG_MEM_SOFT_DIRTY */ +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP + +/* ext_svrsw60t59b: Bit(60) for uffd-wp tracking */ +#define _PAGE_UFFD_WP \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + (1UL << 60) : 0) +/* + * Bit 4 is not involved into swap entry computation, so we + * can borrow it for swap page uffd-wp tracking. + */ +#define _PAGE_SWP_UFFD_WP \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + _PAGE_USER : 0) +#else +#define _PAGE_UFFD_WP 0 +#define _PAGE_SWP_UFFD_WP 0 +#endif + #define _PAGE_TABLE _PAGE_PRESENT /* diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 77344ff0298b..5d3f17e175e5 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -416,6 +416,41 @@ static inline pte_t pte_wrprotect(pte_t pte) return __pte(pte_val(pte) & ~(_PAGE_WRITE)); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +#define pgtable_uffd_wp_supported() \ + riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B) + +static inline bool pte_uffd_wp(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_UFFD_WP); +} + +static inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte_wrprotect(__pte(pte_val(pte) | _PAGE_UFFD_WP)); +} + +static inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_UFFD_WP)); +} + +static inline bool pte_swp_uffd_wp(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_SWP_UFFD_WP); +} + +static inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_UFFD_WP); +} + +static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_SWP_UFFD_WP)); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + /* static inline pte_t pte_mkread(pte_t pte) */ static inline pte_t pte_mkwrite_novma(pte_t pte) @@ -838,6 +873,38 @@ static inline pud_t pud_mkspecial(pud_t pud) } #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline bool pmd_uffd_wp(pmd_t pmd) +{ + return pte_uffd_wp(pmd_pte(pmd)); +} + +static inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_mkuffd_wp(pmd_pte(pmd))); +} + +static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_clear_uffd_wp(pmd_pte(pmd))); +} + +static inline bool pmd_swp_uffd_wp(pmd_t pmd) +{ + return pte_swp_uffd_wp(pmd_pte(pmd)); +} + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_swp_mkuffd_wp(pmd_pte(pmd))); +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_swp_clear_uffd_wp(pmd_pte(pmd))); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY static inline bool pmd_soft_dirty(pmd_t pmd) { @@ -1055,6 +1122,7 @@ static inline pud_t pud_modify(pud_t pud, pgprot_t newprot) * bit 0: _PAGE_PRESENT (zero) * bit 1 to 2: (zero) * bit 3: _PAGE_SWP_SOFT_DIRTY + * bit 4: _PAGE_SWP_UFFD_WP * bit 5: _PAGE_PROT_NONE (zero) * bit 6: exclusive marker * bits 7 to 11: swap type -- 2.34.1 From wangruikang at iscas.ac.cn Thu Sep 11 03:23:12 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Thu, 11 Sep 2025 18:23:12 +0800 Subject: [PATCH net-next v10 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250911094404.GE30363@horms.kernel.org> References: <20250908-net-k1-emac-v10-0-90d807ccd469@iscas.ac.cn> <20250908-net-k1-emac-v10-2-90d807ccd469@iscas.ac.cn> <20250911094404.GE30363@horms.kernel.org> Message-ID: <7895b23a-2b50-4f3e-bdef-f9b7397beef2@iscas.ac.cn> Hi Simon, On 9/11/25 17:44, Simon Horman wrote: > On Mon, Sep 08, 2025 at 08:34:26PM +0800, Vivian Wang wrote: >> The Ethernet MACs found on SpacemiT K1 appears to be a custom design >> that only superficially resembles some other embedded MACs. SpacemiT >> refers to them as "EMAC", so let's just call the driver "k1_emac". >> >> Supports RGMII and RMII interfaces. Includes support for MAC hardware >> statistics counters. PTP support is not implemented. >> >> Signed-off-by: Vivian Wang >> Reviewed-by: Maxime Chevallier >> Reviewed-by: Vadim Fedorenko >> Reviewed-by: Troy Mitchell >> Tested-by: Junhui Liu >> Tested-by: Troy Mitchell >> --- >> drivers/net/ethernet/Kconfig | 1 + >> drivers/net/ethernet/Makefile | 1 + >> drivers/net/ethernet/spacemit/Kconfig | 29 + >> drivers/net/ethernet/spacemit/Makefile | 6 + >> drivers/net/ethernet/spacemit/k1_emac.c | 2156 +++++++++++++++++++++++++++++++ > This is a large patch, so I'm sure I've missed some things. > But, overall, I think this is coming together. > Thanks for your recent updates. > > As the Kernel Patch Robot noticed a problem, > I've provided some minor feedback for your consideration. (That's the function at the end) > ... > >> +static void emac_wr(struct emac_priv *priv, u32 reg, u32 val) >> +{ >> + writel(val, priv->iobase + reg); >> +} >> + >> +static int emac_rd(struct emac_priv *priv, u32 reg) > nit: maybe u32 would be a more suitable return type. > Ah, right, will change to u32 in the next version. >> +{ >> + return readl(priv->iobase + reg); >> +} > ... > >> +static int emac_alloc_tx_resources(struct emac_priv *priv) >> +{ >> + struct emac_desc_ring *tx_ring = &priv->tx_ring; >> + struct platform_device *pdev = priv->pdev; >> + u32 size; >> + >> + size = sizeof(struct emac_tx_desc_buffer) * tx_ring->total_cnt; >> + >> + tx_ring->tx_desc_buf = kzalloc(size, GFP_KERNEL); > nit: I think you can use kcalloc() here. > >> + if (!tx_ring->tx_desc_buf) >> + return -ENOMEM; >> + >> + tx_ring->total_size = tx_ring->total_cnt * sizeof(struct emac_desc); >> + tx_ring->total_size = ALIGN(tx_ring->total_size, PAGE_SIZE); >> + >> + tx_ring->desc_addr = dma_alloc_coherent(&pdev->dev, tx_ring->total_size, >> + &tx_ring->desc_dma_addr, >> + GFP_KERNEL); >> + if (!tx_ring->desc_addr) { >> + kfree(tx_ring->tx_desc_buf); >> + return -ENOMEM; >> + } >> + >> + tx_ring->head = 0; >> + tx_ring->tail = 0; >> + >> + return 0; >> +} > ... > >> +static int emac_alloc_rx_resources(struct emac_priv *priv) >> +{ >> + struct emac_desc_ring *rx_ring = &priv->rx_ring; >> + struct platform_device *pdev = priv->pdev; >> + u32 buf_len; >> + >> + buf_len = sizeof(struct emac_rx_desc_buffer) * rx_ring->total_cnt; >> + >> + rx_ring->rx_desc_buf = kzalloc(buf_len, GFP_KERNEL); > Ditto. Will change these uses of kcalloc for these array allocations in next version. >> + if (!rx_ring->rx_desc_buf) >> + return -ENOMEM; >> + >> + rx_ring->total_size = rx_ring->total_cnt * sizeof(struct emac_desc); >> + >> + rx_ring->total_size = ALIGN(rx_ring->total_size, PAGE_SIZE); >> + >> + rx_ring->desc_addr = dma_alloc_coherent(&pdev->dev, rx_ring->total_size, >> + &rx_ring->desc_dma_addr, >> + GFP_KERNEL); >> + if (!rx_ring->desc_addr) { >> + kfree(rx_ring->rx_desc_buf); >> + return -ENOMEM; >> + } >> + >> + rx_ring->head = 0; >> + rx_ring->tail = 0; >> + >> + return 0; >> +} > ... > >> +static int emac_mii_read(struct mii_bus *bus, int phy_addr, int regnum) >> +{ >> + struct emac_priv *priv = bus->priv; >> + u32 cmd = 0, val; >> + int ret; >> + >> + cmd |= phy_addr & 0x1F; >> + cmd |= (regnum & 0x1F) << 5; > nit: I think this could benefit from using FIELD_PREP > Likewise for similar patterns in this patch. > Right... I'll take a look, thanks. >> + cmd |= MREGBIT_START_MDIO_TRANS | MREGBIT_MDIO_READ_WRITE; >> + >> + emac_wr(priv, MAC_MDIO_DATA, 0x0); >> + emac_wr(priv, MAC_MDIO_CONTROL, cmd); >> + >> + ret = readl_poll_timeout(priv->iobase + MAC_MDIO_CONTROL, val, >> + !((val >> 15) & 0x1), 100, 10000); >> + >> + if (ret) >> + return ret; >> + >> + val = emac_rd(priv, MAC_MDIO_DATA); >> + return val; >> +} > ... > >> +/* >> + * Even though this MAC supports gigabit operation, it only provides 32-bit >> + * statistics counters. The most overflow-prone counters are the "bytes" ones, >> + * which at gigabit overflow about twice a minute. >> + * >> + * Therefore, we maintain the high 32 bits of counters ourselves, incrementing >> + * every time statistics seem to go backwards. Also, update periodically to >> + * catch overflows when we are not otherwise checking the statistics often >> + * enough. >> + */ >> + >> +#define EMAC_STATS_TIMER_PERIOD 20 >> + >> +static int emac_read_stat_cnt(struct emac_priv *priv, u8 cnt, u32 *res, >> + u32 control_reg, u32 high_reg, u32 low_reg) >> +{ >> + u32 val; >> + int ret; >> + >> + /* The "read" bit is the same for TX and RX */ >> + >> + val = MREGBIT_START_TX_COUNTER_READ | cnt; >> + emac_wr(priv, control_reg, val); >> + val = emac_rd(priv, control_reg); >> + >> + ret = readl_poll_timeout_atomic(priv->iobase + control_reg, val, >> + !(val & MREGBIT_START_TX_COUNTER_READ), >> + 100, 10000); >> + >> + if (ret) { >> + netdev_err(priv->ndev, "Read stat timeout\n"); >> + return ret; >> + } >> + >> + *res = emac_rd(priv, high_reg) << 16; >> + *res |= (u16)emac_rd(priv, low_reg); > nit: I think lower_16_bits() and lower_16_bits() would be appropriate here. This one is building up a 32-bit value instead of splitting a 32-bit value in two, and we don't have those macros in linux/wordpart.h. So I think I'll make a local helper: static u32 emac_make_stat(u16 high, u16 low) >> + >> + return 0; >> +} > ... > >> +static void emac_update_counter(u64 *counter, u32 new_low) >> +{ >> + u32 old_low = (u32)*counter; >> + u64 high = *counter >> 32; > Similarly, lower_32_bits() and upper_32_bits here. > Thanks, this one I'll change to {lower,upper}_32_bits. >> + >> + if (old_low > new_low) { >> + /* Overflowed, increment high 32 bits */ >> + high++; >> + } >> + >> + *counter = (high << 32) | new_low; >> +} >> + >> +static void emac_stats_update(struct emac_priv *priv) >> +{ >> + u64 *tx_stats_off = (u64 *)&priv->tx_stats_off; >> + u64 *rx_stats_off = (u64 *)&priv->rx_stats_off; >> + u64 *tx_stats = (u64 *)&priv->tx_stats; >> + u64 *rx_stats = (u64 *)&priv->rx_stats; > nit: I think it would be interesting to use a union containing > 1. the existing tx/rx stats struct and 2. an array of u64. > This may allow avoiding this cast. Which seems nice to me. > But YMMV. Looks like I can use a union with a DECLARE_FLEX_ARRAY for this. I'll change it in the next version. >> + u32 i, res; >> + >> + assert_spin_locked(&priv->stats_lock); >> + >> + if (!netif_running(priv->ndev) || !netif_device_present(priv->ndev)) { >> + /* Not up, don't try to update */ >> + return; >> + } >> + >> + for (i = 0; i < sizeof(priv->tx_stats) / sizeof(*tx_stats); i++) { >> + /* >> + * If reading stats times out, everything is broken and there's >> + * nothing we can do. Reading statistics also can't return an >> + * error, so just return without updating and without >> + * rescheduling. >> + */ >> + if (emac_tx_read_stat_cnt(priv, i, &res)) >> + return; >> + >> + /* >> + * Re-initializing while bringing interface up resets counters >> + * to zero, so to provide continuity, we add the values saved >> + * last time we did emac_down() to the new hardware-provided >> + * value. >> + */ >> + emac_update_counter(&tx_stats[i], res + (u32)tx_stats_off[i]); > nit: maybe lower_32_bits(tx_stats_off[i]) ? > >> + } >> + >> + /* Similar remarks as TX stats */ >> + for (i = 0; i < sizeof(priv->rx_stats) / sizeof(*rx_stats); i++) { >> + if (emac_rx_read_stat_cnt(priv, i, &res)) >> + return; >> + emac_update_counter(&rx_stats[i], res + (u32)rx_stats_off[i]); > Likewise, here for rx_stats_off[i]. > Thanks, these I will use lower_32_bits in these two places in the next version. >> + } >> + >> + mod_timer(&priv->stats_timer, jiffies + EMAC_STATS_TIMER_PERIOD * HZ); >> +} > ... > >> +static u64 emac_get_stat_tx_dropped(struct emac_priv *priv) >> +{ >> + u64 result; >> + int cpu; >> + >> + for_each_possible_cpu(cpu) { >> + result += READ_ONCE(per_cpu(*priv->stat_tx_dropped, cpu)); >> + } > nit: no need for {} here ? Thanks for the catch, but with regards to this entire function, I'm moving tx_dropped to dstats, so this would be moot. >> + >> + return result; >> +} > ... Thanks for your review. Vivian "dramforever" Wang From dlan at gentoo.org Thu Sep 11 04:22:51 2025 From: dlan at gentoo.org (Yixun Lan) Date: Thu, 11 Sep 2025 19:22:51 +0800 Subject: [PATCH] riscv: dts: spacemit: add UART pinctrl combinations In-Reply-To: <20250903145334.425633-1-hendrik.hamerlinck@hammernet.be> References: <20250903145334.425633-1-hendrik.hamerlinck@hammernet.be> Message-ID: <20250911112251-GYA1216475@gentoo.org> Hi Hendrik, On 16:53 Wed 03 Sep , Hendrik Hamerlinck wrote: > This adds UART pinctrl configurations based on the SoC datasheet and the > downstream Bianbu Linux tree. The drive strength values were taken from > the downstream implementation, which uses medium drive strength. > > For convenience, the board DTS files have been updated to include all > UART instances with their possible pinmux options in a disabled state. > > Tested this locally on both Orange Pi RV2 and Banana Pi BPI-F3 boards. > > Signed-off-by: Hendrik Hamerlinck > --- > .../boot/dts/spacemit/k1-bananapi-f3.dts | 18 ++ > .../boot/dts/spacemit/k1-orangepi-rv2.dts | 18 ++ > arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi | 276 +++++++++++++++++- > 3 files changed, 309 insertions(+), 3 deletions(-) > > diff --git a/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts b/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts > index 6013be258542..661d47d1ce9e 100644 > --- a/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts > +++ b/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts > @@ -49,3 +49,21 @@ &uart0 { > pinctrl-0 = <&uart0_2_cfg>; > status = "okay"; > }; > + > +&uart5 { > + pinctrl-names = "default"; > + pinctrl-0 = <&uart5_3_cfg>; > + status = "disabled"; > +}; > + > +&uart8 { > + pinctrl-names = "default"; > + pinctrl-0 = <&uart8_2_cfg>; > + status = "disabled"; > +}; > + > +&uart9 { > + pinctrl-names = "default"; > + pinctrl-0 = <&uart9_2_cfg>; > + status = "disabled"; > +}; all of uart5, 8, 9 come from 26-pins port, the functionaly is very likely depending on the final use case.. and I get your idea of adding those nodes but with "disabled" status.. my suggestion is to not add them, or leave to users to add separated dtbo (Device tree overlays) files in the future but I'm ok to complete uart pinctrl info in the dtsi file > diff --git a/arch/riscv/boot/dts/spacemit/k1-orangepi-rv2.dts b/arch/riscv/boot/dts/spacemit/k1-orangepi-rv2.dts > index 337240ebb7b7..dc45b75b1ad4 100644 > --- a/arch/riscv/boot/dts/spacemit/k1-orangepi-rv2.dts > +++ b/arch/riscv/boot/dts/spacemit/k1-orangepi-rv2.dts > @@ -38,3 +38,21 @@ &uart0 { > pinctrl-0 = <&uart0_2_cfg>; > status = "okay"; > }; > + > +&uart5 { > + pinctrl-names = "default"; > + pinctrl-0 = <&uart5_3_cfg>; > + status = "disabled"; > +}; > + > +&uart8 { > + pinctrl-names = "default"; > + pinctrl-0 = <&uart8_2_cfg>; > + status = "disabled"; > +}; > + > +&uart9 { > + pinctrl-names = "default"; > + pinctrl-0 = <&uart9_2_cfg>; > + status = "disabled"; > +}; > diff --git a/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi b/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi > index 381055737422..43425530b5bf 100644 > --- a/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi > +++ b/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi > @@ -11,12 +11,282 @@ > #define K1_GPIO(x) (x / 32) (x % 32) > > &pinctrl { > + uart0_0_cfg: uart0-0-cfg { > + uart0-0-pins { > + pinmux = , /* uart0_txd */ > + ; /* uart0_rxd */ > + power-source = <3300>; > + bias-pull-up; > + drive-strength = <19>; > + }; > + }; > + > + uart0_1_cfg: uart0-1-cfg { > + uart0-1-pins { > + pinmux = , /* uart0_txd */ > + ; /* uart0_rxd */ > + power-source = <3300>; > + bias-pull-up; > + drive-strength = <19>; > + }; > + }; > + > uart0_2_cfg: uart0-2-cfg { > uart0-2-pins { > - pinmux = , > - ; > + pinmux = , /* uart0_txd */ > + ; /* uart0_rxd */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > > - bias-pull-up = <0>; > + uart2_0_cfg: uart2-0-cfg { > + uart2-0-pins { > + pinmux = , /* uart2_txd */ > + , /* uart2_rxd */ > + , /* uart2_cts */ > + ; /* uart2_rts */ I think for group has cts, rts pins, it's more practical to have two separated cfgs, so the final application can choose to request two pins (tx, rx), or four pins (tx, tx, cts, rts).. (I believe the hardware should support this) something like this: uart2_0_cfg: uart2-0-cfg { uart2-0-pins { pinmux = , /* uart2_txd */ , /* uart2_rxd */ }; }; uart2_0_cts_rts_cfg: uart2-0-cts-rts-cfg { uart2-0-pins { pinmux = , /* uart2_cts */ , /* uart2_rts */ }; }; &uart2 { pinctrl-names = "default"; pinctrl-0 = <&uart2_0_cfg>, <&uart2_0_cts_rts_cfg>; }; > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart3_0_cfg: uart3-0-cfg { > + uart3-0-pins { > + pinmux = , /* uart3_txd */ > + , /* uart3_rxd */ > + , /* uart3_cts */ > + ; /* uart3_rts */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart3_1_cfg: uart3-1-cfg { > + uart3-1-pins { > + pinmux = , /* uart3_txd */ > + , /* uart3_rxd */ > + , /* uart3_cts */ > + ; /* uart3_rts */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart3_2_cfg: uart3-2-cfg { > + uart3-2-pins { > + pinmux = , /* uart3_txd */ > + , /* uart3_rxd */ > + , /* uart3_cts */ > + ; /* uart3_rts */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart4_0_cfg: uart4-0-cfg { > + uart4-0-pins { > + pinmux = , /* uart4_txd */ > + ; /* uart4_rxd */ > + power-source = <3300>; > + bias-pull-up; > + drive-strength = <19>; > + }; > + }; > + > + uart4_1_cfg: uart4-1-cfg { > + uart4-1-pins { > + pinmux = , /* uart4_cts */ > + , /* uart4_rts */ > + , /* uart4_txd */ > + ; /* uart4_rxd */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart4_2_cfg: uart4-2-cfg { > + uart4-2-pins { > + pinmux = , /* uart4_txd */ > + ; /* uart4_rxd */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart4_3_cfg: uart4-3-cfg { > + uart4-3-pins { > + pinmux = , /* uart4_txd */ > + , /* uart4_rxd */ > + , /* uart4_cts */ > + ; /* uart4_rts */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart4_4_cfg: uart4-4-cfg { > + uart4-4-pins { > + pinmux = , /* uart4_txd */ > + , /* uart4_rxd */ > + , /* uart4_cts */ > + ; /* uart4_rts */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart5_0_cfg: uart5-0-cfg { > + uart5-0-pins { > + pinmux = , /* uart5_txd */ > + ; /* uart5_rxd */ > + power-source = <3300>; > + bias-pull-up; > + drive-strength = <19>; > + }; > + }; > + > + uart5_1_cfg: uart5-1-cfg { > + uart5-1-pins { > + pinmux = , /* uart5_txd */ > + , /* uart5_rxd */ > + , /* uart5_cts */ > + ; /* uart5_rts */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart5_2_cfg: uart5-2-cfg { > + uart5-2-pins { > + pinmux = , /* uart5_txd */ > + , /* uart5_rxd */ > + , /* uart5_cts */ > + ; /* uart5_rts */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart5_3_cfg: uart5-3-cfg { > + uart5-3-pins { > + pinmux = , /* uart5_txd */ > + , /* uart5_rxd */ > + , /* uart5_cts */ > + ; /* uart5_rts */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart6_0_cfg: uart6-0-cfg { > + uart6-0-pins { > + pinmux = , /* uart6_cts */ > + , /* uart6_txd */ > + , /* uart6_rxd */ > + ; /* uart6_rts */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart6_1_cfg: uart6-1-cfg { > + uart6-1-pins { > + pinmux = , /* uart6_txd */ > + , /* uart6_rxd */ > + , /* uart6_cts */ > + ; /* uart6_rts */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart6_2_cfg: uart6-2-cfg { > + uart6-2-pins { > + pinmux = , /* uart6_txd */ > + ; /* uart6_rxd */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart7_0_cfg: uart7-0-cfg { > + uart7-0-pins { > + pinmux = , /* uart7_txd */ > + ; /* uart7_rxd */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart7_1_cfg: uart7-1-cfg { > + uart7-1-pins { > + pinmux = , /* uart7_txd */ > + , /* uart7_rxd */ > + , /* uart7_cts */ > + ; /* uart7_rts */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart8_0_cfg: uart8-0-cfg { > + uart8-0-pins { > + pinmux = , /* uart8_txd */ > + ; /* uart8_rxd */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart8_1_cfg: uart8-1-cfg { > + uart8-1-pins { > + pinmux = , /* uart8_txd */ > + , /* uart8_rxd */ > + , /* uart8_cts */ > + ; /* uart8_rts */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart8_2_cfg: uart8-2-cfg { > + uart8-2-pins { > + pinmux = , /* uart8_txd */ > + , /* uart8_rxd */ > + , /* uart8_cts */ > + ; /* uart8_rts */ > + power-source = <3300>; > + bias-pull-up; > + drive-strength = <19>; > + }; > + }; > + > + uart9_0_cfg: uart9-0-cfg { > + uart9-0-pins { > + pinmux = , /* uart9_txd */ > + ; /* uart9_rxd */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart9_1_cfg: uart9-1-cfg { > + uart9-1-pins { > + pinmux = , /* uart9_cts */ > + , /* uart9_rts */ > + , /* uart9_txd */ > + ; /* uart9_rxd */ > + bias-pull-up; > + drive-strength = <32>; > + }; > + }; > + > + uart9_2_cfg: uart9-2-cfg { > + uart9-2-pins { > + pinmux = , /* uart9_txd */ > + ; /* uart9_rxd */ > + bias-pull-up; > drive-strength = <32>; > }; > }; > -- > 2.43.0 > -- Yixun Lan (dlan) From zihong.plct at isrc.iscas.ac.cn Thu Sep 11 05:12:07 2025 From: zihong.plct at isrc.iscas.ac.cn (Yao Zihong) Date: Thu, 11 Sep 2025 20:12:07 +0800 Subject: [PATCH v1 0/2] riscv: hwprobe: add Zicbop support Message-ID: <20250911121219.20243-1-zihong.plct@isrc.iscas.ac.cn> Add UAPI and kernel plumbing to expose the Zicbop extension presence and its block size through sys_hwprobe(). The interface mirrors Zicbom/Zicboz. This allows userspace to safely discover and optimize for Zicbop when available. Yao Zihong (2): uapi: riscv: hwprobe: add Zicbop extension bit and block-size key riscv: hwprobe: report Zicbop presence and block size arch/riscv/include/uapi/asm/hwprobe.h | 2 ++ arch/riscv/kernel/sys_hwprobe.c | 6 ++++++ 2 files changed, 8 insertions(+) -- 2.47.2 From zihong.plct at isrc.iscas.ac.cn Thu Sep 11 05:12:08 2025 From: zihong.plct at isrc.iscas.ac.cn (Yao Zihong) Date: Thu, 11 Sep 2025 20:12:08 +0800 Subject: [PATCH v1 1/2] uapi: riscv: hwprobe: add Zicbop extension bit and block-size key In-Reply-To: <20250911121219.20243-1-zihong.plct@isrc.iscas.ac.cn> References: <20250911121219.20243-1-zihong.plct@isrc.iscas.ac.cn> Message-ID: <20250911121219.20243-2-zihong.plct@isrc.iscas.ac.cn> Introduce RISCV_HWPROBE_EXT_ZICBOP to report presence of the Zicbop extension through sys_hwprobe(), and add RISCV_HWPROBE_KEY_ZICBOP_BLOCK_SIZE to expose the block size (in bytes) when Zicbop is supported. Signed-off-by: Yao Zihong --- arch/riscv/include/uapi/asm/hwprobe.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h index aaf6ad970499..c65c41a4d8ea 100644 --- a/arch/riscv/include/uapi/asm/hwprobe.h +++ b/arch/riscv/include/uapi/asm/hwprobe.h @@ -82,6 +82,7 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_EXT_ZAAMO (1ULL << 56) #define RISCV_HWPROBE_EXT_ZALRSC (1ULL << 57) #define RISCV_HWPROBE_EXT_ZABHA (1ULL << 58) +#define RISCV_HWPROBE_EXT_ZICBOP (1ULL << 59) #define RISCV_HWPROBE_KEY_CPUPERF_0 5 #define RISCV_HWPROBE_MISALIGNED_UNKNOWN (0 << 0) #define RISCV_HWPROBE_MISALIGNED_EMULATED (1 << 0) @@ -106,6 +107,7 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 11 #define RISCV_HWPROBE_KEY_ZICBOM_BLOCK_SIZE 12 #define RISCV_HWPROBE_KEY_VENDOR_EXT_SIFIVE_0 13 +#define RISCV_HWPROBE_KEY_ZICBOP_BLOCK_SIZE 14 /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */ /* Flags */ -- 2.47.2 From zihong.plct at isrc.iscas.ac.cn Thu Sep 11 05:12:09 2025 From: zihong.plct at isrc.iscas.ac.cn (Yao Zihong) Date: Thu, 11 Sep 2025 20:12:09 +0800 Subject: [PATCH v1 2/2] riscv: hwprobe: report Zicbop presence and block size In-Reply-To: <20250911121219.20243-1-zihong.plct@isrc.iscas.ac.cn> References: <20250911121219.20243-1-zihong.plct@isrc.iscas.ac.cn> Message-ID: <20250911121219.20243-3-zihong.plct@isrc.iscas.ac.cn> Plumb Zicbop into sys_hwprobe. Semantics mirror Zicbom/Zicboz to keep userspace expectations aligned. Signed-off-by: Yao Zihong --- arch/riscv/kernel/sys_hwprobe.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 0b170e18a2be..857d4e602e76 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -112,6 +112,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZCB); EXT_KEY(ZCMOP); EXT_KEY(ZICBOM); + EXT_KEY(ZICBOP); EXT_KEY(ZICBOZ); EXT_KEY(ZICNTR); EXT_KEY(ZICOND); @@ -294,6 +295,11 @@ static void hwprobe_one_pair(struct riscv_hwprobe *pair, if (hwprobe_ext0_has(cpus, RISCV_HWPROBE_EXT_ZICBOM)) pair->value = riscv_cbom_block_size; break; + case RISCV_HWPROBE_KEY_ZICBOP_BLOCK_SIZE: + pair->value = 0; + if (hwprobe_ext0_has(cpus, RISCV_HWPROBE_EXT_ZICBOP)) + pair->value = riscv_cbop_block_size; + break; case RISCV_HWPROBE_KEY_HIGHEST_VIRT_ADDRESS: pair->value = user_max_virt_addr(); break; -- 2.47.2 From cp0613 at linux.alibaba.com Thu Sep 11 05:44:44 2025 From: cp0613 at linux.alibaba.com (cp0613 at linux.alibaba.com) Date: Thu, 11 Sep 2025 20:44:44 +0800 Subject: [RFC PATCH 0/4] riscv: tarce: Implement riscv trace pmu driver and perf support Message-ID: <20250911124448.1771-1-cp0613@linux.alibaba.com> From: Chen Pei The RISC-V Trace Specification defines a standardized framework for capturing and analyzing the execution of RISC-V processors. Its main uses include: instruction and data tracing, real-time debugging, etc. Similar to Intel-PT and ARM-CoreSight. According to the RISC-V Trace Control Interface specification [1]. There are two standard RISC-V trace protocols which will utilize this RISC-V Trace Control Interface: - RISC-V N-Trace (Nexus-based Trace) Specification - Efficient Trace for RISC-V Specification So, this is a complete guideline for any standard RISC-V trace implementation. This series of patches is mainly used to start related work and communication. It completes the following tasks: 1. dt-bindings completes the basic definition of riscv trace component properties, but is still incomplete. 2. Implemented the basic RISC-V Trace PMU driver, including support for the aux buffer. 3. Implemented basic support for AUXTRACE integration with perf tools. There's still more work to be done, such as: 1. Complete RISC-V Trace PMU implementation. 2. The perf.data generation and parsing including AUXTRACE events. 3. Taking RISC-V N-Trace as an example, implement the parsing of Nexus Trace data format, including support for perf report and perf script commands. We are still sorting out. Any comments or suggestions are welcome. [1] https://github.com/riscv-non-isa/tg-nexus-trace.git Chen Pei (4): dt-bindings: riscv: Add trace components description riscv: event: Initial riscv trace driver support tools: perf: Support perf record with aux buffer for riscv trace riscv: trace: Support sink using dma buffer .../riscv/trace/riscv,trace,encoder.yaml | 41 +++ .../riscv/trace/riscv,trace,funnel.yaml | 46 ++++ .../riscv/trace/riscv,trace,sink.yaml | 37 +++ arch/riscv/Kbuild | 1 + arch/riscv/Kconfig | 2 + arch/riscv/events/Kconfig | 11 + arch/riscv/events/Makefile | 3 + arch/riscv/events/riscv_trace.c | 253 ++++++++++++++++++ arch/riscv/events/riscv_trace.h | 133 +++++++++ arch/riscv/events/riscv_trace_encoder.c | 109 ++++++++ arch/riscv/events/riscv_trace_funnel.c | 160 +++++++++++ arch/riscv/events/riscv_trace_sink.c | 100 +++++++ tools/perf/arch/riscv/util/Build | 3 + tools/perf/arch/riscv/util/auxtrace.c | 33 +++ tools/perf/arch/riscv/util/pmu.c | 18 ++ tools/perf/arch/riscv/util/riscv-trace.c | 183 +++++++++++++ tools/perf/arch/riscv/util/tsc.c | 15 ++ tools/perf/util/Build | 1 + tools/perf/util/auxtrace.c | 4 + tools/perf/util/auxtrace.h | 1 + tools/perf/util/riscv-trace.c | 162 +++++++++++ tools/perf/util/riscv-trace.h | 18 ++ 22 files changed, 1334 insertions(+) create mode 100644 Documentation/devicetree/bindings/riscv/trace/riscv,trace,encoder.yaml create mode 100644 Documentation/devicetree/bindings/riscv/trace/riscv,trace,funnel.yaml create mode 100644 Documentation/devicetree/bindings/riscv/trace/riscv,trace,sink.yaml create mode 100644 arch/riscv/events/Kconfig create mode 100644 arch/riscv/events/Makefile create mode 100644 arch/riscv/events/riscv_trace.c create mode 100644 arch/riscv/events/riscv_trace.h create mode 100644 arch/riscv/events/riscv_trace_encoder.c create mode 100644 arch/riscv/events/riscv_trace_funnel.c create mode 100644 arch/riscv/events/riscv_trace_sink.c create mode 100644 tools/perf/arch/riscv/util/auxtrace.c create mode 100644 tools/perf/arch/riscv/util/pmu.c create mode 100644 tools/perf/arch/riscv/util/riscv-trace.c create mode 100644 tools/perf/arch/riscv/util/tsc.c create mode 100644 tools/perf/util/riscv-trace.c create mode 100644 tools/perf/util/riscv-trace.h -- 2.49.0 From cp0613 at linux.alibaba.com Thu Sep 11 05:44:45 2025 From: cp0613 at linux.alibaba.com (cp0613 at linux.alibaba.com) Date: Thu, 11 Sep 2025 20:44:45 +0800 Subject: [RFC PATCH 1/4] dt-bindings: riscv: Add trace components description In-Reply-To: <20250911124448.1771-1-cp0613@linux.alibaba.com> References: <20250911124448.1771-1-cp0613@linux.alibaba.com> Message-ID: <20250911124448.1771-2-cp0613@linux.alibaba.com> From: Chen Pei This patch has added property definitions related to the riscv trace component, providing a foundation for subsequent driver implementations. The RISC-V Trace Control Interface can be found in [1]. Some principles are as follows: 1. Trace has three types of components: 1.1 Encoder: Collects CPU execution information through the Ingress Port and generates Trace Messages. 1.2 Funnel: Used to integrate multiple trace sources. 1.3 Sink: Used to store trace data. 2. Each hart requires one trace encoder. 3. When there are multiple trace sources, a trace funnel component is needed to integrate them. One trace funnel is required for each cluster. 4. When multiple trace funnels are fed into a single trace sink, multiple levels of trace funnels are required. 5. If there is only one cluster, the trace funnel (Level 0) can be connected directly to the trace sink. Taking [cpu0]-->[encoder0]-->[funnel0]-->[sink0] as an example, the DTS configuration is as follows: encoder0: trace_encoder at 26001000 { compatible = "riscv_trace,encoder-controller"; reg = <0x0 0x26001000 0x0 0x1000>; cpu = <&cpu0>; output_port { port0 { endpoint = <&funnel0>; }; }; }; funnel0: trace_funnel at 26404000 { compatible = "riscv_trace,funnel-controller"; reg = <0x0 0x26404000 0x0 0x1000>; level = <1>; input_port { port0 { endpoint = <&encoder0>; }; }; output_port { port0 { endpoint = <&sink0>; }; }; }; sink0: trace_sink at 26401000 { compatible = "riscv_trace,sink-controller"; reg = <0x0 0x26401000 0x0 0x1000>; input_port { port0 { endpoint = <&funnel0>; }; }; }; Note: The detailed property definition of each component will be provided in the subsequent series of patches. [1] https://github.com/riscv-non-isa/tg-nexus-trace.git Signed-off-by: Chen Pei --- .../riscv/trace/riscv,trace,encoder.yaml | 41 +++++++++++++++++ .../riscv/trace/riscv,trace,funnel.yaml | 46 +++++++++++++++++++ .../riscv/trace/riscv,trace,sink.yaml | 37 +++++++++++++++ 3 files changed, 124 insertions(+) create mode 100644 Documentation/devicetree/bindings/riscv/trace/riscv,trace,encoder.yaml create mode 100644 Documentation/devicetree/bindings/riscv/trace/riscv,trace,funnel.yaml create mode 100644 Documentation/devicetree/bindings/riscv/trace/riscv,trace,sink.yaml diff --git a/Documentation/devicetree/bindings/riscv/trace/riscv,trace,encoder.yaml b/Documentation/devicetree/bindings/riscv/trace/riscv,trace,encoder.yaml new file mode 100644 index 000000000000..e2ec3ce514b2 --- /dev/null +++ b/Documentation/devicetree/bindings/riscv/trace/riscv,trace,encoder.yaml @@ -0,0 +1,41 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/riscv/trace/riscv,trace,encoder.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: RISC-V Trace Encoder Controller + +description: | + riscv trace encoder controller description. + +maintainers: + - Chen Pei + +properties: + compatible: + items: + - const: riscv_trace,encoder-controller + reg: + description: A memory region containing registers for encoder controller + + cpu: + description: CPU identifier associated with this encoder + + ports: + description: Output port definitions + +additionalProperties: true + +examples: + - | + encoder0: trace_encoder at 26001000 { + compatible = "riscv_trace,encoder-controller"; + reg = <0x0 0x26001000 0x0 0x1000>; + cpu = <&cpu0>; + output_port { + port0 { + endpoint = <&funnel0>; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/riscv/trace/riscv,trace,funnel.yaml b/Documentation/devicetree/bindings/riscv/trace/riscv,trace,funnel.yaml new file mode 100644 index 000000000000..5da836997355 --- /dev/null +++ b/Documentation/devicetree/bindings/riscv/trace/riscv,trace,funnel.yaml @@ -0,0 +1,46 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/riscv/trace/riscv,trace,funnel.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: RISC-V Trace Funnel Controller + +description: | + riscv trace funnel controller description. + +maintainers: + - Chen Pei + +properties: + compatible: + items: + - const: riscv_trace,funnel-controller + reg: + description: A memory region containing registers for funnel controller + + ports: + description: Input/Output port definitions + + level: + description: Level of the funnel (e.g., 1 means close to the encoder) + +additionalProperties: true + +examples: + - | + funnel0: trace_funnel at 26404000 { + compatible = "riscv_trace,funnel-controller"; + reg = <0x0 0x26404000 0x0 0x1000>; + level = <1>; + input_port { + port0 { + endpoint = <&encoder0>; + }; + }; + output_port { + port0 { + endpoint = <&sink0>; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/riscv/trace/riscv,trace,sink.yaml b/Documentation/devicetree/bindings/riscv/trace/riscv,trace,sink.yaml new file mode 100644 index 000000000000..b42e65988f31 --- /dev/null +++ b/Documentation/devicetree/bindings/riscv/trace/riscv,trace,sink.yaml @@ -0,0 +1,37 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/riscv/trace/riscv,trace,sink.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: RISC-V Trace Sink Controller + +description: | + riscv trace sink controller description. + +maintainers: + - Chen Pei + +properties: + compatible: + items: + - const: riscv_trace,sink-controller + reg: + description: A memory region containing registers for sink controller + + ports: + description: Input port definitions + +additionalProperties: true + +examples: + - | + sink0: trace_sink at 26401000 { + compatible = "riscv_trace,sink-controller"; + reg = <0x0 0x26401000 0x0 0x1000>; + input_port { + port0 { + endpoint = <&funnel0>; + }; + }; + }; -- 2.49.0 From cp0613 at linux.alibaba.com Thu Sep 11 05:44:48 2025 From: cp0613 at linux.alibaba.com (cp0613 at linux.alibaba.com) Date: Thu, 11 Sep 2025 20:44:48 +0800 Subject: [RFC PATCH 4/4] riscv: trace: Support sink using dma buffer In-Reply-To: <20250911124448.1771-1-cp0613@linux.alibaba.com> References: <20250911124448.1771-1-cp0613@linux.alibaba.com> Message-ID: <20250911124448.1771-5-cp0613@linux.alibaba.com> From: Chen Pei In common SoC systems, the trace data by the sink is usually written to the memory, and the memory needs to be a large block. We have two methods to achieve this. One is based on reserved memory. This method requires pre-isolation of memory and is not flexible enough. Therefore, we chose the second method, which is based on IOMMU to map non-contiguous memory to continuous. When implementing the driver, only the DMA alloc related APIs are needed. Signed-off-by: Chen Pei --- arch/riscv/events/riscv_trace.c | 49 ++++++++++++++++++++++++++++++++- arch/riscv/events/riscv_trace.h | 4 ++- 2 files changed, 51 insertions(+), 2 deletions(-) diff --git a/arch/riscv/events/riscv_trace.c b/arch/riscv/events/riscv_trace.c index 3ac4a3be5d3e..e8deaefa0180 100644 --- a/arch/riscv/events/riscv_trace.c +++ b/arch/riscv/events/riscv_trace.c @@ -9,6 +9,7 @@ #include #include #include +#include #include "riscv_trace.h" @@ -55,6 +56,44 @@ static void riscv_trace_init_filter_attrs(struct perf_event *event) riscv_trace_pmu.filter_attr.priv_mode); } +static int riscv_trace_sink_dma_alloc(unsigned long size) +{ + struct riscv_trace_component *component; + dma_addr_t dma_addr; + + list_for_each_entry(component, &riscv_trace_controllers, list) { + if (component->type == RISCV_TRACE_SINK) { + component->sink.vaddr = + dma_alloc_coherent(riscv_trace_pmu.pmu.dev, size, + &dma_addr, GFP_KERNEL); + if (component->sink.vaddr) { + component->sink.start_addr = dma_addr; + component->sink.limit_addr = dma_addr + size; + continue; + } else { + pr_err("dma_alloc_coherent failed\n"); + return -ENOMEM; + } + } + } + + return 0; +} + +static void riscv_trace_sink_dma_free(void) +{ + struct riscv_trace_component *component; + + list_for_each_entry(component, &riscv_trace_controllers, list) { + if (component->type == RISCV_TRACE_SINK) { + if (component->sink.vaddr) + dma_free_coherent(riscv_trace_pmu.pmu.dev, + component->sink.limit_addr - component->sink.start_addr, + component->sink.vaddr, component->sink.start_addr); + } + } +} + static int riscv_trace_event_init(struct perf_event *event) { if (event->attr.type != riscv_trace_pmu.pmu.type) @@ -105,7 +144,7 @@ static void *riscv_trace_buffer_setup_aux(struct perf_event *event, void **pages { struct riscv_trace_aux_buf *buf; struct page **pagelist; - int i; + int i, ret; if (overwrite) { pr_warn("Overwrite mode is not supported\n"); @@ -135,6 +174,12 @@ static void *riscv_trace_buffer_setup_aux(struct perf_event *event, void **pages pr_info("nr_pages=%d length=%d\n", buf->nr_pages, buf->length); + ret = riscv_trace_sink_dma_alloc(buf->length); + if (ret) { + kfree(pagelist); + goto err; + } + kfree(pagelist); return buf; err: @@ -148,6 +193,8 @@ static void riscv_trace_buffer_free_aux(void *aux) vunmap(buf->base); kfree(buf); + + riscv_trace_sink_dma_free(); } static int __init riscv_trace_init(void) diff --git a/arch/riscv/events/riscv_trace.h b/arch/riscv/events/riscv_trace.h index c28216227006..7819fbeace1f 100644 --- a/arch/riscv/events/riscv_trace.h +++ b/arch/riscv/events/riscv_trace.h @@ -49,7 +49,9 @@ struct riscv_trace_funnel { }; struct riscv_trace_sink { - ; + u64 start_addr; + u64 limit_addr; + void __iomem *vaddr; }; struct riscv_trace_component { -- 2.49.0 From cp0613 at linux.alibaba.com Thu Sep 11 05:44:46 2025 From: cp0613 at linux.alibaba.com (cp0613 at linux.alibaba.com) Date: Thu, 11 Sep 2025 20:44:46 +0800 Subject: [RFC PATCH 2/4] riscv: event: Initial riscv trace driver support In-Reply-To: <20250911124448.1771-1-cp0613@linux.alibaba.com> References: <20250911124448.1771-1-cp0613@linux.alibaba.com> Message-ID: <20250911124448.1771-3-cp0613@linux.alibaba.com> From: Chen Pei This patch implements the riscv trace perf driver. It's suitable for RISC-V processors that implement the trace specification (N-Trace or E-Trace). Users can specify the riscv_trace driver through the perf program. The driver adds two format attributes, start_addr and stop_addr, to specify the valid instruction range for tracing. It also supports specifying the priv_mode (user and kernel) to specify the valid privilege mode for tracing. The reference commands are as follows: cat /sys/bus/event_source/devices/riscv_trace/format/* perf record -e riscv_trace// ls perf record -e riscv_trace/start_addr=0x1234,stop_addr=0x5678/ ls perf record -e riscv_trace/start_addr=0x1234,stop_addr=0x5678/k ls perf report -D Signed-off-by: Chen Pei --- arch/riscv/Kbuild | 1 + arch/riscv/Kconfig | 2 + arch/riscv/events/Kconfig | 11 ++ arch/riscv/events/Makefile | 3 + arch/riscv/events/riscv_trace.c | 145 +++++++++++++++++++++ arch/riscv/events/riscv_trace.h | 123 ++++++++++++++++++ arch/riscv/events/riscv_trace_encoder.c | 109 ++++++++++++++++ arch/riscv/events/riscv_trace_funnel.c | 160 ++++++++++++++++++++++++ arch/riscv/events/riscv_trace_sink.c | 100 +++++++++++++++ 9 files changed, 654 insertions(+) create mode 100644 arch/riscv/events/Kconfig create mode 100644 arch/riscv/events/Makefile create mode 100644 arch/riscv/events/riscv_trace.c create mode 100644 arch/riscv/events/riscv_trace.h create mode 100644 arch/riscv/events/riscv_trace_encoder.c create mode 100644 arch/riscv/events/riscv_trace_funnel.c create mode 100644 arch/riscv/events/riscv_trace_sink.c diff --git a/arch/riscv/Kbuild b/arch/riscv/Kbuild index 126fb738fc44..8107a614c428 100644 --- a/arch/riscv/Kbuild +++ b/arch/riscv/Kbuild @@ -4,6 +4,7 @@ obj-y += kernel/ mm/ net/ obj-$(CONFIG_CRYPTO) += crypto/ obj-y += errata/ obj-$(CONFIG_KVM) += kvm/ +obj-$(CONFIG_PERF_EVENTS) += events/ obj-$(CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY) += purgatory/ diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 51dcd8eaa243..145d3424651b 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -1374,3 +1374,5 @@ endmenu # "CPU Power Management" source "arch/riscv/kvm/Kconfig" source "drivers/acpi/Kconfig" + +source "arch/riscv/events/Kconfig" diff --git a/arch/riscv/events/Kconfig b/arch/riscv/events/Kconfig new file mode 100644 index 000000000000..c6fb073b29b1 --- /dev/null +++ b/arch/riscv/events/Kconfig @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: GPL-2.0 +menu "Performance monitoring" + +config PERF_EVENTS_RISCV_TRACE + tristate "RISCV TRACE events" + depends on PERF_EVENTS && RISCV + default m + help + Include support for riscv trace events. + +endmenu diff --git a/arch/riscv/events/Makefile b/arch/riscv/events/Makefile new file mode 100644 index 000000000000..5014de2847df --- /dev/null +++ b/arch/riscv/events/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_PERF_EVENTS_RISCV_TRACE) += riscv_trace_pmu.o +riscv_trace_pmu-objs := riscv_trace.o riscv_trace_encoder.o riscv_trace_funnel.o riscv_trace_sink.o diff --git a/arch/riscv/events/riscv_trace.c b/arch/riscv/events/riscv_trace.c new file mode 100644 index 000000000000..e408d9a4034a --- /dev/null +++ b/arch/riscv/events/riscv_trace.c @@ -0,0 +1,145 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#define pr_fmt(fmt) KBUILD_BASENAME ": " fmt + +#include +#include +#include +#include +#include +#include + +#include "riscv_trace.h" + +LIST_HEAD(riscv_trace_controllers); +static struct riscv_trace_pmu riscv_trace_pmu; + +PMU_FORMAT_ATTR(start_addr, "config:0-63"); +PMU_FORMAT_ATTR(stop_addr, "config1:0-63"); + +static struct attribute *riscv_trace_filter_attrs[] = { + &format_attr_start_addr.attr, + &format_attr_stop_addr.attr, + NULL, +}; + +static struct attribute_group riscv_trace_filter_attr_group = { + .name = "format", + .attrs = riscv_trace_filter_attrs, +}; + +static const struct attribute_group *riscv_trace_attr_groups[] = { + &riscv_trace_filter_attr_group, + NULL +}; + +static void riscv_trace_init_filter_attrs(struct perf_event *event) +{ + riscv_trace_pmu.filter_attr.start_addr = event->attr.config; + riscv_trace_pmu.filter_attr.stop_addr = event->attr.config1; + + if (event->attr.exclude_kernel) + riscv_trace_pmu.filter_attr.priv_mode = + RISCV_TRACE_PRIV_MODE_EXCL_KERN; + else if (event->attr.exclude_user) + riscv_trace_pmu.filter_attr.priv_mode = + RISCV_TRACE_PRIV_MODE_EXCL_USER; + else + riscv_trace_pmu.filter_attr.priv_mode = + RISCV_TRACE_PRIV_MODE_EXCL_NONE; + + pr_info("start_addr=0x%llx stop_addr=0x%llx priv_mode=%d\n", + riscv_trace_pmu.filter_attr.start_addr, + riscv_trace_pmu.filter_attr.stop_addr, + riscv_trace_pmu.filter_attr.priv_mode); +} + +static int riscv_trace_event_init(struct perf_event *event) +{ + if (event->attr.type != riscv_trace_pmu.pmu.type) + return -ENOENT; + + riscv_trace_init_filter_attrs(event); + + return 0; +} + +static int riscv_trace_event_add(struct perf_event *event, int flags) +{ + pr_info("%s:%d\n", __func__, __LINE__); + // TODO: Configuring the trace component + return 0; +} + +static void riscv_trace_event_del(struct perf_event *event, int flags) +{ + // TODO: Reset the trace component + pr_info("%s:%d\n", __func__, __LINE__); +} + +static void riscv_trace_event_start(struct perf_event *event, int flags) +{ + pr_info("%s:%d on_cpu=%d cpu=%d\n", __func__, __LINE__, + event->oncpu, event->cpu); + // TODO: Enable the trace component +} + +static void riscv_trace_event_stop(struct perf_event *event, int flags) +{ + pr_info("%s:%d on_cpu=%d cpu=%d\n", __func__, __LINE__, + event->oncpu, event->cpu); + // TODO: Disable the trace component +} + +static int __init riscv_trace_init(void) +{ + struct riscv_trace_component *component; + + riscv_trace_encoder_init(); + riscv_trace_funnel_init(); + riscv_trace_sink_init(); + + if (get_list_count(&riscv_trace_controllers) == 0) + return -ENXIO; + + list_for_each_entry(component, &riscv_trace_controllers, list) { + pr_info("type=%s in_num=%d out_num=%d\n", + riscv_trace_type2str(component->type), + component->in_num, component->out_num); + for (int i = 0; i < component->in_num; i++) { + pr_info("\t in[%d] type=%s base_addr=0x%llx\n", i, + riscv_trace_type2str(component->in[i]->type), + component->in[i]->base_addr); + } + for (int j = 0; j < component->out_num; j++) { + pr_info("\t out[%d] type=%s base_addr=0x%llx\n", j, + riscv_trace_type2str(component->out[j]->type), + component->out[j]->base_addr); + } + } + + riscv_trace_pmu.pmu.module = THIS_MODULE, + riscv_trace_pmu.pmu.name = "riscv_trace", + riscv_trace_pmu.pmu.capabilities = PERF_PMU_CAP_EXCLUSIVE | PERF_PMU_CAP_ITRACE; + riscv_trace_pmu.pmu.attr_groups = riscv_trace_attr_groups; + riscv_trace_pmu.pmu.task_ctx_nr = perf_sw_context, + riscv_trace_pmu.pmu.event_init = riscv_trace_event_init; + riscv_trace_pmu.pmu.add = riscv_trace_event_add; + riscv_trace_pmu.pmu.del = riscv_trace_event_del; + riscv_trace_pmu.pmu.start = riscv_trace_event_start; + riscv_trace_pmu.pmu.stop = riscv_trace_event_stop; + + return perf_pmu_register(&riscv_trace_pmu.pmu, "riscv_trace", -1); +} + +static void __exit riscv_trace_exit(void) +{ + perf_pmu_unregister(&riscv_trace_pmu.pmu); +} + +module_init(riscv_trace_init); +module_exit(riscv_trace_exit); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Chen Pei "); +MODULE_DESCRIPTION("Driver for RISC-V Trace Device"); diff --git a/arch/riscv/events/riscv_trace.h b/arch/riscv/events/riscv_trace.h new file mode 100644 index 000000000000..ef0af0d0b2ee --- /dev/null +++ b/arch/riscv/events/riscv_trace.h @@ -0,0 +1,123 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#ifndef __RISCV_TRACE_H__ +#define __RISCV_TRACE_H__ + +#include +#include +#include +#include + +#define RISCV_TRACE_ADDR_MASK GENMASK(63, 0) + +enum RISCV_TRACE_COMPONENT_TYPE { + RISCV_TRACE_ENCODER = 0, + RISCV_TRACE_FUNNEL, + RISCV_TRACE_SINK, +}; + +enum RISCV_TRACE_FUNNEL_LEVEL { + LEVEL1_FUNNEL = 1, + LEVEL2_FUNNEL = 2, +}; + +enum RISCV_TRACE_PRIV_MODE_TYPE { + RISCV_TRACE_PRIV_MODE_EXCL_NONE = 0, + RISCV_TRACE_PRIV_MODE_EXCL_KERN, + RISCV_TRACE_PRIV_MODE_EXCL_USER, +}; + +struct riscv_trace_filter_attr { + u64 start_addr; + u64 stop_addr; + u32 priv_mode; // user&kernel +}; + +struct riscv_io_port { + bool is_input; // input=1, output=0 + u32 endpoint_num; + enum RISCV_TRACE_COMPONENT_TYPE type; + u64 base_addr; +}; + +struct riscv_trace_encoder { + u32 cpu; +}; + +struct riscv_trace_funnel { + enum RISCV_TRACE_FUNNEL_LEVEL level; +}; + +struct riscv_trace_sink { + ; +}; + +struct riscv_trace_component { + enum RISCV_TRACE_COMPONENT_TYPE type; + u64 reg_base; + u64 reg_size; + struct list_head list; + + union { + struct riscv_trace_encoder encoder; + struct riscv_trace_funnel funnel; + struct riscv_trace_sink sink; + }; + + u32 in_num; + u32 out_num; + struct riscv_io_port **in; + struct riscv_io_port **out; +}; + +extern struct list_head riscv_trace_controllers; + +struct riscv_trace_pmu { + struct pmu pmu; + struct riscv_trace_filter_attr filter_attr; +}; + +static inline const char *riscv_trace_type2str(enum RISCV_TRACE_COMPONENT_TYPE + type) +{ + switch (type) { + case RISCV_TRACE_ENCODER: + return "encoder"; + case RISCV_TRACE_FUNNEL: + return "funnel"; + case RISCV_TRACE_SINK: + return "sink"; + default: + return "none"; + } +} + +static inline int count_device_node_child(struct device_node *parent) +{ + struct device_node *child; + int count = 0; + + for_each_child_of_node(parent, child) { + count++; + } + + return count; +} + +static inline int get_list_count(struct list_head *head) +{ + u32 count = 0; + struct list_head *pos; + + list_for_each(pos, head) { + count++; + } + + return count; +} + +int riscv_trace_encoder_init(void); +int riscv_trace_funnel_init(void); +int riscv_trace_sink_init(void); + +#endif /* __RISCV_TRACE_H__ */ diff --git a/arch/riscv/events/riscv_trace_encoder.c b/arch/riscv/events/riscv_trace_encoder.c new file mode 100644 index 000000000000..fb2c37c3561a --- /dev/null +++ b/arch/riscv/events/riscv_trace_encoder.c @@ -0,0 +1,109 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#define pr_fmt(fmt) KBUILD_BASENAME ": " fmt + +#include +#include +#include +#include +#include + +#include "riscv_trace.h" + +static const struct of_device_id riscv_trace_encoder_of_match[] = { + {.compatible = "riscv_trace,encoder-controller", }, + { }, +}; + +int riscv_trace_encoder_init(void) +{ + struct riscv_trace_component *component; + struct device_node *node, *child_node, *port_node; + struct riscv_io_port *io_port; + resource_size_t base, size; + u32 reg[4]; + int port_nr; + int ret; + + for_each_matching_node(node, riscv_trace_encoder_of_match) { + if (!of_device_is_available(node)) { + of_node_put(node); + continue; + } + + component = kzalloc(sizeof(*component), GFP_KERNEL); + if (!component) + return -ENOMEM; + component->type = RISCV_TRACE_ENCODER; + + ret = of_property_read_u32_array(node, "reg", ®[0], 4); + if (ret) { + pr_err("Failed to read 'reg'\n"); + of_node_put(node); + return ret; + } + base = ((resource_size_t) reg[0] << 32) | reg[1]; + size = ((resource_size_t) reg[2] << 32) | reg[3]; + pr_info("base=0x%llx size=0x%llx\n", base, size); + component->reg_base = (u64)ioremap(base, size); + component->reg_size = size; + pr_info("reg_base=0x%llx reg_size=0x%llx\n", + component->reg_base, component->reg_size); + + ret = + of_property_read_u32(node, "cpu", &component->encoder.cpu); + if (ret) { + pr_err("Failed to read 'cpu'\n"); + of_node_put(node); + return ret; + } + pr_info("cpu=%d\n", component->encoder.cpu); + + child_node = of_get_child_by_name(node, "output_port"); + if (!child_node) { + pr_err("Failed to find 'output_port'\n"); + of_node_put(node); + return -ENODEV; + } + component->out_num = count_device_node_child(child_node); + if (component->out_num) { + component->out = + krealloc_array(component->out, component->out_num, + sizeof(*component->out), GFP_KERNEL); + if (!component->out) + return -ENOMEM; + port_nr = 0; + + for_each_child_of_node(child_node, port_node) { + if (!of_device_is_available(port_node)) { + of_node_put(child_node); + continue; + } + pr_info("Found output_port: %pOF\n", port_node); + const struct device_node *endpoint_node = + of_parse_phandle(port_node, "endpoint", 0); + pr_info("\t endpoint: %pOF\n", endpoint_node); + + of_property_read_u32_array((struct device_node + *)endpoint_node, + "reg", ®[0], 4); + + io_port = + kmalloc(sizeof(struct riscv_io_port), + GFP_KERNEL); + io_port->is_input = false; + io_port->endpoint_num = port_nr; + io_port->type = RISCV_TRACE_FUNNEL; + io_port->base_addr = + ((u64) reg[0] << 32) | reg[1]; + component->out[port_nr] = io_port; + port_nr++; + } + } + + INIT_LIST_HEAD(&component->list); + list_add_tail(&component->list, &riscv_trace_controllers); + } + + return ret; +} diff --git a/arch/riscv/events/riscv_trace_funnel.c b/arch/riscv/events/riscv_trace_funnel.c new file mode 100644 index 000000000000..c6d412fd1f90 --- /dev/null +++ b/arch/riscv/events/riscv_trace_funnel.c @@ -0,0 +1,160 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#define pr_fmt(fmt) KBUILD_BASENAME ": " fmt + +#include +#include +#include +#include +#include + +#include "riscv_trace.h" + +static const struct of_device_id riscv_trace_funnel_of_match[] = { + {.compatible = "riscv_trace,funnel-controller", }, + { }, +}; + +int riscv_trace_funnel_init(void) +{ + struct riscv_trace_component *component; + struct device_node *node, *child_node, *port_node; + struct riscv_io_port *io_port; + resource_size_t base, size; + u32 reg[4]; + int port_nr; + int ret; + + for_each_matching_node(node, riscv_trace_funnel_of_match) { + if (!of_device_is_available(node)) { + of_node_put(node); + continue; + } + + component = kzalloc(sizeof(*component), GFP_KERNEL); + if (!component) + return -ENOMEM; + component->type = RISCV_TRACE_FUNNEL; + + ret = of_property_read_u32_array(node, "reg", ®[0], 4); + if (ret) { + pr_err("Failed to read 'reg'\n"); + of_node_put(node); + return ret; + } + base = ((resource_size_t) reg[0] << 32) | reg[1]; + size = ((resource_size_t) reg[2] << 32) | reg[3]; + pr_info("base=0x%llx size=0x%llx\n", base, size); + component->reg_base = (u64)ioremap(base, size); + component->reg_size = size; + pr_info("reg_base=0x%llx reg_size=0x%llx\n", + component->reg_base, component->reg_size); + + ret = + of_property_read_u32(node, "level", + &component->funnel.level); + if (ret) + component->funnel.level = LEVEL1_FUNNEL; + pr_info("funnel level=%d\n", component->funnel.level); + + child_node = of_get_child_by_name(node, "input_port"); + if (!child_node) { + pr_err("Failed to find 'input_port'\n"); + of_node_put(node); + return -ENODEV; + } + component->in_num = count_device_node_child(child_node); + if (component->in_num) { + component->in = + krealloc_array(component->in, component->in_num, + sizeof(*component->in), GFP_KERNEL); + if (!component->in) + return -ENOMEM; + port_nr = 0; + + for_each_child_of_node(child_node, port_node) { + if (!of_device_is_available(port_node)) { + of_node_put(child_node); + continue; + } + pr_info("Found input_port: %pOF\n", port_node); + const struct device_node *endpoint_node = + of_parse_phandle(port_node, "endpoint", 0); + pr_info("\t endpoint: %pOF\n", endpoint_node); + + of_property_read_u32_array((struct device_node + *)endpoint_node, + "reg", ®[0], 4); + + io_port = + kmalloc(sizeof(struct riscv_io_port), + GFP_KERNEL); + io_port->is_input = true; + io_port->endpoint_num = port_nr; + io_port->type = RISCV_TRACE_ENCODER; + io_port->base_addr = + ((u64) reg[0] << 32) | reg[1]; + component->in[port_nr] = io_port; + port_nr++; + } + } + + child_node = of_get_child_by_name(node, "output_port"); + if (!child_node) { + pr_err("Failed to find 'output_port'\n"); + of_node_put(node); + return -ENODEV; + } + component->out_num = count_device_node_child(child_node); + if (component->out_num) { + component->out = + krealloc_array(component->out, component->out_num, + sizeof(*component->out), GFP_KERNEL); + if (!component->out) + return -ENOMEM; + port_nr = 0; + + for_each_child_of_node(child_node, port_node) { + if (!of_device_is_available(port_node)) { + of_node_put(child_node); + continue; + } + pr_info("Found output_port: %pOF\n", port_node); + const struct device_node *endpoint_node = + of_parse_phandle(port_node, "endpoint", 0); + pr_info("\t endpoint: %pOF\n", endpoint_node); + + of_property_read_u32_array((struct device_node + *)endpoint_node, + "reg", ®[0], 4); + + io_port = + kmalloc(sizeof(struct riscv_io_port), + GFP_KERNEL); + io_port->is_input = false; + io_port->endpoint_num = port_nr; + io_port->type = RISCV_TRACE_SINK; + io_port->base_addr = + ((u64) reg[0] << 32) | reg[1]; + component->out[port_nr] = io_port; + port_nr++; + } + } + + for (int i = 0; i < component->in_num; i++) { + pr_info("\t in[%d]is_input=%d endpoint_num=%d\n", i, + component->in[i]->is_input, + component->in[i]->endpoint_num); + } + for (int j = 0; j < component->out_num; j++) { + pr_info("\t out[%d]is_input=%d endpoint_num=%d\n", j, + component->out[j]->is_input, + component->out[j]->endpoint_num); + } + + INIT_LIST_HEAD(&component->list); + list_add_tail(&component->list, &riscv_trace_controllers); + } + + return ret; +} diff --git a/arch/riscv/events/riscv_trace_sink.c b/arch/riscv/events/riscv_trace_sink.c new file mode 100644 index 000000000000..dbdc153d798c --- /dev/null +++ b/arch/riscv/events/riscv_trace_sink.c @@ -0,0 +1,100 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#define pr_fmt(fmt) KBUILD_BASENAME ": " fmt + +#include +#include +#include +#include +#include + +#include "riscv_trace.h" + +static const struct of_device_id riscv_trace_sink_of_match[] = { + {.compatible = "riscv_trace,sink-controller", }, + { }, +}; + +int riscv_trace_sink_init(void) +{ + struct riscv_trace_component *component; + struct device_node *node, *child_node, *port_node; + struct riscv_io_port *io_port; + resource_size_t base, size; + u32 reg[4]; + int port_nr; + int ret; + + for_each_matching_node(node, riscv_trace_sink_of_match) { + if (!of_device_is_available(node)) { + of_node_put(node); + continue; + } + + component = kzalloc(sizeof(*component), GFP_KERNEL); + if (!component) + return -ENOMEM; + component->type = RISCV_TRACE_SINK; + + ret = of_property_read_u32_array(node, "reg", ®[0], 4); + if (ret) { + pr_err("Failed to read 'reg'\n"); + of_node_put(node); + return ret; + } + base = ((resource_size_t) reg[0] << 32) | reg[1]; + size = ((resource_size_t) reg[2] << 32) | reg[3]; + pr_info("base=0x%llx size=0x%llx\n", base, size); + component->reg_base = (u64)ioremap(base, size); + component->reg_size = size; + pr_info("reg_base=0x%llx reg_size=0x%llx\n", + component->reg_base, component->reg_size); + + child_node = of_get_child_by_name(node, "input_port"); + if (!child_node) { + pr_err("Failed to find 'input_port'\n"); + of_node_put(node); + return -ENODEV; + } + component->in_num = count_device_node_child(child_node); + if (component->in_num) { + component->in = + krealloc_array(component->in, component->in_num, + sizeof(*component->in), GFP_KERNEL); + if (!component->in) + return -ENOMEM; + port_nr = 0; + + for_each_child_of_node(child_node, port_node) { + if (!of_device_is_available(port_node)) { + of_node_put(child_node); + continue; + } + pr_info("Found input_port: %pOF\n", port_node); + const struct device_node *endpoint_node = + of_parse_phandle(port_node, "endpoint", 0); + pr_info("\t endpoint: %pOF\n", endpoint_node); + + of_property_read_u32_array((struct device_node + *)endpoint_node, + "reg", ®[0], 4); + + io_port = + kmalloc(sizeof(struct riscv_io_port), + GFP_KERNEL); + io_port->is_input = true; + io_port->endpoint_num = port_nr; + io_port->type = RISCV_TRACE_FUNNEL; + io_port->base_addr = + ((u64) reg[0] << 32) | reg[1]; + component->in[port_nr] = io_port; + port_nr++; + } + } + + INIT_LIST_HEAD(&component->list); + list_add_tail(&component->list, &riscv_trace_controllers); + } + + return ret; +} -- 2.49.0 From cp0613 at linux.alibaba.com Thu Sep 11 05:44:47 2025 From: cp0613 at linux.alibaba.com (cp0613 at linux.alibaba.com) Date: Thu, 11 Sep 2025 20:44:47 +0800 Subject: [RFC PATCH 3/4] tools: perf: Support perf record with aux buffer for riscv trace In-Reply-To: <20250911124448.1771-1-cp0613@linux.alibaba.com> References: <20250911124448.1771-1-cp0613@linux.alibaba.com> Message-ID: <20250911124448.1771-4-cp0613@linux.alibaba.com> From: Chen Pei This patch implements AUXTRACE support for RISC-V Trace. The corresponding driver needs to implement the setup_aux and free_aux PMU driver ops. The aux buffer is a type of ring buffer used in trace scenarios, and RISC-V Trace should also reuse this capability. Signed-off-by: Chen Pei --- arch/riscv/events/riscv_trace.c | 61 ++++++++ arch/riscv/events/riscv_trace.h | 8 + tools/perf/arch/riscv/util/Build | 3 + tools/perf/arch/riscv/util/auxtrace.c | 33 ++++ tools/perf/arch/riscv/util/pmu.c | 18 +++ tools/perf/arch/riscv/util/riscv-trace.c | 183 +++++++++++++++++++++++ tools/perf/arch/riscv/util/tsc.c | 15 ++ tools/perf/util/Build | 1 + tools/perf/util/auxtrace.c | 4 + tools/perf/util/auxtrace.h | 1 + tools/perf/util/riscv-trace.c | 162 ++++++++++++++++++++ tools/perf/util/riscv-trace.h | 18 +++ 12 files changed, 507 insertions(+) create mode 100644 tools/perf/arch/riscv/util/auxtrace.c create mode 100644 tools/perf/arch/riscv/util/pmu.c create mode 100644 tools/perf/arch/riscv/util/riscv-trace.c create mode 100644 tools/perf/arch/riscv/util/tsc.c create mode 100644 tools/perf/util/riscv-trace.c create mode 100644 tools/perf/util/riscv-trace.h diff --git a/arch/riscv/events/riscv_trace.c b/arch/riscv/events/riscv_trace.c index e408d9a4034a..3ac4a3be5d3e 100644 --- a/arch/riscv/events/riscv_trace.c +++ b/arch/riscv/events/riscv_trace.c @@ -8,6 +8,7 @@ #include #include #include +#include #include "riscv_trace.h" @@ -81,6 +82,9 @@ static void riscv_trace_event_start(struct perf_event *event, int flags) { pr_info("%s:%d on_cpu=%d cpu=%d\n", __func__, __LINE__, event->oncpu, event->cpu); + // TODO: Begin aux buffer + // struct xuantie_ntrace_aux_buf *buf; + // buf = perf_aux_output_begin(&riscv_trace_pmu.handle, event); // TODO: Enable the trace component } @@ -89,6 +93,61 @@ static void riscv_trace_event_stop(struct perf_event *event, int flags) pr_info("%s:%d on_cpu=%d cpu=%d\n", __func__, __LINE__, event->oncpu, event->cpu); // TODO: Disable the trace component + // TODO: End aux buffer + // struct xuantie_ntrace_aux_buf *buf; + // buf = perf_get_aux(&riscv_trace_pmu.handle); + // Fill aux buffer + // perf_aux_output_end(&riscv_trace_pmu.handle, size); +} + +static void *riscv_trace_buffer_setup_aux(struct perf_event *event, void **pages, + int nr_pages, bool overwrite) +{ + struct riscv_trace_aux_buf *buf; + struct page **pagelist; + int i; + + if (overwrite) { + pr_warn("Overwrite mode is not supported\n"); + return NULL; + } + + buf = kzalloc(sizeof(*buf), GFP_KERNEL); + if (!buf) + return NULL; + + pagelist = kcalloc(nr_pages, sizeof(*pagelist), GFP_KERNEL); + if (!pagelist) + goto err; + + for (i = 0; i < nr_pages; i++) + pagelist[i] = virt_to_page(pages[i]); + + buf->base = vmap(pagelist, nr_pages, VM_MAP, PAGE_KERNEL); + if (!buf->base) { + kfree(pagelist); + goto err; + } + + buf->nr_pages = nr_pages; + buf->length = nr_pages * PAGE_SIZE; + buf->pos = 0; + + pr_info("nr_pages=%d length=%d\n", buf->nr_pages, buf->length); + + kfree(pagelist); + return buf; +err: + kfree(buf); + return NULL; +} + +static void riscv_trace_buffer_free_aux(void *aux) +{ + struct riscv_trace_aux_buf *buf = aux; + + vunmap(buf->base); + kfree(buf); } static int __init riscv_trace_init(void) @@ -128,6 +187,8 @@ static int __init riscv_trace_init(void) riscv_trace_pmu.pmu.del = riscv_trace_event_del; riscv_trace_pmu.pmu.start = riscv_trace_event_start; riscv_trace_pmu.pmu.stop = riscv_trace_event_stop; + riscv_trace_pmu.pmu.setup_aux = riscv_trace_buffer_setup_aux; + riscv_trace_pmu.pmu.free_aux = riscv_trace_buffer_free_aux; return perf_pmu_register(&riscv_trace_pmu.pmu, "riscv_trace", -1); } diff --git a/arch/riscv/events/riscv_trace.h b/arch/riscv/events/riscv_trace.h index ef0af0d0b2ee..c28216227006 100644 --- a/arch/riscv/events/riscv_trace.h +++ b/arch/riscv/events/riscv_trace.h @@ -75,6 +75,14 @@ extern struct list_head riscv_trace_controllers; struct riscv_trace_pmu { struct pmu pmu; struct riscv_trace_filter_attr filter_attr; + struct perf_output_handle handle; +}; + +struct riscv_trace_aux_buf { + u32 length; + u32 nr_pages; + void *base; + u32 pos; }; static inline const char *riscv_trace_type2str(enum RISCV_TRACE_COMPONENT_TYPE diff --git a/tools/perf/arch/riscv/util/Build b/tools/perf/arch/riscv/util/Build index 58a672246024..d1599b70ef2f 100644 --- a/tools/perf/arch/riscv/util/Build +++ b/tools/perf/arch/riscv/util/Build @@ -1,5 +1,8 @@ perf-util-y += perf_regs.o perf-util-y += header.o +perf-util-y += tsc.o perf-util-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o perf-util-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o + +perf-util-$(CONFIG_AUXTRACE) += pmu.o auxtrace.o riscv-trace.o diff --git a/tools/perf/arch/riscv/util/auxtrace.c b/tools/perf/arch/riscv/util/auxtrace.c new file mode 100644 index 000000000000..51c8dac5ff61 --- /dev/null +++ b/tools/perf/arch/riscv/util/auxtrace.c @@ -0,0 +1,33 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include + +#include "../../../util/auxtrace.h" +#include "../../../util/debug.h" +#include "../../../util/evlist.h" +#include "../../../util/pmu.h" +#include "../../../util/pmus.h" +#include "riscv-trace.h" + +struct auxtrace_record +*auxtrace_record__init(struct evlist *evlist, int *err) +{ + struct perf_pmu *riscv_trace_pmu = NULL; + struct evsel *evsel; + bool found_riscv_trace = false; + + riscv_trace_pmu = perf_pmus__find(RISCV_TRACE_PMU_NAME); + + evlist__for_each_entry(evlist, evsel) { + if (riscv_trace_pmu && evsel->core.attr.type == riscv_trace_pmu->type) + found_riscv_trace = true; + } + + if (found_riscv_trace) + return riscv_trace_recording_init(err, riscv_trace_pmu); + + return NULL; +} diff --git a/tools/perf/arch/riscv/util/pmu.c b/tools/perf/arch/riscv/util/pmu.c new file mode 100644 index 000000000000..921b083c4f6b --- /dev/null +++ b/tools/perf/arch/riscv/util/pmu.c @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include + +#include "riscv-trace.h" +#include "../../../util/pmu.h" + +void perf_pmu__arch_init(struct perf_pmu *pmu) +{ +#ifdef HAVE_AUXTRACE_SUPPORT + if (!strcmp(pmu->name, RISCV_TRACE_PMU_NAME)) { + pmu->auxtrace = true; + pmu->selectable = true; + } +#endif +} diff --git a/tools/perf/arch/riscv/util/riscv-trace.c b/tools/perf/arch/riscv/util/riscv-trace.c new file mode 100644 index 000000000000..0632f1f43c15 --- /dev/null +++ b/tools/perf/arch/riscv/util/riscv-trace.c @@ -0,0 +1,183 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include +#include + +#include // page_size +#include "../../../util/auxtrace.h" +#include "../../../util/cpumap.h" +#include "../../../util/debug.h" +#include "../../../util/event.h" +#include "../../../util/evlist.h" +#include "../../../util/evsel.h" +#include "../../../util/pmu.h" +#include "../../../util/record.h" +#include "../../../util/session.h" +#include "../../../util/tsc.h" +#include "../../../util/riscv-trace.h" + +#define KiB(x) ((x) * 1024) +#define MiB(x) ((x) * 1024 * 1024) + +struct riscv_trace_recording { + struct auxtrace_record itr; + struct perf_pmu *riscv_trace_pmu; + struct evlist *evlist; +}; + +static size_t +riscv_trace_info_priv_size(struct auxtrace_record *itr __maybe_unused, + struct evlist *evlist __maybe_unused) +{ + return RISCV_TRACE_AUXTRACE_PRIV_SIZE; +} + +static int riscv_trace_info_fill(struct auxtrace_record *itr, + struct perf_session *session, + struct perf_record_auxtrace_info *auxtrace_info, + size_t priv_size) +{ + struct riscv_trace_recording *pttr = + container_of(itr, struct riscv_trace_recording, itr); + struct perf_pmu *riscv_trace_pmu = pttr->riscv_trace_pmu; + + if (priv_size != RISCV_TRACE_AUXTRACE_PRIV_SIZE) + return -EINVAL; + + if (!session->evlist->core.nr_mmaps) + return -EINVAL; + + auxtrace_info->type = PERF_AUXTRACE_RISCV_TRACE; + auxtrace_info->priv[0] = riscv_trace_pmu->type; + + return 0; +} + +static int riscv_trace_set_auxtrace_mmap_page(struct record_opts *opts) +{ + bool privileged = perf_event_paranoid_check(-1); + + if (!opts->full_auxtrace) + return 0; + + if (opts->full_auxtrace && !opts->auxtrace_mmap_pages) { + if (privileged) { + opts->auxtrace_mmap_pages = MiB(16) / page_size; + } else { + opts->auxtrace_mmap_pages = KiB(128) / page_size; + if (opts->mmap_pages == UINT_MAX) + opts->mmap_pages = KiB(256) / page_size; + } + } + + /* Validate auxtrace_mmap_pages */ + if (opts->auxtrace_mmap_pages) { + size_t sz = opts->auxtrace_mmap_pages * (size_t)page_size; + size_t min_sz = KiB(8); + + if (sz < min_sz || !is_power_of_2(sz)) { + pr_err("Invalid mmap size for riscv trace: must be at least %zuKiB and a power of 2\n", + min_sz / 1024); + return -EINVAL; + } + } + + return 0; +} + +static int riscv_trace_recording_options(struct auxtrace_record *itr, + struct evlist *evlist, + struct record_opts *opts) +{ + struct riscv_trace_recording *pttr = + container_of(itr, struct riscv_trace_recording, itr); + struct perf_pmu *riscv_trace_pmu = pttr->riscv_trace_pmu; + struct evsel *evsel, *riscv_trace_evsel = NULL; + struct evsel *tracking_evsel; + int err; + + pttr->evlist = evlist; + evlist__for_each_entry(evlist, evsel) { + if (evsel->core.attr.type == riscv_trace_pmu->type) { + if (riscv_trace_evsel) { + pr_err("There may be only one " RISCV_TRACE_PMU_NAME "x event\n"); + return -EINVAL; + } + evsel->core.attr.freq = 0; + evsel->core.attr.sample_period = 1; + evsel->needs_auxtrace_mmap = true; + riscv_trace_evsel = evsel; + opts->full_auxtrace = true; + } + } + + err = riscv_trace_set_auxtrace_mmap_page(opts); + if (err) + return err; + /* + * To obtain the auxtrace buffer file descriptor, the auxtrace event + * must come first. + */ + evlist__to_front(evlist, riscv_trace_evsel); + evsel__set_sample_bit(riscv_trace_evsel, TIME); + + /* Add dummy event to keep tracking */ + err = parse_event(evlist, "dummy:u"); + if (err) + return err; + + tracking_evsel = evlist__last(evlist); + evlist__set_tracking_event(evlist, tracking_evsel); + + tracking_evsel->core.attr.freq = 0; + tracking_evsel->core.attr.sample_period = 1; + evsel__set_sample_bit(tracking_evsel, TIME); + + return 0; +} + +static u64 riscv_trace_reference(struct auxtrace_record *itr __maybe_unused) +{ + return rdtsc(); +} + +static void riscv_trace_recording_free(struct auxtrace_record *itr) +{ + struct riscv_trace_recording *pttr = + container_of(itr, struct riscv_trace_recording, itr); + + free(pttr); +} + +struct auxtrace_record *riscv_trace_recording_init(int *err, + struct perf_pmu *riscv_trace_pmu) +{ + struct riscv_trace_recording *pttr; + + if (!riscv_trace_pmu) { + *err = -ENODEV; + return NULL; + } + + pttr = zalloc(sizeof(*pttr)); + if (!pttr) { + *err = -ENOMEM; + return NULL; + } + + pttr->riscv_trace_pmu = riscv_trace_pmu; + pttr->itr.recording_options = riscv_trace_recording_options; + pttr->itr.info_priv_size = riscv_trace_info_priv_size; + pttr->itr.info_fill = riscv_trace_info_fill; + pttr->itr.free = riscv_trace_recording_free; + pttr->itr.reference = riscv_trace_reference; + pttr->itr.read_finish = auxtrace_record__read_finish; + pttr->itr.alignment = 0; + + *err = 0; + return &pttr->itr; +} diff --git a/tools/perf/arch/riscv/util/tsc.c b/tools/perf/arch/riscv/util/tsc.c new file mode 100644 index 000000000000..cf021e423f79 --- /dev/null +++ b/tools/perf/arch/riscv/util/tsc.c @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include + +#include "../../../util/tsc.h" + +u64 rdtsc(void) +{ + u64 val; + + // https://lore.kernel.org/all/YxIzgYP3MujXdqwj at aurel32.net/T/ + asm volatile("rdtime %0" : "=r"(val)); + + return val; +} diff --git a/tools/perf/util/Build b/tools/perf/util/Build index 4959e7a990e4..4726a100a156 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -136,6 +136,7 @@ perf-util-$(CONFIG_AUXTRACE) += arm-spe-decoder/ perf-util-$(CONFIG_AUXTRACE) += hisi-ptt.o perf-util-$(CONFIG_AUXTRACE) += hisi-ptt-decoder/ perf-util-$(CONFIG_AUXTRACE) += s390-cpumsf.o +perf-util-$(CONFIG_AUXTRACE) += riscv-trace.o ifdef CONFIG_LIBOPENCSD perf-util-$(CONFIG_AUXTRACE) += cs-etm.o diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index ebd32f1b8f12..b4d8f3c5ebb1 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -54,6 +54,7 @@ #include "arm-spe.h" #include "hisi-ptt.h" #include "s390-cpumsf.h" +#include "riscv-trace.h" #include "util/mmap.h" #include @@ -1393,6 +1394,9 @@ int perf_event__process_auxtrace_info(struct perf_session *session, case PERF_AUXTRACE_HISI_PTT: err = hisi_ptt_process_auxtrace_info(event, session); break; + case PERF_AUXTRACE_RISCV_TRACE: + err = riscv_trace_process_auxtrace_info(event, session); + break; case PERF_AUXTRACE_UNKNOWN: default: return -EINVAL; diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index f001cbb68f8e..5b7ce4a99709 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -50,6 +50,7 @@ enum auxtrace_type { PERF_AUXTRACE_ARM_SPE, PERF_AUXTRACE_S390_CPUMSF, PERF_AUXTRACE_HISI_PTT, + PERF_AUXTRACE_RISCV_TRACE, }; enum itrace_period_type { diff --git a/tools/perf/util/riscv-trace.c b/tools/perf/util/riscv-trace.c new file mode 100644 index 000000000000..c9bc3f6a7857 --- /dev/null +++ b/tools/perf/util/riscv-trace.c @@ -0,0 +1,162 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "auxtrace.h" +#include "color.h" +#include "debug.h" +#include "evsel.h" +#include "riscv-trace.h" +#include "machine.h" +#include "session.h" +#include "tool.h" +#include + +struct riscv_trace { + struct auxtrace auxtrace; + u32 auxtrace_type; + struct perf_session *session; + struct machine *machine; + u32 pmu_type; +}; + +static void riscv_trace_dump(struct riscv_trace *trace __maybe_unused, + unsigned char *buf, size_t len) +{ + + const char *color = PERF_COLOR_BLUE; + + color_fprintf(stdout, color, ". ... %s: buf=%p len=%zubytes\n", __func__, buf, len); + for (size_t i = 0; i < len; i++) + printf("%02x ", buf[i]); +} + +static void riscv_trace_dump_event(struct riscv_trace *trace, unsigned char *buf, + size_t len) +{ + printf(".\n"); + + riscv_trace_dump(trace, buf, len); +} + +static int riscv_trace_process_event(struct perf_session *session __maybe_unused, + union perf_event *event __maybe_unused, + struct perf_sample *sample __maybe_unused, + const struct perf_tool *tool __maybe_unused) +{ + return 0; +} + +static int riscv_trace_process_auxtrace_event(struct perf_session *session, + union perf_event *event, + const struct perf_tool *tool __maybe_unused) +{ + struct riscv_trace *trace = container_of(session->auxtrace, struct riscv_trace, + auxtrace); + int fd = perf_data__fd(session->data); + int size = event->auxtrace.size; + void *data = malloc(size); + off_t data_offset; + int err; + + if (!data) + return -errno; + + if (perf_data__is_pipe(session->data)) { + data_offset = 0; + } else { + data_offset = lseek(fd, 0, SEEK_CUR); + if (data_offset == -1) { + free(data); + return -errno; + } + } + + err = readn(fd, data, size); + if (err != (ssize_t)size) { + free(data); + return -errno; + } + + if (dump_trace) + riscv_trace_dump_event(trace, data, size); + + free(data); + return 0; +} + +static int riscv_trace_flush(struct perf_session *session __maybe_unused, + const struct perf_tool *tool __maybe_unused) +{ + return 0; +} + +static void riscv_trace_free_events(struct perf_session *session __maybe_unused) +{ +} + +static void riscv_trace_free(struct perf_session *session) +{ + struct riscv_trace *trace = container_of(session->auxtrace, struct riscv_trace, + auxtrace); + + session->auxtrace = NULL; + free(trace); +} + +static bool riscv_trace_evsel_is_auxtrace(struct perf_session *session, + struct evsel *evsel) +{ + struct riscv_trace *trace = container_of(session->auxtrace, struct riscv_trace, auxtrace); + + return evsel->core.attr.type == trace->pmu_type; +} + +static void riscv_trace_print_info(__u64 type) +{ + if (!dump_trace) + return; + + fprintf(stdout, " PMU Type %" PRId64 "\n", (s64) type); +} + +int riscv_trace_process_auxtrace_info(union perf_event *event, + struct perf_session *session) +{ + struct perf_record_auxtrace_info *auxtrace_info = &event->auxtrace_info; + struct riscv_trace *trace; + + if (auxtrace_info->header.size < RISCV_TRACE_AUXTRACE_PRIV_SIZE + + sizeof(struct perf_record_auxtrace_info)) + return -EINVAL; + + trace = zalloc(sizeof(*trace)); + if (!trace) + return -ENOMEM; + + trace->session = session; + trace->machine = &session->machines.host; /* No kvm support */ + trace->auxtrace_type = auxtrace_info->type; + trace->pmu_type = auxtrace_info->priv[0]; + + trace->auxtrace.process_event = riscv_trace_process_event; + trace->auxtrace.process_auxtrace_event = riscv_trace_process_auxtrace_event; + trace->auxtrace.flush_events = riscv_trace_flush; + trace->auxtrace.free_events = riscv_trace_free_events; + trace->auxtrace.free = riscv_trace_free; + trace->auxtrace.evsel_is_auxtrace = riscv_trace_evsel_is_auxtrace; + session->auxtrace = &trace->auxtrace; + + riscv_trace_print_info(auxtrace_info->priv[0]); + + return 0; +} diff --git a/tools/perf/util/riscv-trace.h b/tools/perf/util/riscv-trace.h new file mode 100644 index 000000000000..4901ea323b77 --- /dev/null +++ b/tools/perf/util/riscv-trace.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef INCLUDE__PERF_RISCV_TRACE_H__ +#define INCLUDE__PERF_RISCV_TRACE_H__ + +#define RISCV_TRACE_PMU_NAME "riscv_trace" +#define RISCV_TRACE_AUXTRACE_PRIV_SIZE sizeof(u64) + +union perf_event; +struct perf_session; +struct perf_pmu; + +struct auxtrace_record *riscv_trace_recording_init(int *err, + struct perf_pmu *riscv_ntrace_pmu); + +int riscv_trace_process_auxtrace_info(union perf_event *event, + struct perf_session *session); + +#endif -- 2.49.0 From david at redhat.com Thu Sep 11 06:09:22 2025 From: david at redhat.com (David Hildenbrand) Date: Thu, 11 Sep 2025 15:09:22 +0200 Subject: [PATCH v11 1/5] mm: softdirty: Add pgtable_soft_dirty_supported() In-Reply-To: <20250911095602.1130290-2-zhangchunyan@iscas.ac.cn> References: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> <20250911095602.1130290-2-zhangchunyan@iscas.ac.cn> Message-ID: <9bcaf3ec-c0a1-4ca5-87aa-f84e297d1e42@redhat.com> On 11.09.25 11:55, Chunyan Zhang wrote: > Some platforms can customize the PTE PMD entry soft-dirty bit making it > unavailable even if the architecture provides the resource. > > Add an API which architectures can define their specific implementations > to detect if soft-dirty bit is available on which device the kernel is > running. Thinking to myself: maybe pgtable_supports_soft_dirty() would read better Whatever you prefer. > > Signed-off-by: Chunyan Zhang > --- > fs/proc/task_mmu.c | 17 ++++++++++++++++- > include/linux/pgtable.h | 12 ++++++++++++ > mm/debug_vm_pgtable.c | 10 +++++----- > mm/huge_memory.c | 13 +++++++------ > mm/internal.h | 2 +- > mm/mremap.c | 13 +++++++------ > mm/userfaultfd.c | 10 ++++------ > 7 files changed, 52 insertions(+), 25 deletions(-) > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index 29cca0e6d0ff..9e8083b6d4cd 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -1058,7 +1058,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) > * -Werror=unterminated-string-initialization warning > * with GCC 15 > */ > - static const char mnemonics[BITS_PER_LONG][3] = { > + static char mnemonics[BITS_PER_LONG][3] = { > /* > * In case if we meet a flag we don't know about. > */ > @@ -1129,6 +1129,16 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) > [ilog2(VM_SEALED)] = "sl", > #endif > }; > +/* > + * We should remove the VM_SOFTDIRTY flag if the soft-dirty bit is > + * unavailable on which the kernel is running, even if the architecture > + * provides the resource and soft-dirty is compiled in. > + */ > +#ifdef CONFIG_MEM_SOFT_DIRTY > + if (!pgtable_soft_dirty_supported()) > + mnemonics[ilog2(VM_SOFTDIRTY)][0] = 0; > +#endif You can now drop the ifdef. But, I wonder if could we instead just stop setting the flag. Then we don't have to worry about any VM_SOFTDIRTY checks. Something like the following diff --git a/include/linux/mm.h b/include/linux/mm.h index 892fe5dbf9de0..8b8bf63a32ef7 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -783,6 +783,7 @@ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) static inline void vm_flags_init(struct vm_area_struct *vma, vm_flags_t flags) { + VM_WARN_ON_ONCE(!pgtable_soft_dirty_supported() && (flags & VM_SOFTDIRTY)); ACCESS_PRIVATE(vma, __vm_flags) = flags; } @@ -801,6 +802,7 @@ static inline void vm_flags_reset(struct vm_area_struct *vma, static inline void vm_flags_reset_once(struct vm_area_struct *vma, vm_flags_t flags) { + VM_WARN_ON_ONCE(!pgtable_soft_dirty_supported() && (flags & VM_SOFTDIRTY)); vma_assert_write_locked(vma); WRITE_ONCE(ACCESS_PRIVATE(vma, __vm_flags), flags); } @@ -808,6 +810,7 @@ static inline void vm_flags_reset_once(struct vm_area_struct *vma, static inline void vm_flags_set(struct vm_area_struct *vma, vm_flags_t flags) { + VM_WARN_ON_ONCE(!pgtable_soft_dirty_supported() && (flags & VM_SOFTDIRTY)); vma_start_write(vma); ACCESS_PRIVATE(vma, __vm_flags) |= flags; } diff --git a/mm/mmap.c b/mm/mmap.c index 5fd3b80fda1d5..40cb3fbf9a247 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1451,8 +1451,10 @@ static struct vm_area_struct *__install_special_mapping( return ERR_PTR(-ENOMEM); vma_set_range(vma, addr, addr + len, 0); - vm_flags_init(vma, (vm_flags | mm->def_flags | - VM_DONTEXPAND | VM_SOFTDIRTY) & ~VM_LOCKED_MASK); + vm_flags |= mm->def_flags | VM_DONTEXPAND; + if (pgtable_soft_dirty_supported()) + vm_flags |= VM_SOFTDIRTY; + vm_flags_init(vma, vm_flags & ~VM_LOCKED_MASK); vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); vma->vm_ops = ops; diff --git a/mm/vma.c b/mm/vma.c index abe0da33c8446..16a1ed2a6199c 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -2551,7 +2551,8 @@ static void __mmap_complete(struct mmap_state *map, struct vm_area_struct *vma) * then new mapped in-place (which must be aimed as * a completely new data area). */ - vm_flags_set(vma, VM_SOFTDIRTY); + if (pgtable_soft_dirty_supported()) + vm_flags_set(vma, VM_SOFTDIRTY); vma_set_page_prot(vma); } @@ -2819,7 +2820,8 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *vma, mm->data_vm += len >> PAGE_SHIFT; if (vm_flags & VM_LOCKED) mm->locked_vm += (len >> PAGE_SHIFT); - vm_flags_set(vma, VM_SOFTDIRTY); + if (pgtable_soft_dirty_supported()) + vm_flags_set(vma, VM_SOFTDIRTY); return 0; mas_store_fail: diff --git a/mm/vma_exec.c b/mm/vma_exec.c index 922ee51747a68..c06732a5a620a 100644 --- a/mm/vma_exec.c +++ b/mm/vma_exec.c @@ -107,6 +107,7 @@ int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift) int create_init_stack_vma(struct mm_struct *mm, struct vm_area_struct **vmap, unsigned long *top_mem_p) { + unsigned long flags = VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP; int err; struct vm_area_struct *vma = vm_area_alloc(mm); @@ -137,7 +138,9 @@ int create_init_stack_vma(struct mm_struct *mm, struct vm_area_struct **vmap, BUILD_BUG_ON(VM_STACK_FLAGS & VM_STACK_INCOMPLETE_SETUP); vma->vm_end = STACK_TOP_MAX; vma->vm_start = vma->vm_end - PAGE_SIZE; - vm_flags_init(vma, VM_SOFTDIRTY | VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP); + if (pgtable_soft_dirty_supported()) + flags |= VM_SOFTDIRTY; + vm_flags_init(vma, flags); vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); err = insert_vm_struct(mm, vma); > + > size_t i; > > seq_puts(m, "VmFlags: "); > @@ -1531,6 +1541,8 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, > static inline void clear_soft_dirty(struct vm_area_struct *vma, > unsigned long addr, pte_t *pte) > { > + if (!pgtable_soft_dirty_supported()) > + return; > /* > * The soft-dirty tracker uses #PF-s to catch writes > * to pages, so write-protect the pte as well. See the > @@ -1566,6 +1578,9 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > { > pmd_t old, pmd = *pmdp; > > + if (!pgtable_soft_dirty_supported()) > + return; > + > if (pmd_present(pmd)) { > /* See comment in change_huge_pmd() */ > old = pmdp_invalidate(vma, addr, pmdp); That would all be handled with the above never-set-VM_SOFTDIRTY. > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index 4c035637eeb7..2a3578a4ae4c 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -1537,6 +1537,18 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) > #define arch_start_context_switch(prev) do {} while (0) > #endif > > +/* > + * Some platforms can customize the PTE soft-dirty bit making it unavailable > + * even if the architecture provides the resource. > + * Adding this API allows architectures to add their own checks for the > + * devices on which the kernel is running. > + * Note: When overiding it, please make sure the CONFIG_MEM_SOFT_DIRTY > + * is part of this macro. > + */ > +#ifndef pgtable_soft_dirty_supported > +#define pgtable_soft_dirty_supported() IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) > +#endif > + > #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY > #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION > static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > index 830107b6dd08..b32ce2b0b998 100644 > --- a/mm/debug_vm_pgtable.c > +++ b/mm/debug_vm_pgtable.c > @@ -690,7 +690,7 @@ static void __init pte_soft_dirty_tests(struct pgtable_debug_args *args) > { > pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); > > - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > + if (!pgtable_soft_dirty_supported()) > return; > > pr_debug("Validating PTE soft dirty\n"); > @@ -702,7 +702,7 @@ static void __init pte_swap_soft_dirty_tests(struct pgtable_debug_args *args) > { > pte_t pte; > > - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > + if (!pgtable_soft_dirty_supported()) > return; > > pr_debug("Validating PTE swap soft dirty\n"); > @@ -718,7 +718,7 @@ static void __init pmd_soft_dirty_tests(struct pgtable_debug_args *args) > { > pmd_t pmd; > > - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > + if (!pgtable_soft_dirty_supported()) > return; > > if (!has_transparent_hugepage()) > @@ -734,8 +734,8 @@ static void __init pmd_swap_soft_dirty_tests(struct pgtable_debug_args *args) > { > pmd_t pmd; > > - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || > - !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) > + if (!pgtable_soft_dirty_supported() || > + !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) > return; > > if (!has_transparent_hugepage()) > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 9c38a95e9f09..218d430a2ec6 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2271,12 +2271,13 @@ static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl, > > static pmd_t move_soft_dirty_pmd(pmd_t pmd) > { > -#ifdef CONFIG_MEM_SOFT_DIRTY > - if (unlikely(is_pmd_migration_entry(pmd))) > - pmd = pmd_swp_mksoft_dirty(pmd); > - else if (pmd_present(pmd)) > - pmd = pmd_mksoft_dirty(pmd); > -#endif > + if (pgtable_soft_dirty_supported()) { > + if (unlikely(is_pmd_migration_entry(pmd))) > + pmd = pmd_swp_mksoft_dirty(pmd); > + else if (pmd_present(pmd)) > + pmd = pmd_mksoft_dirty(pmd); > + } > + Wondering, should simply the arch take care of that and we can just clal pmd_swp_mksoft_dirty / pmd_mksoft_dirty? > return pmd; > } > > diff --git a/mm/internal.h b/mm/internal.h > index 45b725c3dc03..c6ca62f8ecf3 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -1538,7 +1538,7 @@ static inline bool vma_soft_dirty_enabled(struct vm_area_struct *vma) > * VM_SOFTDIRTY is defined as 0x0, then !(vm_flags & VM_SOFTDIRTY) > * will be constantly true. > */ > - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > + if (!pgtable_soft_dirty_supported()) > return false; > That should be handled with the above never-set-VM_SOFTDIRTY. > /* > diff --git a/mm/mremap.c b/mm/mremap.c > index e618a706aff5..7beb3114dbf5 100644 > --- a/mm/mremap.c > +++ b/mm/mremap.c > @@ -162,12 +162,13 @@ static pte_t move_soft_dirty_pte(pte_t pte) > * Set soft dirty bit so we can notice > * in userspace the ptes were moved. > */ > -#ifdef CONFIG_MEM_SOFT_DIRTY > - if (pte_present(pte)) > - pte = pte_mksoft_dirty(pte); > - else if (is_swap_pte(pte)) > - pte = pte_swp_mksoft_dirty(pte); > -#endif > + if (pgtable_soft_dirty_supported()) { > + if (pte_present(pte)) > + pte = pte_mksoft_dirty(pte); > + else if (is_swap_pte(pte)) > + pte = pte_swp_mksoft_dirty(pte); > + } > + > return pte; > } > -- Cheers David / dhildenb From conor at kernel.org Thu Sep 11 06:14:33 2025 From: conor at kernel.org (Conor Dooley) Date: Thu, 11 Sep 2025 14:14:33 +0100 Subject: [PATCH v2] RISC-V: re-enable gcc + rust builds In-Reply-To: <6bceca9d-44cd-4373-a456-7c2129b418e3@gmail.com> References: <20250909-gcc-rust-v2-v2-1-35e086b1b255@gmail.com> <20250910-harmless-bamboo-ebc94758fdad@spud> <6bceca9d-44cd-4373-a456-7c2129b418e3@gmail.com> Message-ID: <20250911-reprogram-conductor-f02af5f6d03e@spud> On Thu, Sep 11, 2025 at 12:46:01PM +0800, Asuna wrote: > On 9/10/25 10:27 PM, Conor Dooley wrote: > > FWIW, this --- breaks git, and anything after this line (including your > > signoff) is lost when the patch is applied. > > I used b4 command to prepare and send the cover letter and patch for v2, not > sure what happened. Dunno. Maybe while editing your commit message you omitted the signoff somehow? I don't use b4-submit, so I don't know how it formats stuff. If it inserted the --- and what was below it was your intended cover letter, your patch itself might be missing the signoff? > > I see that other people's patches have a [PATCH 0/n] email as a start that > describes their patch series, this is called a cover-letter in b4 and > git-send-email right? Yes it is. Not really needed if you only have one patch though. > > The riscv patchwork CI stuff is really unhappy with this change: > > init/Kconfig:87: syntax error > > init/Kconfig:87: invalid statement > > init/Kconfig:88: invalid statement > > init/Kconfig:89:warning: ignoring unsupported character '`' > > init/Kconfig:89:warning: ignoring unsupported character '`' > > init/Kconfig:89:warning: ignoring unsupported character '.' > > init/Kconfig:89: unknown statement "This" > > > > Is this bogus, or can rustc-bindgen-libclang-version return nothing > > under some conditions where rust is not available? > > Should this have 2 default lines like some other options in the file? > > This is because rustc-bindgen-libclang-version can't find the bindgen and > returns nothing. Sorry I forgot to mention this, it's another reason why I > wanted to separate the script, in a separate script we can easily fallback > to return 0 when an error is encountered. > > Adding a second line `default 0` doesn't work, I'll try to fix it. BTW, when > I fix it, if the diff isn't too large, do I need to open a v3 patch, or > simply replying to the thread just fine? Feel free to reply with the diff if you're looking to discuss the implantation, but for the sake of the various bits of automation (patchwork, ci bots etc) please submit a v3 when you're happy with what you've produced. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From cuiyunhui at bytedance.com Thu Sep 11 06:24:04 2025 From: cuiyunhui at bytedance.com (yunhui cui) Date: Thu, 11 Sep 2025 21:24:04 +0800 Subject: [External] [PATCH v5 18/21] RISC-V: perf: Add Qemu virt machine events In-Reply-To: <20250327-counter_delegation-v5-18-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> <20250327-counter_delegation-v5-18-1ee538468d1b@rivosinc.com> Message-ID: Hi Atish, On Fri, Mar 28, 2025 at 3:46?AM Atish Patra wrote: > > Qemu virt machine supports a very minimal set of legacy perf events. > Add them to the vendor table so that users can use them when > counter delegation is enabled. > > Signed-off-by: Atish Patra > --- > arch/riscv/include/asm/vendorid_list.h | 4 ++++ > drivers/perf/riscv_pmu_dev.c | 36 ++++++++++++++++++++++++++++++++++ > 2 files changed, 40 insertions(+) > > diff --git a/arch/riscv/include/asm/vendorid_list.h b/arch/riscv/include/asm/vendorid_list.h > index a5150cdf34d8..0eefc844923e 100644 > --- a/arch/riscv/include/asm/vendorid_list.h > +++ b/arch/riscv/include/asm/vendorid_list.h > @@ -10,4 +10,8 @@ > #define SIFIVE_VENDOR_ID 0x489 > #define THEAD_VENDOR_ID 0x5b7 > > +#define QEMU_VIRT_VENDOR_ID 0x000 > +#define QEMU_VIRT_IMPL_ID 0x000 > +#define QEMU_VIRT_ARCH_ID 0x000 > + > #endif > diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c > index 8a079949e3a4..cd2ac4cf34f1 100644 > --- a/drivers/perf/riscv_pmu_dev.c > +++ b/drivers/perf/riscv_pmu_dev.c > @@ -26,6 +26,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -391,7 +392,42 @@ struct riscv_vendor_pmu_events { > .hw_event_map = _hw_event_map, .cache_event_map = _cache_event_map, \ > .attrs_events = _attrs }, > > +/* QEMU virt PMU events */ > +static const struct riscv_pmu_event qemu_virt_hw_event_map[PERF_COUNT_HW_MAX] = { > + PERF_MAP_ALL_UNSUPPORTED, > + [PERF_COUNT_HW_CPU_CYCLES] = {0x01, 0xFFFFFFF8}, > + [PERF_COUNT_HW_INSTRUCTIONS] = {0x02, 0xFFFFFFF8} > +}; > + > +static const struct riscv_pmu_event qemu_virt_cache_event_map[PERF_COUNT_HW_CACHE_MAX] > + [PERF_COUNT_HW_CACHE_OP_MAX] > + [PERF_COUNT_HW_CACHE_RESULT_MAX] = { > + PERF_CACHE_MAP_ALL_UNSUPPORTED, > + [C(DTLB)][C(OP_READ)][C(RESULT_MISS)] = {0x10019, 0xFFFFFFF8}, > + [C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)] = {0x1001B, 0xFFFFFFF8}, > + > + [C(ITLB)][C(OP_READ)][C(RESULT_MISS)] = {0x10021, 0xFFFFFFF8}, > +}; > + > +RVPMU_EVENT_CMASK_ATTR(cycles, cycles, 0x01, 0xFFFFFFF8); > +RVPMU_EVENT_CMASK_ATTR(instructions, instructions, 0x02, 0xFFFFFFF8); > +RVPMU_EVENT_CMASK_ATTR(dTLB-load-misses, dTLB_load_miss, 0x10019, 0xFFFFFFF8); > +RVPMU_EVENT_CMASK_ATTR(dTLB-store-misses, dTLB_store_miss, 0x1001B, 0xFFFFFFF8); > +RVPMU_EVENT_CMASK_ATTR(iTLB-load-misses, iTLB_load_miss, 0x10021, 0xFFFFFFF8); If other vendors intend to define it, would that throw a duplicate definition error? > + > +static struct attribute *qemu_virt_event_group[] = { > + RVPMU_EVENT_ATTR_PTR(cycles), > + RVPMU_EVENT_ATTR_PTR(instructions), > + RVPMU_EVENT_ATTR_PTR(dTLB_load_miss), > + RVPMU_EVENT_ATTR_PTR(dTLB_store_miss), > + RVPMU_EVENT_ATTR_PTR(iTLB_load_miss), > + NULL, > +}; > + > static struct riscv_vendor_pmu_events pmu_vendor_events_table[] = { > + RISCV_VENDOR_PMU_EVENTS(QEMU_VIRT_VENDOR_ID, QEMU_VIRT_ARCH_ID, QEMU_VIRT_IMPL_ID, > + qemu_virt_hw_event_map, qemu_virt_cache_event_map, > + qemu_virt_event_group) > }; > > const struct riscv_pmu_event *current_pmu_hw_event_map; > > -- > 2.43.0 > > Thanks, Yunhui From pinkesh.vaghela at einfochips.com Thu Sep 11 07:13:09 2025 From: pinkesh.vaghela at einfochips.com (Pinkesh Vaghela) Date: Thu, 11 Sep 2025 14:13:09 +0000 Subject: [PATCH v5 0/6] Basic device tree support for ESWIN EIC7700 RISC-V SoC In-Reply-To: References: <20250825132427.1618089-1-pinkesh.vaghela@einfochips.com> Message-ID: Hello Arnd, Gentle reminder. Please let me know if I need to follow any other steps. Regards, Pinkesh On Mon, Aug 25, 2025 at 06:59 PM, Pinkesh Vaghela wrote: > Hello Arnd, > > Can you please consider this patch series for RISC-V/Eswin EIC7700 SOC. > > Regards, > Pinkesh > > On Mon, Aug 25, 2025 at 06:54 PM, Pinkesh Vaghela wrote: > > Add support for ESWIN EIC7700 SoC consisting of SiFive Quad-Core > > P550 CPU cluster and the first development board that uses it, the > > SiFive HiFive Premier P550. > > > > This patch series adds initial device tree and also adds ESWIN > > architecture support. > > > > Boot-tested using intiramfs with Linux v6.17-rc3 on HiFive Premier > > P550 board using U-Boot 2024.01 and OpenSBI 1.4. > > > > Changes in v5: > > - Rebased the patches to kernel v6.17-rc3 > > - Drop "dt-bindings: vendor-prefixes: add eswin" patch (Patch #3 in v4) > > as it is already applied by Rob Herring [1]. > > - Link to v4: > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore > > .k > > ernel.org%2Flkml%2F20250616112316.3833343-1- > > > pinkesh.vaghela%40einfochips.com%2F&data=05%7C02%7Cpinkesh.vaghela% > > > 40einfochips.com%7Ca7a3db36d4f8414d95dd08dde3dab8ef%7C0beb0c359c > > > bb4feb99e5589e415c7944%7C1%7C0%7C638917250735611269%7CUnknown > > > %7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAi > > > OiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata= > > eqrfYDxwbfOccXZ7im7%2BBS5ZWaLZZML0jfMac5yRRiA%3D&reserved=0 > > > > [1]: > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit. > > ker > > nel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fnext%2Flinux- > > next.git%2Fcommit%2F%3Fh%3Dnext- > > > 20250825%26id%3Dac29e4487aa20a21b7c3facbd1f14f5093835dc9&data=05 > > > %7C02%7Cpinkesh.vaghela%40einfochips.com%7Ca7a3db36d4f8414d95dd08 > > > dde3dab8ef%7C0beb0c359cbb4feb99e5589e415c7944%7C1%7C0%7C638917 > > > 250735660956%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydW > > > UsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D% > > > 3D%7C0%7C%7C%7C&sdata=5CsQwP5HjO0zRAL5CPMJvkpGom5W6FiBe%2B > > GyzR1F1XU%3D&reserved=0 > > > > Changes in v4: > > - Rebased the patches to kernel v6.16-rc1 > > - Drop patches that are already merged > > - Added "Acked-by" tag of "Min Lin" for Patch 4 > > - Corrected the commit message of Patch 7 (Patch #10 in v3) > > - Added "Tested-by" tag of "Ariel D'Alessandro" for Patch 7 > > - Link to v3: > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore > > .k > > ernel.org%2Flkml%2F20250410152519.1358964-1- > > > pinkesh.vaghela%40einfochips.com%2F&data=05%7C02%7Cpinkesh.vaghela% > > > 40einfochips.com%7Ca7a3db36d4f8414d95dd08dde3dab8ef%7C0beb0c359c > > > bb4feb99e5589e415c7944%7C1%7C0%7C638917250735680365%7CUnknown > > > %7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAi > > > OiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata= > > FgYbanGvuw2gw1jpzSbG3KciYPXlasos0sPCVXf31fQ%3D&reserved=0 > > > > Changes in v3: > > - Rebased the patches to kernel 6.15.0-rc1 > > - Added "Reviewed-by" tag of "Rob Herring" for Patch 4 > > - Updated MAINTAINERS file > > - Add GIT tree URL > > - Updated DTSI file > > - Added "dma-noncoherent" property to soc node > > - Updated GPIO node labels in DTSI file > > - Link to v2: > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore > > .k > > ernel.org%2Flkml%2F20250320105449.2094192-1- > > > pinkesh.vaghela%40einfochips.com%2F&data=05%7C02%7Cpinkesh.vaghela% > > > 40einfochips.com%7Ca7a3db36d4f8414d95dd08dde3dab8ef%7C0beb0c359c > > > bb4feb99e5589e415c7944%7C1%7C0%7C638917250735700104%7CUnknown > > > %7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAi > > > OiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata= > > > kvHq5Hf30zj9%2B%2BBQ6aoat0i7RL14roD8%2B2bCYJRKiR4%3D&reserved=0 > > > > Changes in v2: > > - Added "Acked-by" tag of "Conor Dooley" for Patches 1, 2, 3, 7 and 8 > > - Added "Reviewed-by" tag of "Matthias Brugger" for Patch 4 > > - Updated MAINTAINERS file > > - Add the path for the eswin binding file > > - Updated sifive,ccache0.yaml > > - Add restrictions for "cache-size" property based on the > > compatible string > > - Link to v1: > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore > > .k > > ernel.org%2Flkml%2F20250311073432.4068512-1- > > > pinkesh.vaghela%40einfochips.com%2F&data=05%7C02%7Cpinkesh.vaghela% > > > 40einfochips.com%7Ca7a3db36d4f8414d95dd08dde3dab8ef%7C0beb0c359c > > > bb4feb99e5589e415c7944%7C1%7C0%7C638917250735720668%7CUnknown > > > %7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAi > > > OiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata= > > xy7c5b96WT208HoJvQ03nZR14ZZrsfQaqfpZNdecSXk%3D&reserved=0 > > > > Darshan Prajapati (2): > > dt-bindings: riscv: Add SiFive P550 CPU compatible > > dt-bindings: interrupt-controller: Add ESWIN EIC7700 PLIC > > > > Min Lin (2): > > riscv: dts: add initial support for EIC7700 SoC > > riscv: dts: eswin: add HiFive Premier P550 board device tree > > > > Pinkesh Vaghela (1): > > riscv: Add Kconfig option for ESWIN platforms > > > > Pritesh Patel (1): > > dt-bindings: riscv: Add SiFive HiFive Premier P550 board > > > > .../sifive,plic-1.0.0.yaml | 1 + > > .../devicetree/bindings/riscv/cpus.yaml | 1 + > > .../devicetree/bindings/riscv/eswin.yaml | 29 ++ > > MAINTAINERS | 9 + > > arch/riscv/Kconfig.socs | 6 + > > arch/riscv/boot/dts/Makefile | 1 + > > arch/riscv/boot/dts/eswin/Makefile | 2 + > > .../dts/eswin/eic7700-hifive-premier-p550.dts | 29 ++ > > arch/riscv/boot/dts/eswin/eic7700.dtsi | 345 ++++++++++++++++++ > > 9 files changed, 423 insertions(+) > > create mode 100644 Documentation/devicetree/bindings/riscv/eswin.yaml > > create mode 100644 arch/riscv/boot/dts/eswin/Makefile > > create mode 100644 arch/riscv/boot/dts/eswin/eic7700-hifive-premier- > > p550.dts > > create mode 100644 arch/riscv/boot/dts/eswin/eic7700.dtsi > > > > -- > > 2.25.1 From jeffbai at aosc.io Thu Sep 11 09:17:30 2025 From: jeffbai at aosc.io (Mingcong Bai) Date: Fri, 12 Sep 2025 00:17:30 +0800 Subject: [PATCH v2 3/7] PCI: sg2042: Add Sophgo SG2042 PCIe driver In-Reply-To: <162d064228261ccd0bf9313a20288e510912effd.1757467895.git.unicorn_wang@outlook.com> References: <162d064228261ccd0bf9313a20288e510912effd.1757467895.git.unicorn_wang@outlook.com> Message-ID: <35447ba0-21c2-4a12-9d27-033a7be0af3e@aosc.io> Hi Chen, ? 2025/9/10 10:08, Chen Wang ??: > +config PCIE_SG2042_HOST > + tristate "Sophgo SG2042 PCIe controller (host mode)" > + depends on OF && (ARCH_SOPHGO || COMPILE_TEST) > + select PCIE_CADENCE_HOST > + help > + Say Y here if you want to support the Sophgo SG2042 PCIe platform > + controller in host mode. Sophgo SG2042 PCIe controller uses Cadence > + PCIe core. > + While build testing this patch against v6.16.6, PCIE_SG2042_HOST is set to "M", the kernel would fail to build during MODPOST: ERROR: modpost: "cdns_pcie_pm_ops" [drivers/pci/controller/cadence/pcie-sg2042.ko] undefined! make[2]: *** [scripts/Makefile.modpost:147: Module.symvers] Error 1 make[1]: *** [[...]/linux-6.16.6/Makefile:1953: modpost] Error 2 make: *** [Makefile:248: __sub-make] Error 2 Best Regards, Mingcong Bai From conor at kernel.org Thu Sep 11 09:23:30 2025 From: conor at kernel.org (Conor Dooley) Date: Thu, 11 Sep 2025 17:23:30 +0100 Subject: [PATCH 0/2] RISC-V: Detect Ssqosid extension and handle srmcfg CSR In-Reply-To: <20250910-ssqosid-v6-17-rc5-v1-0-72cb8f144615@kernel.org> References: <20250910-ssqosid-v6-17-rc5-v1-0-72cb8f144615@kernel.org> Message-ID: <20250911-chaste-rare-fbc3b48a341a@spud> On Wed, Sep 10, 2025 at 11:15:28PM -0700, Drew Fustini wrote: > This series adds support for the RISC-V Quality-of-Service Identifiers > (Ssqosid) extension [1] which adds the srmcfg register. This CSR > configures a hart with two identifiers: a Resource Control ID (RCID) > and a Monitoring Counter ID (MCID). These identifiers accompany each > request issued by the hart to shared resource controllers. > > Background on RISC-V QoS: > > The Ssqosid extension is used by the RISC-V Capacity and Bandwidth > Controller QoS Register Interface (CBQRI) specification [2]. QoS in > this context is concerned with shared resources on an SoC such as cache > capacity and memory bandwidth. Intel and AMD already have QoS features > on x86 and ARM has MPAM. There is an existing user interface in Linux: > the resctrl virtual filesystem [3]. > > The srmcfg CSR provides a mechanism by which a software workload (e.g. > a process or a set of processes) can be associated with an RCID and an > MCID. CBQRI defines operations to configure resource usage limits, in > the form of capacity or bandwidth. CBQRI also defines operations to > configure counters to track the resource utilization. > > Goal for this series: > > These two patches are taken from the implementation of resctrl support > for RISC-V CBQRI. Please refer to the proof-of-concept RFC [4] for > details on the resctrl implementation. More recently, I have rebased > the CBQRI support on mainline [5]. Big thanks to James Morse for the > tireless work to extract resctrl from arch/x86 and make it available > to all archs. > > I think it makes sense to first focus on the detection of Ssqosid and > handling of srmcfg when switching tasks. It has been tested against a > QEMU branch that implements Ssqosid and CBQRI [6]. A test driver [7] > was used to set srmcfg for the current process. This allows switch_to > to be tested without resctrl. > > Changes from RFC v2: > - Rename all instances of the sqoscfg CSR to srmcfg to match the > ratified Ssqosid spec > - RFC v2: https://lore.kernel.org/linux-riscv/20230430-riscv-cbqri-rfc-v2-v2-0-8e3725c4a473 at baylibre.com/ > > Changes from RFC v1: > - change DEFINE_PER_CPU to DECLARE_PER_CPU for cpu_sqoscfg in qos.h to > prevent linking error about multiple definition. Move DEFINE_PER_CPU > for cpu_sqoscfg into qos.c > - renamed qos prefix in function names to sqoscfg to be less generic > - handle sqoscfg the same way has_vector and has_fpu are handled in the > vector patch series > - RFC v1: https://lore.kernel.org/linux-riscv/20230410043646.3138446-1-dfustini at baylibre.com/ > > [1] https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0 > [2] https://github.com/riscv-non-isa/riscv-cbqri/releases/tag/v1.0 > [3] https://docs.kernel.org/filesystems/resctrl.html > [4] https://lore.kernel.org/linux-riscv/20230419111111.477118-1-dfustini at baylibre.com/ > [5] https://github.com/tt-fustini/linux/tree/b4/cbqri-v6-17-rc5 > [6] https://github.com/tt-fustini/qemu/tree/riscv-cbqri-rqsc-pptt > [7] https://github.com/tt-fustini/linux/tree/ssqosid-v6-17-rc5-debug > > Signed-off-by: Drew Fustini > --- > Drew Fustini (2): > RISC-V: Detect the Ssqosid extension > RISC-V: Add support for srmcfg CSR from Ssqosid ext > > MAINTAINERS | 6 ++++++ > arch/riscv/Kconfig | 17 ++++++++++++++++ > arch/riscv/include/asm/csr.h | 8 ++++++++ > arch/riscv/include/asm/hwcap.h | 1 + > arch/riscv/include/asm/processor.h | 3 +++ > arch/riscv/include/asm/qos.h | 41 ++++++++++++++++++++++++++++++++++++++ > arch/riscv/include/asm/switch_to.h | 3 +++ > arch/riscv/kernel/cpufeature.c | 1 + Why is there no binding change here? Is it not possible to use the extension on DT systems, or is this an oversight? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From elder at riscstar.com Thu Sep 11 09:36:41 2025 From: elder at riscstar.com (Alex Elder) Date: Thu, 11 Sep 2025 11:36:41 -0500 Subject: (subset) [PATCH v13 0/7] spacemit: introduce P1 PMIC support In-Reply-To: <175690199980.2656286.5459018179105557107.b4-ty@kernel.org> References: <20250825172057.163883-1-elder@riscstar.com> <175690199980.2656286.5459018179105557107.b4-ty@kernel.org> Message-ID: On 9/3/25 7:19 AM, Lee Jones wrote: > On Mon, 25 Aug 2025 12:20:49 -0500, Alex Elder wrote: >> The SpacemiT P1 is an I2C-controlled PMIC that implements 6 buck >> converters and 12 LDOs. It contains a load switch, ADC channels, >> GPIOs, a real-time clock, and a watchdog timer. >> >> This series introduces a multifunction driver for the P1 PMIC as >> well as drivers for its regulators and RTC. >> >> [...] > > Applied, thanks! > > [1/7] dt-bindings: mfd: add support the SpacemiT P1 PMIC > commit: baac6755d3e8ddf47eee2be3ca72fe14ebae2143 > [2/7] mfd: simple-mfd-i2c: add SpacemiT P1 support > commit: 49833495c85f26d070e70148fd9607c6fbf927fd > > -- > Lee Jones [???] > Yixun Lan plans to merge patches 5-7 of this series. That leaves patch 3, which enables regulator support, and patch 4, which adds RTC support. How should these two patches be merged? Mark has reviewed the regulator patch 3 and Alexandre has acked the RTC patch 4. Thank you. -Alex From mani at kernel.org Thu Sep 11 10:03:18 2025 From: mani at kernel.org (Manivannan Sadhasivam) Date: Thu, 11 Sep 2025 22:33:18 +0530 Subject: [PATCH v2 3/7] PCI: sg2042: Add Sophgo SG2042 PCIe driver In-Reply-To: References: <162d064228261ccd0bf9313a20288e510912effd.1757467895.git.unicorn_wang@outlook.com> Message-ID: On Wed, Sep 10, 2025 at 10:56:23AM GMT, Inochi Amaoto wrote: > On Wed, Sep 10, 2025 at 10:08:39AM +0800, Chen Wang wrote: > > From: Chen Wang > > > > Add support for PCIe controller in SG2042 SoC. The controller > > uses the Cadence PCIe core programmed by pcie-cadence*.c. The > > PCIe controller will work in host mode only, supporting data > > rate(gen4) and lanes(x16 or x8). > > > > Signed-off-by: Chen Wang > > --- > > drivers/pci/controller/cadence/Kconfig | 10 ++ > > drivers/pci/controller/cadence/Makefile | 1 + > > drivers/pci/controller/cadence/pcie-sg2042.c | 104 +++++++++++++++++++ > > 3 files changed, 115 insertions(+) > > create mode 100644 drivers/pci/controller/cadence/pcie-sg2042.c > > > > diff --git a/drivers/pci/controller/cadence/Kconfig b/drivers/pci/controller/cadence/Kconfig > > index 666e16b6367f..02a639e55fd8 100644 > > --- a/drivers/pci/controller/cadence/Kconfig > > +++ b/drivers/pci/controller/cadence/Kconfig > > @@ -42,6 +42,15 @@ config PCIE_CADENCE_PLAT_EP > > endpoint mode. This PCIe controller may be embedded into many > > different vendors SoCs. > > > > +config PCIE_SG2042_HOST > > + tristate "Sophgo SG2042 PCIe controller (host mode)" > > + depends on OF && (ARCH_SOPHGO || COMPILE_TEST) > > + select PCIE_CADENCE_HOST > > + help > > + Say Y here if you want to support the Sophgo SG2042 PCIe platform > > + controller in host mode. Sophgo SG2042 PCIe controller uses Cadence > > + PCIe core. > > + > > config PCI_J721E > > tristate > > select PCIE_CADENCE_HOST if PCI_J721E_HOST != n > > @@ -67,4 +76,5 @@ config PCI_J721E_EP > > Say Y here if you want to support the TI J721E PCIe platform > > controller in endpoint mode. TI J721E PCIe controller uses Cadence PCIe > > core. > > + > > endmenu > > diff --git a/drivers/pci/controller/cadence/Makefile b/drivers/pci/controller/cadence/Makefile > > index 9bac5fb2f13d..5e23f8539ecc 100644 > > --- a/drivers/pci/controller/cadence/Makefile > > +++ b/drivers/pci/controller/cadence/Makefile > > @@ -4,3 +4,4 @@ obj-$(CONFIG_PCIE_CADENCE_HOST) += pcie-cadence-host.o > > obj-$(CONFIG_PCIE_CADENCE_EP) += pcie-cadence-ep.o > > obj-$(CONFIG_PCIE_CADENCE_PLAT) += pcie-cadence-plat.o > > obj-$(CONFIG_PCI_J721E) += pci-j721e.o > > +obj-$(CONFIG_PCIE_SG2042_HOST) += pcie-sg2042.o > > diff --git a/drivers/pci/controller/cadence/pcie-sg2042.c b/drivers/pci/controller/cadence/pcie-sg2042.c > > new file mode 100644 > > index 000000000000..c026e1ca5d6e > > --- /dev/null > > +++ b/drivers/pci/controller/cadence/pcie-sg2042.c > > @@ -0,0 +1,104 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* > > + * pcie-sg2042 - PCIe controller driver for Sophgo SG2042 SoC > > + * > > + * Copyright (C) 2025 Sophgo Technology Inc. > > + * Copyright (C) 2025 Chen Wang > > + */ > > + > > +#include > > +#include > > +#include > > +#include > > + > > +#include "pcie-cadence.h" > > + > > +/* > > + * SG2042 only supports 4-byte aligned access, so for the rootbus (i.e. to > > + * read/write the Root Port itself, read32/write32 is required. For > > + * non-rootbus (i.e. to read/write the PCIe peripheral registers, supports > > + * 1/2/4 byte aligned access, so directly using read/write should be fine. > > + */ > > + > > +static struct pci_ops sg2042_pcie_root_ops = { > > + .map_bus = cdns_pci_map_bus, > > + .read = pci_generic_config_read32, > > + .write = pci_generic_config_write32, > > +}; > > + > > +static struct pci_ops sg2042_pcie_child_ops = { > > + .map_bus = cdns_pci_map_bus, > > + .read = pci_generic_config_read, > > + .write = pci_generic_config_write, > > +}; > > + > > +static int sg2042_pcie_probe(struct platform_device *pdev) > > +{ > > + struct device *dev = &pdev->dev; > > + struct pci_host_bridge *bridge; > > + struct cdns_pcie *pcie; > > + struct cdns_pcie_rc *rc; > > + int ret; > > + > > + bridge = devm_pci_alloc_host_bridge(dev, sizeof(*rc)); > > + if (!bridge) { > > + dev_err_probe(dev, -ENOMEM, "Failed to alloc host bridge!\n"); > > + return -ENOMEM; > > + } > > + > > + bridge->ops = &sg2042_pcie_root_ops; > > + bridge->child_ops = &sg2042_pcie_child_ops; > > + > > + rc = pci_host_bridge_priv(bridge); > > + pcie = &rc->pcie; > > + pcie->dev = dev; > > + > > + platform_set_drvdata(pdev, pcie); > > + > > + pm_runtime_set_active(dev); > > + pm_runtime_no_callbacks(dev); > > + devm_pm_runtime_enable(dev); > > + > > + ret = cdns_pcie_init_phy(dev, pcie); > > + if (ret) { > > + dev_err_probe(dev, ret, "Failed to init phy!\n"); > > + return ret; > > + } > > + > > + ret = cdns_pcie_host_setup(rc); > > + if (ret) { > > + dev_err_probe(dev, ret, "Failed to setup host!\n"); > > + cdns_pcie_disable_phy(pcie); > > + return ret; > > + } > > + > > + return 0; > > +} > > + > > > +static void sg2042_pcie_remove(struct platform_device *pdev) > > +{ > > + struct cdns_pcie *pcie = platform_get_drvdata(pdev); > > + > > + cdns_pcie_disable_phy(pcie); > > +} > > + > > I think this remove is useless, as it is almost impossible to > remove a pcie at runtime. > Why impossible? We only have concerns with removing PCIe controllers implementing irqchip, but this driver is not implementing it and using an external irqchip controller. So it is safe and possible to remove this driver during runtime. - Mani -- ????????? ???????? From krzk at kernel.org Thu Sep 11 10:24:39 2025 From: krzk at kernel.org (Krzysztof Kozlowski) Date: Thu, 11 Sep 2025 19:24:39 +0200 Subject: [RFC PATCH 1/4] dt-bindings: riscv: Add trace components description In-Reply-To: <20250911124448.1771-2-cp0613@linux.alibaba.com> References: <20250911124448.1771-1-cp0613@linux.alibaba.com> <20250911124448.1771-2-cp0613@linux.alibaba.com> Message-ID: On 11/09/2025 14:44, cp0613 at linux.alibaba.com wrote: > From: Chen Pei > > This patch has added property definitions related to the riscv Please do not use "This commit/patch/change", but imperative mood. See longer explanation here: https://elixir.bootlin.com/linux/v6.16/source/Documentation/process/submitting-patches.rst#L94
Please use scripts/get_maintainers.pl to get a list of necessary people and lists to CC. It might happen, that command when run on an older kernel, gives you outdated entries. Therefore please be sure you base your patches on recent Linux kernel. Tools like b4 or scripts/get_maintainer.pl provide you proper list of people, so fix your workflow. Tools might also fail if you work on some ancient tree (don't, instead use mainline) or work on fork of kernel (don't, instead use mainline). Just use b4 and everything should be fine, although remember about `b4 prep --auto-to-cc` if you added new patches to the patchset. You missed at least devicetree list (maybe more), so this won't be tested by automated tooling. Performing review on untested code might be a waste of time. Please kindly resend and include all necessary To/Cc entries.
> trace component, providing a foundation for subsequent driver > implementations. > ... > +$id: http://devicetree.org/schemas/riscv/trace/riscv,trace,funnel.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: RISC-V Trace Funnel Controller > + > +description: | > + riscv trace funnel controller description. > + > +maintainers: > + - Chen Pei > + > +properties: > + compatible: > + items: > + - const: riscv_trace,funnel-controller You need to start following DTS coding style. > + reg: > + description: A memory region containing registers for funnel controller > + > + ports: > + description: Input/Output port definitions > + > + level: > + description: Level of the funnel (e.g., 1 means close to the encoder) > + > +additionalProperties: true No clue from where you got this, but that's not how DT bindings are written. Maybe you used some AI tools for that - in that case, it would be strong grumpy NAK. :( You just waste community time with such approach. Please start from scratch from example-schema or known good bindings. Best regards, Krzysztof From conor at kernel.org Thu Sep 11 11:07:10 2025 From: conor at kernel.org (Conor Dooley) Date: Thu, 11 Sep 2025 19:07:10 +0100 Subject: [PATCH v3 0/5] riscv: dts: starfive: Add Milk-V Mars CM (Lite) SoM In-Reply-To: <20250905144011.928332-1-e@freeshell.de> References: <20250905144011.928332-1-e@freeshell.de> Message-ID: <20250911-smoked-aviation-b514261e547e@spud> Emil, This look okay to take? On Fri, Sep 05, 2025 at 07:39:38AM -0700, E Shattow wrote: > Milk-V Mars CM and Mars CM Lite System-on-Module both are based on the > StarFive JH7110 SoC and compatible with the Raspberry Pi CM4IO Classic IO > Board carrier. Mars CM Lite is Mars CM without the eMMC storage component > on mmc0 and the mmc0 interface configured instead for SD Card use. The > optional WiFi+BT chipset is connected via SDIO on the mmc1 interface that > would otherwise be connected to an SD Card slot on the StarFive > VisionFive2 reference design. > > Add the related devicetree files for both Milk-V Mars CM and Milk-V Mars > CM Lite describing the currently supported features, namely PMIC, UART, > I2C, GPIO, eMMC or SD Card, WiFi+BT, QSPI Flash, and Ethernet. > > Caveat with vendor AP6256 firmware files present the firmware loading is > successful but subsequently fails IRQ wake initialization. Common GPIO > conflicts for "WiFi" optioned boards having this module: > > pwmdac_pins: > - AP6256: WL_REG_ON>>WIFI_REG_ON_H_GPIO33 > - AP6256: WL_HOST_WAKE>>WIFI_WAKE_HOST_H_GPIO34 > > i2c2_pins: > - AP6256: UART_CTS_N< - AP6256: UART_RTS_N>>UART1_CTSn_GPIO3 > > i2c6_pins: > - AP6256: UART_RXD< - AP6256: UART_TXD>>UART_RX_GPIO17 > > Tested successfully for basic mmc0 storage, USB, and network functionality on: > - Milk-V Mars CM 8GB > - Milk-V Mars CM Lite 4GB > - Milk-V Mars CM Lite WiFi 8GB > > Changes since v2: > - PATCH 3/5 delete newline at end of file > - PATCH 5/5 delete newline at end of file > > Link to v2: > https://lore.kernel.org/lkml/20250831225959.531393-1-e at freeshell.de/ > > E Shattow (5): > riscv: dts: starfive: add common board dtsi for Milk-V Mars CM > variants > dt-bindings: riscv: starfive: add milkv,marscm-emmc > riscv: dts: starfive: add Milk-V Mars CM system-on-module > dt-bindings: riscv: starfive: add milkv,marscm-lite > riscv: dts: starfive: add Milk-V Mars CM Lite system-on-module > > .../devicetree/bindings/riscv/starfive.yaml | 2 + > arch/riscv/boot/dts/starfive/Makefile | 2 + > .../dts/starfive/jh7110-milkv-marscm-emmc.dts | 12 ++ > .../dts/starfive/jh7110-milkv-marscm-lite.dts | 25 +++ > .../dts/starfive/jh7110-milkv-marscm.dtsi | 159 ++++++++++++++++++ > 5 files changed, 200 insertions(+) > create mode 100644 arch/riscv/boot/dts/starfive/jh7110-milkv-marscm-emmc.dts > create mode 100644 arch/riscv/boot/dts/starfive/jh7110-milkv-marscm-lite.dts > create mode 100644 arch/riscv/boot/dts/starfive/jh7110-milkv-marscm.dtsi > > > base-commit: 8181cc2f3f21657392da912eb20ee17514c87828 > -- > 2.50.0 > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From wangruikang at iscas.ac.cn Thu Sep 11 11:13:57 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Fri, 12 Sep 2025 02:13:57 +0800 Subject: [PATCH net-next v11 5/5] riscv: dts: spacemit: Add Ethernet support for Jupiter In-Reply-To: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> References: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> Message-ID: <20250912-net-k1-emac-v11-5-aa3e84f8043b@iscas.ac.cn> Milk-V Jupiter uses an RGMII PHY for each port and uses GPIO for PHY reset. Signed-off-by: Vivian Wang Reviewed-by: Yixun Lan --- arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts | 46 +++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts b/arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts index 4483192141049caa201c093fb206b6134a064f42..c5933555c06b66f40e61fe2b9c159ba0770c2fa1 100644 --- a/arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts +++ b/arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts @@ -20,6 +20,52 @@ chosen { }; }; +ð0 { + phy-handle = <&rgmii0>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac0_cfg>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <0>; + status = "okay"; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + reset-gpios = <&gpio K1_GPIO(110) GPIO_ACTIVE_LOW>; + reset-delay-us = <10000>; + reset-post-delay-us = <100000>; + + rgmii0: phy at 1 { + reg = <0x1>; + }; + }; +}; + +ð1 { + phy-handle = <&rgmii1>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac1_cfg>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <250>; + status = "okay"; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + reset-gpios = <&gpio K1_GPIO(115) GPIO_ACTIVE_LOW>; + reset-delay-us = <10000>; + reset-post-delay-us = <100000>; + + rgmii1: phy at 1 { + reg = <0x1>; + }; + }; +}; + &uart0 { pinctrl-names = "default"; pinctrl-0 = <&uart0_2_cfg>; -- 2.50.1 From wangruikang at iscas.ac.cn Thu Sep 11 11:13:53 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Fri, 12 Sep 2025 02:13:53 +0800 Subject: [PATCH net-next v11 1/5] dt-bindings: net: Add support for SpacemiT K1 In-Reply-To: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> References: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> Message-ID: <20250912-net-k1-emac-v11-1-aa3e84f8043b@iscas.ac.cn> The Ethernet MACs on SpacemiT K1 appears to be a custom design. SpacemiT refers to them as "EMAC", so let's just call them "spacemit,k1-emac". Signed-off-by: Vivian Wang Reviewed-by: Conor Dooley --- .../devicetree/bindings/net/spacemit,k1-emac.yaml | 81 ++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/Documentation/devicetree/bindings/net/spacemit,k1-emac.yaml b/Documentation/devicetree/bindings/net/spacemit,k1-emac.yaml new file mode 100644 index 0000000000000000000000000000000000000000..500a3e1daa230ea3a1fad30d8ea56a7822fccb3d --- /dev/null +++ b/Documentation/devicetree/bindings/net/spacemit,k1-emac.yaml @@ -0,0 +1,81 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/spacemit,k1-emac.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: SpacemiT K1 Ethernet MAC + +allOf: + - $ref: ethernet-controller.yaml# + +maintainers: + - Vivian Wang + +properties: + compatible: + const: spacemit,k1-emac + + reg: + maxItems: 1 + + clocks: + maxItems: 1 + + interrupts: + maxItems: 1 + + mdio-bus: + $ref: mdio.yaml# + unevaluatedProperties: false + + resets: + maxItems: 1 + + spacemit,apmu: + $ref: /schemas/types.yaml#/definitions/phandle-array + items: + - items: + - description: phandle to syscon that controls this MAC + - description: offset of control registers + description: + A phandle to syscon with byte offset to control registers for this MAC + +required: + - compatible + - reg + - clocks + - interrupts + - resets + - spacemit,apmu + +unevaluatedProperties: false + +examples: + - | + #include + + ethernet at cac80000 { + compatible = "spacemit,k1-emac"; + reg = <0xcac80000 0x00000420>; + clocks = <&syscon_apmu CLK_EMAC0_BUS>; + interrupts = <131>; + mac-address = [ 00 00 00 00 00 00 ]; + phy-handle = <&rgmii0>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac0_cfg>; + resets = <&syscon_apmu RESET_EMAC0>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <0>; + spacemit,apmu = <&syscon_apmu 0x3e4>; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + rgmii0: phy at 1 { + reg = <0x1>; + }; + }; + }; -- 2.50.1 From wangruikang at iscas.ac.cn Thu Sep 11 11:13:55 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Fri, 12 Sep 2025 02:13:55 +0800 Subject: [PATCH net-next v11 3/5] riscv: dts: spacemit: Add Ethernet support for K1 In-Reply-To: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> References: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> Message-ID: <20250912-net-k1-emac-v11-3-aa3e84f8043b@iscas.ac.cn> Add nodes for each of the two Ethernet MACs on K1 with generic properties. Also add "gmac" pins to pinctrl config. Signed-off-by: Vivian Wang Reviewed-by: Yixun Lan --- arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi | 48 ++++++++++++++++++++++++++++ arch/riscv/boot/dts/spacemit/k1.dtsi | 22 +++++++++++++ 2 files changed, 70 insertions(+) diff --git a/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi b/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi index 3810557374228100be7adab58cd785c72e6d4aed..aff19c86d5ff381881016eaa87fc4809da65b50e 100644 --- a/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi +++ b/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi @@ -11,6 +11,54 @@ #define K1_GPIO(x) (x / 32) (x % 32) &pinctrl { + gmac0_cfg: gmac0-cfg { + gmac0-pins { + pinmux = , /* gmac0_rxdv */ + , /* gmac0_rx_d0 */ + , /* gmac0_rx_d1 */ + , /* gmac0_rx_clk */ + , /* gmac0_rx_d2 */ + , /* gmac0_rx_d3 */ + , /* gmac0_tx_d0 */ + , /* gmac0_tx_d1 */ + , /* gmac0_tx */ + , /* gmac0_tx_d2 */ + , /* gmac0_tx_d3 */ + , /* gmac0_tx_en */ + , /* gmac0_mdc */ + , /* gmac0_mdio */ + , /* gmac0_int_n */ + ; /* gmac0_clk_ref */ + + bias-pull-up = <0>; + drive-strength = <21>; + }; + }; + + gmac1_cfg: gmac1-cfg { + gmac1-pins { + pinmux = , /* gmac1_rxdv */ + , /* gmac1_rx_d0 */ + , /* gmac1_rx_d1 */ + , /* gmac1_rx_clk */ + , /* gmac1_rx_d2 */ + , /* gmac1_rx_d3 */ + , /* gmac1_tx_d0 */ + , /* gmac1_tx_d1 */ + , /* gmac1_tx */ + , /* gmac1_tx_d2 */ + , /* gmac1_tx_d3 */ + , /* gmac1_tx_en */ + , /* gmac1_mdc */ + , /* gmac1_mdio */ + , /* gmac1_int_n */ + ; /* gmac1_clk_ref */ + + bias-pull-up = <0>; + drive-strength = <21>; + }; + }; + uart0_2_cfg: uart0-2-cfg { uart0-2-pins { pinmux = , diff --git a/arch/riscv/boot/dts/spacemit/k1.dtsi b/arch/riscv/boot/dts/spacemit/k1.dtsi index abde8bb07c95c5a745736a2dd6f0c0e0d7c696e4..7b2ac3637d6d9fa1929418cc68aa25c57850ac7f 100644 --- a/arch/riscv/boot/dts/spacemit/k1.dtsi +++ b/arch/riscv/boot/dts/spacemit/k1.dtsi @@ -805,6 +805,28 @@ network-bus { #size-cells = <2>; dma-ranges = <0x0 0x00000000 0x0 0x00000000 0x0 0x80000000>, <0x0 0x80000000 0x1 0x00000000 0x0 0x80000000>; + + eth0: ethernet at cac80000 { + compatible = "spacemit,k1-emac"; + reg = <0x0 0xcac80000 0x0 0x420>; + clocks = <&syscon_apmu CLK_EMAC0_BUS>; + interrupts = <131>; + mac-address = [ 00 00 00 00 00 00 ]; + resets = <&syscon_apmu RESET_EMAC0>; + spacemit,apmu = <&syscon_apmu 0x3e4>; + status = "disabled"; + }; + + eth1: ethernet at cac81000 { + compatible = "spacemit,k1-emac"; + reg = <0x0 0xcac81000 0x0 0x420>; + clocks = <&syscon_apmu CLK_EMAC1_BUS>; + interrupts = <133>; + mac-address = [ 00 00 00 00 00 00 ]; + resets = <&syscon_apmu RESET_EMAC1>; + spacemit,apmu = <&syscon_apmu 0x3ec>; + status = "disabled"; + }; }; pcie-bus { -- 2.50.1 From wangruikang at iscas.ac.cn Thu Sep 11 11:13:54 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Fri, 12 Sep 2025 02:13:54 +0800 Subject: [PATCH net-next v11 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> References: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> Message-ID: <20250912-net-k1-emac-v11-2-aa3e84f8043b@iscas.ac.cn> The Ethernet MACs found on SpacemiT K1 appears to be a custom design that only superficially resembles some other embedded MACs. SpacemiT refers to them as "EMAC", so let's just call the driver "k1_emac". Supports RGMII and RMII interfaces. Includes support for MAC hardware statistics counters. PTP support is not implemented. Signed-off-by: Vivian Wang Reviewed-by: Maxime Chevallier Reviewed-by: Vadim Fedorenko Reviewed-by: Troy Mitchell Tested-by: Junhui Liu Tested-by: Troy Mitchell --- drivers/net/ethernet/Kconfig | 1 + drivers/net/ethernet/Makefile | 1 + drivers/net/ethernet/spacemit/Kconfig | 29 + drivers/net/ethernet/spacemit/Makefile | 6 + drivers/net/ethernet/spacemit/k1_emac.c | 2161 +++++++++++++++++++++++++++++++ drivers/net/ethernet/spacemit/k1_emac.h | 416 ++++++ 6 files changed, 2614 insertions(+) diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig index f86d4557d8d7756a5e27bc17578353b5c19ca108..aead145dd91d129b7bb410f2d4d754c744dddbf4 100644 --- a/drivers/net/ethernet/Kconfig +++ b/drivers/net/ethernet/Kconfig @@ -188,6 +188,7 @@ source "drivers/net/ethernet/sis/Kconfig" source "drivers/net/ethernet/sfc/Kconfig" source "drivers/net/ethernet/smsc/Kconfig" source "drivers/net/ethernet/socionext/Kconfig" +source "drivers/net/ethernet/spacemit/Kconfig" source "drivers/net/ethernet/stmicro/Kconfig" source "drivers/net/ethernet/sun/Kconfig" source "drivers/net/ethernet/sunplus/Kconfig" diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile index 67182339469a0d8337cc4e92aa51e498c615156d..998dd628b202ced212748450753fe180f0440c74 100644 --- a/drivers/net/ethernet/Makefile +++ b/drivers/net/ethernet/Makefile @@ -91,6 +91,7 @@ obj-$(CONFIG_NET_VENDOR_SOLARFLARE) += sfc/ obj-$(CONFIG_NET_VENDOR_SGI) += sgi/ obj-$(CONFIG_NET_VENDOR_SMSC) += smsc/ obj-$(CONFIG_NET_VENDOR_SOCIONEXT) += socionext/ +obj-$(CONFIG_NET_VENDOR_SPACEMIT) += spacemit/ obj-$(CONFIG_NET_VENDOR_STMICRO) += stmicro/ obj-$(CONFIG_NET_VENDOR_SUN) += sun/ obj-$(CONFIG_NET_VENDOR_SUNPLUS) += sunplus/ diff --git a/drivers/net/ethernet/spacemit/Kconfig b/drivers/net/ethernet/spacemit/Kconfig new file mode 100644 index 0000000000000000000000000000000000000000..85ef61a9b4eff4249ad2d32a6e7dbf283b0c180f --- /dev/null +++ b/drivers/net/ethernet/spacemit/Kconfig @@ -0,0 +1,29 @@ +config NET_VENDOR_SPACEMIT + bool "SpacemiT devices" + default y + depends on ARCH_SPACEMIT || COMPILE_TEST + help + If you have a network (Ethernet) device belonging to this class, + say Y. + + Note that the answer to this question does not directly affect + the kernel: saying N will just cause the configurator to skip all + the questions regarding SpacemiT devices. If you say Y, you will + be asked for your specific chipset/driver in the following questions. + +if NET_VENDOR_SPACEMIT + +config SPACEMIT_K1_EMAC + tristate "SpacemiT K1 Ethernet MAC driver" + depends on ARCH_SPACEMIT || COMPILE_TEST + depends on MFD_SYSCON + depends on OF + default m if ARCH_SPACEMIT + select PHYLIB + help + This driver supports the Ethernet MAC in the SpacemiT K1 SoC. + + To compile this driver as a module, choose M here: the module + will be called k1_emac. + +endif # NET_VENDOR_SPACEMIT diff --git a/drivers/net/ethernet/spacemit/Makefile b/drivers/net/ethernet/spacemit/Makefile new file mode 100644 index 0000000000000000000000000000000000000000..d29efd997a4ff5dcb50986e439997df7e3650570 --- /dev/null +++ b/drivers/net/ethernet/spacemit/Makefile @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Makefile for the SpacemiT network device drivers. +# + +obj-$(CONFIG_SPACEMIT_K1_EMAC) += k1_emac.o diff --git a/drivers/net/ethernet/spacemit/k1_emac.c b/drivers/net/ethernet/spacemit/k1_emac.c new file mode 100644 index 0000000000000000000000000000000000000000..c9723cea98aeb97260af0dbbda13be6927872d21 --- /dev/null +++ b/drivers/net/ethernet/spacemit/k1_emac.c @@ -0,0 +1,2161 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * SpacemiT K1 Ethernet driver + * + * Copyright (C) 2023-2025 SpacemiT (Hangzhou) Technology Co. Ltd + * Copyright (C) 2025 Vivian Wang + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "k1_emac.h" + +#define DRIVER_NAME "k1_emac" + +#define EMAC_DEFAULT_BUFSIZE 1536 +#define EMAC_RX_BUF_2K 2048 +#define EMAC_RX_BUF_4K 4096 + +/* Tuning parameters from SpacemiT */ +#define EMAC_TX_FRAMES 64 +#define EMAC_TX_COAL_TIMEOUT 40000 +#define EMAC_RX_FRAMES 64 +#define EMAC_RX_COAL_TIMEOUT (600 * 312) + +#define DEFAULT_FC_PAUSE_TIME 0xffff +#define DEFAULT_FC_FIFO_HIGH 1600 +#define DEFAULT_TX_ALMOST_FULL 0x1f8 +#define DEFAULT_TX_THRESHOLD 1518 +#define DEFAULT_RX_THRESHOLD 12 +#define DEFAULT_TX_RING_NUM 1024 +#define DEFAULT_RX_RING_NUM 1024 +#define DEFAULT_DMA_BURST MREGBIT_BURST_16WORD +#define HASH_TABLE_SIZE 64 + +struct desc_buf { + u64 dma_addr; + void *buff_addr; + u16 dma_len; + u8 map_as_page; +}; + +struct emac_tx_desc_buffer { + struct sk_buff *skb; + struct desc_buf buf[2]; +}; + +struct emac_rx_desc_buffer { + struct sk_buff *skb; + u64 dma_addr; + void *buff_addr; + u16 dma_len; + u8 map_as_page; +}; + +/** + * struct emac_desc_ring - Software-side information for one descriptor ring + * Same structure used for both RX and TX + * @desc_addr: Virtual address to the descriptor ring memory + * @desc_dma_addr: DMA address of the descriptor ring + * @total_size: Size of ring in bytes + * @total_cnt: Number of descriptors + * @head: Next descriptor to associate a buffer with + * @tail: Next descriptor to check status bit + * @rx_desc_buf: Array of descriptors for RX + * @tx_desc_buf: Array of descriptors for TX, with max of two buffers each + */ +struct emac_desc_ring { + void *desc_addr; + dma_addr_t desc_dma_addr; + u32 total_size; + u32 total_cnt; + u32 head; + u32 tail; + union { + struct emac_rx_desc_buffer *rx_desc_buf; + struct emac_tx_desc_buffer *tx_desc_buf; + }; +}; + +struct emac_priv { + void __iomem *iobase; + u32 dma_buf_sz; + struct emac_desc_ring tx_ring; + struct emac_desc_ring rx_ring; + + struct net_device *ndev; + struct napi_struct napi; + struct platform_device *pdev; + struct clk *bus_clk; + struct clk *ref_clk; + struct regmap *regmap_apmu; + u32 regmap_apmu_offset; + int irq; + + phy_interface_t phy_interface; + + union emac_hw_tx_stats tx_stats, tx_stats_off; + union emac_hw_rx_stats rx_stats, rx_stats_off; + + u32 tx_count_frames; + u32 tx_coal_frames; + u32 tx_coal_timeout; + struct work_struct tx_timeout_task; + + struct timer_list txtimer; + struct timer_list stats_timer; + + u32 tx_delay; + u32 rx_delay; + + bool flow_control_autoneg; + u8 flow_control; + + /* Hold while touching hardware statistics */ + spinlock_t stats_lock; +}; + +static void emac_wr(struct emac_priv *priv, u32 reg, u32 val) +{ + writel(val, priv->iobase + reg); +} + +static u32 emac_rd(struct emac_priv *priv, u32 reg) +{ + return readl(priv->iobase + reg); +} + +static int emac_phy_interface_config(struct emac_priv *priv) +{ + u32 val = 0, mask = REF_CLK_SEL | RGMII_TX_CLK_SEL | PHY_INTF_RGMII; + + if (phy_interface_mode_is_rgmii(priv->phy_interface)) + val |= PHY_INTF_RGMII; + + regmap_update_bits(priv->regmap_apmu, + priv->regmap_apmu_offset + APMU_EMAC_CTRL_REG, + mask, val); + + return 0; +} + +/* + * Where the hardware expects a MAC address, it is laid out in this high, med, + * low order in three consecutive registers and in this format. + */ + +static void emac_set_mac_addr_reg(struct emac_priv *priv, + const unsigned char *addr, + u32 reg) +{ + emac_wr(priv, reg + sizeof(u32) * 0, addr[1] << 8 | addr[0]); + emac_wr(priv, reg + sizeof(u32) * 1, addr[3] << 8 | addr[2]); + emac_wr(priv, reg + sizeof(u32) * 2, addr[5] << 8 | addr[4]); +} + +static void emac_set_mac_addr(struct emac_priv *priv, const unsigned char *addr) +{ + /* We use only one address, so set the same for flow control as well */ + emac_set_mac_addr_reg(priv, addr, MAC_ADDRESS1_HIGH); + emac_set_mac_addr_reg(priv, addr, MAC_FC_SOURCE_ADDRESS_HIGH); +} + +static void emac_reset_hw(struct emac_priv *priv) +{ + /* Disable all interrupts */ + emac_wr(priv, MAC_INTERRUPT_ENABLE, 0x0); + emac_wr(priv, DMA_INTERRUPT_ENABLE, 0x0); + + /* Disable transmit and receive units */ + emac_wr(priv, MAC_RECEIVE_CONTROL, 0x0); + emac_wr(priv, MAC_TRANSMIT_CONTROL, 0x0); + + /* Disable DMA */ + emac_wr(priv, DMA_CONTROL, 0x0); +} + +static void emac_init_hw(struct emac_priv *priv) +{ + /* Destination address for 802.3x Ethernet flow control */ + u8 fc_dest_addr[ETH_ALEN] = { 0x01, 0x80, 0xc2, 0x00, 0x00, 0x01 }; + + u32 rxirq = 0, dma = 0; + + regmap_set_bits(priv->regmap_apmu, + priv->regmap_apmu_offset + APMU_EMAC_CTRL_REG, + AXI_SINGLE_ID); + + /* Disable transmit and receive units */ + emac_wr(priv, MAC_RECEIVE_CONTROL, 0x0); + emac_wr(priv, MAC_TRANSMIT_CONTROL, 0x0); + + /* Enable MAC address 1 filtering */ + emac_wr(priv, MAC_ADDRESS_CONTROL, MREGBIT_MAC_ADDRESS1_ENABLE); + + /* Zero initialize the multicast hash table */ + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, 0x0); + + /* Configure thresholds */ + emac_wr(priv, MAC_TRANSMIT_FIFO_ALMOST_FULL, DEFAULT_TX_ALMOST_FULL); + emac_wr(priv, MAC_TRANSMIT_PACKET_START_THRESHOLD, + DEFAULT_TX_THRESHOLD); + emac_wr(priv, MAC_RECEIVE_PACKET_START_THRESHOLD, DEFAULT_RX_THRESHOLD); + + /* Configure flow control (enabled in emac_adjust_link() later) */ + emac_set_mac_addr_reg(priv, fc_dest_addr, MAC_FC_SOURCE_ADDRESS_HIGH); + emac_wr(priv, MAC_FC_PAUSE_HIGH_THRESHOLD, DEFAULT_FC_FIFO_HIGH); + emac_wr(priv, MAC_FC_HIGH_PAUSE_TIME, DEFAULT_FC_PAUSE_TIME); + emac_wr(priv, MAC_FC_PAUSE_LOW_THRESHOLD, 0); + + /* RX IRQ mitigation */ + rxirq = FIELD_PREP(MREGBIT_RECEIVE_IRQ_FRAME_COUNTER_MASK, + EMAC_RX_FRAMES); + rxirq |= FIELD_PREP(MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_MASK, + EMAC_RX_COAL_TIMEOUT); + rxirq |= MREGBIT_RECEIVE_IRQ_MITIGATION_ENABLE; + emac_wr(priv, DMA_RECEIVE_IRQ_MITIGATION_CTRL, rxirq); + + /* Disable and set DMA config */ + emac_wr(priv, DMA_CONTROL, 0x0); + + emac_wr(priv, DMA_CONFIGURATION, MREGBIT_SOFTWARE_RESET); + usleep_range(9000, 10000); + emac_wr(priv, DMA_CONFIGURATION, 0x0); + usleep_range(9000, 10000); + + dma |= MREGBIT_STRICT_BURST; + dma |= MREGBIT_DMA_64BIT_MODE; + dma |= DEFAULT_DMA_BURST; + + emac_wr(priv, DMA_CONFIGURATION, dma); +} + +static void emac_dma_start_transmit(struct emac_priv *priv) +{ + /* The actual value written does not matter */ + emac_wr(priv, DMA_TRANSMIT_POLL_DEMAND, 1); +} + +static void emac_enable_interrupt(struct emac_priv *priv) +{ + u32 val; + + val = emac_rd(priv, DMA_INTERRUPT_ENABLE); + val |= MREGBIT_TRANSMIT_TRANSFER_DONE_INTR_ENABLE; + val |= MREGBIT_RECEIVE_TRANSFER_DONE_INTR_ENABLE; + emac_wr(priv, DMA_INTERRUPT_ENABLE, val); +} + +static void emac_disable_interrupt(struct emac_priv *priv) +{ + u32 val; + + val = emac_rd(priv, DMA_INTERRUPT_ENABLE); + val &= ~MREGBIT_TRANSMIT_TRANSFER_DONE_INTR_ENABLE; + val &= ~MREGBIT_RECEIVE_TRANSFER_DONE_INTR_ENABLE; + emac_wr(priv, DMA_INTERRUPT_ENABLE, val); +} + +static u32 emac_tx_avail(struct emac_priv *priv) +{ + struct emac_desc_ring *tx_ring = &priv->tx_ring; + u32 avail; + + if (tx_ring->tail > tx_ring->head) + avail = tx_ring->tail - tx_ring->head - 1; + else + avail = tx_ring->total_cnt - tx_ring->head + tx_ring->tail - 1; + + return avail; +} + +static void emac_tx_coal_timer_resched(struct emac_priv *priv) +{ + mod_timer(&priv->txtimer, + jiffies + usecs_to_jiffies(priv->tx_coal_timeout)); +} + +static void emac_tx_coal_timer(struct timer_list *t) +{ + struct emac_priv *priv = timer_container_of(priv, t, txtimer); + + napi_schedule(&priv->napi); +} + +static bool emac_tx_should_interrupt(struct emac_priv *priv, u32 pkt_num) +{ + priv->tx_count_frames += pkt_num; + if (likely(priv->tx_coal_frames > priv->tx_count_frames)) { + emac_tx_coal_timer_resched(priv); + return false; + } + + priv->tx_count_frames = 0; + return true; +} + +static void emac_free_tx_buf(struct emac_priv *priv, int i) +{ + struct emac_tx_desc_buffer *tx_buf; + struct emac_desc_ring *tx_ring; + struct desc_buf *buf; + int j; + + tx_ring = &priv->tx_ring; + tx_buf = &tx_ring->tx_desc_buf[i]; + + for (j = 0; j < 2; j++) { + buf = &tx_buf->buf[j]; + if (!buf->dma_addr) + continue; + + if (buf->map_as_page) + dma_unmap_page(&priv->pdev->dev, buf->dma_addr, + buf->dma_len, DMA_TO_DEVICE); + else + dma_unmap_single(&priv->pdev->dev, + buf->dma_addr, buf->dma_len, + DMA_TO_DEVICE); + + buf->dma_addr = 0; + buf->map_as_page = false; + buf->buff_addr = NULL; + } + + if (tx_buf->skb) { + dev_kfree_skb_any(tx_buf->skb); + tx_buf->skb = NULL; + } +} + +static void emac_clean_tx_desc_ring(struct emac_priv *priv) +{ + struct emac_desc_ring *tx_ring = &priv->tx_ring; + u32 i; + + for (i = 0; i < tx_ring->total_cnt; i++) + emac_free_tx_buf(priv, i); + + tx_ring->head = 0; + tx_ring->tail = 0; +} + +static void emac_clean_rx_desc_ring(struct emac_priv *priv) +{ + struct emac_rx_desc_buffer *rx_buf; + struct emac_desc_ring *rx_ring; + u32 i; + + rx_ring = &priv->rx_ring; + + for (i = 0; i < rx_ring->total_cnt; i++) { + rx_buf = &rx_ring->rx_desc_buf[i]; + + if (!rx_buf->skb) + continue; + + dma_unmap_single(&priv->pdev->dev, rx_buf->dma_addr, + rx_buf->dma_len, DMA_FROM_DEVICE); + + dev_kfree_skb(rx_buf->skb); + rx_buf->skb = NULL; + } + + rx_ring->tail = 0; + rx_ring->head = 0; +} + +static int emac_alloc_tx_resources(struct emac_priv *priv) +{ + struct emac_desc_ring *tx_ring = &priv->tx_ring; + struct platform_device *pdev = priv->pdev; + + tx_ring->tx_desc_buf = kcalloc(tx_ring->total_cnt, + sizeof(*tx_ring->tx_desc_buf), + GFP_KERNEL); + + if (!tx_ring->tx_desc_buf) + return -ENOMEM; + + tx_ring->total_size = tx_ring->total_cnt * sizeof(struct emac_desc); + tx_ring->total_size = ALIGN(tx_ring->total_size, PAGE_SIZE); + + tx_ring->desc_addr = dma_alloc_coherent(&pdev->dev, tx_ring->total_size, + &tx_ring->desc_dma_addr, + GFP_KERNEL); + if (!tx_ring->desc_addr) { + kfree(tx_ring->tx_desc_buf); + return -ENOMEM; + } + + tx_ring->head = 0; + tx_ring->tail = 0; + + return 0; +} + +static int emac_alloc_rx_resources(struct emac_priv *priv) +{ + struct emac_desc_ring *rx_ring = &priv->rx_ring; + struct platform_device *pdev = priv->pdev; + + rx_ring->rx_desc_buf = kcalloc(rx_ring->total_cnt, + sizeof(*rx_ring->rx_desc_buf), + GFP_KERNEL); + if (!rx_ring->rx_desc_buf) + return -ENOMEM; + + rx_ring->total_size = rx_ring->total_cnt * sizeof(struct emac_desc); + + rx_ring->total_size = ALIGN(rx_ring->total_size, PAGE_SIZE); + + rx_ring->desc_addr = dma_alloc_coherent(&pdev->dev, rx_ring->total_size, + &rx_ring->desc_dma_addr, + GFP_KERNEL); + if (!rx_ring->desc_addr) { + kfree(rx_ring->rx_desc_buf); + return -ENOMEM; + } + + rx_ring->head = 0; + rx_ring->tail = 0; + + return 0; +} + +static void emac_free_tx_resources(struct emac_priv *priv) +{ + struct emac_desc_ring *tr = &priv->tx_ring; + struct device *dev = &priv->pdev->dev; + + emac_clean_tx_desc_ring(priv); + + kfree(tr->tx_desc_buf); + tr->tx_desc_buf = NULL; + + dma_free_coherent(dev, tr->total_size, tr->desc_addr, + tr->desc_dma_addr); + tr->desc_addr = NULL; +} + +static void emac_free_rx_resources(struct emac_priv *priv) +{ + struct emac_desc_ring *rr = &priv->rx_ring; + struct device *dev = &priv->pdev->dev; + + emac_clean_rx_desc_ring(priv); + + kfree(rr->rx_desc_buf); + rr->rx_desc_buf = NULL; + + dma_free_coherent(dev, rr->total_size, rr->desc_addr, + rr->desc_dma_addr); + rr->desc_addr = NULL; +} + +static int emac_tx_clean_desc(struct emac_priv *priv) +{ + struct net_device *ndev = priv->ndev; + struct emac_desc_ring *tx_ring; + struct emac_desc *tx_desc; + u32 i; + + netif_tx_lock(ndev); + + tx_ring = &priv->tx_ring; + + i = tx_ring->tail; + + while (i != tx_ring->head) { + tx_desc = &((struct emac_desc *)tx_ring->desc_addr)[i]; + + /* Stop checking if desc still own by DMA */ + if (READ_ONCE(tx_desc->desc0) & TX_DESC_0_OWN) + break; + + emac_free_tx_buf(priv, i); + memset(tx_desc, 0, sizeof(struct emac_desc)); + + if (++i == tx_ring->total_cnt) + i = 0; + } + + tx_ring->tail = i; + + if (unlikely(netif_queue_stopped(ndev) && + emac_tx_avail(priv) > tx_ring->total_cnt / 4)) + netif_wake_queue(ndev); + + netif_tx_unlock(ndev); + + return 0; +} + +static bool emac_rx_frame_good(struct emac_priv *priv, struct emac_desc *desc) +{ + const char *msg; + u32 len; + + len = FIELD_GET(RX_DESC_0_FRAME_PACKET_LENGTH_MASK, desc->desc0); + + if (WARN_ON_ONCE(!(desc->desc0 & RX_DESC_0_LAST_DESCRIPTOR))) + msg = "Not last descriptor"; /* This would be a bug */ + else if (desc->desc0 & RX_DESC_0_FRAME_RUNT) + msg = "Runt frame"; + else if (desc->desc0 & RX_DESC_0_FRAME_CRC_ERR) + msg = "Frame CRC error"; + else if (desc->desc0 & RX_DESC_0_FRAME_MAX_LEN_ERR) + msg = "Frame exceeds max length"; + else if (desc->desc0 & RX_DESC_0_FRAME_JABBER_ERR) + msg = "Frame jabber error"; + else if (desc->desc0 & RX_DESC_0_FRAME_LENGTH_ERR) + msg = "Frame length error"; + else if (len <= ETH_FCS_LEN || len > priv->dma_buf_sz) + msg = "Frame length unacceptable"; + else + return true; /* All good */ + + dev_dbg_ratelimited(&priv->ndev->dev, "RX error: %s", msg); + + return false; +} + +static void emac_alloc_rx_desc_buffers(struct emac_priv *priv) +{ + struct emac_desc_ring *rx_ring = &priv->rx_ring; + struct emac_desc rx_desc, *rx_desc_addr; + struct net_device *ndev = priv->ndev; + struct emac_rx_desc_buffer *rx_buf; + struct sk_buff *skb; + u32 i; + + i = rx_ring->head; + rx_buf = &rx_ring->rx_desc_buf[i]; + + while (!rx_buf->skb) { + skb = netdev_alloc_skb_ip_align(ndev, priv->dma_buf_sz); + if (!skb) + break; + + skb->dev = ndev; + + rx_buf->skb = skb; + rx_buf->dma_len = priv->dma_buf_sz; + rx_buf->dma_addr = dma_map_single(&priv->pdev->dev, skb->data, + priv->dma_buf_sz, + DMA_FROM_DEVICE); + if (dma_mapping_error(&priv->pdev->dev, rx_buf->dma_addr)) { + dev_err_ratelimited(&ndev->dev, "Mapping skb failed\n"); + goto err_free_skb; + } + + rx_desc_addr = &((struct emac_desc *)rx_ring->desc_addr)[i]; + + memset(&rx_desc, 0, sizeof(rx_desc)); + + rx_desc.buffer_addr_1 = rx_buf->dma_addr; + rx_desc.desc1 = FIELD_PREP(RX_DESC_1_BUFFER_SIZE_1_MASK, + rx_buf->dma_len); + + if (++i == rx_ring->total_cnt) { + rx_desc.desc1 |= RX_DESC_1_END_RING; + i = 0; + } + + *rx_desc_addr = rx_desc; + dma_wmb(); + WRITE_ONCE(rx_desc_addr->desc0, rx_desc.desc0 | RX_DESC_0_OWN); + + rx_buf = &rx_ring->rx_desc_buf[i]; + } + + rx_ring->head = i; + return; + +err_free_skb: + dev_kfree_skb_any(skb); + rx_buf->skb = NULL; +} + +/* Returns number of packets received */ +static int emac_rx_clean_desc(struct emac_priv *priv, int budget) +{ + struct net_device *ndev = priv->ndev; + struct emac_rx_desc_buffer *rx_buf; + struct emac_desc_ring *rx_ring; + struct sk_buff *skb = NULL; + struct emac_desc *rx_desc; + u32 got = 0, skb_len, i; + + rx_ring = &priv->rx_ring; + + i = rx_ring->tail; + + while (budget--) { + rx_desc = &((struct emac_desc *)rx_ring->desc_addr)[i]; + + /* Stop checking if rx_desc still owned by DMA */ + if (READ_ONCE(rx_desc->desc0) & RX_DESC_0_OWN) + break; + + dma_rmb(); + + rx_buf = &rx_ring->rx_desc_buf[i]; + + if (!rx_buf->skb) + break; + + got++; + + dma_unmap_single(&priv->pdev->dev, rx_buf->dma_addr, + rx_buf->dma_len, DMA_FROM_DEVICE); + + if (likely(emac_rx_frame_good(priv, rx_desc))) { + skb = rx_buf->skb; + + skb_len = FIELD_GET(RX_DESC_0_FRAME_PACKET_LENGTH_MASK, + rx_desc->desc0); + skb_len -= ETH_FCS_LEN; + + skb_put(skb, skb_len); + skb->dev = ndev; + ndev->hard_header_len = ETH_HLEN; + + skb->protocol = eth_type_trans(skb, ndev); + + skb->ip_summed = CHECKSUM_NONE; + + napi_gro_receive(&priv->napi, skb); + + memset(rx_desc, 0, sizeof(struct emac_desc)); + rx_buf->skb = NULL; + } else { + dev_kfree_skb_irq(rx_buf->skb); + rx_buf->skb = NULL; + } + + if (++i == rx_ring->total_cnt) + i = 0; + } + + rx_ring->tail = i; + + emac_alloc_rx_desc_buffers(priv); + + return got; +} + +static int emac_rx_poll(struct napi_struct *napi, int budget) +{ + struct emac_priv *priv = container_of(napi, struct emac_priv, napi); + int work_done; + + emac_tx_clean_desc(priv); + + work_done = emac_rx_clean_desc(priv, budget); + if (work_done < budget && napi_complete_done(napi, work_done)) + emac_enable_interrupt(priv); + + return work_done; +} + +/* + * For convenience, skb->data is fragment 0, frags[0] is fragment 1, etc. + * + * Each descriptor can hold up to two fragments, called buffer 1 and 2. For each + * fragment f, if f % 2 == 0, it uses buffer 1, otherwise it uses buffer 2. + */ + +static int emac_tx_map_frag(struct device *dev, struct emac_desc *tx_desc, + struct emac_tx_desc_buffer *tx_buf, + struct sk_buff *skb, u32 frag_idx) +{ + bool map_as_page, buf_idx; + const skb_frag_t *frag; + phys_addr_t addr; + u32 len; + int ret; + + buf_idx = frag_idx % 2; + + if (frag_idx == 0) { + /* Non-fragmented part */ + len = skb_headlen(skb); + addr = dma_map_single(dev, skb->data, len, DMA_TO_DEVICE); + map_as_page = false; + } else { + /* Fragment */ + frag = &skb_shinfo(skb)->frags[frag_idx - 1]; + len = skb_frag_size(frag); + addr = skb_frag_dma_map(dev, frag, 0, len, DMA_TO_DEVICE); + map_as_page = true; + } + + ret = dma_mapping_error(dev, addr); + if (ret) + return ret; + + tx_buf->buf[buf_idx].dma_addr = addr; + tx_buf->buf[buf_idx].dma_len = len; + tx_buf->buf[buf_idx].map_as_page = map_as_page; + + if (buf_idx == 0) { + tx_desc->buffer_addr_1 = addr; + tx_desc->desc1 |= FIELD_PREP(TX_DESC_1_BUFFER_SIZE_1_MASK, len); + } else { + tx_desc->buffer_addr_2 = addr; + tx_desc->desc1 |= FIELD_PREP(TX_DESC_1_BUFFER_SIZE_2_MASK, len); + } + + return 0; +} + +static void emac_tx_mem_map(struct emac_priv *priv, struct sk_buff *skb) +{ + struct emac_desc_ring *tx_ring = &priv->tx_ring; + struct emac_desc tx_desc, *tx_desc_addr; + struct device *dev = &priv->pdev->dev; + struct emac_tx_desc_buffer *tx_buf; + u32 head, old_head, frag_num, f; + bool buf_idx; + + frag_num = skb_shinfo(skb)->nr_frags; + head = tx_ring->head; + old_head = head; + + for (f = 0; f < frag_num + 1; f++) { + buf_idx = f % 2; + + /* + * If using buffer 1, initialize a new desc. Otherwise, use + * buffer 2 of previous fragment's desc. + */ + if (!buf_idx) { + tx_buf = &tx_ring->tx_desc_buf[head]; + tx_desc_addr = + &((struct emac_desc *)tx_ring->desc_addr)[head]; + memset(&tx_desc, 0, sizeof(tx_desc)); + + /* + * Give ownership for all but first desc initially. For + * first desc, give at the end so DMA cannot start + * reading uninitialized descs. + */ + if (head != old_head) + tx_desc.desc0 |= TX_DESC_0_OWN; + + if (++head == tx_ring->total_cnt) { + /* Just used last desc in ring */ + tx_desc.desc1 |= TX_DESC_1_END_RING; + head = 0; + } + } + + if (emac_tx_map_frag(dev, &tx_desc, tx_buf, skb, f)) { + dev_err_ratelimited(&priv->ndev->dev, + "Map TX frag %d failed\n", f); + goto err_free_skb; + } + + if (f == 0) + tx_desc.desc1 |= TX_DESC_1_FIRST_SEGMENT; + + if (f == frag_num) { + tx_desc.desc1 |= TX_DESC_1_LAST_SEGMENT; + tx_buf->skb = skb; + if (emac_tx_should_interrupt(priv, frag_num + 1)) + tx_desc.desc1 |= + TX_DESC_1_INTERRUPT_ON_COMPLETION; + } + + *tx_desc_addr = tx_desc; + } + + /* All descriptors are ready, give ownership for first desc */ + tx_desc_addr = &((struct emac_desc *)tx_ring->desc_addr)[old_head]; + dma_wmb(); + WRITE_ONCE(tx_desc_addr->desc0, tx_desc_addr->desc0 | TX_DESC_0_OWN); + + emac_dma_start_transmit(priv); + + tx_ring->head = head; + + return; + +err_free_skb: + dev_dstats_tx_dropped(priv->ndev); + dev_kfree_skb_any(skb); +} + +static netdev_tx_t emac_start_xmit(struct sk_buff *skb, struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + int nfrags = skb_shinfo(skb)->nr_frags; + struct device *dev = &priv->pdev->dev; + + if (unlikely(emac_tx_avail(priv) < nfrags + 1)) { + if (!netif_queue_stopped(ndev)) { + netif_stop_queue(ndev); + dev_err_ratelimited(dev, "TX ring full, stop TX queue\n"); + } + return NETDEV_TX_BUSY; + } + + emac_tx_mem_map(priv, skb); + + /* Make sure there is space in the ring for the next TX. */ + if (unlikely(emac_tx_avail(priv) <= MAX_SKB_FRAGS + 2)) + netif_stop_queue(ndev); + + return NETDEV_TX_OK; +} + +static int emac_set_mac_address(struct net_device *ndev, void *addr) +{ + struct emac_priv *priv = netdev_priv(ndev); + int ret = eth_mac_addr(ndev, addr); + + if (ret) + return ret; + + /* If running, set now; if not running it will be set in emac_up. */ + if (netif_running(ndev)) + emac_set_mac_addr(priv, ndev->dev_addr); + + return 0; +} + +static void emac_mac_multicast_filter_clear(struct emac_priv *priv) +{ + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, 0x0); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, 0x0); +} + +/* + * The upper 6 bits of the Ethernet CRC of the MAC address is used as the hash + * when matching multicast addresses. + */ +static u32 emac_ether_addr_hash(u8 addr[ETH_ALEN]) +{ + u32 crc32 = ether_crc(ETH_ALEN, addr); + + return crc32 >> 26; +} + +/* Configure Multicast and Promiscuous modes */ +static void emac_set_rx_mode(struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + struct netdev_hw_addr *ha; + u32 mc_filter[4] = { 0 }; + u32 hash, reg, bit, val; + + val = emac_rd(priv, MAC_ADDRESS_CONTROL); + + val &= ~MREGBIT_PROMISCUOUS_MODE; + + if (ndev->flags & IFF_PROMISC) { + /* Enable promisc mode */ + val |= MREGBIT_PROMISCUOUS_MODE; + } else if ((ndev->flags & IFF_ALLMULTI) || + (netdev_mc_count(ndev) > HASH_TABLE_SIZE)) { + /* Accept all multicast frames by setting every bit */ + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, 0xffff); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, 0xffff); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, 0xffff); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, 0xffff); + } else if (!netdev_mc_empty(ndev)) { + emac_mac_multicast_filter_clear(priv); + netdev_for_each_mc_addr(ha, ndev) { + /* + * The hash table is an array of 4 16-bit registers. It + * is treated like an array of 64 bits (bits[hash]). + */ + hash = emac_ether_addr_hash(ha->addr); + reg = hash / 16; + bit = hash % 16; + mc_filter[reg] |= BIT(bit); + } + emac_wr(priv, MAC_MULTICAST_HASH_TABLE1, mc_filter[0]); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE2, mc_filter[1]); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE3, mc_filter[2]); + emac_wr(priv, MAC_MULTICAST_HASH_TABLE4, mc_filter[3]); + } + + emac_wr(priv, MAC_ADDRESS_CONTROL, val); +} + +static int emac_change_mtu(struct net_device *ndev, int mtu) +{ + struct emac_priv *priv = netdev_priv(ndev); + u32 frame_len; + + if (netif_running(ndev)) { + netdev_err(ndev, "must be stopped to change MTU\n"); + return -EBUSY; + } + + frame_len = mtu + ETH_HLEN + ETH_FCS_LEN; + + if (frame_len <= EMAC_DEFAULT_BUFSIZE) + priv->dma_buf_sz = EMAC_DEFAULT_BUFSIZE; + else if (frame_len <= EMAC_RX_BUF_2K) + priv->dma_buf_sz = EMAC_RX_BUF_2K; + else + priv->dma_buf_sz = EMAC_RX_BUF_4K; + + ndev->mtu = mtu; + + return 0; +} + +static void emac_tx_timeout(struct net_device *ndev, unsigned int txqueue) +{ + struct emac_priv *priv = netdev_priv(ndev); + + schedule_work(&priv->tx_timeout_task); +} + +static int emac_mii_read(struct mii_bus *bus, int phy_addr, int regnum) +{ + struct emac_priv *priv = bus->priv; + u32 cmd = 0, val; + int ret; + + cmd |= FIELD_PREP(MREGBIT_PHY_ADDRESS, phy_addr); + cmd |= FIELD_PREP(MREGBIT_REGISTER_ADDRESS, regnum); + cmd |= MREGBIT_START_MDIO_TRANS | MREGBIT_MDIO_READ_WRITE; + + emac_wr(priv, MAC_MDIO_DATA, 0x0); + emac_wr(priv, MAC_MDIO_CONTROL, cmd); + + ret = readl_poll_timeout(priv->iobase + MAC_MDIO_CONTROL, val, + !(val & MREGBIT_START_MDIO_TRANS), 100, 10000); + + if (ret) + return ret; + + val = emac_rd(priv, MAC_MDIO_DATA); + return val; +} + +static int emac_mii_write(struct mii_bus *bus, int phy_addr, int regnum, + u16 value) +{ + struct emac_priv *priv = bus->priv; + u32 cmd = 0, val; + int ret; + + emac_wr(priv, MAC_MDIO_DATA, value); + + cmd |= FIELD_PREP(MREGBIT_PHY_ADDRESS, phy_addr); + cmd |= FIELD_PREP(MREGBIT_REGISTER_ADDRESS, regnum); + cmd |= MREGBIT_START_MDIO_TRANS; + + emac_wr(priv, MAC_MDIO_CONTROL, cmd); + + ret = readl_poll_timeout(priv->iobase + MAC_MDIO_CONTROL, val, + !(val & MREGBIT_START_MDIO_TRANS), 100, 10000); + + return ret; +} + +static int emac_mdio_init(struct emac_priv *priv) +{ + struct device *dev = &priv->pdev->dev; + struct device_node *mii_np; + struct mii_bus *mii; + int ret; + + mii = devm_mdiobus_alloc(dev); + if (!mii) + return -ENOMEM; + + mii->priv = priv; + mii->name = "k1_emac_mii"; + mii->read = emac_mii_read; + mii->write = emac_mii_write; + mii->parent = dev; + mii->phy_mask = ~0; + snprintf(mii->id, MII_BUS_ID_SIZE, "%s", priv->pdev->name); + + mii_np = of_get_available_child_by_name(dev->of_node, "mdio-bus"); + + ret = devm_of_mdiobus_register(dev, mii, mii_np); + if (ret) + dev_err_probe(dev, ret, "Failed to register mdio bus\n"); + + of_node_put(mii_np); + return ret; +} + +static void emac_set_tx_fc(struct emac_priv *priv, bool enable) +{ + u32 val; + + val = emac_rd(priv, MAC_FC_CONTROL); + + FIELD_MODIFY(MREGBIT_FC_GENERATION_ENABLE, &val, enable); + FIELD_MODIFY(MREGBIT_AUTO_FC_GENERATION_ENABLE, &val, enable); + + emac_wr(priv, MAC_FC_CONTROL, val); +} + +static void emac_set_rx_fc(struct emac_priv *priv, bool enable) +{ + u32 val = emac_rd(priv, MAC_FC_CONTROL); + + FIELD_MODIFY(MREGBIT_FC_DECODE_ENABLE, &val, enable); + + emac_wr(priv, MAC_FC_CONTROL, val); +} + +static void emac_set_fc(struct emac_priv *priv, u8 fc) +{ + emac_set_tx_fc(priv, fc & FLOW_CTRL_TX); + emac_set_rx_fc(priv, fc & FLOW_CTRL_RX); + priv->flow_control = fc; +} + +static void emac_set_fc_autoneg(struct emac_priv *priv) +{ + struct phy_device *phydev = priv->ndev->phydev; + u32 local_adv, remote_adv; + u8 fc; + + local_adv = linkmode_adv_to_lcl_adv_t(phydev->advertising); + + remote_adv = 0; + + if (phydev->pause) + remote_adv |= LPA_PAUSE_CAP; + + if (phydev->asym_pause) + remote_adv |= LPA_PAUSE_ASYM; + + fc = mii_resolve_flowctrl_fdx(local_adv, remote_adv); + + priv->flow_control_autoneg = true; + + emac_set_fc(priv, fc); +} + +/* + * Even though this MAC supports gigabit operation, it only provides 32-bit + * statistics counters. The most overflow-prone counters are the "bytes" ones, + * which at gigabit overflow about twice a minute. + * + * Therefore, we maintain the high 32 bits of counters ourselves, incrementing + * every time statistics seem to go backwards. Also, update periodically to + * catch overflows when we are not otherwise checking the statistics often + * enough. + */ + +#define EMAC_STATS_TIMER_PERIOD 20 + +static int emac_read_stat_cnt(struct emac_priv *priv, u8 cnt, u32 *res, + u32 control_reg, u32 high_reg, u32 low_reg) +{ + u32 val, high, low; + int ret; + + /* The "read" bit is the same for TX and RX */ + + val = MREGBIT_START_TX_COUNTER_READ | cnt; + emac_wr(priv, control_reg, val); + val = emac_rd(priv, control_reg); + + ret = readl_poll_timeout_atomic(priv->iobase + control_reg, val, + !(val & MREGBIT_START_TX_COUNTER_READ), + 100, 10000); + + if (ret) { + netdev_err(priv->ndev, "Read stat timeout\n"); + return ret; + } + + high = emac_rd(priv, high_reg); + low = emac_rd(priv, low_reg); + *res = high << 16 | lower_16_bits(low); + + return 0; +} + +static int emac_tx_read_stat_cnt(struct emac_priv *priv, u8 cnt, u32 *res) +{ + return emac_read_stat_cnt(priv, cnt, res, MAC_TX_STATCTR_CONTROL, + MAC_TX_STATCTR_DATA_HIGH, + MAC_TX_STATCTR_DATA_LOW); +} + +static int emac_rx_read_stat_cnt(struct emac_priv *priv, u8 cnt, u32 *res) +{ + return emac_read_stat_cnt(priv, cnt, res, MAC_RX_STATCTR_CONTROL, + MAC_RX_STATCTR_DATA_HIGH, + MAC_RX_STATCTR_DATA_LOW); +} + +static void emac_update_counter(u64 *counter, u32 new_low) +{ + u32 old_low = lower_32_bits(*counter); + u64 high = upper_32_bits(*counter); + + if (old_low > new_low) { + /* Overflowed, increment high 32 bits */ + high++; + } + + *counter = (high << 32) | new_low; +} + +static void emac_stats_update(struct emac_priv *priv) +{ + u64 *tx_stats_off = priv->tx_stats_off.array; + u64 *rx_stats_off = priv->rx_stats_off.array; + u64 *tx_stats = priv->tx_stats.array; + u64 *rx_stats = priv->rx_stats.array; + u32 i, res, offset; + + assert_spin_locked(&priv->stats_lock); + + if (!netif_running(priv->ndev) || !netif_device_present(priv->ndev)) { + /* Not up, don't try to update */ + return; + } + + for (i = 0; i < sizeof(priv->tx_stats) / sizeof(*tx_stats); i++) { + /* + * If reading stats times out, everything is broken and there's + * nothing we can do. Reading statistics also can't return an + * error, so just return without updating and without + * rescheduling. + */ + if (emac_tx_read_stat_cnt(priv, i, &res)) + return; + + /* + * Re-initializing while bringing interface up resets counters + * to zero, so to provide continuity, we add the values saved + * last time we did emac_down() to the new hardware-provided + * value. + */ + offset = lower_32_bits(tx_stats_off[i]); + emac_update_counter(&tx_stats[i], res + offset); + } + + /* Similar remarks as TX stats */ + for (i = 0; i < sizeof(priv->rx_stats) / sizeof(*rx_stats); i++) { + if (emac_rx_read_stat_cnt(priv, i, &res)) + return; + offset = lower_32_bits(rx_stats_off[i]); + emac_update_counter(&rx_stats[i], res + offset); + } + + mod_timer(&priv->stats_timer, jiffies + EMAC_STATS_TIMER_PERIOD * HZ); +} + +static void emac_stats_timer(struct timer_list *t) +{ + struct emac_priv *priv = timer_container_of(priv, t, stats_timer); + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + spin_unlock(&priv->stats_lock); +} + +static const struct ethtool_rmon_hist_range emac_rmon_hist_ranges[] = { + { 64, 64 }, + { 65, 127 }, + { 128, 255 }, + { 256, 511 }, + { 512, 1023 }, + { 1024, 1518 }, + { 1519, 4096 }, + { /* sentinel */ }, +}; + +/* Like dev_fetch_dstats(), but we only use tx_drops */ +static u64 emac_get_stat_tx_drops(struct emac_priv *priv) +{ + const struct pcpu_dstats *stats; + u64 tx_drops, total = 0; + unsigned int start; + int cpu; + + for_each_possible_cpu(cpu) { + stats = per_cpu_ptr(priv->ndev->dstats, cpu); + do { + start = u64_stats_fetch_begin(&stats->syncp); + tx_drops = u64_stats_read(&stats->tx_drops); + } while (u64_stats_fetch_retry(&stats->syncp, start)); + + total += tx_drops; + } + + return total; +} + +static void emac_get_stats64(struct net_device *dev, + struct rtnl_link_stats64 *storage) +{ + struct emac_priv *priv = netdev_priv(dev); + union emac_hw_tx_stats *tx_stats; + union emac_hw_rx_stats *rx_stats; + + tx_stats = &priv->tx_stats; + rx_stats = &priv->rx_stats; + + /* This is the only software counter */ + storage->tx_dropped = emac_get_stat_tx_drops(priv); + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + storage->tx_packets = tx_stats->stats.tx_ok_pkts; + storage->tx_bytes = tx_stats->stats.tx_ok_bytes; + storage->tx_errors = tx_stats->stats.tx_err_pkts; + + storage->rx_packets = rx_stats->stats.rx_ok_pkts; + storage->rx_bytes = rx_stats->stats.rx_ok_bytes; + storage->rx_errors = rx_stats->stats.rx_err_total_pkts; + storage->rx_crc_errors = rx_stats->stats.rx_crc_err_pkts; + storage->rx_frame_errors = rx_stats->stats.rx_align_err_pkts; + storage->rx_length_errors = rx_stats->stats.rx_len_err_pkts; + + storage->collisions = tx_stats->stats.tx_singleclsn_pkts; + storage->collisions += tx_stats->stats.tx_multiclsn_pkts; + storage->collisions += tx_stats->stats.tx_excessclsn_pkts; + + storage->rx_missed_errors = rx_stats->stats.rx_drp_fifo_full_pkts; + storage->rx_missed_errors += rx_stats->stats.rx_truncate_fifo_full_pkts; + + spin_unlock(&priv->stats_lock); +} + +static void emac_get_rmon_stats(struct net_device *dev, + struct ethtool_rmon_stats *rmon_stats, + const struct ethtool_rmon_hist_range **ranges) +{ + struct emac_priv *priv = netdev_priv(dev); + union emac_hw_rx_stats *rx_stats; + + rx_stats = &priv->rx_stats; + + *ranges = emac_rmon_hist_ranges; + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + rmon_stats->undersize_pkts = rx_stats->stats.rx_len_undersize_pkts; + rmon_stats->oversize_pkts = rx_stats->stats.rx_len_oversize_pkts; + rmon_stats->fragments = rx_stats->stats.rx_len_fragment_pkts; + rmon_stats->jabbers = rx_stats->stats.rx_len_jabber_pkts; + + /* Only RX has histogram stats */ + + rmon_stats->hist[0] = rx_stats->stats.rx_64_pkts; + rmon_stats->hist[1] = rx_stats->stats.rx_65_127_pkts; + rmon_stats->hist[2] = rx_stats->stats.rx_128_255_pkts; + rmon_stats->hist[3] = rx_stats->stats.rx_256_511_pkts; + rmon_stats->hist[4] = rx_stats->stats.rx_512_1023_pkts; + rmon_stats->hist[5] = rx_stats->stats.rx_1024_1518_pkts; + rmon_stats->hist[6] = rx_stats->stats.rx_1519_plus_pkts; + + spin_unlock(&priv->stats_lock); +} + +static void emac_get_eth_mac_stats(struct net_device *dev, + struct ethtool_eth_mac_stats *mac_stats) +{ + struct emac_priv *priv = netdev_priv(dev); + union emac_hw_tx_stats *tx_stats; + union emac_hw_rx_stats *rx_stats; + + tx_stats = &priv->tx_stats; + rx_stats = &priv->rx_stats; + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + mac_stats->MulticastFramesXmittedOK = tx_stats->stats.tx_multicast_pkts; + mac_stats->BroadcastFramesXmittedOK = tx_stats->stats.tx_broadcast_pkts; + + mac_stats->MulticastFramesReceivedOK = + rx_stats->stats.rx_multicast_pkts; + mac_stats->BroadcastFramesReceivedOK = + rx_stats->stats.rx_broadcast_pkts; + + mac_stats->SingleCollisionFrames = tx_stats->stats.tx_singleclsn_pkts; + mac_stats->MultipleCollisionFrames = tx_stats->stats.tx_multiclsn_pkts; + mac_stats->LateCollisions = tx_stats->stats.tx_lateclsn_pkts; + mac_stats->FramesAbortedDueToXSColls = + tx_stats->stats.tx_excessclsn_pkts; + + spin_unlock(&priv->stats_lock); +} + +static void emac_get_pause_stats(struct net_device *dev, + struct ethtool_pause_stats *pause_stats) +{ + struct emac_priv *priv = netdev_priv(dev); + union emac_hw_tx_stats *tx_stats; + union emac_hw_rx_stats *rx_stats; + + tx_stats = &priv->tx_stats; + rx_stats = &priv->rx_stats; + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + pause_stats->tx_pause_frames = tx_stats->stats.tx_pause_pkts; + pause_stats->rx_pause_frames = rx_stats->stats.rx_pause_pkts; + + spin_unlock(&priv->stats_lock); +} + +/* Other statistics that are not derivable from standard statistics */ + +#define EMAC_ETHTOOL_STAT(type, name) \ + { offsetof(type, stats.name) / sizeof(u64), #name } + +static const struct emac_ethtool_stats { + size_t offset; + char str[ETH_GSTRING_LEN]; +} emac_ethtool_rx_stats[] = { + EMAC_ETHTOOL_STAT(union emac_hw_rx_stats, rx_drp_fifo_full_pkts), + EMAC_ETHTOOL_STAT(union emac_hw_rx_stats, rx_truncate_fifo_full_pkts), +}; + +static int emac_get_sset_count(struct net_device *dev, int sset) +{ + switch (sset) { + case ETH_SS_STATS: + return ARRAY_SIZE(emac_ethtool_rx_stats); + default: + return -EOPNOTSUPP; + } +} + +static void emac_get_strings(struct net_device *dev, u32 stringset, u8 *data) +{ + int i; + + switch (stringset) { + case ETH_SS_STATS: + for (i = 0; i < ARRAY_SIZE(emac_ethtool_rx_stats); i++) { + memcpy(data, emac_ethtool_rx_stats[i].str, + ETH_GSTRING_LEN); + data += ETH_GSTRING_LEN; + } + break; + } +} + +static void emac_get_ethtool_stats(struct net_device *dev, + struct ethtool_stats *stats, u64 *data) +{ + struct emac_priv *priv = netdev_priv(dev); + u64 *rx_stats = (u64 *)&priv->rx_stats; + int i; + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + for (i = 0; i < ARRAY_SIZE(emac_ethtool_rx_stats); i++) + data[i] = rx_stats[emac_ethtool_rx_stats[i].offset]; + + spin_unlock(&priv->stats_lock); +} + +static int emac_ethtool_get_regs_len(struct net_device *dev) +{ + return (EMAC_DMA_REG_CNT + EMAC_MAC_REG_CNT) * sizeof(u32); +} + +static void emac_ethtool_get_regs(struct net_device *dev, + struct ethtool_regs *regs, void *space) +{ + struct emac_priv *priv = netdev_priv(dev); + u32 *reg_space = space; + int i; + + regs->version = 1; + + for (i = 0; i < EMAC_DMA_REG_CNT; i++) + reg_space[i] = emac_rd(priv, DMA_CONFIGURATION + i * 4); + + for (i = 0; i < EMAC_MAC_REG_CNT; i++) + reg_space[i + EMAC_DMA_REG_CNT] = + emac_rd(priv, MAC_GLOBAL_CONTROL + i * 4); +} + +static void emac_get_pauseparam(struct net_device *dev, + struct ethtool_pauseparam *pause) +{ + struct emac_priv *priv = netdev_priv(dev); + + pause->autoneg = priv->flow_control_autoneg; + pause->tx_pause = !!(priv->flow_control & FLOW_CTRL_TX); + pause->rx_pause = !!(priv->flow_control & FLOW_CTRL_RX); +} + +static int emac_set_pauseparam(struct net_device *dev, + struct ethtool_pauseparam *pause) +{ + struct emac_priv *priv = netdev_priv(dev); + u8 fc = 0; + + priv->flow_control_autoneg = pause->autoneg; + + if (pause->autoneg) { + emac_set_fc_autoneg(priv); + } else { + if (pause->tx_pause) + fc |= FLOW_CTRL_TX; + + if (pause->rx_pause) + fc |= FLOW_CTRL_RX; + + emac_set_fc(priv, fc); + } + + return 0; +} + +static void emac_get_drvinfo(struct net_device *dev, + struct ethtool_drvinfo *info) +{ + strscpy(info->driver, DRIVER_NAME, sizeof(info->driver)); + info->n_stats = ARRAY_SIZE(emac_ethtool_rx_stats); +} + +static void emac_tx_timeout_task(struct work_struct *work) +{ + struct net_device *ndev; + struct emac_priv *priv; + + priv = container_of(work, struct emac_priv, tx_timeout_task); + ndev = priv->ndev; + + rtnl_lock(); + + /* No need to reset if already down */ + if (!netif_running(ndev)) { + rtnl_unlock(); + return; + } + + netdev_err(ndev, "MAC reset due to TX timeout\n"); + + netif_trans_update(ndev); /* prevent tx timeout */ + dev_close(ndev); + dev_open(ndev, NULL); + + rtnl_unlock(); +} + +static void emac_sw_init(struct emac_priv *priv) +{ + priv->dma_buf_sz = EMAC_DEFAULT_BUFSIZE; + + priv->tx_ring.total_cnt = DEFAULT_TX_RING_NUM; + priv->rx_ring.total_cnt = DEFAULT_RX_RING_NUM; + + spin_lock_init(&priv->stats_lock); + + INIT_WORK(&priv->tx_timeout_task, emac_tx_timeout_task); + + priv->tx_coal_frames = EMAC_TX_FRAMES; + priv->tx_coal_timeout = EMAC_TX_COAL_TIMEOUT; + + timer_setup(&priv->txtimer, emac_tx_coal_timer, 0); + timer_setup(&priv->stats_timer, emac_stats_timer, 0); +} + +static irqreturn_t emac_interrupt_handler(int irq, void *dev_id) +{ + struct net_device *ndev = (struct net_device *)dev_id; + struct emac_priv *priv = netdev_priv(ndev); + bool should_schedule = false; + u32 clr = 0; + u32 status; + + status = emac_rd(priv, DMA_STATUS_IRQ); + + if (status & MREGBIT_TRANSMIT_TRANSFER_DONE_IRQ) { + clr |= MREGBIT_TRANSMIT_TRANSFER_DONE_IRQ; + should_schedule = true; + } + + if (status & MREGBIT_TRANSMIT_DES_UNAVAILABLE_IRQ) + clr |= MREGBIT_TRANSMIT_DES_UNAVAILABLE_IRQ; + + if (status & MREGBIT_TRANSMIT_DMA_STOPPED_IRQ) + clr |= MREGBIT_TRANSMIT_DMA_STOPPED_IRQ; + + if (status & MREGBIT_RECEIVE_TRANSFER_DONE_IRQ) { + clr |= MREGBIT_RECEIVE_TRANSFER_DONE_IRQ; + should_schedule = true; + } + + if (status & MREGBIT_RECEIVE_DES_UNAVAILABLE_IRQ) + clr |= MREGBIT_RECEIVE_DES_UNAVAILABLE_IRQ; + + if (status & MREGBIT_RECEIVE_DMA_STOPPED_IRQ) + clr |= MREGBIT_RECEIVE_DMA_STOPPED_IRQ; + + if (status & MREGBIT_RECEIVE_MISSED_FRAME_IRQ) + clr |= MREGBIT_RECEIVE_MISSED_FRAME_IRQ; + + if (should_schedule) { + if (napi_schedule_prep(&priv->napi)) { + emac_disable_interrupt(priv); + __napi_schedule_irqoff(&priv->napi); + } + } + + emac_wr(priv, DMA_STATUS_IRQ, clr); + + return IRQ_HANDLED; +} + +static void emac_configure_tx(struct emac_priv *priv) +{ + u32 val; + + /* Set base address */ + val = (u32)priv->tx_ring.desc_dma_addr; + emac_wr(priv, DMA_TRANSMIT_BASE_ADDRESS, val); + + /* Set TX inter-frame gap value, enable transmit */ + val = emac_rd(priv, MAC_TRANSMIT_CONTROL); + val &= ~MREGBIT_IFG_LEN; + val |= MREGBIT_TRANSMIT_ENABLE; + val |= MREGBIT_TRANSMIT_AUTO_RETRY; + emac_wr(priv, MAC_TRANSMIT_CONTROL, val); + + emac_wr(priv, DMA_TRANSMIT_AUTO_POLL_COUNTER, 0x0); + + /* Start TX DMA */ + val = emac_rd(priv, DMA_CONTROL); + val |= MREGBIT_START_STOP_TRANSMIT_DMA; + emac_wr(priv, DMA_CONTROL, val); +} + +static void emac_configure_rx(struct emac_priv *priv) +{ + u32 val; + + /* Set base address */ + val = (u32)priv->rx_ring.desc_dma_addr; + emac_wr(priv, DMA_RECEIVE_BASE_ADDRESS, val); + + /* Enable receive */ + val = emac_rd(priv, MAC_RECEIVE_CONTROL); + val |= MREGBIT_RECEIVE_ENABLE; + val |= MREGBIT_STORE_FORWARD; + emac_wr(priv, MAC_RECEIVE_CONTROL, val); + + /* Start RX DMA */ + val = emac_rd(priv, DMA_CONTROL); + val |= MREGBIT_START_STOP_RECEIVE_DMA; + emac_wr(priv, DMA_CONTROL, val); +} + +static void emac_adjust_link(struct net_device *dev) +{ + struct emac_priv *priv = netdev_priv(dev); + struct phy_device *phydev = dev->phydev; + u32 ctrl; + + if (phydev->link) { + ctrl = emac_rd(priv, MAC_GLOBAL_CONTROL); + + /* Update duplex and speed from PHY */ + + if (!phydev->duplex) + ctrl &= ~MREGBIT_FULL_DUPLEX_MODE; + else + ctrl |= MREGBIT_FULL_DUPLEX_MODE; + + ctrl &= ~MREGBIT_SPEED; + + switch (phydev->speed) { + case SPEED_1000: + ctrl |= MREGBIT_SPEED_1000M; + break; + case SPEED_100: + ctrl |= MREGBIT_SPEED_100M; + break; + case SPEED_10: + ctrl |= MREGBIT_SPEED_10M; + break; + default: + netdev_err(dev, "Unknown speed: %d\n", phydev->speed); + phydev->speed = SPEED_UNKNOWN; + break; + } + + emac_wr(priv, MAC_GLOBAL_CONTROL, ctrl); + + emac_set_fc_autoneg(priv); + } + + phy_print_status(phydev); +} + +static void emac_update_delay_line(struct emac_priv *priv) +{ + u32 mask = 0, val = 0; + + mask |= EMAC_RX_DLINE_EN; + mask |= EMAC_RX_DLINE_STEP_MASK | EMAC_RX_DLINE_CODE_MASK; + mask |= EMAC_TX_DLINE_EN; + mask |= EMAC_TX_DLINE_STEP_MASK | EMAC_TX_DLINE_CODE_MASK; + + if (phy_interface_mode_is_rgmii(priv->phy_interface)) { + val |= EMAC_RX_DLINE_EN; + val |= FIELD_PREP(EMAC_RX_DLINE_STEP_MASK, + EMAC_DLINE_STEP_15P6); + val |= FIELD_PREP(EMAC_RX_DLINE_CODE_MASK, priv->rx_delay); + + val |= EMAC_TX_DLINE_EN; + val |= FIELD_PREP(EMAC_TX_DLINE_STEP_MASK, + EMAC_DLINE_STEP_15P6); + val |= FIELD_PREP(EMAC_TX_DLINE_CODE_MASK, priv->tx_delay); + } + + regmap_update_bits(priv->regmap_apmu, + priv->regmap_apmu_offset + APMU_EMAC_DLINE_REG, + mask, val); +} + +static int emac_phy_connect(struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + struct device *dev = &priv->pdev->dev; + struct phy_device *phydev; + struct device_node *np; + int ret; + + ret = of_get_phy_mode(dev->of_node, &priv->phy_interface); + if (ret) { + netdev_err(ndev, "No phy-mode found"); + return ret; + } + + switch (priv->phy_interface) { + case PHY_INTERFACE_MODE_RMII: + case PHY_INTERFACE_MODE_RGMII: + case PHY_INTERFACE_MODE_RGMII_ID: + case PHY_INTERFACE_MODE_RGMII_RXID: + case PHY_INTERFACE_MODE_RGMII_TXID: + break; + default: + netdev_err(ndev, "Unsupported PHY interface %s", + phy_modes(priv->phy_interface)); + return -EINVAL; + } + + np = of_parse_phandle(dev->of_node, "phy-handle", 0); + if (!np && of_phy_is_fixed_link(dev->of_node)) + np = of_node_get(dev->of_node); + + if (!np) { + netdev_err(ndev, "No PHY specified"); + return -ENODEV; + } + + ret = emac_phy_interface_config(priv); + if (ret) + goto err_node_put; + + phydev = of_phy_connect(ndev, np, &emac_adjust_link, 0, + priv->phy_interface); + if (!phydev) { + netdev_err(ndev, "Could not attach to PHY\n"); + ret = -ENODEV; + goto err_node_put; + } + + phy_support_asym_pause(phydev); + + phydev->mac_managed_pm = true; + + emac_update_delay_line(priv); + +err_node_put: + of_node_put(np); + return ret; +} + +static int emac_up(struct emac_priv *priv) +{ + struct platform_device *pdev = priv->pdev; + struct net_device *ndev = priv->ndev; + int ret; + + pm_runtime_get_sync(&pdev->dev); + + ret = emac_phy_connect(ndev); + if (ret) { + dev_err(&pdev->dev, "emac_phy_connect failed\n"); + goto err_pm_put; + } + + emac_init_hw(priv); + + emac_set_mac_addr(priv, ndev->dev_addr); + emac_configure_tx(priv); + emac_configure_rx(priv); + + emac_alloc_rx_desc_buffers(priv); + + phy_start(ndev->phydev); + + ret = request_irq(priv->irq, emac_interrupt_handler, IRQF_SHARED, + ndev->name, ndev); + if (ret) { + dev_err(&pdev->dev, "request_irq failed\n"); + goto err_reset_disconnect_phy; + } + + /* Don't enable MAC interrupts */ + emac_wr(priv, MAC_INTERRUPT_ENABLE, 0x0); + + /* Enable DMA interrupts */ + emac_wr(priv, DMA_INTERRUPT_ENABLE, + MREGBIT_TRANSMIT_TRANSFER_DONE_INTR_ENABLE | + MREGBIT_TRANSMIT_DMA_STOPPED_INTR_ENABLE | + MREGBIT_RECEIVE_TRANSFER_DONE_INTR_ENABLE | + MREGBIT_RECEIVE_DMA_STOPPED_INTR_ENABLE | + MREGBIT_RECEIVE_MISSED_FRAME_INTR_ENABLE); + + napi_enable(&priv->napi); + + netif_start_queue(ndev); + + emac_stats_timer(&priv->stats_timer); + + return 0; + +err_reset_disconnect_phy: + emac_reset_hw(priv); + phy_disconnect(ndev->phydev); + +err_pm_put: + pm_runtime_put_sync(&pdev->dev); + return ret; +} + +static int emac_down(struct emac_priv *priv) +{ + struct platform_device *pdev = priv->pdev; + struct net_device *ndev = priv->ndev; + + netif_stop_queue(ndev); + + phy_disconnect(ndev->phydev); + + emac_wr(priv, MAC_INTERRUPT_ENABLE, 0x0); + emac_wr(priv, DMA_INTERRUPT_ENABLE, 0x0); + + free_irq(priv->irq, ndev); + + napi_disable(&priv->napi); + + timer_delete_sync(&priv->txtimer); + cancel_work_sync(&priv->tx_timeout_task); + + timer_delete_sync(&priv->stats_timer); + + emac_reset_hw(priv); + + /* Update and save current stats, see emac_stats_update() for usage */ + + spin_lock(&priv->stats_lock); + + emac_stats_update(priv); + + priv->tx_stats_off = priv->tx_stats; + priv->rx_stats_off = priv->rx_stats; + + spin_unlock(&priv->stats_lock); + + pm_runtime_put_sync(&pdev->dev); + return 0; +} + +/* Called when net interface is brought up. */ +static int emac_open(struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + struct device *dev = &priv->pdev->dev; + int ret; + + ret = emac_alloc_tx_resources(priv); + if (ret) { + dev_err(dev, "Cannot allocate TX resources\n"); + return ret; + } + + ret = emac_alloc_rx_resources(priv); + if (ret) { + dev_err(dev, "Cannot allocate RX resources\n"); + goto err_free_tx; + } + + ret = emac_up(priv); + if (ret) { + dev_err(dev, "Error when bringing interface up\n"); + goto err_free_rx; + } + return 0; + +err_free_rx: + emac_free_rx_resources(priv); +err_free_tx: + emac_free_tx_resources(priv); + + return ret; +} + +/* Called when interface is brought down. */ +static int emac_stop(struct net_device *ndev) +{ + struct emac_priv *priv = netdev_priv(ndev); + + emac_down(priv); + emac_free_tx_resources(priv); + emac_free_rx_resources(priv); + + return 0; +} + +static const struct ethtool_ops emac_ethtool_ops = { + .get_link_ksettings = phy_ethtool_get_link_ksettings, + .set_link_ksettings = phy_ethtool_set_link_ksettings, + .nway_reset = phy_ethtool_nway_reset, + .get_drvinfo = emac_get_drvinfo, + .get_link = ethtool_op_get_link, + + .get_regs = emac_ethtool_get_regs, + .get_regs_len = emac_ethtool_get_regs_len, + + .get_rmon_stats = emac_get_rmon_stats, + .get_pause_stats = emac_get_pause_stats, + .get_eth_mac_stats = emac_get_eth_mac_stats, + + .get_sset_count = emac_get_sset_count, + .get_strings = emac_get_strings, + .get_ethtool_stats = emac_get_ethtool_stats, + + .get_pauseparam = emac_get_pauseparam, + .set_pauseparam = emac_set_pauseparam, +}; + +static const struct net_device_ops emac_netdev_ops = { + .ndo_open = emac_open, + .ndo_stop = emac_stop, + .ndo_start_xmit = emac_start_xmit, + .ndo_validate_addr = eth_validate_addr, + .ndo_set_mac_address = emac_set_mac_address, + .ndo_eth_ioctl = phy_do_ioctl_running, + .ndo_change_mtu = emac_change_mtu, + .ndo_tx_timeout = emac_tx_timeout, + .ndo_set_rx_mode = emac_set_rx_mode, + .ndo_get_stats64 = emac_get_stats64, +}; + +/* Currently we always use 15.6 ps/step for the delay line */ + +static u32 delay_ps_to_unit(u32 ps) +{ + return DIV_ROUND_CLOSEST(ps * 10, 156); +} + +static u32 delay_unit_to_ps(u32 unit) +{ + return DIV_ROUND_CLOSEST(unit * 156, 10); +} + +#define EMAC_MAX_DELAY_UNIT FIELD_MAX(EMAC_TX_DLINE_CODE_MASK) + +/* Minus one just to be safe from rounding errors */ +#define EMAC_MAX_DELAY_PS (delay_unit_to_ps(EMAC_MAX_DELAY_UNIT - 1)) + +static int emac_config_dt(struct platform_device *pdev, struct emac_priv *priv) +{ + struct device_node *np = pdev->dev.of_node; + struct device *dev = &pdev->dev; + u8 mac_addr[ETH_ALEN] = { 0 }; + int ret; + + priv->iobase = devm_platform_ioremap_resource(pdev, 0); + if (IS_ERR(priv->iobase)) + return dev_err_probe(dev, PTR_ERR(priv->iobase), + "ioremap failed\n"); + + priv->regmap_apmu = + syscon_regmap_lookup_by_phandle_args(np, "spacemit,apmu", 1, + &priv->regmap_apmu_offset); + + if (IS_ERR(priv->regmap_apmu)) + return dev_err_probe(dev, PTR_ERR(priv->regmap_apmu), + "failed to get syscon\n"); + + priv->irq = platform_get_irq(pdev, 0); + if (priv->irq < 0) + return priv->irq; + + ret = of_get_mac_address(np, mac_addr); + if (ret) { + if (ret == -EPROBE_DEFER) + return dev_err_probe(dev, ret, + "Can't get MAC address\n"); + + dev_info(&pdev->dev, "Using random MAC address\n"); + eth_hw_addr_random(priv->ndev); + } else { + eth_hw_addr_set(priv->ndev, mac_addr); + } + + priv->tx_delay = 0; + priv->rx_delay = 0; + + of_property_read_u32(np, "tx-internal-delay-ps", &priv->tx_delay); + of_property_read_u32(np, "rx-internal-delay-ps", &priv->rx_delay); + + if (priv->tx_delay > EMAC_MAX_DELAY_PS) { + dev_err(&pdev->dev, + "tx-internal-delay-ps too large: max %d, got %d", + EMAC_MAX_DELAY_PS, priv->tx_delay); + return -EINVAL; + } + + if (priv->rx_delay > EMAC_MAX_DELAY_PS) { + dev_err(&pdev->dev, + "rx-internal-delay-ps too large: max %d, got %d", + EMAC_MAX_DELAY_PS, priv->rx_delay); + return -EINVAL; + } + + priv->tx_delay = delay_ps_to_unit(priv->tx_delay); + priv->rx_delay = delay_ps_to_unit(priv->rx_delay); + + return 0; +} + +static void emac_phy_deregister_fixed_link(void *data) +{ + struct device_node *of_node = data; + + of_phy_deregister_fixed_link(of_node); +} + +static int emac_probe(struct platform_device *pdev) +{ + struct device *dev = &pdev->dev; + struct reset_control *reset; + struct net_device *ndev; + struct emac_priv *priv; + int ret; + + ndev = devm_alloc_etherdev(dev, sizeof(struct emac_priv)); + if (!ndev) + return -ENOMEM; + + ndev->hw_features = NETIF_F_SG; + ndev->features |= ndev->hw_features; + + ndev->max_mtu = EMAC_RX_BUF_4K - (ETH_HLEN + ETH_FCS_LEN); + ndev->pcpu_stat_type = NETDEV_PCPU_STAT_DSTATS; + + priv = netdev_priv(ndev); + priv->ndev = ndev; + priv->pdev = pdev; + platform_set_drvdata(pdev, priv); + + ret = emac_config_dt(pdev, priv); + if (ret < 0) + return dev_err_probe(dev, ret, "Configuration failed\n"); + + ndev->watchdog_timeo = 5 * HZ; + ndev->base_addr = (unsigned long)priv->iobase; + ndev->irq = priv->irq; + + ndev->ethtool_ops = &emac_ethtool_ops; + ndev->netdev_ops = &emac_netdev_ops; + + devm_pm_runtime_enable(&pdev->dev); + + priv->bus_clk = devm_clk_get_enabled(&pdev->dev, NULL); + if (IS_ERR(priv->bus_clk)) + return dev_err_probe(dev, PTR_ERR(priv->bus_clk), + "Failed to get clock\n"); + + reset = devm_reset_control_get_optional_exclusive_deasserted(&pdev->dev, + NULL); + if (IS_ERR(reset)) + return dev_err_probe(dev, PTR_ERR(reset), + "Failed to get reset\n"); + + if (of_phy_is_fixed_link(dev->of_node)) { + ret = of_phy_register_fixed_link(dev->of_node); + if (ret) + return dev_err_probe(dev, ret, + "Failed to register fixed-link\n"); + + ret = devm_add_action_or_reset(dev, + emac_phy_deregister_fixed_link, + dev->of_node); + + if (ret) { + dev_err(dev, "devm_add_action_or_reset failed\n"); + return ret; + } + } + + emac_sw_init(priv); + + ret = emac_mdio_init(priv); + if (ret) + goto err_timer_delete; + + SET_NETDEV_DEV(ndev, &pdev->dev); + + ret = devm_register_netdev(dev, ndev); + if (ret) { + dev_err(dev, "devm_register_netdev failed\n"); + goto err_timer_delete; + } + + netif_napi_add(ndev, &priv->napi, emac_rx_poll); + netif_carrier_off(ndev); + + return 0; + +err_timer_delete: + timer_delete_sync(&priv->txtimer); + timer_delete_sync(&priv->stats_timer); + + return ret; +} + +static void emac_remove(struct platform_device *pdev) +{ + struct emac_priv *priv = platform_get_drvdata(pdev); + + timer_shutdown_sync(&priv->txtimer); + cancel_work_sync(&priv->tx_timeout_task); + + timer_shutdown_sync(&priv->stats_timer); + + emac_reset_hw(priv); +} + +static int emac_resume(struct device *dev) +{ + struct emac_priv *priv = dev_get_drvdata(dev); + struct net_device *ndev = priv->ndev; + int ret; + + ret = clk_prepare_enable(priv->bus_clk); + if (ret < 0) { + dev_err(dev, "Failed to enable bus clock: %d\n", ret); + return ret; + } + + if (!netif_running(ndev)) + return 0; + + ret = emac_open(ndev); + if (ret) { + clk_disable_unprepare(priv->bus_clk); + return ret; + } + + netif_device_attach(ndev); + + emac_stats_timer(&priv->stats_timer); + + return 0; +} + +static int emac_suspend(struct device *dev) +{ + struct emac_priv *priv = dev_get_drvdata(dev); + struct net_device *ndev = priv->ndev; + + if (!ndev || !netif_running(ndev)) { + clk_disable_unprepare(priv->bus_clk); + return 0; + } + + emac_stop(ndev); + + clk_disable_unprepare(priv->bus_clk); + netif_device_detach(ndev); + return 0; +} + +static const struct dev_pm_ops emac_pm_ops = { + SYSTEM_SLEEP_PM_OPS(emac_suspend, emac_resume) +}; + +static const struct of_device_id emac_of_match[] = { + { .compatible = "spacemit,k1-emac" }, + { /* sentinel */ }, +}; +MODULE_DEVICE_TABLE(of, emac_of_match); + +static struct platform_driver emac_driver = { + .probe = emac_probe, + .remove = emac_remove, + .driver = { + .name = DRIVER_NAME, + .of_match_table = of_match_ptr(emac_of_match), + .pm = &emac_pm_ops, + }, +}; +module_platform_driver(emac_driver); + +MODULE_DESCRIPTION("SpacemiT K1 Ethernet driver"); +MODULE_AUTHOR("Vivian Wang "); +MODULE_LICENSE("GPL"); diff --git a/drivers/net/ethernet/spacemit/k1_emac.h b/drivers/net/ethernet/spacemit/k1_emac.h new file mode 100644 index 0000000000000000000000000000000000000000..5a09e946a276f5a3fe36ba191e77e963671eb9bd --- /dev/null +++ b/drivers/net/ethernet/spacemit/k1_emac.h @@ -0,0 +1,416 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * SpacemiT K1 Ethernet hardware definitions + * + * Copyright (C) 2023-2025 SpacemiT (Hangzhou) Technology Co. Ltd + * Copyright (C) 2025 Vivian Wang + */ + +#ifndef _K1_EMAC_H_ +#define _K1_EMAC_H_ + +#include + +/* APMU syscon registers */ + +#define APMU_EMAC_CTRL_REG 0x0 + +#define PHY_INTF_RGMII BIT(2) + +/* + * Only valid for RMII mode + * 0: Ref clock from External PHY + * 1: Ref clock from SoC + */ +#define REF_CLK_SEL BIT(3) + +/* + * Function clock select + * 0: 208 MHz + * 1: 312 MHz + */ +#define FUNC_CLK_SEL BIT(4) + +/* Only valid for RMII, invert TX clk */ +#define RMII_TX_CLK_SEL BIT(6) + +/* Only valid for RMII, invert RX clk */ +#define RMII_RX_CLK_SEL BIT(7) + +/* + * Only valid for RGMII + * 0: TX clk from RX clk + * 1: TX clk from SoC + */ +#define RGMII_TX_CLK_SEL BIT(8) + +#define PHY_IRQ_EN BIT(12) +#define AXI_SINGLE_ID BIT(13) + +#define APMU_EMAC_DLINE_REG 0x4 + +#define EMAC_RX_DLINE_EN BIT(0) +#define EMAC_RX_DLINE_STEP_MASK GENMASK(5, 4) +#define EMAC_RX_DLINE_CODE_MASK GENMASK(15, 8) + +#define EMAC_TX_DLINE_EN BIT(16) +#define EMAC_TX_DLINE_STEP_MASK GENMASK(21, 20) +#define EMAC_TX_DLINE_CODE_MASK GENMASK(31, 24) + +#define EMAC_DLINE_STEP_15P6 0 /* 15.6 ps/step */ +#define EMAC_DLINE_STEP_24P4 1 /* 24.4 ps/step */ +#define EMAC_DLINE_STEP_29P7 2 /* 29.7 ps/step */ +#define EMAC_DLINE_STEP_35P1 3 /* 35.1 ps/step */ + +/* DMA register set */ +#define DMA_CONFIGURATION 0x0000 +#define DMA_CONTROL 0x0004 +#define DMA_STATUS_IRQ 0x0008 +#define DMA_INTERRUPT_ENABLE 0x000c + +#define DMA_TRANSMIT_AUTO_POLL_COUNTER 0x0010 +#define DMA_TRANSMIT_POLL_DEMAND 0x0014 +#define DMA_RECEIVE_POLL_DEMAND 0x0018 + +#define DMA_TRANSMIT_BASE_ADDRESS 0x001c +#define DMA_RECEIVE_BASE_ADDRESS 0x0020 +#define DMA_MISSED_FRAME_COUNTER 0x0024 +#define DMA_STOP_FLUSH_COUNTER 0x0028 + +#define DMA_RECEIVE_IRQ_MITIGATION_CTRL 0x002c + +#define DMA_CURRENT_TRANSMIT_DESCRIPTOR_POINTER 0x0030 +#define DMA_CURRENT_TRANSMIT_BUFFER_POINTER 0x0034 +#define DMA_CURRENT_RECEIVE_DESCRIPTOR_POINTER 0x0038 +#define DMA_CURRENT_RECEIVE_BUFFER_POINTER 0x003c + +/* MAC Register set */ +#define MAC_GLOBAL_CONTROL 0x0100 +#define MAC_TRANSMIT_CONTROL 0x0104 +#define MAC_RECEIVE_CONTROL 0x0108 +#define MAC_MAXIMUM_FRAME_SIZE 0x010c +#define MAC_TRANSMIT_JABBER_SIZE 0x0110 +#define MAC_RECEIVE_JABBER_SIZE 0x0114 +#define MAC_ADDRESS_CONTROL 0x0118 +#define MAC_MDIO_CLK_DIV 0x011c +#define MAC_ADDRESS1_HIGH 0x0120 +#define MAC_ADDRESS1_MED 0x0124 +#define MAC_ADDRESS1_LOW 0x0128 +#define MAC_ADDRESS2_HIGH 0x012c +#define MAC_ADDRESS2_MED 0x0130 +#define MAC_ADDRESS2_LOW 0x0134 +#define MAC_ADDRESS3_HIGH 0x0138 +#define MAC_ADDRESS3_MED 0x013c +#define MAC_ADDRESS3_LOW 0x0140 +#define MAC_ADDRESS4_HIGH 0x0144 +#define MAC_ADDRESS4_MED 0x0148 +#define MAC_ADDRESS4_LOW 0x014c +#define MAC_MULTICAST_HASH_TABLE1 0x0150 +#define MAC_MULTICAST_HASH_TABLE2 0x0154 +#define MAC_MULTICAST_HASH_TABLE3 0x0158 +#define MAC_MULTICAST_HASH_TABLE4 0x015c +#define MAC_FC_CONTROL 0x0160 +#define MAC_FC_PAUSE_FRAME_GENERATE 0x0164 +#define MAC_FC_SOURCE_ADDRESS_HIGH 0x0168 +#define MAC_FC_SOURCE_ADDRESS_MED 0x016c +#define MAC_FC_SOURCE_ADDRESS_LOW 0x0170 +#define MAC_FC_DESTINATION_ADDRESS_HIGH 0x0174 +#define MAC_FC_DESTINATION_ADDRESS_MED 0x0178 +#define MAC_FC_DESTINATION_ADDRESS_LOW 0x017c +#define MAC_FC_PAUSE_TIME_VALUE 0x0180 +#define MAC_FC_HIGH_PAUSE_TIME 0x0184 +#define MAC_FC_LOW_PAUSE_TIME 0x0188 +#define MAC_FC_PAUSE_HIGH_THRESHOLD 0x018c +#define MAC_FC_PAUSE_LOW_THRESHOLD 0x0190 +#define MAC_MDIO_CONTROL 0x01a0 +#define MAC_MDIO_DATA 0x01a4 +#define MAC_RX_STATCTR_CONTROL 0x01a8 +#define MAC_RX_STATCTR_DATA_HIGH 0x01ac +#define MAC_RX_STATCTR_DATA_LOW 0x01b0 +#define MAC_TX_STATCTR_CONTROL 0x01b4 +#define MAC_TX_STATCTR_DATA_HIGH 0x01b8 +#define MAC_TX_STATCTR_DATA_LOW 0x01bc +#define MAC_TRANSMIT_FIFO_ALMOST_FULL 0x01c0 +#define MAC_TRANSMIT_PACKET_START_THRESHOLD 0x01c4 +#define MAC_RECEIVE_PACKET_START_THRESHOLD 0x01c8 +#define MAC_STATUS_IRQ 0x01e0 +#define MAC_INTERRUPT_ENABLE 0x01e4 + +/* Used for register dump */ +#define EMAC_DMA_REG_CNT 16 +#define EMAC_MAC_REG_CNT 124 + +/* DMA_CONFIGURATION (0x0000) */ + +/* + * 0-DMA controller in normal operation mode, + * 1-DMA controller reset to default state, + * clearing all internal state information + */ +#define MREGBIT_SOFTWARE_RESET BIT(0) + +#define MREGBIT_BURST_1WORD BIT(1) +#define MREGBIT_BURST_2WORD BIT(2) +#define MREGBIT_BURST_4WORD BIT(3) +#define MREGBIT_BURST_8WORD BIT(4) +#define MREGBIT_BURST_16WORD BIT(5) +#define MREGBIT_BURST_32WORD BIT(6) +#define MREGBIT_BURST_64WORD BIT(7) +#define MREGBIT_BURST_LENGTH GENMASK(7, 1) +#define MREGBIT_DESCRIPTOR_SKIP_LENGTH GENMASK(12, 8) + +/* For Receive and Transmit DMA operate in Big-Endian mode for Descriptors. */ +#define MREGBIT_DESCRIPTOR_BYTE_ORDERING BIT(13) + +#define MREGBIT_BIG_LITLE_ENDIAN BIT(14) +#define MREGBIT_TX_RX_ARBITRATION BIT(15) +#define MREGBIT_WAIT_FOR_DONE BIT(16) +#define MREGBIT_STRICT_BURST BIT(17) +#define MREGBIT_DMA_64BIT_MODE BIT(18) + +/* DMA_CONTROL (0x0004) */ +#define MREGBIT_START_STOP_TRANSMIT_DMA BIT(0) +#define MREGBIT_START_STOP_RECEIVE_DMA BIT(1) + +/* DMA_STATUS_IRQ (0x0008) */ +#define MREGBIT_TRANSMIT_TRANSFER_DONE_IRQ BIT(0) +#define MREGBIT_TRANSMIT_DES_UNAVAILABLE_IRQ BIT(1) +#define MREGBIT_TRANSMIT_DMA_STOPPED_IRQ BIT(2) +#define MREGBIT_RECEIVE_TRANSFER_DONE_IRQ BIT(4) +#define MREGBIT_RECEIVE_DES_UNAVAILABLE_IRQ BIT(5) +#define MREGBIT_RECEIVE_DMA_STOPPED_IRQ BIT(6) +#define MREGBIT_RECEIVE_MISSED_FRAME_IRQ BIT(7) +#define MREGBIT_MAC_IRQ BIT(8) +#define MREGBIT_TRANSMIT_DMA_STATE GENMASK(18, 16) +#define MREGBIT_RECEIVE_DMA_STATE GENMASK(23, 20) + +/* DMA_INTERRUPT_ENABLE (0x000c) */ +#define MREGBIT_TRANSMIT_TRANSFER_DONE_INTR_ENABLE BIT(0) +#define MREGBIT_TRANSMIT_DES_UNAVAILABLE_INTR_ENABLE BIT(1) +#define MREGBIT_TRANSMIT_DMA_STOPPED_INTR_ENABLE BIT(2) +#define MREGBIT_RECEIVE_TRANSFER_DONE_INTR_ENABLE BIT(4) +#define MREGBIT_RECEIVE_DES_UNAVAILABLE_INTR_ENABLE BIT(5) +#define MREGBIT_RECEIVE_DMA_STOPPED_INTR_ENABLE BIT(6) +#define MREGBIT_RECEIVE_MISSED_FRAME_INTR_ENABLE BIT(7) +#define MREGBIT_MAC_INTR_ENABLE BIT(8) + +/* DMA_RECEIVE_IRQ_MITIGATION_CTRL (0x002c) */ +#define MREGBIT_RECEIVE_IRQ_FRAME_COUNTER_MASK GENMASK(7, 0) +#define MREGBIT_RECEIVE_IRQ_TIMEOUT_COUNTER_MASK GENMASK(27, 8) +#define MREGBIT_RECEIVE_IRQ_FRAME_COUNTER_MODE BIT(30) +#define MREGBIT_RECEIVE_IRQ_MITIGATION_ENABLE BIT(31) + +/* MAC_GLOBAL_CONTROL (0x0100) */ +#define MREGBIT_SPEED GENMASK(1, 0) +#define MREGBIT_SPEED_10M 0x0 +#define MREGBIT_SPEED_100M BIT(0) +#define MREGBIT_SPEED_1000M BIT(1) +#define MREGBIT_FULL_DUPLEX_MODE BIT(2) +#define MREGBIT_RESET_RX_STAT_COUNTERS BIT(3) +#define MREGBIT_RESET_TX_STAT_COUNTERS BIT(4) +#define MREGBIT_UNICAST_WAKEUP_MODE BIT(8) +#define MREGBIT_MAGIC_PACKET_WAKEUP_MODE BIT(9) + +/* MAC_TRANSMIT_CONTROL (0x0104) */ +#define MREGBIT_TRANSMIT_ENABLE BIT(0) +#define MREGBIT_INVERT_FCS BIT(1) +#define MREGBIT_DISABLE_FCS_INSERT BIT(2) +#define MREGBIT_TRANSMIT_AUTO_RETRY BIT(3) +#define MREGBIT_IFG_LEN GENMASK(6, 4) +#define MREGBIT_PREAMBLE_LENGTH GENMASK(9, 7) + +/* MAC_RECEIVE_CONTROL (0x0108) */ +#define MREGBIT_RECEIVE_ENABLE BIT(0) +#define MREGBIT_DISABLE_FCS_CHECK BIT(1) +#define MREGBIT_STRIP_FCS BIT(2) +#define MREGBIT_STORE_FORWARD BIT(3) +#define MREGBIT_STATUS_FIRST BIT(4) +#define MREGBIT_PASS_BAD_FRAMES BIT(5) +#define MREGBIT_ACOOUNT_VLAN BIT(6) + +/* MAC_MAXIMUM_FRAME_SIZE (0x010c) */ +#define MREGBIT_MAX_FRAME_SIZE GENMASK(13, 0) + +/* MAC_TRANSMIT_JABBER_SIZE (0x0110) */ +#define MREGBIT_TRANSMIT_JABBER_SIZE GENMASK(15, 0) + +/* MAC_RECEIVE_JABBER_SIZE (0x0114) */ +#define MREGBIT_RECEIVE_JABBER_SIZE GENMASK(15, 0) + +/* MAC_ADDRESS_CONTROL (0x0118) */ +#define MREGBIT_MAC_ADDRESS1_ENABLE BIT(0) +#define MREGBIT_MAC_ADDRESS2_ENABLE BIT(1) +#define MREGBIT_MAC_ADDRESS3_ENABLE BIT(2) +#define MREGBIT_MAC_ADDRESS4_ENABLE BIT(3) +#define MREGBIT_INVERSE_MAC_ADDRESS1_ENABLE BIT(4) +#define MREGBIT_INVERSE_MAC_ADDRESS2_ENABLE BIT(5) +#define MREGBIT_INVERSE_MAC_ADDRESS3_ENABLE BIT(6) +#define MREGBIT_INVERSE_MAC_ADDRESS4_ENABLE BIT(7) +#define MREGBIT_PROMISCUOUS_MODE BIT(8) + +/* MAC_FC_CONTROL (0x0160) */ +#define MREGBIT_FC_DECODE_ENABLE BIT(0) +#define MREGBIT_FC_GENERATION_ENABLE BIT(1) +#define MREGBIT_AUTO_FC_GENERATION_ENABLE BIT(2) +#define MREGBIT_MULTICAST_MODE BIT(3) +#define MREGBIT_BLOCK_PAUSE_FRAMES BIT(4) + +/* MAC_FC_PAUSE_FRAME_GENERATE (0x0164) */ +#define MREGBIT_GENERATE_PAUSE_FRAME BIT(0) + +/* MAC_FC_PAUSE_TIME_VALUE (0x0180) */ +#define MREGBIT_MAC_FC_PAUSE_TIME GENMASK(15, 0) + +/* MAC_MDIO_CONTROL (0x01a0) */ +#define MREGBIT_PHY_ADDRESS GENMASK(4, 0) +#define MREGBIT_REGISTER_ADDRESS GENMASK(9, 5) +#define MREGBIT_MDIO_READ_WRITE BIT(10) +#define MREGBIT_START_MDIO_TRANS BIT(15) + +/* MAC_MDIO_DATA (0x01a4) */ +#define MREGBIT_MDIO_DATA GENMASK(15, 0) + +/* MAC_RX_STATCTR_CONTROL (0x01a8) */ +#define MREGBIT_RX_COUNTER_NUMBER GENMASK(4, 0) +#define MREGBIT_START_RX_COUNTER_READ BIT(15) + +/* MAC_RX_STATCTR_DATA_HIGH (0x01ac) */ +#define MREGBIT_RX_STATCTR_DATA_HIGH GENMASK(15, 0) +/* MAC_RX_STATCTR_DATA_LOW (0x01b0) */ +#define MREGBIT_RX_STATCTR_DATA_LOW GENMASK(15, 0) + +/* MAC_TX_STATCTR_CONTROL (0x01b4) */ +#define MREGBIT_TX_COUNTER_NUMBER GENMASK(4, 0) +#define MREGBIT_START_TX_COUNTER_READ BIT(15) + +/* MAC_TX_STATCTR_DATA_HIGH (0x01b8) */ +#define MREGBIT_TX_STATCTR_DATA_HIGH GENMASK(15, 0) +/* MAC_TX_STATCTR_DATA_LOW (0x01bc) */ +#define MREGBIT_TX_STATCTR_DATA_LOW GENMASK(15, 0) + +/* MAC_TRANSMIT_FIFO_ALMOST_FULL (0x01c0) */ +#define MREGBIT_TX_FIFO_AF GENMASK(13, 0) + +/* MAC_TRANSMIT_PACKET_START_THRESHOLD (0x01c4) */ +#define MREGBIT_TX_PACKET_START_THRESHOLD GENMASK(13, 0) + +/* MAC_RECEIVE_PACKET_START_THRESHOLD (0x01c8) */ +#define MREGBIT_RX_PACKET_START_THRESHOLD GENMASK(13, 0) + +/* MAC_STATUS_IRQ (0x01e0) */ +#define MREGBIT_MAC_UNDERRUN_IRQ BIT(0) +#define MREGBIT_MAC_JABBER_IRQ BIT(1) + +/* MAC_INTERRUPT_ENABLE (0x01e4) */ +#define MREGBIT_MAC_UNDERRUN_INTERRUPT_ENABLE BIT(0) +#define MREGBIT_JABBER_INTERRUPT_ENABLE BIT(1) + +/* RX DMA descriptor */ + +#define RX_DESC_0_FRAME_PACKET_LENGTH_MASK GENMASK(13, 0) +#define RX_DESC_0_FRAME_ALIGN_ERR BIT(14) +#define RX_DESC_0_FRAME_RUNT BIT(15) +#define RX_DESC_0_FRAME_ETHERNET_TYPE BIT(16) +#define RX_DESC_0_FRAME_VLAN BIT(17) +#define RX_DESC_0_FRAME_MULTICAST BIT(18) +#define RX_DESC_0_FRAME_BROADCAST BIT(19) +#define RX_DESC_0_FRAME_CRC_ERR BIT(20) +#define RX_DESC_0_FRAME_MAX_LEN_ERR BIT(21) +#define RX_DESC_0_FRAME_JABBER_ERR BIT(22) +#define RX_DESC_0_FRAME_LENGTH_ERR BIT(23) +#define RX_DESC_0_FRAME_MAC_ADDR1_MATCH BIT(24) +#define RX_DESC_0_FRAME_MAC_ADDR2_MATCH BIT(25) +#define RX_DESC_0_FRAME_MAC_ADDR3_MATCH BIT(26) +#define RX_DESC_0_FRAME_MAC_ADDR4_MATCH BIT(27) +#define RX_DESC_0_FRAME_PAUSE_CTRL BIT(28) +#define RX_DESC_0_LAST_DESCRIPTOR BIT(29) +#define RX_DESC_0_FIRST_DESCRIPTOR BIT(30) +#define RX_DESC_0_OWN BIT(31) + +#define RX_DESC_1_BUFFER_SIZE_1_MASK GENMASK(11, 0) +#define RX_DESC_1_BUFFER_SIZE_2_MASK GENMASK(23, 12) + /* [24] reserved */ +#define RX_DESC_1_SECOND_ADDRESS_CHAINED BIT(25) +#define RX_DESC_1_END_RING BIT(26) + /* [29:27] reserved */ +#define RX_DESC_1_RX_TIMESTAMP BIT(30) +#define RX_DESC_1_PTP_PKT BIT(31) + +/* TX DMA descriptor */ + + /* [29:0] unused */ +#define TX_DESC_0_TX_TIMESTAMP BIT(30) +#define TX_DESC_0_OWN BIT(31) + +#define TX_DESC_1_BUFFER_SIZE_1_MASK GENMASK(11, 0) +#define TX_DESC_1_BUFFER_SIZE_2_MASK GENMASK(23, 12) +#define TX_DESC_1_FORCE_EOP_ERROR BIT(24) +#define TX_DESC_1_SECOND_ADDRESS_CHAINED BIT(25) +#define TX_DESC_1_END_RING BIT(26) +#define TX_DESC_1_DISABLE_PADDING BIT(27) +#define TX_DESC_1_ADD_CRC_DISABLE BIT(28) +#define TX_DESC_1_FIRST_SEGMENT BIT(29) +#define TX_DESC_1_LAST_SEGMENT BIT(30) +#define TX_DESC_1_INTERRUPT_ON_COMPLETION BIT(31) + +struct emac_desc { + u32 desc0; + u32 desc1; + u32 buffer_addr_1; + u32 buffer_addr_2; +}; + +/* Keep stats in this order, index used for accessing hardware */ + +union emac_hw_tx_stats { + struct { + u64 tx_ok_pkts; + u64 tx_total_pkts; + u64 tx_ok_bytes; + u64 tx_err_pkts; + u64 tx_singleclsn_pkts; + u64 tx_multiclsn_pkts; + u64 tx_lateclsn_pkts; + u64 tx_excessclsn_pkts; + u64 tx_unicast_pkts; + u64 tx_multicast_pkts; + u64 tx_broadcast_pkts; + u64 tx_pause_pkts; + } stats; + + DECLARE_FLEX_ARRAY(u64, array); +}; + +union emac_hw_rx_stats { + struct { + u64 rx_ok_pkts; + u64 rx_total_pkts; + u64 rx_crc_err_pkts; + u64 rx_align_err_pkts; + u64 rx_err_total_pkts; + u64 rx_ok_bytes; + u64 rx_total_bytes; + u64 rx_unicast_pkts; + u64 rx_multicast_pkts; + u64 rx_broadcast_pkts; + u64 rx_pause_pkts; + u64 rx_len_err_pkts; + u64 rx_len_undersize_pkts; + u64 rx_len_oversize_pkts; + u64 rx_len_fragment_pkts; + u64 rx_len_jabber_pkts; + u64 rx_64_pkts; + u64 rx_65_127_pkts; + u64 rx_128_255_pkts; + u64 rx_256_511_pkts; + u64 rx_512_1023_pkts; + u64 rx_1024_1518_pkts; + u64 rx_1519_plus_pkts; + u64 rx_drp_fifo_full_pkts; + u64 rx_truncate_fifo_full_pkts; + } stats; + + DECLARE_FLEX_ARRAY(u64, array); +}; + +#endif /* _K1_EMAC_H_ */ -- 2.50.1 From wangruikang at iscas.ac.cn Thu Sep 11 11:13:56 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Fri, 12 Sep 2025 02:13:56 +0800 Subject: [PATCH net-next v11 4/5] riscv: dts: spacemit: Add Ethernet support for BPI-F3 In-Reply-To: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> References: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> Message-ID: <20250912-net-k1-emac-v11-4-aa3e84f8043b@iscas.ac.cn> Banana Pi BPI-F3 uses an RGMII PHY for each port and uses GPIO for PHY reset. Tested-by: Hendrik Hamerlinck Signed-off-by: Vivian Wang Reviewed-by: Yixun Lan --- arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts | 46 +++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts b/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts index fe22c747c5012fe56d42ac8a7efdbbdb694f31b6..15fa4a5ebd043f3fbb115d37e5a980c9b773a228 100644 --- a/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts +++ b/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts @@ -40,6 +40,52 @@ &emmc { status = "okay"; }; +ð0 { + phy-handle = <&rgmii0>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac0_cfg>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <0>; + status = "okay"; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + reset-gpios = <&gpio K1_GPIO(110) GPIO_ACTIVE_LOW>; + reset-delay-us = <10000>; + reset-post-delay-us = <100000>; + + rgmii0: phy at 1 { + reg = <0x1>; + }; + }; +}; + +ð1 { + phy-handle = <&rgmii1>; + phy-mode = "rgmii-id"; + pinctrl-names = "default"; + pinctrl-0 = <&gmac1_cfg>; + rx-internal-delay-ps = <0>; + tx-internal-delay-ps = <250>; + status = "okay"; + + mdio-bus { + #address-cells = <0x1>; + #size-cells = <0x0>; + + reset-gpios = <&gpio K1_GPIO(115) GPIO_ACTIVE_LOW>; + reset-delay-us = <10000>; + reset-post-delay-us = <100000>; + + rgmii1: phy at 1 { + reg = <0x1>; + }; + }; +}; + &uart0 { pinctrl-names = "default"; pinctrl-0 = <&uart0_2_cfg>; -- 2.50.1 From wangruikang at iscas.ac.cn Thu Sep 11 11:13:52 2025 From: wangruikang at iscas.ac.cn (Vivian Wang) Date: Fri, 12 Sep 2025 02:13:52 +0800 Subject: [PATCH net-next v11 0/5] Add Ethernet MAC support for SpacemiT K1 Message-ID: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> SpacemiT K1 has two gigabit Ethernet MACs with RGMII and RMII support. Add devicetree bindings, driver, and DTS for it. Tested primarily on BananaPi BPI-F3. Basic TX/RX functionality also tested on Milk-V Jupiter. I would like to note that even though some bit field names superficially resemble that of DesignWare MAC, all other differences point to it in fact being a custom design. Based on SpacemiT drivers [1]. These patches are also available at: https://github.com/dramforever/linux/tree/k1/ethernet/v11 [1]: https://github.com/spacemit-com/linux-k1x --- Changes in v11: - Use NETDEV_PCPU_STAT_DSTATS for tx_dropped - Use DECLARE_FLEX_ARRAY for emac_hw_{tx,rx}_stats instead of cast - More bitfields stuff to simplify code: - Define EMAC_MAX_DELAY_UNIT with FIELD_MAX - Use FIELD_{PREP,GET} in emac_mii_{read,write}() - Use FIELD_MODIFY in emac_set_{tx,rx}_fc() - Minor changes: - Use lower_32_bits and such instead of casts and shifts - Extract emac_ether_addr_hash() helper - In emac_mdio_init(), 0xffffffff -> ~0 - Minor comment changes - Link to v10: https://lore.kernel.org/r/20250908-net-k1-emac-v10-0-90d807ccd469 at iscas.ac.cn Changes in v10: - Use FIELD_GET and FIELD_PREP, remove some unused constants - Remove redundant software statistics - In particular, rx_dropped should have been and is already tracked in rx_errors. - Track tx_dropped with a percpu field - Minor changes - Simplified int emac_rx_frame_status() -> bool emac_rx_frame_good() - Link to v9: https://lore.kernel.org/r/20250905-net-k1-emac-v9-0-f1649b98a19c at iscas.ac.cn Changes in v9: - Refactor to use phy_interface_mode_is_rgmii - Minor changes - Use netdev_err in more places - Print phy-mode by name on unsupported phy-mode - Link to v8: https://lore.kernel.org/r/20250828-net-k1-emac-v8-0-e9075dd2ca90 at iscas.ac.cn Changes in v8: - Use devres to do of_phy_deregister_fixed_link on probe failure or remove - Simplified control flow in a few places with early return or continue - Minor changes - Removed some unneeded parens in emac_configure_{tx,rx} - Link to v7: https://lore.kernel.org/r/20250826-net-k1-emac-v7-0-5bc158d086ae at iscas.ac.cn Changes in v7: - Removed scoped_guard usage - Renamed error handling path labels after destinations - Fix skb free error handling path in emac_start_xmit and emac_tx_mem_map - Cancel tx_timeout_task to prevent schedule_work lifetime problems - Minor changes: - Remove unnecessary timer_delete_sync in emac_down - Use dev_err_ratelimited in a few more places - Cosmetic fixes in error messages - Link to v6: https://lore.kernel.org/r/20250820-net-k1-emac-v6-0-c1e28f2b8be5 at iscas.ac.cn Changes in v6: - Implement pause frame support - Minor changes: - Convert comment for emac_stats_update() into assert_spin_locked() - Cosmetic fixes for some comments and whitespace - emac_set_mac_addr() is now refactored - Link to v5: https://lore.kernel.org/r/20250812-net-k1-emac-v5-0-dd17c4905f49 at iscas.ac.cn Changes in v5: - Rebased on v6.17-rc1, add back DTS now that they apply cleanly - Use standard statistics interface, handle 32-bit statistics overflow - Minor changes: - Fix clock resource handling in emac_resume - Ratelimit the message in emac_rx_frame_status - Add ndo_validate_addr = eth_validate_addr - Remove unnecessary parens in emac_set_mac_addr - Change some functions that never fail to return void instead of int - Minor rewording - Link to v4: https://lore.kernel.org/r/20250703-net-k1-emac-v4-0-686d09c4cfa8 at iscas.ac.cn Changes in v4: - Resource handling on probe and remove: timer_delete_sync and of_phy_deregister_fixed_link - Drop DTS changes and dependencies (will send through SpacemiT tree) - Minor changes: - Remove redundant phy_stop() and setting of ndev->phydev - Fix error checking for emac_open in emac_resume - Fix one missed dev_err -> dev_err_probe - Fix type of emac_start_xmit - Fix one missed reverse xmas tree formatting - Rename some functions for consistency between emac_* and ndo_* - Link to v3: https://lore.kernel.org/r/20250702-net-k1-emac-v3-0-882dc55404f3 at iscas.ac.cn Changes in v3: - Refactored and simplified emac_tx_mem_map - Addressed other minor v2 review comments - Removed what was patch 3 in v2, depend on DMA buses instead - DT nodes in alphabetical order where appropriate - Link to v2: https://lore.kernel.org/r/20250618-net-k1-emac-v2-0-94f5f07227a8 at iscas.ac.cn Changes in v2: - dts: Put eth0 and eth1 nodes under a bus with dma-ranges - dts: Added Milk-V Jupiter - Fix typo in emac_init_hw() that broke the driver (Oops!) - Reformatted line lengths to under 80 - Addressed other v1 review comments - Link to v1: https://lore.kernel.org/r/20250613-net-k1-emac-v1-0-cc6f9e510667 at iscas.ac.cn --- Vivian Wang (5): dt-bindings: net: Add support for SpacemiT K1 net: spacemit: Add K1 Ethernet MAC riscv: dts: spacemit: Add Ethernet support for K1 riscv: dts: spacemit: Add Ethernet support for BPI-F3 riscv: dts: spacemit: Add Ethernet support for Jupiter .../devicetree/bindings/net/spacemit,k1-emac.yaml | 81 + arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts | 46 + arch/riscv/boot/dts/spacemit/k1-milkv-jupiter.dts | 46 + arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi | 48 + arch/riscv/boot/dts/spacemit/k1.dtsi | 22 + drivers/net/ethernet/Kconfig | 1 + drivers/net/ethernet/Makefile | 1 + drivers/net/ethernet/spacemit/Kconfig | 29 + drivers/net/ethernet/spacemit/Makefile | 6 + drivers/net/ethernet/spacemit/k1_emac.c | 2161 ++++++++++++++++++++ drivers/net/ethernet/spacemit/k1_emac.h | 416 ++++ 11 files changed, 2857 insertions(+) --- base-commit: 062b3e4a1f880f104a8d4b90b767788786aa7b78 change-id: 20250606-net-k1-emac-3e181508ea64 Best regards, -- Vivian "dramforever" Wang From conor at kernel.org Thu Sep 11 11:24:06 2025 From: conor at kernel.org (Conor Dooley) Date: Thu, 11 Sep 2025 19:24:06 +0100 Subject: [PATCH] cache: sifive_ccache: Optimize cache flushes In-Reply-To: <20250909224131.276800-1-samuel.holland@sifive.com> References: <20250909224131.276800-1-samuel.holland@sifive.com> Message-ID: <20250911-moonscape-bulk-cd1312329fa1@spud> From: Conor Dooley On Tue, 09 Sep 2025 15:41:27 -0700, Samuel Holland wrote: > Fence instructions are required only at the beginning and the end of > a flush operation, not separately for each cache line being flushed. > Speed up cache flushes by about 15% by removing the extra fences. > > Applied to riscv-cache-for-next, thanks! [1/1] cache: sifive_ccache: Optimize cache flushes https://git.kernel.org/conor/c/941327ca5ddd Thanks, Conor. From rabenda.cn at gmail.com Thu Sep 11 11:45:25 2025 From: rabenda.cn at gmail.com (Han Gao) Date: Fri, 12 Sep 2025 02:45:25 +0800 Subject: [PATCH 0/3] riscv: dts: thead: add more th1520 isa extension support Message-ID: <20250911184528.1512543-1-rabenda.cn@gmail.com> Add xtheadvector & ziccrse & zfh for th1520 Thanks, Han Han Gao (3): riscv: dts: thead: add xtheadvector to the th1520 devicetree riscv: dts: thead: add ziccrse for th1520 riscv: dts: thead: add zfh for th1520 arch/riscv/boot/dts/thead/th1520.dtsi | 28 +++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) -- 2.47.3 From rabenda.cn at gmail.com Thu Sep 11 11:45:26 2025 From: rabenda.cn at gmail.com (Han Gao) Date: Fri, 12 Sep 2025 02:45:26 +0800 Subject: [PATCH 1/3] riscv: dts: thead: add xtheadvector to the th1520 devicetree In-Reply-To: <20250911184528.1512543-1-rabenda.cn@gmail.com> References: <20250911184528.1512543-1-rabenda.cn@gmail.com> Message-ID: <20250911184528.1512543-2-rabenda.cn@gmail.com> The th1520 support xtheadvector [1] so it can be included in the devicetree. Also include vlenb for the cpu. And set vlenb=16 [2]. This can be tested by passing the "mitigations=off" kernel parameter. Link: https://lore.kernel.org/linux-riscv/20241113-xtheadvector-v11-4-236c22791ef9 at rivosinc.com/ [1] Link: https://lore.kernel.org/linux-riscv/aCO44SAoS2kIP61r at ghost/ [2] Signed-off-by: Han Gao Signed-off-by: Han Gao --- arch/riscv/boot/dts/thead/th1520.dtsi | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/arch/riscv/boot/dts/thead/th1520.dtsi b/arch/riscv/boot/dts/thead/th1520.dtsi index 42724bf7e90e..59d1927764a6 100644 --- a/arch/riscv/boot/dts/thead/th1520.dtsi +++ b/arch/riscv/boot/dts/thead/th1520.dtsi @@ -25,7 +25,8 @@ c910_0: cpu at 0 { riscv,isa = "rv64imafdc"; riscv,isa-base = "rv64i"; riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", - "zifencei", "zihpm"; + "zifencei", "zihpm", "xtheadvector"; + thead,vlenb = <16>; reg = <0>; i-cache-block-size = <64>; i-cache-size = <65536>; @@ -49,7 +50,8 @@ c910_1: cpu at 1 { riscv,isa = "rv64imafdc"; riscv,isa-base = "rv64i"; riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", - "zifencei", "zihpm"; + "zifencei", "zihpm", "xtheadvector"; + thead,vlenb = <16>; reg = <1>; i-cache-block-size = <64>; i-cache-size = <65536>; @@ -73,7 +75,8 @@ c910_2: cpu at 2 { riscv,isa = "rv64imafdc"; riscv,isa-base = "rv64i"; riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", - "zifencei", "zihpm"; + "zifencei", "zihpm", "xtheadvector"; + thead,vlenb = <16>; reg = <2>; i-cache-block-size = <64>; i-cache-size = <65536>; @@ -97,7 +100,8 @@ c910_3: cpu at 3 { riscv,isa = "rv64imafdc"; riscv,isa-base = "rv64i"; riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", - "zifencei", "zihpm"; + "zifencei", "zihpm", "xtheadvector"; + thead,vlenb = <16>; reg = <3>; i-cache-block-size = <64>; i-cache-size = <65536>; -- 2.47.3 From rabenda.cn at gmail.com Thu Sep 11 11:45:27 2025 From: rabenda.cn at gmail.com (Han Gao) Date: Fri, 12 Sep 2025 02:45:27 +0800 Subject: [PATCH 2/3] riscv: dts: thead: add ziccrse for th1520 In-Reply-To: <20250911184528.1512543-1-rabenda.cn@gmail.com> References: <20250911184528.1512543-1-rabenda.cn@gmail.com> Message-ID: <20250911184528.1512543-3-rabenda.cn@gmail.com> th1520 support Ziccrse ISA extension [1]. Link: https://lore.kernel.org/all/20241103145153.105097-12-alexghiti at rivosinc.com/ [1] Signed-off-by: Han Gao Signed-off-by: Han Gao --- arch/riscv/boot/dts/thead/th1520.dtsi | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/arch/riscv/boot/dts/thead/th1520.dtsi b/arch/riscv/boot/dts/thead/th1520.dtsi index 59d1927764a6..7f07688aa964 100644 --- a/arch/riscv/boot/dts/thead/th1520.dtsi +++ b/arch/riscv/boot/dts/thead/th1520.dtsi @@ -24,8 +24,10 @@ c910_0: cpu at 0 { device_type = "cpu"; riscv,isa = "rv64imafdc"; riscv,isa-base = "rv64i"; - riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", - "zifencei", "zihpm", "xtheadvector"; + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", + "ziccrse", "zicntr", "zicsr", + "zifencei", "zihpm", + "xtheadvector"; thead,vlenb = <16>; reg = <0>; i-cache-block-size = <64>; @@ -49,8 +51,10 @@ c910_1: cpu at 1 { device_type = "cpu"; riscv,isa = "rv64imafdc"; riscv,isa-base = "rv64i"; - riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", - "zifencei", "zihpm", "xtheadvector"; + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", + "ziccrse", "zicntr", "zicsr", + "zifencei", "zihpm", + "xtheadvector"; thead,vlenb = <16>; reg = <1>; i-cache-block-size = <64>; @@ -74,8 +78,10 @@ c910_2: cpu at 2 { device_type = "cpu"; riscv,isa = "rv64imafdc"; riscv,isa-base = "rv64i"; - riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", - "zifencei", "zihpm", "xtheadvector"; + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", + "ziccrse", "zicntr", "zicsr", + "zifencei", "zihpm", + "xtheadvector"; thead,vlenb = <16>; reg = <2>; i-cache-block-size = <64>; @@ -99,8 +105,10 @@ c910_3: cpu at 3 { device_type = "cpu"; riscv,isa = "rv64imafdc"; riscv,isa-base = "rv64i"; - riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", - "zifencei", "zihpm", "xtheadvector"; + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", + "ziccrse", "zicntr", "zicsr", + "zifencei", "zihpm", + "xtheadvector"; thead,vlenb = <16>; reg = <3>; i-cache-block-size = <64>; -- 2.47.3 From rabenda.cn at gmail.com Thu Sep 11 11:45:28 2025 From: rabenda.cn at gmail.com (Han Gao) Date: Fri, 12 Sep 2025 02:45:28 +0800 Subject: [PATCH 3/3] riscv: dts: thead: add zfh for th1520 In-Reply-To: <20250911184528.1512543-1-rabenda.cn@gmail.com> References: <20250911184528.1512543-1-rabenda.cn@gmail.com> Message-ID: <20250911184528.1512543-4-rabenda.cn@gmail.com> th1520 support Zfh ISA extension [1]. Link: https://occ-oss-prod.oss-cn-hangzhou.aliyuncs.com/resource//1737721869472/%E7%8E%84%E9%93%81C910%E4%B8%8EC920R1S6%E7%94%A8%E6%88%B7%E6%89%8B%E5%86%8C%28xrvm%29_20250124.pdf [1] Signed-off-by: Han Gao Signed-off-by: Han Gao --- arch/riscv/boot/dts/thead/th1520.dtsi | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/riscv/boot/dts/thead/th1520.dtsi b/arch/riscv/boot/dts/thead/th1520.dtsi index 7f07688aa964..2075bb969c2f 100644 --- a/arch/riscv/boot/dts/thead/th1520.dtsi +++ b/arch/riscv/boot/dts/thead/th1520.dtsi @@ -26,7 +26,7 @@ c910_0: cpu at 0 { riscv,isa-base = "rv64i"; riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "ziccrse", "zicntr", "zicsr", - "zifencei", "zihpm", + "zifencei", "zihpm", "zfh", "xtheadvector"; thead,vlenb = <16>; reg = <0>; @@ -53,7 +53,7 @@ c910_1: cpu at 1 { riscv,isa-base = "rv64i"; riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "ziccrse", "zicntr", "zicsr", - "zifencei", "zihpm", + "zifencei", "zihpm", "zfh", "xtheadvector"; thead,vlenb = <16>; reg = <1>; @@ -80,7 +80,7 @@ c910_2: cpu at 2 { riscv,isa-base = "rv64i"; riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "ziccrse", "zicntr", "zicsr", - "zifencei", "zihpm", + "zifencei", "zihpm", "zfh", "xtheadvector"; thead,vlenb = <16>; reg = <2>; @@ -107,7 +107,7 @@ c910_3: cpu at 3 { riscv,isa-base = "rv64i"; riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "ziccrse", "zicntr", "zicsr", - "zifencei", "zihpm", + "zifencei", "zihpm", "zfh", "xtheadvector"; thead,vlenb = <16>; reg = <3>; -- 2.47.3 From broonie at kernel.org Thu Sep 11 11:57:18 2025 From: broonie at kernel.org (Mark Brown) Date: Thu, 11 Sep 2025 19:57:18 +0100 Subject: (subset) [PATCH v13 0/7] spacemit: introduce P1 PMIC support In-Reply-To: References: <20250825172057.163883-1-elder@riscstar.com> <175690199980.2656286.5459018179105557107.b4-ty@kernel.org> Message-ID: <7aba368e-709b-49b0-b62c-f2f8250c8628@sirena.org.uk> On Thu, Sep 11, 2025 at 11:36:41AM -0500, Alex Elder wrote: > That leaves patch 3, which enables regulator support, and patch > 4, which adds RTC support. > How should these two patches be merged? Mark has reviewed the > regulator patch 3 and Alexandre has acked the RTC patch 4. We'd both have been expecting them to go via MFD. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From opendmb at gmail.com Thu Sep 11 12:50:47 2025 From: opendmb at gmail.com (Doug Berger) Date: Thu, 11 Sep 2025 12:50:47 -0700 Subject: [PATCH v2 07/15] gpio: brcmstb: use new generic GPIO chip API In-Reply-To: References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> <20250910-gpio-mmio-gpio-conv-part4-v2-7-f3d1a4c57124@linaro.org> Message-ID: On 9/11/2025 12:56 AM, Bartosz Golaszewski wrote: > On Thu, Sep 11, 2025 at 2:11?AM Doug Berger wrote: >> >>> >>> @@ -700,7 +707,8 @@ static int brcmstb_gpio_probe(struct platform_device *pdev) >>> * be retained from S5 cold boot >>> */ >>> need_wakeup_event |= !!__brcmstb_gpio_get_active_irqs(bank); >>> - gc->write_reg(reg_base + GIO_MASK(bank->id), 0); >>> + gpio_generic_write_reg(&bank->chip, >>> + reg_base + GIO_MASK(bank->id), 0); >>> >>> err = gpiochip_add_data(gc, bank); >>> if (err) { >>> >> I suppose I'm OK with all of this, but I'm just curious about the longer >> term plans for the member accesses. Is there an intent to have helpers >> for things like?: >> chip.gc.offset >> chip.gc.ngpio > > I don't think so. It would require an enormous effort and these fields > in struct gpio_chip are pretty stable so there's no real reason for > it. > > Bart Ok, so assuming struct gpio_chip is sticking around long term that makes sense to me. Thanks! Acked-by: Doug Berger From fustini at kernel.org Thu Sep 11 12:53:32 2025 From: fustini at kernel.org (Drew Fustini) Date: Thu, 11 Sep 2025 12:53:32 -0700 Subject: [PATCH 0/2] RISC-V: Detect Ssqosid extension and handle srmcfg CSR In-Reply-To: <20250911-chaste-rare-fbc3b48a341a@spud> References: <20250910-ssqosid-v6-17-rc5-v1-0-72cb8f144615@kernel.org> <20250911-chaste-rare-fbc3b48a341a@spud> Message-ID: On Thu, Sep 11, 2025 at 05:23:30PM +0100, Conor Dooley wrote: > Why is there no binding change here? Is it not possible to use the > extension on DT systems, or is this an oversight? Thanks for pointing this out. My intention is to support QoS on both DT and ACPI systems. I will add an entry after sstc in extensions.yaml. Thanks, Drew -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From dlan at gentoo.org Thu Sep 11 12:55:02 2025 From: dlan at gentoo.org (Yixun Lan) Date: Fri, 12 Sep 2025 03:55:02 +0800 Subject: (subset) [PATCH v13 0/7] spacemit: introduce P1 PMIC support In-Reply-To: References: <20250825172057.163883-1-elder@riscstar.com> <175690199980.2656286.5459018179105557107.b4-ty@kernel.org> Message-ID: <20250911195502-GYA1223946@gentoo.org> Hi Alex, On 11:36 Thu 11 Sep , Alex Elder wrote: > On 9/3/25 7:19 AM, Lee Jones wrote: > > On Mon, 25 Aug 2025 12:20:49 -0500, Alex Elder wrote: > >> The SpacemiT P1 is an I2C-controlled PMIC that implements 6 buck > >> converters and 12 LDOs. It contains a load switch, ADC channels, > >> GPIOs, a real-time clock, and a watchdog timer. > >> > >> This series introduces a multifunction driver for the P1 PMIC as > >> well as drivers for its regulators and RTC. > >> > >> [...] > > > > Applied, thanks! > > > > [1/7] dt-bindings: mfd: add support the SpacemiT P1 PMIC > > commit: baac6755d3e8ddf47eee2be3ca72fe14ebae2143 > > [2/7] mfd: simple-mfd-i2c: add SpacemiT P1 support > > commit: 49833495c85f26d070e70148fd9607c6fbf927fd > > > > -- > > Lee Jones [???] > > > > Yixun Lan plans to merge patches 5-7 of this series. > DT patches usually go as the last one, in this case, they effectively depened on patch 3 which adds the regulator support As Mark suggest it go via MFD, that leaves Jones to handle, It's close to the release of rc6, I'd hope it isn't too late.. > That leaves patch 3, which enables regulator support, and patch > 4, which adds RTC support. > > How should these two patches be merged? Mark has reviewed the > regulator patch 3 and Alexandre has acked the RTC patch 4. > > Thank you. > > -Alex -- Yixun Lan (dlan) From unicorn_wang at outlook.com Thu Sep 11 17:52:30 2025 From: unicorn_wang at outlook.com (Chen Wang) Date: Fri, 12 Sep 2025 08:52:30 +0800 Subject: [PATCH v2 3/7] PCI: sg2042: Add Sophgo SG2042 PCIe driver In-Reply-To: <35447ba0-21c2-4a12-9d27-033a7be0af3e@aosc.io> References: <162d064228261ccd0bf9313a20288e510912effd.1757467895.git.unicorn_wang@outlook.com> <35447ba0-21c2-4a12-9d27-033a7be0af3e@aosc.io> Message-ID: On 9/12/2025 12:17 AM, Mingcong Bai wrote: > Hi Chen, > > ? 2025/9/10 10:08, Chen Wang ??: >> +config PCIE_SG2042_HOST >> +??? tristate "Sophgo SG2042 PCIe controller (host mode)" >> +??? depends on OF && (ARCH_SOPHGO || COMPILE_TEST) >> +??? select PCIE_CADENCE_HOST >> +??? help >> +????? Say Y here if you want to support the Sophgo SG2042 PCIe platform >> +????? controller in host mode. Sophgo SG2042 PCIe controller uses >> Cadence >> +????? PCIe core. >> + > > While build testing this patch against v6.16.6, PCIE_SG2042_HOST is > set to "M", the kernel would fail to build during MODPOST: > > ERROR: modpost: "cdns_pcie_pm_ops" > [drivers/pci/controller/cadence/pcie-sg2042.ko] undefined! > make[2]: *** [scripts/Makefile.modpost:147: Module.symvers] Error 1 > make[1]: *** [[...]/linux-6.16.6/Makefile:1953: modpost] Error 2 > make: *** [Makefile:248: __sub-make] Error 2 > My fault, seems there were some problems when I submitted the driver code. I will correct them immediately and submit a new version. Thanks, Chen > Best Regards, > Mingcong Bai From unicornxw at gmail.com Thu Sep 11 19:35:10 2025 From: unicornxw at gmail.com (Chen Wang) Date: Fri, 12 Sep 2025 10:35:10 +0800 Subject: [PATCH v3 0/7] Add PCIe support to Sophgo SG2042 SoC Message-ID: From: Chen Wang Sophgo's SG2042 SoC uses Cadence PCIe core to implement RC mode. This is a completely rewritten PCIe driver for SG2042. It inherits some previously submitted patch codes (not merged into the upstream mainline), but the biggest difference is that the support for compatibility with old 32-bit PCIe devices has been removed in this new version. This is because after discussing with community users, we felt that there was not much demand for support for old devices, so we made a new design based on the simplified design and practical needs. If someone really needs to play with old devices, we can provide them with some necessary hack patches in the downstream repository. Since the new design is quite different from the old code, I will release it as a new patch series. The old patch series can be found in here [old-series]. Note, regarding [2/7] of this patchset, this fix is introduced because the pcie->ops pointer is not filled in SG2042 PCIe driver. This is not a must-have parameter, if we use it w/o checking will cause a null pointer access error during runtime. Link: https://lore.kernel.org/linux-riscv/cover.1736923025.git.unicorn_wang at outlook.com/ [old-series] Thanks, Chen --- Changes in v3: This patchset is based on v6.17-rc1. Fixed following issues for driver code based on feedbacks from Bjorn Helgaas, Mingcong Bai, thanks. - Fixed the issue when building the driver as a module. Define own pm_ops inside driver, don't use the ops defined in other built-in drivers. - Improve .remove() function to properly disable the host. Changes in v2: This patchset is based on v6.17-rc1. You can simply review or test the patches at the link [2]. Fixed following issues based on feedbacks from Rob Herring, Manivannan Sadhasivam, Bjorn Helgaas, ALOK TIWARI, thanks. - Driver binding: - Removed vendor-id/device-id from "required" property. - Improve drivers code: - Have separated pci_ops for the root bus and child buses. - Make the driver tristate and as a module. - Change the configuration name from PCIE_SG2042 to PCIE_SG2042_HOST. - Removed "Fixes" tag from commit [2/7], since this is not for an existing bug fix. - Other code cleanups and optimizations - DT: - Add PCIe support for SG2042 EVB boards. Changes in v1: The patch series is based on v6.17-rc1. You can simply review or test the patches at the link [1]. Link: https://lore.kernel.org/linux-riscv/cover.1756344464.git.unicorn_wang at outlook.com/ [1] Link: https://lore.kernel.org/linux-riscv/cover.1757467895.git.unicorn_wang at outlook.com/ [2] --- Chen Wang (7): dt-bindings: pci: Add Sophgo SG2042 PCIe host PCI: cadence: Check pcie-ops before using it PCI: sg2042: Add Sophgo SG2042 PCIe driver riscv: sophgo: dts: add PCIe controllers for SG2042 riscv: sophgo: dts: enable PCIe for PioneerBox riscv: sophgo: dts: enable PCIe for SG2042_EVB_V1.X riscv: sophgo: dts: enable PCIe for SG2042_EVB_V2.0 .../bindings/pci/sophgo,sg2042-pcie-host.yaml | 64 ++++++++ arch/riscv/boot/dts/sophgo/sg2042-evb-v1.dts | 12 ++ arch/riscv/boot/dts/sophgo/sg2042-evb-v2.dts | 12 ++ .../boot/dts/sophgo/sg2042-milkv-pioneer.dts | 12 ++ arch/riscv/boot/dts/sophgo/sg2042.dtsi | 88 +++++++++++ drivers/pci/controller/cadence/Kconfig | 10 ++ drivers/pci/controller/cadence/Makefile | 1 + .../controller/cadence/pcie-cadence-host.c | 2 +- drivers/pci/controller/cadence/pcie-cadence.c | 4 +- drivers/pci/controller/cadence/pcie-cadence.h | 6 +- drivers/pci/controller/cadence/pcie-sg2042.c | 138 ++++++++++++++++++ 11 files changed, 343 insertions(+), 6 deletions(-) create mode 100644 Documentation/devicetree/bindings/pci/sophgo,sg2042-pcie-host.yaml create mode 100644 drivers/pci/controller/cadence/pcie-sg2042.c base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585 -- 2.34.1 From unicornxw at gmail.com Thu Sep 11 19:35:32 2025 From: unicornxw at gmail.com (Chen Wang) Date: Fri, 12 Sep 2025 10:35:32 +0800 Subject: [PATCH v3 1/7] dt-bindings: pci: Add Sophgo SG2042 PCIe host In-Reply-To: References: Message-ID: <2755f145755b6096247c26852b63671a6fea4dbf.1757643388.git.unicorn_wang@outlook.com> From: Chen Wang Add binding for Sophgo SG2042 PCIe host controller. Reviewed-by: Rob Herring (Arm) Signed-off-by: Chen Wang --- .../bindings/pci/sophgo,sg2042-pcie-host.yaml | 64 +++++++++++++++++++ 1 file changed, 64 insertions(+) create mode 100644 Documentation/devicetree/bindings/pci/sophgo,sg2042-pcie-host.yaml diff --git a/Documentation/devicetree/bindings/pci/sophgo,sg2042-pcie-host.yaml b/Documentation/devicetree/bindings/pci/sophgo,sg2042-pcie-host.yaml new file mode 100644 index 000000000000..f8b7ca57fff1 --- /dev/null +++ b/Documentation/devicetree/bindings/pci/sophgo,sg2042-pcie-host.yaml @@ -0,0 +1,64 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/pci/sophgo,sg2042-pcie-host.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Sophgo SG2042 PCIe Host (Cadence PCIe Wrapper) + +description: + Sophgo SG2042 PCIe host controller is based on the Cadence PCIe core. + +maintainers: + - Chen Wang + +properties: + compatible: + const: sophgo,sg2042-pcie-host + + reg: + maxItems: 2 + + reg-names: + items: + - const: reg + - const: cfg + + vendor-id: + const: 0x1f1c + + device-id: + const: 0x2042 + + msi-parent: true + +allOf: + - $ref: cdns-pcie-host.yaml# + +required: + - compatible + - reg + - reg-names + +unevaluatedProperties: false + +examples: + - | + #include + + pcie at 62000000 { + compatible = "sophgo,sg2042-pcie-host"; + device_type = "pci"; + reg = <0x62000000 0x00800000>, + <0x48000000 0x00001000>; + reg-names = "reg", "cfg"; + #address-cells = <3>; + #size-cells = <2>; + ranges = <0x81000000 0 0x00000000 0xde000000 0 0x00010000>, + <0x82000000 0 0xd0400000 0xd0400000 0 0x0d000000>; + bus-range = <0x00 0xff>; + vendor-id = <0x1f1c>; + device-id = <0x2042>; + cdns,no-bar-match-nbits = <48>; + msi-parent = <&msi>; + }; -- 2.34.1 From unicornxw at gmail.com Thu Sep 11 19:36:01 2025 From: unicornxw at gmail.com (Chen Wang) Date: Fri, 12 Sep 2025 10:36:01 +0800 Subject: [PATCH v3 2/7] PCI: cadence: Check pcie-ops before using it In-Reply-To: References: Message-ID: <35182ee1d972dfcd093a964e11205efcebbdc044.1757643388.git.unicorn_wang@outlook.com> From: Chen Wang ops of struct cdns_pcie may be NULL, direct use will result in a null pointer error. Add checking of pcie->ops before using it for new driver that may not supply pcie->ops. Signed-off-by: Chen Wang --- drivers/pci/controller/cadence/pcie-cadence-host.c | 2 +- drivers/pci/controller/cadence/pcie-cadence.c | 4 ++-- drivers/pci/controller/cadence/pcie-cadence.h | 6 +++--- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/pci/controller/cadence/pcie-cadence-host.c b/drivers/pci/controller/cadence/pcie-cadence-host.c index 59a4631de79f..fffd63d6665e 100644 --- a/drivers/pci/controller/cadence/pcie-cadence-host.c +++ b/drivers/pci/controller/cadence/pcie-cadence-host.c @@ -531,7 +531,7 @@ static int cdns_pcie_host_init_address_translation(struct cdns_pcie_rc *rc) cdns_pcie_writel(pcie, CDNS_PCIE_AT_OB_REGION_PCI_ADDR1(0), addr1); cdns_pcie_writel(pcie, CDNS_PCIE_AT_OB_REGION_DESC1(0), desc1); - if (pcie->ops->cpu_addr_fixup) + if (pcie->ops && pcie->ops->cpu_addr_fixup) cpu_addr = pcie->ops->cpu_addr_fixup(pcie, cpu_addr); addr0 = CDNS_PCIE_AT_OB_REGION_CPU_ADDR0_NBITS(12) | diff --git a/drivers/pci/controller/cadence/pcie-cadence.c b/drivers/pci/controller/cadence/pcie-cadence.c index 70a19573440e..61806bbd8aa3 100644 --- a/drivers/pci/controller/cadence/pcie-cadence.c +++ b/drivers/pci/controller/cadence/pcie-cadence.c @@ -92,7 +92,7 @@ void cdns_pcie_set_outbound_region(struct cdns_pcie *pcie, u8 busnr, u8 fn, cdns_pcie_writel(pcie, CDNS_PCIE_AT_OB_REGION_DESC1(r), desc1); /* Set the CPU address */ - if (pcie->ops->cpu_addr_fixup) + if (pcie->ops && pcie->ops->cpu_addr_fixup) cpu_addr = pcie->ops->cpu_addr_fixup(pcie, cpu_addr); addr0 = CDNS_PCIE_AT_OB_REGION_CPU_ADDR0_NBITS(nbits) | @@ -123,7 +123,7 @@ void cdns_pcie_set_outbound_region_for_normal_msg(struct cdns_pcie *pcie, } /* Set the CPU address */ - if (pcie->ops->cpu_addr_fixup) + if (pcie->ops && pcie->ops->cpu_addr_fixup) cpu_addr = pcie->ops->cpu_addr_fixup(pcie, cpu_addr); addr0 = CDNS_PCIE_AT_OB_REGION_CPU_ADDR0_NBITS(17) | diff --git a/drivers/pci/controller/cadence/pcie-cadence.h b/drivers/pci/controller/cadence/pcie-cadence.h index 1d81c4bf6c6d..2f07ba661bda 100644 --- a/drivers/pci/controller/cadence/pcie-cadence.h +++ b/drivers/pci/controller/cadence/pcie-cadence.h @@ -468,7 +468,7 @@ static inline u32 cdns_pcie_ep_fn_readl(struct cdns_pcie *pcie, u8 fn, u32 reg) static inline int cdns_pcie_start_link(struct cdns_pcie *pcie) { - if (pcie->ops->start_link) + if (pcie->ops && pcie->ops->start_link) return pcie->ops->start_link(pcie); return 0; @@ -476,13 +476,13 @@ static inline int cdns_pcie_start_link(struct cdns_pcie *pcie) static inline void cdns_pcie_stop_link(struct cdns_pcie *pcie) { - if (pcie->ops->stop_link) + if (pcie->ops && pcie->ops->stop_link) pcie->ops->stop_link(pcie); } static inline bool cdns_pcie_link_up(struct cdns_pcie *pcie) { - if (pcie->ops->link_up) + if (pcie->ops && pcie->ops->link_up) return pcie->ops->link_up(pcie); return true; -- 2.34.1 From unicornxw at gmail.com Thu Sep 11 19:36:31 2025 From: unicornxw at gmail.com (Chen Wang) Date: Fri, 12 Sep 2025 10:36:31 +0800 Subject: [PATCH v3 3/7] PCI: sg2042: Add Sophgo SG2042 PCIe driver In-Reply-To: References: Message-ID: <01b0a57cd9dba8bed7c1f2d52997046c2c6f042b.1757643388.git.unicorn_wang@outlook.com> From: Chen Wang Add support for PCIe controller in SG2042 SoC. The controller uses the Cadence PCIe core programmed by pcie-cadence*.c. The PCIe controller will work in host mode only, supporting data rate (16 GT/s) and lanes (x16 or x8). Signed-off-by: Chen Wang --- drivers/pci/controller/cadence/Kconfig | 10 ++ drivers/pci/controller/cadence/Makefile | 1 + drivers/pci/controller/cadence/pcie-sg2042.c | 138 +++++++++++++++++++ 3 files changed, 149 insertions(+) create mode 100644 drivers/pci/controller/cadence/pcie-sg2042.c diff --git a/drivers/pci/controller/cadence/Kconfig b/drivers/pci/controller/cadence/Kconfig index 666e16b6367f..02a639e55fd8 100644 --- a/drivers/pci/controller/cadence/Kconfig +++ b/drivers/pci/controller/cadence/Kconfig @@ -42,6 +42,15 @@ config PCIE_CADENCE_PLAT_EP endpoint mode. This PCIe controller may be embedded into many different vendors SoCs. +config PCIE_SG2042_HOST + tristate "Sophgo SG2042 PCIe controller (host mode)" + depends on OF && (ARCH_SOPHGO || COMPILE_TEST) + select PCIE_CADENCE_HOST + help + Say Y here if you want to support the Sophgo SG2042 PCIe platform + controller in host mode. Sophgo SG2042 PCIe controller uses Cadence + PCIe core. + config PCI_J721E tristate select PCIE_CADENCE_HOST if PCI_J721E_HOST != n @@ -67,4 +76,5 @@ config PCI_J721E_EP Say Y here if you want to support the TI J721E PCIe platform controller in endpoint mode. TI J721E PCIe controller uses Cadence PCIe core. + endmenu diff --git a/drivers/pci/controller/cadence/Makefile b/drivers/pci/controller/cadence/Makefile index 9bac5fb2f13d..5e23f8539ecc 100644 --- a/drivers/pci/controller/cadence/Makefile +++ b/drivers/pci/controller/cadence/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_PCIE_CADENCE_HOST) += pcie-cadence-host.o obj-$(CONFIG_PCIE_CADENCE_EP) += pcie-cadence-ep.o obj-$(CONFIG_PCIE_CADENCE_PLAT) += pcie-cadence-plat.o obj-$(CONFIG_PCI_J721E) += pci-j721e.o +obj-$(CONFIG_PCIE_SG2042_HOST) += pcie-sg2042.o diff --git a/drivers/pci/controller/cadence/pcie-sg2042.c b/drivers/pci/controller/cadence/pcie-sg2042.c new file mode 100644 index 000000000000..db91c37790b7 --- /dev/null +++ b/drivers/pci/controller/cadence/pcie-sg2042.c @@ -0,0 +1,138 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * pcie-sg2042 - PCIe controller driver for Sophgo SG2042 SoC + * + * Copyright (C) 2025 Sophgo Technology Inc. + * Copyright (C) 2025 Chen Wang + */ + +#include +#include +#include +#include + +#include "pcie-cadence.h" + +/* + * SG2042 only supports 4-byte aligned access, so for the rootbus (i.e. to + * read/write the Root Port itself, read32/write32 is required. For + * non-rootbus (i.e. to read/write the PCIe peripheral registers, supports + * 1/2/4 byte aligned access, so directly using read/write should be fine. + */ + +static struct pci_ops sg2042_pcie_root_ops = { + .map_bus = cdns_pci_map_bus, + .read = pci_generic_config_read32, + .write = pci_generic_config_write32, +}; + +static struct pci_ops sg2042_pcie_child_ops = { + .map_bus = cdns_pci_map_bus, + .read = pci_generic_config_read, + .write = pci_generic_config_write, +}; + +static int sg2042_pcie_probe(struct platform_device *pdev) +{ + struct device *dev = &pdev->dev; + struct pci_host_bridge *bridge; + struct cdns_pcie *pcie; + struct cdns_pcie_rc *rc; + int ret; + + bridge = devm_pci_alloc_host_bridge(dev, sizeof(*rc)); + if (!bridge) { + dev_err_probe(dev, -ENOMEM, "Failed to alloc host bridge!\n"); + return -ENOMEM; + } + + bridge->ops = &sg2042_pcie_root_ops; + bridge->child_ops = &sg2042_pcie_child_ops; + + rc = pci_host_bridge_priv(bridge); + pcie = &rc->pcie; + pcie->dev = dev; + + platform_set_drvdata(pdev, pcie); + + pm_runtime_set_active(dev); + pm_runtime_no_callbacks(dev); + devm_pm_runtime_enable(dev); + + ret = cdns_pcie_init_phy(dev, pcie); + if (ret) { + dev_err_probe(dev, ret, "Failed to init phy!\n"); + return ret; + } + + ret = cdns_pcie_host_setup(rc); + if (ret) { + dev_err_probe(dev, ret, "Failed to setup host!\n"); + cdns_pcie_disable_phy(pcie); + return ret; + } + + return 0; +} + +static void sg2042_pcie_remove(struct platform_device *pdev) +{ + struct cdns_pcie *pcie = platform_get_drvdata(pdev); + struct device *dev = &pdev->dev; + struct cdns_pcie_rc *rc; + + rc = container_of(pcie, struct cdns_pcie_rc, pcie); + cdns_pcie_host_disable(rc); + + cdns_pcie_disable_phy(pcie); + + pm_runtime_disable(dev); +} + +static int sg2042_pcie_suspend_noirq(struct device *dev) +{ + struct cdns_pcie *pcie = dev_get_drvdata(dev); + + cdns_pcie_disable_phy(pcie); + + return 0; +} + +static int sg2042_pcie_resume_noirq(struct device *dev) +{ + struct cdns_pcie *pcie = dev_get_drvdata(dev); + int ret; + + ret = cdns_pcie_enable_phy(pcie); + if (ret) { + dev_err(dev, "failed to enable PHY\n"); + return ret; + } + + return 0; +} + +static DEFINE_NOIRQ_DEV_PM_OPS(sg2042_pcie_pm_ops, + sg2042_pcie_suspend_noirq, + sg2042_pcie_resume_noirq); + +static const struct of_device_id sg2042_pcie_of_match[] = { + { .compatible = "sophgo,sg2042-pcie-host" }, + {}, +}; +MODULE_DEVICE_TABLE(of, sg2042_pcie_of_match); + +static struct platform_driver sg2042_pcie_driver = { + .driver = { + .name = "sg2042-pcie", + .of_match_table = sg2042_pcie_of_match, + .pm = pm_sleep_ptr(&sg2042_pcie_pm_ops), + }, + .probe = sg2042_pcie_probe, + .remove = sg2042_pcie_remove, +}; +module_platform_driver(sg2042_pcie_driver); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("PCIe controller driver for SG2042 SoCs"); +MODULE_AUTHOR("Chen Wang "); -- 2.34.1 From unicornxw at gmail.com Thu Sep 11 19:36:50 2025 From: unicornxw at gmail.com (Chen Wang) Date: Fri, 12 Sep 2025 10:36:50 +0800 Subject: [PATCH v3 4/7] riscv: sophgo: dts: add PCIe controllers for SG2042 In-Reply-To: References: Message-ID: <828860951ec4973285fe92fceb4b6f0ecb365a2f.1757643388.git.unicorn_wang@outlook.com> From: Chen Wang Add PCIe controller nodes in DTS for Sophgo SG2042. Default they are disabled. Signed-off-by: Inochi Amaoto Signed-off-by: Han Gao Signed-off-by: Chen Wang --- arch/riscv/boot/dts/sophgo/sg2042.dtsi | 88 ++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) diff --git a/arch/riscv/boot/dts/sophgo/sg2042.dtsi b/arch/riscv/boot/dts/sophgo/sg2042.dtsi index b3e4d3c18fdc..b521f674283e 100644 --- a/arch/riscv/boot/dts/sophgo/sg2042.dtsi +++ b/arch/riscv/boot/dts/sophgo/sg2042.dtsi @@ -220,6 +220,94 @@ clkgen: clock-controller at 7030012000 { #clock-cells = <1>; }; + pcie_rc0: pcie at 7060000000 { + compatible = "sophgo,sg2042-pcie-host"; + device_type = "pci"; + reg = <0x70 0x60000000 0x0 0x00800000>, + <0x40 0x00000000 0x0 0x00001000>; + reg-names = "reg", "cfg"; + linux,pci-domain = <0>; + #address-cells = <3>; + #size-cells = <2>; + ranges = <0x01000000 0x0 0xc0000000 0x40 0xc0000000 0x0 0x00400000>, + <0x42000000 0x0 0xd0000000 0x40 0xd0000000 0x0 0x10000000>, + <0x02000000 0x0 0xe0000000 0x40 0xe0000000 0x0 0x20000000>, + <0x43000000 0x42 0x00000000 0x42 0x00000000 0x2 0x00000000>, + <0x03000000 0x41 0x00000000 0x41 0x00000000 0x1 0x00000000>; + bus-range = <0x0 0xff>; + vendor-id = <0x1f1c>; + device-id = <0x2042>; + cdns,no-bar-match-nbits = <48>; + msi-parent = <&msi>; + status = "disabled"; + }; + + pcie_rc1: pcie at 7060800000 { + compatible = "sophgo,sg2042-pcie-host"; + device_type = "pci"; + reg = <0x70 0x60800000 0x0 0x00800000>, + <0x44 0x00000000 0x0 0x00001000>; + reg-names = "reg", "cfg"; + linux,pci-domain = <1>; + #address-cells = <3>; + #size-cells = <2>; + ranges = <0x01000000 0x0 0xc0400000 0x44 0xc0400000 0x0 0x00400000>, + <0x42000000 0x0 0xd0000000 0x44 0xd0000000 0x0 0x10000000>, + <0x02000000 0x0 0xe0000000 0x44 0xe0000000 0x0 0x20000000>, + <0x43000000 0x46 0x00000000 0x46 0x00000000 0x2 0x00000000>, + <0x03000000 0x45 0x00000000 0x45 0x00000000 0x1 0x00000000>; + bus-range = <0x0 0xff>; + vendor-id = <0x1f1c>; + device-id = <0x2042>; + cdns,no-bar-match-nbits = <48>; + msi-parent = <&msi>; + status = "disabled"; + }; + + pcie_rc2: pcie at 7062000000 { + compatible = "sophgo,sg2042-pcie-host"; + device_type = "pci"; + reg = <0x70 0x62000000 0x0 0x00800000>, + <0x48 0x00000000 0x0 0x00001000>; + reg-names = "reg", "cfg"; + linux,pci-domain = <2>; + #address-cells = <3>; + #size-cells = <2>; + ranges = <0x01000000 0x0 0xc0800000 0x48 0xc0800000 0x0 0x00400000>, + <0x42000000 0x0 0xd0000000 0x48 0xd0000000 0x0 0x10000000>, + <0x02000000 0x0 0xe0000000 0x48 0xe0000000 0x0 0x20000000>, + <0x03000000 0x49 0x00000000 0x49 0x00000000 0x1 0x00000000>, + <0x43000000 0x4a 0x00000000 0x4a 0x00000000 0x2 0x00000000>; + bus-range = <0x0 0xff>; + vendor-id = <0x1f1c>; + device-id = <0x2042>; + cdns,no-bar-match-nbits = <48>; + msi-parent = <&msi>; + status = "disabled"; + }; + + pcie_rc3: pcie at 7062800000 { + compatible = "sophgo,sg2042-pcie-host"; + device_type = "pci"; + reg = <0x70 0x62800000 0x0 0x00800000>, + <0x4c 0x00000000 0x0 0x00001000>; + reg-names = "reg", "cfg"; + linux,pci-domain = <3>; + #address-cells = <3>; + #size-cells = <2>; + ranges = <0x01000000 0x0 0xc0c00000 0x4c 0xc0c00000 0x0 0x00400000>, + <0x42000000 0x0 0xf8000000 0x4c 0xf8000000 0x0 0x04000000>, + <0x02000000 0x0 0xfc000000 0x4c 0xfc000000 0x0 0x04000000>, + <0x43000000 0x4e 0x00000000 0x4e 0x00000000 0x2 0x00000000>, + <0x03000000 0x4d 0x00000000 0x4d 0x00000000 0x1 0x00000000>; + bus-range = <0x0 0xff>; + vendor-id = <0x1f1c>; + device-id = <0x2042>; + cdns,no-bar-match-nbits = <48>; + msi-parent = <&msi>; + status = "disabled"; + }; + clint_mswi: interrupt-controller at 7094000000 { compatible = "sophgo,sg2042-aclint-mswi", "thead,c900-aclint-mswi"; reg = <0x00000070 0x94000000 0x00000000 0x00004000>; -- 2.34.1 From unicornxw at gmail.com Thu Sep 11 19:37:13 2025 From: unicornxw at gmail.com (Chen Wang) Date: Fri, 12 Sep 2025 10:37:13 +0800 Subject: [PATCH v3 5/7] riscv: sophgo: dts: enable PCIe for PioneerBox In-Reply-To: References: Message-ID: From: Chen Wang Enable PCIe controllers for PioneerBox, which uses SG2042 SoC. Signed-off-by: Chen Wang --- arch/riscv/boot/dts/sophgo/sg2042-milkv-pioneer.dts | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/riscv/boot/dts/sophgo/sg2042-milkv-pioneer.dts b/arch/riscv/boot/dts/sophgo/sg2042-milkv-pioneer.dts index ef3a602172b1..c4d5f8d7d4ad 100644 --- a/arch/riscv/boot/dts/sophgo/sg2042-milkv-pioneer.dts +++ b/arch/riscv/boot/dts/sophgo/sg2042-milkv-pioneer.dts @@ -128,6 +128,18 @@ uart0-rx-pins { }; }; +&pcie_rc0 { + status = "okay"; +}; + +&pcie_rc2 { + status = "okay"; +}; + +&pcie_rc3 { + status = "okay"; +}; + &sd { pinctrl-0 = <&sd_cfg>; pinctrl-names = "default"; -- 2.34.1 From unicornxw at gmail.com Thu Sep 11 19:37:35 2025 From: unicornxw at gmail.com (Chen Wang) Date: Fri, 12 Sep 2025 10:37:35 +0800 Subject: [PATCH v3 6/7] riscv: sophgo: dts: enable PCIe for SG2042_EVB_V1.X In-Reply-To: References: Message-ID: <76d4012e515dc3c3d4e406a237eadc55203f77b6.1757643388.git.unicorn_wang@outlook.com> From: Chen Wang Enable PCIe controllers for Sophgo SG2042_EVB_V1.X board, which uses SG2042 SoC. Signed-off-by: Han Gao Signed-off-by: Chen Wang --- arch/riscv/boot/dts/sophgo/sg2042-evb-v1.dts | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/riscv/boot/dts/sophgo/sg2042-evb-v1.dts b/arch/riscv/boot/dts/sophgo/sg2042-evb-v1.dts index 3320bc1dd2c6..a186d036cf36 100644 --- a/arch/riscv/boot/dts/sophgo/sg2042-evb-v1.dts +++ b/arch/riscv/boot/dts/sophgo/sg2042-evb-v1.dts @@ -164,6 +164,18 @@ phy0: phy at 0 { }; }; +&pcie_rc0 { + status = "okay"; +}; + +&pcie_rc1 { + status = "okay"; +}; + +&pcie_rc2 { + status = "okay"; +}; + &pinctrl { emmc_cfg: sdhci-emmc-cfg { sdhci-emmc-wp-pins { -- 2.34.1 From unicornxw at gmail.com Thu Sep 11 19:37:54 2025 From: unicornxw at gmail.com (Chen Wang) Date: Fri, 12 Sep 2025 10:37:54 +0800 Subject: [PATCH v3 7/7] riscv: sophgo: dts: enable PCIe for SG2042_EVB_V2.0 In-Reply-To: References: Message-ID: <16831a3277a6c8c19436a17ac199d2f9b80f9ce5.1757643388.git.unicorn_wang@outlook.com> From: Chen Wang Enable PCIe controllers for Sophgo SG2042_EVB_V2.0 board, which uses SG2042 SoC. Signed-off-by: Han Gao Signed-off-by: Chen Wang --- arch/riscv/boot/dts/sophgo/sg2042-evb-v2.dts | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/riscv/boot/dts/sophgo/sg2042-evb-v2.dts b/arch/riscv/boot/dts/sophgo/sg2042-evb-v2.dts index 46980e41b886..0cd0dc0f537c 100644 --- a/arch/riscv/boot/dts/sophgo/sg2042-evb-v2.dts +++ b/arch/riscv/boot/dts/sophgo/sg2042-evb-v2.dts @@ -152,6 +152,18 @@ phy0: phy at 0 { }; }; +&pcie_rc0 { + status = "okay"; +}; + +&pcie_rc1 { + status = "okay"; +}; + +&pcie_rc2 { + status = "okay"; +}; + &pinctrl { emmc_cfg: sdhci-emmc-cfg { sdhci-emmc-wp-pins { -- 2.34.1 From inochiama at gmail.com Thu Sep 11 19:47:43 2025 From: inochiama at gmail.com (Inochi Amaoto) Date: Fri, 12 Sep 2025 10:47:43 +0800 Subject: [PATCH v2 3/7] PCI: sg2042: Add Sophgo SG2042 PCIe driver In-Reply-To: References: <162d064228261ccd0bf9313a20288e510912effd.1757467895.git.unicorn_wang@outlook.com> Message-ID: On Thu, Sep 11, 2025 at 10:33:18PM +0530, Manivannan Sadhasivam wrote: > On Wed, Sep 10, 2025 at 10:56:23AM GMT, Inochi Amaoto wrote: > > On Wed, Sep 10, 2025 at 10:08:39AM +0800, Chen Wang wrote: > > > From: Chen Wang > > > > > > Add support for PCIe controller in SG2042 SoC. The controller > > > uses the Cadence PCIe core programmed by pcie-cadence*.c. The > > > PCIe controller will work in host mode only, supporting data > > > rate(gen4) and lanes(x16 or x8). > > > > > > Signed-off-by: Chen Wang > > > --- > > > drivers/pci/controller/cadence/Kconfig | 10 ++ > > > drivers/pci/controller/cadence/Makefile | 1 + > > > drivers/pci/controller/cadence/pcie-sg2042.c | 104 +++++++++++++++++++ > > > 3 files changed, 115 insertions(+) > > > create mode 100644 drivers/pci/controller/cadence/pcie-sg2042.c > > > > > > diff --git a/drivers/pci/controller/cadence/Kconfig b/drivers/pci/controller/cadence/Kconfig > > > index 666e16b6367f..02a639e55fd8 100644 > > > --- a/drivers/pci/controller/cadence/Kconfig > > > +++ b/drivers/pci/controller/cadence/Kconfig > > > @@ -42,6 +42,15 @@ config PCIE_CADENCE_PLAT_EP > > > endpoint mode. This PCIe controller may be embedded into many > > > different vendors SoCs. > > > > > > +config PCIE_SG2042_HOST > > > + tristate "Sophgo SG2042 PCIe controller (host mode)" > > > + depends on OF && (ARCH_SOPHGO || COMPILE_TEST) > > > + select PCIE_CADENCE_HOST > > > + help > > > + Say Y here if you want to support the Sophgo SG2042 PCIe platform > > > + controller in host mode. Sophgo SG2042 PCIe controller uses Cadence > > > + PCIe core. > > > + > > > config PCI_J721E > > > tristate > > > select PCIE_CADENCE_HOST if PCI_J721E_HOST != n > > > @@ -67,4 +76,5 @@ config PCI_J721E_EP > > > Say Y here if you want to support the TI J721E PCIe platform > > > controller in endpoint mode. TI J721E PCIe controller uses Cadence PCIe > > > core. > > > + > > > endmenu > > > diff --git a/drivers/pci/controller/cadence/Makefile b/drivers/pci/controller/cadence/Makefile > > > index 9bac5fb2f13d..5e23f8539ecc 100644 > > > --- a/drivers/pci/controller/cadence/Makefile > > > +++ b/drivers/pci/controller/cadence/Makefile > > > @@ -4,3 +4,4 @@ obj-$(CONFIG_PCIE_CADENCE_HOST) += pcie-cadence-host.o > > > obj-$(CONFIG_PCIE_CADENCE_EP) += pcie-cadence-ep.o > > > obj-$(CONFIG_PCIE_CADENCE_PLAT) += pcie-cadence-plat.o > > > obj-$(CONFIG_PCI_J721E) += pci-j721e.o > > > +obj-$(CONFIG_PCIE_SG2042_HOST) += pcie-sg2042.o > > > diff --git a/drivers/pci/controller/cadence/pcie-sg2042.c b/drivers/pci/controller/cadence/pcie-sg2042.c > > > new file mode 100644 > > > index 000000000000..c026e1ca5d6e > > > --- /dev/null > > > +++ b/drivers/pci/controller/cadence/pcie-sg2042.c > > > @@ -0,0 +1,104 @@ > > > +// SPDX-License-Identifier: GPL-2.0 > > > +/* > > > + * pcie-sg2042 - PCIe controller driver for Sophgo SG2042 SoC > > > + * > > > + * Copyright (C) 2025 Sophgo Technology Inc. > > > + * Copyright (C) 2025 Chen Wang > > > + */ > > > + > > > +#include > > > +#include > > > +#include > > > +#include > > > + > > > +#include "pcie-cadence.h" > > > + > > > +/* > > > + * SG2042 only supports 4-byte aligned access, so for the rootbus (i.e. to > > > + * read/write the Root Port itself, read32/write32 is required. For > > > + * non-rootbus (i.e. to read/write the PCIe peripheral registers, supports > > > + * 1/2/4 byte aligned access, so directly using read/write should be fine. > > > + */ > > > + > > > +static struct pci_ops sg2042_pcie_root_ops = { > > > + .map_bus = cdns_pci_map_bus, > > > + .read = pci_generic_config_read32, > > > + .write = pci_generic_config_write32, > > > +}; > > > + > > > +static struct pci_ops sg2042_pcie_child_ops = { > > > + .map_bus = cdns_pci_map_bus, > > > + .read = pci_generic_config_read, > > > + .write = pci_generic_config_write, > > > +}; > > > + > > > +static int sg2042_pcie_probe(struct platform_device *pdev) > > > +{ > > > + struct device *dev = &pdev->dev; > > > + struct pci_host_bridge *bridge; > > > + struct cdns_pcie *pcie; > > > + struct cdns_pcie_rc *rc; > > > + int ret; > > > + > > > + bridge = devm_pci_alloc_host_bridge(dev, sizeof(*rc)); > > > + if (!bridge) { > > > + dev_err_probe(dev, -ENOMEM, "Failed to alloc host bridge!\n"); > > > + return -ENOMEM; > > > + } > > > + > > > + bridge->ops = &sg2042_pcie_root_ops; > > > + bridge->child_ops = &sg2042_pcie_child_ops; > > > + > > > + rc = pci_host_bridge_priv(bridge); > > > + pcie = &rc->pcie; > > > + pcie->dev = dev; > > > + > > > + platform_set_drvdata(pdev, pcie); > > > + > > > + pm_runtime_set_active(dev); > > > + pm_runtime_no_callbacks(dev); > > > + devm_pm_runtime_enable(dev); > > > + > > > + ret = cdns_pcie_init_phy(dev, pcie); > > > + if (ret) { > > > + dev_err_probe(dev, ret, "Failed to init phy!\n"); > > > + return ret; > > > + } > > > + > > > + ret = cdns_pcie_host_setup(rc); > > > + if (ret) { > > > + dev_err_probe(dev, ret, "Failed to setup host!\n"); > > > + cdns_pcie_disable_phy(pcie); > > > + return ret; > > > + } > > > + > > > + return 0; > > > +} > > > + > > > > > +static void sg2042_pcie_remove(struct platform_device *pdev) > > > +{ > > > + struct cdns_pcie *pcie = platform_get_drvdata(pdev); > > > + > > > + cdns_pcie_disable_phy(pcie); > > > +} > > > + > > > > I think this remove is useless, as it is almost impossible to > > remove a pcie at runtime. > > > > Why impossible? We only have concerns with removing PCIe controllers > implementing irqchip, but this driver is not implementing it and using an > external irqchip controller. > > So it is safe and possible to remove this driver during runtime. > Good to know this. It is the thing I did not know before. So it is OK for me to see this code. Thanks, Inochi From krzk at kernel.org Thu Sep 11 23:23:43 2025 From: krzk at kernel.org (Krzysztof Kozlowski) Date: Fri, 12 Sep 2025 08:23:43 +0200 Subject: [PATCH] defconfig: cleanup orphaned CONFIG_SCHED_DEBUG In-Reply-To: <20250828103828.33255-1-twoerner@gmail.com> References: <20250828103828.33255-1-twoerner@gmail.com> Message-ID: <295aa861-8e5c-4146-a137-20dcfc24e1c0@kernel.org> On 28/08/2025 12:38, Trevor Woerner wrote: > In commit b52173065e0a ("sched/debug: Remove CONFIG_SCHED_DEBUG") this > Kconfig option was removed since CONFIG_SCHED_DEBUG was made unconditional > by patches preceding it. > > Signed-off-by: Trevor Woerner I doubt anyone will pick up such patch touching all possible architectures. I would suggest to split it per arch. If you want to keep it like that, there is a chance soc@ would pick it up if you send it to them. Reviewed-by: Krzysztof Kozlowski Best regards, Krzysztof From brgl at bgdev.pl Fri Sep 12 00:26:56 2025 From: brgl at bgdev.pl (Bartosz Golaszewski) Date: Fri, 12 Sep 2025 09:26:56 +0200 Subject: [PATCH v2 00/15] gpio: replace legacy bgpio_init() with its modernized alternative - part 4 In-Reply-To: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: <175766186360.9646.5204996164911945151.b4-ty@linaro.org> From: Bartosz Golaszewski On Wed, 10 Sep 2025 09:12:36 +0200, Bartosz Golaszewski wrote: > Here's the final part of the generic GPIO chip conversions. Once all the > existing users are switched to the new API, the final patch in the > series removes bgpio_init(), moves the gpio-mmio fields out of struct > gpio_chip and into struct gpio_generic_chip and adjusts gpio-mmio.c to > the new situation. > > Down the line we could probably improve gpio-mmio.c by using lock guards > and replacing the - now obsolete - "bgpio" prefix with "gpio_generic" or > something similar but this series is already big as is so I'm leaving > that for the future. > > [...] Let's allow it to cook in next for some time. [01/15] gpio: loongson1: allow building the module with COMPILE_TEST enabled https://git.kernel.org/brgl/linux/c/80d7319c7a2a9865dc730422ec7227bfcc92e6bb [02/15] gpio: loongson1: use new generic GPIO chip API https://git.kernel.org/brgl/linux/c/116eadc92b4c47277d660271eac1efd4afd33121 [03/15] gpio: hlwd: use new generic GPIO chip API https://git.kernel.org/brgl/linux/c/43dffacf6be98fb31aa7790d693adc29276461f0 [04/15] gpio: ath79: use new generic GPIO chip API https://git.kernel.org/brgl/linux/c/551a097118391018ddc4079cbcec6fe4e7d64bc5 [05/15] gpio: ath79: use the generic GPIO chip lock for IRQ handling https://git.kernel.org/brgl/linux/c/e7a3a1be11d7e786924ed7af3b3411def2e46f21 [06/15] gpio: xgene-sb: use generic GPIO chip register read and write APIs https://git.kernel.org/brgl/linux/c/36f30f7ffc4b98dbd49deec8599cf810e7006cdf [07/15] gpio: brcmstb: use new generic GPIO chip API https://git.kernel.org/brgl/linux/c/e8bd2a6a5059043a9f13a0723acd48c1291a55ff [08/15] gpio: mt7621: use new generic GPIO chip API https://git.kernel.org/brgl/linux/c/80fd7e96d669d729d9e01bfa3e2b60ea6b500e20 [09/15] gpio: mt7621: use the generic GPIO chip lock for IRQ handling https://git.kernel.org/brgl/linux/c/2c1f22fa54fcbf8fbd9c03f5d341c73ef36c6d27 [10/15] gpio: menz127: use new generic GPIO chip API https://git.kernel.org/brgl/linux/c/b24489af4500720d8ad57c55111d90e762133c50 [11/15] gpio: sifive: use new generic GPIO chip API https://git.kernel.org/brgl/linux/c/8e1c8ccc1df8b802a7a1b4beadbd8b87fff1c3b3 [12/15] gpio: spacemit-k1: use new generic GPIO chip API https://git.kernel.org/brgl/linux/c/063411108de622a26b36487a711903443b0e864b [13/15] gpio: sodaville: use new generic GPIO chip API https://git.kernel.org/brgl/linux/c/ae9a52990b2cd62e0555adad92d8fe9e431d1bac [14/15] gpio: mmio: use new generic GPIO chip API https://git.kernel.org/brgl/linux/c/e43e94fa19cf058c4e465fcdbc2f521123058ea6 [15/15] gpio: move gpio-mmio-specific fields out of struct gpio_chip https://git.kernel.org/brgl/linux/c/9b90afa6d613b66ec4e74ae75f9bfa5baf386ecd Best regards, -- Bartosz Golaszewski From tianruidong at linux.alibaba.com Fri Sep 12 00:30:41 2025 From: tianruidong at linux.alibaba.com (Ruidong Tian) Date: Fri, 12 Sep 2025 15:30:41 +0800 Subject: [RFC PATCH v1 00/10] Add RAS support for RISC-V architecture In-Reply-To: <20250227123628.2931490-1-hchauhan@ventanamicro.com> References: <20250227123628.2931490-1-hchauhan@ventanamicro.com> Message-ID: <72563756-a53a-4f50-9bf4-87f6b26af036@linux.alibaba.com> ? 2025/2/27 20:36, Himanshu Chauhan ??: > This series implements the RAS (Reliability, Availability and Serviceability) > support for RISC-V architecture using RISC-V RERI specification. It is conformant > to ACPI platform error interfaces (APEI). It uses the highest priority > Supervisor Software Events (SSE)[2] to deliver the hardware error events to the kernel. > The SSE implemetation has already been merged in OpenSBI. Clement has sent a patch series for > its implemenation in Linux kernel.[5] > > The GHES driver framework is used as is with the following changes for RISC-V: > 1. Register each ghes entry with SSE layer. Ghes notification vector is SSE event. > 2. Add RISC-V specific entries for processor type and ISA string > 3. Add fixmap indices GHES SSE Low and High Priority to help map and read from > physical addresses present in GHES entry. > 4. Other changes to build/configure the RAS support > > How to Use: > ---------- > This RAS stack consists of Qemu[3], OpenSBI, EDK2[4], Linux kernel and devmem utility to inject and trigger > errors. Qemu [Ref.] has support to emulate RISC-V RERI. The RAS agent is implemented in OpenSBI which > creates CPER records. EDK2 generates HEST table and populates it with GHES entries with the help of > OpenSBI. > > Qemu Command: > ------------ > /build/qemu-system-riscv64 \ > -s -accel tcg -m 4096 -smp 2 \ > -cpu rv64,smepmp=false \ > -serial mon:stdio \ > -d guest_errors -D ./qemu.log \ > -bios /build/platform/generic/firmware/fw_dynamic.bin \ > -monitor telnet:127.0.0.1:55555,server,nowait \ > -device virtio-gpu-pci -full-screen \ > -device qemu-xhci \ > -device usb-kbd \ > -blockdev node-name=pflash0,driver=file,read-only=on,filename=/RiscVVirtQemu/RELEASE_GCC5/FV/RISCV_VIRT_CODE.fd \ > -blockdev node-name=pflash1,driver=file,filename=/RiscVVirtQemu/RELEASE_GCC5/FV/RISCV_VIRT_VARS.fd \ > -M virt,pflash0=pflash0,pflash1=pflash1,rpmi=true,reri=true,aia=aplic-imsic \ > -kernel \ > -initrd \ > -append "root=/dev/ram rw console=ttyS0 earlycon=uart8250,mmio,0x10000000" > > Error Injection & Triggering: > ---------------------------- > devmem 0x4010040 32 0x2a1 > devmem 0x4010048 32 0x9001404 > devmem 0x4010044 8 1 > > The above commands injects a TLB error on CPU 0. > > Sample Output (CPU 0): > --------------------- > [ 34.370282] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 > [ 34.371375] {1}[Hardware Error]: event severity: recoverable > [ 34.372149] {1}[Hardware Error]: Error 0, type: recoverable > [ 34.372756] {1}[Hardware Error]: section_type: general processor error > [ 34.373357] {1}[Hardware Error]: processor_type: 3, RISCV > [ 34.373806] {1}[Hardware Error]: processor_isa: 6, RISCV64 > [ 34.374294] {1}[Hardware Error]: error_type: 0x02 > [ 34.374845] {1}[Hardware Error]: TLB error > [ 34.375448] {1}[Hardware Error]: operation: 1, data read > [ 34.376100] {1}[Hardware Error]: target_address: 0x0000000000000000 > > References: > ---------- > [1] RERI Specification: https://github.com/riscv-non-isa/riscv-ras-eri/releases/download/v1.0/riscv-reri.pdf > [2] SSE Section in OpenSBI v3.0: https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/v3.0-rc3/riscv-sbi.pdf > [3] Qemu source (with RERI emulation support): https://github.com/ventanamicro/qemu.git (branch: dev-upstream) > [4] EDK2: https://github.com/ventanamicro/edk2.git (branch: dev-upstream) > [5] SSE Kernel Patches: https://lore.kernel.org/linux-riscv/649fdead-09b0-4f94-a6ff-099fc970d890 at rivosinc.com/T/ Hi, Thanks for this series. I'm doing some work related to your patch. Besides SSE, I'm working on support for another notification type for synchronous hardware errors (e.g., on a poison read), which called Hardware Error Exception (HEE) in Dhaval Sharma's UEFI proposal[0] in PRS-TG. I have a patch for HEE support which I've sent out separately[1]. Perhaps we could merge my work into your patchset to bringing a complete RAS solution to the RISC-V architecture? Or, I'm also happy to wait for your patches to land and then continue my work on top. Let me know what you think would be best. Cheers, Ruidong Tian [0]: https://lists.riscv.org/g/tech-prs/topic/risc_v_ras_related_ecrs/113685653 [1]: https://lore.kernel.org/all/20250910093347.75822-6-tianruidong at linux.alibaba.com/ > Himanshu Chauhan (10): > riscv: Define ioremap_cache for RISC-V > riscv: Define arch_apei_get_mem_attribute for RISC-V > acpi: Introduce SSE in HEST notification types > riscv: Add fixmap indices for GHES IRQ and SSE contexts > riscv: conditionally compile GHES NMI spool function > riscv: Add functions to register ghes having SSE notification > riscv: Add RISC-V entries in processor type and ISA strings > riscv: Introduce HEST SSE notification handlers > riscv: Add config option to enable APEI SSE handler > riscv: Enable APEI and NMI safe cmpxchg options required for RAS > > arch/riscv/Kconfig | 2 + > arch/riscv/include/asm/acpi.h | 20 ++++ > arch/riscv/include/asm/fixmap.h | 8 ++ > arch/riscv/include/asm/io.h | 3 + > drivers/acpi/apei/Kconfig | 5 + > drivers/acpi/apei/ghes.c | 102 +++++++++++++++++--- > drivers/firmware/efi/cper.c | 3 + > drivers/firmware/riscv/riscv_sse.c | 147 +++++++++++++++++++++++++++++ > include/acpi/actbl1.h | 3 +- > include/linux/riscv_sse.h | 15 +++ > 10 files changed, 296 insertions(+), 12 deletions(-) > From linus.walleij at linaro.org Fri Sep 12 00:33:12 2025 From: linus.walleij at linaro.org (Linus Walleij) Date: Fri, 12 Sep 2025 09:33:12 +0200 Subject: [PATCH v2 00/15] gpio: replace legacy bgpio_init() with its modernized alternative - part 4 In-Reply-To: References: <20250910-gpio-mmio-gpio-conv-part4-v2-0-f3d1a4c57124@linaro.org> Message-ID: On Thu, Sep 11, 2025 at 9:38?AM Bartosz Golaszewski wrote: > On Wed, Sep 10, 2025 at 11:32?PM Linus Walleij wrote: > > I would merge the first 14 and keep the last for the later part > > of the merge window when all other trees with conversions > > are merged. > > > > (You probably already thought of this.) > > > > Yours, > > Linus Walleij > > I already have both pinctrl and mfd changes in my tree from Lee's and > your immutable branches. I pushed this into gpio/devel and it built > just fine. Ah, excellent planning. Smarter than anything I'd be able to logisticize in my head! Yours, Linus Walleij From zhang.lyra at gmail.com Fri Sep 12 01:22:22 2025 From: zhang.lyra at gmail.com (Chunyan Zhang) Date: Fri, 12 Sep 2025 16:22:22 +0800 Subject: [PATCH v11 1/5] mm: softdirty: Add pgtable_soft_dirty_supported() In-Reply-To: <9bcaf3ec-c0a1-4ca5-87aa-f84e297d1e42@redhat.com> References: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> <20250911095602.1130290-2-zhangchunyan@iscas.ac.cn> <9bcaf3ec-c0a1-4ca5-87aa-f84e297d1e42@redhat.com> Message-ID: Hi David, On Thu, 11 Sept 2025 at 21:09, David Hildenbrand wrote: > > On 11.09.25 11:55, Chunyan Zhang wrote: > > Some platforms can customize the PTE PMD entry soft-dirty bit making it > > unavailable even if the architecture provides the resource. > > > > Add an API which architectures can define their specific implementations > > to detect if soft-dirty bit is available on which device the kernel is > > running. > > Thinking to myself: maybe pgtable_supports_soft_dirty() would read better > Whatever you prefer. I will use pgtable_supports_* in the next version. > > > > Signed-off-by: Chunyan Zhang > > --- > > fs/proc/task_mmu.c | 17 ++++++++++++++++- > > include/linux/pgtable.h | 12 ++++++++++++ > > mm/debug_vm_pgtable.c | 10 +++++----- > > mm/huge_memory.c | 13 +++++++------ > > mm/internal.h | 2 +- > > mm/mremap.c | 13 +++++++------ > > mm/userfaultfd.c | 10 ++++------ > > 7 files changed, 52 insertions(+), 25 deletions(-) > > > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > > index 29cca0e6d0ff..9e8083b6d4cd 100644 > > --- a/fs/proc/task_mmu.c > > +++ b/fs/proc/task_mmu.c > > @@ -1058,7 +1058,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) > > * -Werror=unterminated-string-initialization warning > > * with GCC 15 > > */ > > - static const char mnemonics[BITS_PER_LONG][3] = { > > + static char mnemonics[BITS_PER_LONG][3] = { > > /* > > * In case if we meet a flag we don't know about. > > */ > > @@ -1129,6 +1129,16 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) > > [ilog2(VM_SEALED)] = "sl", > > #endif > > }; > > +/* > > + * We should remove the VM_SOFTDIRTY flag if the soft-dirty bit is > > + * unavailable on which the kernel is running, even if the architecture > > + * provides the resource and soft-dirty is compiled in. > > + */ > > +#ifdef CONFIG_MEM_SOFT_DIRTY > > + if (!pgtable_soft_dirty_supported()) > > + mnemonics[ilog2(VM_SOFTDIRTY)][0] = 0; > > +#endif > > You can now drop the ifdef. Ok, you mean define VM_SOFTDIRTY 0x08000000 no matter if MEM_SOFT_DIRTY is compiled in, right? Then I need memcpy() to set mnemonics[ilog2(VM_SOFTDIRTY)] here. > > But, I wonder if could we instead just stop setting the flag. Then we don't > have to worry about any VM_SOFTDIRTY checks. > > Something like the following > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 892fe5dbf9de0..8b8bf63a32ef7 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -783,6 +783,7 @@ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) > static inline void vm_flags_init(struct vm_area_struct *vma, > vm_flags_t flags) > { > + VM_WARN_ON_ONCE(!pgtable_soft_dirty_supported() && (flags & VM_SOFTDIRTY)); > ACCESS_PRIVATE(vma, __vm_flags) = flags; > } > > @@ -801,6 +802,7 @@ static inline void vm_flags_reset(struct vm_area_struct *vma, > static inline void vm_flags_reset_once(struct vm_area_struct *vma, > vm_flags_t flags) > { > + VM_WARN_ON_ONCE(!pgtable_soft_dirty_supported() && (flags & VM_SOFTDIRTY)); > vma_assert_write_locked(vma); > WRITE_ONCE(ACCESS_PRIVATE(vma, __vm_flags), flags); > } > @@ -808,6 +810,7 @@ static inline void vm_flags_reset_once(struct vm_area_struct *vma, > static inline void vm_flags_set(struct vm_area_struct *vma, > vm_flags_t flags) > { > + VM_WARN_ON_ONCE(!pgtable_soft_dirty_supported() && (flags & VM_SOFTDIRTY)); > vma_start_write(vma); > ACCESS_PRIVATE(vma, __vm_flags) |= flags; > } > diff --git a/mm/mmap.c b/mm/mmap.c > index 5fd3b80fda1d5..40cb3fbf9a247 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1451,8 +1451,10 @@ static struct vm_area_struct *__install_special_mapping( > return ERR_PTR(-ENOMEM); > > vma_set_range(vma, addr, addr + len, 0); > - vm_flags_init(vma, (vm_flags | mm->def_flags | > - VM_DONTEXPAND | VM_SOFTDIRTY) & ~VM_LOCKED_MASK); > + vm_flags |= mm->def_flags | VM_DONTEXPAND; Why use '|=' rather than not directly setting vm_flags which is an uninitialized variable? > + if (pgtable_soft_dirty_supported()) > + vm_flags |= VM_SOFTDIRTY; > + vm_flags_init(vma, vm_flags & ~VM_LOCKED_MASK); > vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); > > vma->vm_ops = ops; > diff --git a/mm/vma.c b/mm/vma.c > index abe0da33c8446..16a1ed2a6199c 100644 > --- a/mm/vma.c > +++ b/mm/vma.c > @@ -2551,7 +2551,8 @@ static void __mmap_complete(struct mmap_state *map, struct vm_area_struct *vma) > * then new mapped in-place (which must be aimed as > * a completely new data area). > */ > - vm_flags_set(vma, VM_SOFTDIRTY); > + if (pgtable_soft_dirty_supported()) > + vm_flags_set(vma, VM_SOFTDIRTY); > > vma_set_page_prot(vma); > } > @@ -2819,7 +2820,8 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *vma, > mm->data_vm += len >> PAGE_SHIFT; > if (vm_flags & VM_LOCKED) > mm->locked_vm += (len >> PAGE_SHIFT); > - vm_flags_set(vma, VM_SOFTDIRTY); > + if (pgtable_soft_dirty_supported()) > + vm_flags_set(vma, VM_SOFTDIRTY); > return 0; > > mas_store_fail: > diff --git a/mm/vma_exec.c b/mm/vma_exec.c > index 922ee51747a68..c06732a5a620a 100644 > --- a/mm/vma_exec.c > +++ b/mm/vma_exec.c > @@ -107,6 +107,7 @@ int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift) > int create_init_stack_vma(struct mm_struct *mm, struct vm_area_struct **vmap, > unsigned long *top_mem_p) > { > + unsigned long flags = VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP; > int err; > struct vm_area_struct *vma = vm_area_alloc(mm); > > @@ -137,7 +138,9 @@ int create_init_stack_vma(struct mm_struct *mm, struct vm_area_struct **vmap, > BUILD_BUG_ON(VM_STACK_FLAGS & VM_STACK_INCOMPLETE_SETUP); > vma->vm_end = STACK_TOP_MAX; > vma->vm_start = vma->vm_end - PAGE_SIZE; > - vm_flags_init(vma, VM_SOFTDIRTY | VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP); > + if (pgtable_soft_dirty_supported()) > + flags |= VM_SOFTDIRTY; > + vm_flags_init(vma, flags); > vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); > > err = insert_vm_struct(mm, vma); > > > > + > > size_t i; > > > > seq_puts(m, "VmFlags: "); > > @@ -1531,6 +1541,8 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, > > static inline void clear_soft_dirty(struct vm_area_struct *vma, > > unsigned long addr, pte_t *pte) > > { > > + if (!pgtable_soft_dirty_supported()) > > + return; > > /* > > * The soft-dirty tracker uses #PF-s to catch writes > > * to pages, so write-protect the pte as well. See the > > @@ -1566,6 +1578,9 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > > { > > pmd_t old, pmd = *pmdp; > > > > + if (!pgtable_soft_dirty_supported()) > > + return; > > + > > if (pmd_present(pmd)) { > > /* See comment in change_huge_pmd() */ > > old = pmdp_invalidate(vma, addr, pmdp); > > That would all be handled with the above never-set-VM_SOFTDIRTY. Sorry I'm not sure I understand here, you mean no longer need #ifdef CONFIG_MEM_SOFT_DIRTY for these function definitions, right? > > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > > index 4c035637eeb7..2a3578a4ae4c 100644 > > --- a/include/linux/pgtable.h > > +++ b/include/linux/pgtable.h > > @@ -1537,6 +1537,18 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) > > #define arch_start_context_switch(prev) do {} while (0) > > #endif > > > > +/* > > + * Some platforms can customize the PTE soft-dirty bit making it unavailable > > + * even if the architecture provides the resource. > > + * Adding this API allows architectures to add their own checks for the > > + * devices on which the kernel is running. > > + * Note: When overiding it, please make sure the CONFIG_MEM_SOFT_DIRTY > > + * is part of this macro. > > + */ > > +#ifndef pgtable_soft_dirty_supported > > +#define pgtable_soft_dirty_supported() IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) > > +#endif > > + > > #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY > > #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION > > static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) > > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > > index 830107b6dd08..b32ce2b0b998 100644 > > --- a/mm/debug_vm_pgtable.c > > +++ b/mm/debug_vm_pgtable.c > > @@ -690,7 +690,7 @@ static void __init pte_soft_dirty_tests(struct pgtable_debug_args *args) > > { > > pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); > > > > - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > > + if (!pgtable_soft_dirty_supported()) > > return; > > > > pr_debug("Validating PTE soft dirty\n"); > > @@ -702,7 +702,7 @@ static void __init pte_swap_soft_dirty_tests(struct pgtable_debug_args *args) > > { > > pte_t pte; > > > > - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > > + if (!pgtable_soft_dirty_supported()) > > return; > > > > pr_debug("Validating PTE swap soft dirty\n"); > > @@ -718,7 +718,7 @@ static void __init pmd_soft_dirty_tests(struct pgtable_debug_args *args) > > { > > pmd_t pmd; > > > > - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > > + if (!pgtable_soft_dirty_supported()) > > return; > > > > if (!has_transparent_hugepage()) > > @@ -734,8 +734,8 @@ static void __init pmd_swap_soft_dirty_tests(struct pgtable_debug_args *args) > > { > > pmd_t pmd; > > > > - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || > > - !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) > > + if (!pgtable_soft_dirty_supported() || > > + !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) > > return; > > > > if (!has_transparent_hugepage()) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index 9c38a95e9f09..218d430a2ec6 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -2271,12 +2271,13 @@ static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl, > > > > static pmd_t move_soft_dirty_pmd(pmd_t pmd) > > { > > -#ifdef CONFIG_MEM_SOFT_DIRTY > > - if (unlikely(is_pmd_migration_entry(pmd))) > > - pmd = pmd_swp_mksoft_dirty(pmd); > > - else if (pmd_present(pmd)) > > - pmd = pmd_mksoft_dirty(pmd); > > -#endif > > + if (pgtable_soft_dirty_supported()) { > > + if (unlikely(is_pmd_migration_entry(pmd))) > > + pmd = pmd_swp_mksoft_dirty(pmd); > > + else if (pmd_present(pmd)) > > + pmd = pmd_mksoft_dirty(pmd); > > + } > > + > > Wondering, should simply the arch take care of that and we can just clal > pmd_swp_mksoft_dirty / pmd_mksoft_dirty? Ok, I think I can do that in another patchset. > > > return pmd; > > } > > > > diff --git a/mm/internal.h b/mm/internal.h > > index 45b725c3dc03..c6ca62f8ecf3 100644 > > --- a/mm/internal.h > > +++ b/mm/internal.h > > @@ -1538,7 +1538,7 @@ static inline bool vma_soft_dirty_enabled(struct vm_area_struct *vma) > > * VM_SOFTDIRTY is defined as 0x0, then !(vm_flags & VM_SOFTDIRTY) > > * will be constantly true. > > */ > > - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > > + if (!pgtable_soft_dirty_supported()) > > return false; > > > > That should be handled with the above never-set-VM_SOFTDIRTY. We don't need to check if (!pgtable_soft_dirty_supported()) if I understand correctly. Thanks for the review, Chunyan > > > /* > > diff --git a/mm/mremap.c b/mm/mremap.c > > index e618a706aff5..7beb3114dbf5 100644 > > --- a/mm/mremap.c > > +++ b/mm/mremap.c > > @@ -162,12 +162,13 @@ static pte_t move_soft_dirty_pte(pte_t pte) > > * Set soft dirty bit so we can notice > > * in userspace the ptes were moved. > > */ > > -#ifdef CONFIG_MEM_SOFT_DIRTY > > - if (pte_present(pte)) > > - pte = pte_mksoft_dirty(pte); > > - else if (is_swap_pte(pte)) > > - pte = pte_swp_mksoft_dirty(pte); > > -#endif > > + if (pgtable_soft_dirty_supported()) { > > + if (pte_present(pte)) > > + pte = pte_mksoft_dirty(pte); > > + else if (is_swap_pte(pte)) > > + pte = pte_swp_mksoft_dirty(pte); > > + } > > + > > return pte; > > } > > > -- > Cheers > > David / dhildenb > From david at redhat.com Fri Sep 12 01:41:10 2025 From: david at redhat.com (David Hildenbrand) Date: Fri, 12 Sep 2025 10:41:10 +0200 Subject: [PATCH v11 1/5] mm: softdirty: Add pgtable_soft_dirty_supported() In-Reply-To: References: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> <20250911095602.1130290-2-zhangchunyan@iscas.ac.cn> <9bcaf3ec-c0a1-4ca5-87aa-f84e297d1e42@redhat.com> Message-ID: <04d2d781-fd5e-4778-b042-d4dbeb8c5d49@redhat.com> [...] >>> +/* >>> + * We should remove the VM_SOFTDIRTY flag if the soft-dirty bit is >>> + * unavailable on which the kernel is running, even if the architecture >>> + * provides the resource and soft-dirty is compiled in. >>> + */ >>> +#ifdef CONFIG_MEM_SOFT_DIRTY >>> + if (!pgtable_soft_dirty_supported()) >>> + mnemonics[ilog2(VM_SOFTDIRTY)][0] = 0; >>> +#endif >> >> You can now drop the ifdef. > > Ok, you mean define VM_SOFTDIRTY 0x08000000 no matter if > MEM_SOFT_DIRTY is compiled in, right? > > Then I need memcpy() to set mnemonics[ilog2(VM_SOFTDIRTY)] here. The whole hunk will not be required when we make sure VM_SOFTDIRTY never gets set, correct? > >> >> But, I wonder if could we instead just stop setting the flag. Then we don't >> have to worry about any VM_SOFTDIRTY checks. >> >> Something like the following >> >> diff --git a/include/linux/mm.h b/include/linux/mm.h >> index 892fe5dbf9de0..8b8bf63a32ef7 100644 >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -783,6 +783,7 @@ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) >> static inline void vm_flags_init(struct vm_area_struct *vma, >> vm_flags_t flags) >> { >> + VM_WARN_ON_ONCE(!pgtable_soft_dirty_supported() && (flags & VM_SOFTDIRTY)); >> ACCESS_PRIVATE(vma, __vm_flags) = flags; >> } >> >> @@ -801,6 +802,7 @@ static inline void vm_flags_reset(struct vm_area_struct *vma, >> static inline void vm_flags_reset_once(struct vm_area_struct *vma, >> vm_flags_t flags) >> { >> + VM_WARN_ON_ONCE(!pgtable_soft_dirty_supported() && (flags & VM_SOFTDIRTY)); >> vma_assert_write_locked(vma); >> WRITE_ONCE(ACCESS_PRIVATE(vma, __vm_flags), flags); >> } >> @@ -808,6 +810,7 @@ static inline void vm_flags_reset_once(struct vm_area_struct *vma, >> static inline void vm_flags_set(struct vm_area_struct *vma, >> vm_flags_t flags) >> { >> + VM_WARN_ON_ONCE(!pgtable_soft_dirty_supported() && (flags & VM_SOFTDIRTY)); >> vma_start_write(vma); >> ACCESS_PRIVATE(vma, __vm_flags) |= flags; >> } >> diff --git a/mm/mmap.c b/mm/mmap.c >> index 5fd3b80fda1d5..40cb3fbf9a247 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -1451,8 +1451,10 @@ static struct vm_area_struct *__install_special_mapping( >> return ERR_PTR(-ENOMEM); >> >> vma_set_range(vma, addr, addr + len, 0); >> - vm_flags_init(vma, (vm_flags | mm->def_flags | >> - VM_DONTEXPAND | VM_SOFTDIRTY) & ~VM_LOCKED_MASK); >> + vm_flags |= mm->def_flags | VM_DONTEXPAND; > > Why use '|=' rather than not directly setting vm_flags which is an > uninitialized variable? vm_flags is passed in by the caller? But just to clarify: this code was just a quick hack, adjust it as you need. [...] >>> >>> + if (!pgtable_soft_dirty_supported()) >>> + return; >>> + >>> if (pmd_present(pmd)) { >>> /* See comment in change_huge_pmd() */ >>> old = pmdp_invalidate(vma, addr, pmdp); >> >> That would all be handled with the above never-set-VM_SOFTDIRTY. I meant that there is no need to add the pgtable_soft_dirty_supported() check. > > Sorry I'm not sure I understand here, you mean no longer need #ifdef > CONFIG_MEM_SOFT_DIRTY for these function definitions, right? Likely we could drop them. VM_SOFTDIRTY will never be set so the code will not be invoked. And for architectures where VM_SOFTDIRTY is never even possible (!CONFIG_MEM_SOFT_DIRTY) we keep it as 0. That way, the compiler can even optimize out all of that code because "vma->vm_flags & VM_SOFTDIRTY" -> "vma->vm_flags & 0" will never be true. > >> >>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h >>> index 4c035637eeb7..2a3578a4ae4c 100644 >>> --- a/include/linux/pgtable.h >>> +++ b/include/linux/pgtable.h >>> @@ -1537,6 +1537,18 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) >>> #define arch_start_context_switch(prev) do {} while (0) >>> #endif >>> >>> +/* >>> + * Some platforms can customize the PTE soft-dirty bit making it unavailable >>> + * even if the architecture provides the resource. >>> + * Adding this API allows architectures to add their own checks for the >>> + * devices on which the kernel is running. >>> + * Note: When overiding it, please make sure the CONFIG_MEM_SOFT_DIRTY >>> + * is part of this macro. >>> + */ >>> +#ifndef pgtable_soft_dirty_supported >>> +#define pgtable_soft_dirty_supported() IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) >>> +#endif >>> + >>> #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY >>> #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION >>> static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) >>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c >>> index 830107b6dd08..b32ce2b0b998 100644 >>> --- a/mm/debug_vm_pgtable.c >>> +++ b/mm/debug_vm_pgtable.c >>> @@ -690,7 +690,7 @@ static void __init pte_soft_dirty_tests(struct pgtable_debug_args *args) >>> { >>> pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); >>> >>> - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) >>> + if (!pgtable_soft_dirty_supported()) >>> return; >>> >>> pr_debug("Validating PTE soft dirty\n"); >>> @@ -702,7 +702,7 @@ static void __init pte_swap_soft_dirty_tests(struct pgtable_debug_args *args) >>> { >>> pte_t pte; >>> >>> - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) >>> + if (!pgtable_soft_dirty_supported()) >>> return; >>> >>> pr_debug("Validating PTE swap soft dirty\n"); >>> @@ -718,7 +718,7 @@ static void __init pmd_soft_dirty_tests(struct pgtable_debug_args *args) >>> { >>> pmd_t pmd; >>> >>> - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) >>> + if (!pgtable_soft_dirty_supported()) >>> return; >>> >>> if (!has_transparent_hugepage()) >>> @@ -734,8 +734,8 @@ static void __init pmd_swap_soft_dirty_tests(struct pgtable_debug_args *args) >>> { >>> pmd_t pmd; >>> >>> - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || >>> - !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) >>> + if (!pgtable_soft_dirty_supported() || >>> + !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) >>> return; >>> >>> if (!has_transparent_hugepage()) >>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>> index 9c38a95e9f09..218d430a2ec6 100644 >>> --- a/mm/huge_memory.c >>> +++ b/mm/huge_memory.c >>> @@ -2271,12 +2271,13 @@ static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl, >>> >>> static pmd_t move_soft_dirty_pmd(pmd_t pmd) >>> { >>> -#ifdef CONFIG_MEM_SOFT_DIRTY >>> - if (unlikely(is_pmd_migration_entry(pmd))) >>> - pmd = pmd_swp_mksoft_dirty(pmd); >>> - else if (pmd_present(pmd)) >>> - pmd = pmd_mksoft_dirty(pmd); >>> -#endif >>> + if (pgtable_soft_dirty_supported()) { >>> + if (unlikely(is_pmd_migration_entry(pmd))) >>> + pmd = pmd_swp_mksoft_dirty(pmd); >>> + else if (pmd_present(pmd)) >>> + pmd = pmd_mksoft_dirty(pmd); >>> + } >>> + >> >> Wondering, should simply the arch take care of that and we can just clal >> pmd_swp_mksoft_dirty / pmd_mksoft_dirty? > I think we have that already in include/linux/pgtable.h: We have stubs that just don't do anything. For riscv support you would handle runtime-enablement in these helpers. > >> >>> return pmd; >>> } >>> >>> diff --git a/mm/internal.h b/mm/internal.h >>> index 45b725c3dc03..c6ca62f8ecf3 100644 >>> --- a/mm/internal.h >>> +++ b/mm/internal.h >>> @@ -1538,7 +1538,7 @@ static inline bool vma_soft_dirty_enabled(struct vm_area_struct *vma) >>> * VM_SOFTDIRTY is defined as 0x0, then !(vm_flags & VM_SOFTDIRTY) >>> * will be constantly true. >>> */ >>> - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) >>> + if (!pgtable_soft_dirty_supported()) >>> return false; >>> >> >> That should be handled with the above never-set-VM_SOFTDIRTY. > > We don't need to check if (!pgtable_soft_dirty_supported()) if I > understand correctly. Hm, let me think about that. No, I think this has to stay as the comment says, so this case here is special. -- Cheers David / dhildenb From david at redhat.com Fri Sep 12 01:54:19 2025 From: david at redhat.com (David Hildenbrand) Date: Fri, 12 Sep 2025 10:54:19 +0200 Subject: [PATCH v11 2/5] mm: userfaultfd: Add pgtable_uffd_wp_supported() In-Reply-To: <20250911095602.1130290-3-zhangchunyan@iscas.ac.cn> References: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> <20250911095602.1130290-3-zhangchunyan@iscas.ac.cn> Message-ID: <009f31e4-aba8-4ab4-b6f3-09244ca03e1c@redhat.com> On 11.09.25 11:55, Chunyan Zhang wrote: > Some platforms can customize the PTE/PMD entry uffd-wp bit making > it unavailable even if the architecture provides the resource. > This patch adds a macro API that allows architectures to define their > specific implementations to check if the uffd-wp bit is available > on which device the kernel is running. If you change the name of the sofdirty thingy, adjust that one here as well. > > Signed-off-by: Chunyan Zhang > --- > fs/userfaultfd.c | 23 ++++++++-------- > include/asm-generic/pgtable_uffd.h | 11 ++++++++ > include/linux/mm_inline.h | 7 +++++ > include/linux/userfaultfd_k.h | 44 +++++++++++++++++++----------- > mm/memory.c | 6 ++-- > 5 files changed, 62 insertions(+), 29 deletions(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 54c6cc7fe9c6..b549c327d7ad 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -1270,9 +1270,9 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, > if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MISSING) > vm_flags |= VM_UFFD_MISSING; > if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) { > -#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP > - goto out; > -#endif > + if (!pgtable_uffd_wp_supported()) > + goto out; > + > vm_flags |= VM_UFFD_WP; I like that, similar to the softdirty thing we will simply not set the flag. > } > if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MINOR) { > @@ -1980,14 +1980,15 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx, > uffdio_api.features &= > ~(UFFD_FEATURE_MINOR_HUGETLBFS | UFFD_FEATURE_MINOR_SHMEM); > #endif > -#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP > - uffdio_api.features &= ~UFFD_FEATURE_PAGEFAULT_FLAG_WP; > -#endif > -#ifndef CONFIG_PTE_MARKER_UFFD_WP > - uffdio_api.features &= ~UFFD_FEATURE_WP_HUGETLBFS_SHMEM; > - uffdio_api.features &= ~UFFD_FEATURE_WP_UNPOPULATED; > - uffdio_api.features &= ~UFFD_FEATURE_WP_ASYNC; > -#endif > + if (!pgtable_uffd_wp_supported()) > + uffdio_api.features &= ~UFFD_FEATURE_PAGEFAULT_FLAG_WP; > + > + if (!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || > + !pgtable_uffd_wp_supported()) { I wonder if we would want to have a helper for that like static inline bool uffd_supports_wp_marker(void) { return pgtable_uffd_wp_supported() && IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP); } That should clean all of this futher up. > + uffdio_api.features &= ~UFFD_FEATURE_WP_HUGETLBFS_SHMEM; > + uffdio_api.features &= ~UFFD_FEATURE_WP_UNPOPULATED; > + uffdio_api.features &= ~UFFD_FEATURE_WP_ASYNC; > + } > > ret = -EINVAL; > if (features & ~uffdio_api.features) > diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h > index 828966d4c281..895d68ece0e7 100644 > --- a/include/asm-generic/pgtable_uffd.h > +++ b/include/asm-generic/pgtable_uffd.h > @@ -1,6 +1,17 @@ > #ifndef _ASM_GENERIC_PGTABLE_UFFD_H > #define _ASM_GENERIC_PGTABLE_UFFD_H > > +/* > + * Some platforms can customize the uffd-wp bit, making it unavailable > + * even if the architecture provides the resource. > + * Adding this API allows architectures to add their own checks for the > + * devices on which the kernel is running. > + * Note: When overiding it, please make sure the s/overiding/overriding/ > + * CONFIG_HAVE_ARCH_USERFAULTFD_WP is part of this macro. > + */ > +#ifndef pgtable_uffd_wp_supported > +#define pgtable_uffd_wp_supported() IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP) > +#endif > #ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP > static __always_inline int pte_uffd_wp(pte_t pte) > { > diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h > index 89b518ff097e..38845b8b79ff 100644 > --- a/include/linux/mm_inline.h > +++ b/include/linux/mm_inline.h > @@ -571,6 +571,13 @@ pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, > pte_t *pte, pte_t pteval) > { > #ifdef CONFIG_PTE_MARKER_UFFD_WP > + /* > + * Some platforms can customize the PTE uffd-wp bit, making it unavailable > + * even if the architecture allows providing the PTE resource. > + */ > + if (!pgtable_uffd_wp_supported()) > + return false; > + Likely we could use the uffd_supports_wp_marker() wrapper here isntead and remove the #ifdef. > bool arm_uffd_pte = false; > > /* The current status of the pte should be "cleared" before calling */ > diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h > index c0e716aec26a..6264b56ae961 100644 > --- a/include/linux/userfaultfd_k.h > +++ b/include/linux/userfaultfd_k.h > @@ -228,15 +228,15 @@ static inline bool vma_can_userfault(struct vm_area_struct *vma, > if (wp_async && (vm_flags == VM_UFFD_WP)) > return true; > > -#ifndef CONFIG_PTE_MARKER_UFFD_WP > /* > * If user requested uffd-wp but not enabled pte markers for > * uffd-wp, then shmem & hugetlbfs are not supported but only > * anonymous. > */ > - if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma)) > + if ((!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || > + !pgtable_uffd_wp_supported()) && This would also use the helper. > + (vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma)) > return false; > -#endif > > /* By default, allow any of anon|shmem|hugetlb */ > return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || > @@ -437,8 +437,11 @@ static inline bool userfaultfd_wp_use_markers(struct vm_area_struct *vma) > static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) > { > #ifdef CONFIG_PTE_MARKER_UFFD_WP > - return is_pte_marker_entry(entry) && > - (pte_marker_get(entry) & PTE_MARKER_UFFD_WP); > + if (pgtable_uffd_wp_supported()) > + return is_pte_marker_entry(entry) && > + (pte_marker_get(entry) & PTE_MARKER_UFFD_WP); > + else > + return false; if (!uffd_supports_wp_marker()) return false; return is_pte_marker_entry(entry) && (pte_marker_get(entry) & PTE_MARKER_UFFD_WP); > #else > return false; > #endif > @@ -447,14 +450,19 @@ static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) > static inline bool pte_marker_uffd_wp(pte_t pte) > { > #ifdef CONFIG_PTE_MARKER_UFFD_WP Simialrly here, just do a if (!uffd_supports_wp_marker()) return false; and remove the ifdef > #endif > @@ -467,14 +475,18 @@ static inline bool pte_marker_uffd_wp(pte_t pte) > static inline bool pte_swp_uffd_wp_any(pte_t pte) > { > #ifdef CONFIG_PTE_MARKER_UFFD_WP Same here. > - if (!is_swap_pte(pte)) > - return false; > + if (pgtable_uffd_wp_supported()) { > + if (!is_swap_pte(pte)) > + return false; > > - if (pte_swp_uffd_wp(pte)) > - return true; > + if (pte_swp_uffd_wp(pte)) > + return true; > > - if (pte_marker_uffd_wp(pte)) > - return true; > + if (pte_marker_uffd_wp(pte)) > + return true; > + } else { > + return false; > + } > #endif > return false; > } > diff --git a/mm/memory.c b/mm/memory.c > index 0ba4f6b71847..4eb05c5f487b 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1465,7 +1465,9 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, > { > bool was_installed = false; > > -#ifdef CONFIG_PTE_MARKER_UFFD_WP > + if (!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) || !pgtable_uffd_wp_supported()) > + return false; > + Same here. -- Cheers David / dhildenb From roypat at amazon.co.uk Fri Sep 12 02:17:39 2025 From: roypat at amazon.co.uk (Roy, Patrick) Date: Fri, 12 Sep 2025 09:17:39 +0000 Subject: [PATCH v6 06/11] KVM: selftests: load elf via bounce buffer In-Reply-To: <20250912091708.17502-1-roypat@amazon.co.uk> References: <20250912091708.17502-1-roypat@amazon.co.uk> Message-ID: <20250912091708.17502-7-roypat@amazon.co.uk> If guest memory is backed using a VMA that does not allow GUP (e.g. a userspace mapping of guest_memfd when the fd was allocated using KVM_GMEM_NO_DIRECT_MAP), then directly loading the test ELF binary into it via read(2) potentially does not work. To nevertheless support loading binaries in this cases, do the read(2) syscall using a bounce buffer, and then memcpy from the bounce buffer into guest memory. Signed-off-by: Patrick Roy --- .../testing/selftests/kvm/include/test_util.h | 1 + tools/testing/selftests/kvm/lib/elf.c | 8 +++---- tools/testing/selftests/kvm/lib/io.c | 23 +++++++++++++++++++ 3 files changed, 28 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index c6ef895fbd9a..0409b7b96c94 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -46,6 +46,7 @@ do { \ ssize_t test_write(int fd, const void *buf, size_t count); ssize_t test_read(int fd, void *buf, size_t count); +ssize_t test_read_bounce(int fd, void *buf, size_t count); int test_seq_read(const char *path, char **bufp, size_t *sizep); void __printf(5, 6) test_assert(bool exp, const char *exp_str, diff --git a/tools/testing/selftests/kvm/lib/elf.c b/tools/testing/selftests/kvm/lib/elf.c index f34d926d9735..e829fbe0a11e 100644 --- a/tools/testing/selftests/kvm/lib/elf.c +++ b/tools/testing/selftests/kvm/lib/elf.c @@ -31,7 +31,7 @@ static void elfhdr_get(const char *filename, Elf64_Ehdr *hdrp) * the real size of the ELF header. */ unsigned char ident[EI_NIDENT]; - test_read(fd, ident, sizeof(ident)); + test_read_bounce(fd, ident, sizeof(ident)); TEST_ASSERT((ident[EI_MAG0] == ELFMAG0) && (ident[EI_MAG1] == ELFMAG1) && (ident[EI_MAG2] == ELFMAG2) && (ident[EI_MAG3] == ELFMAG3), "ELF MAGIC Mismatch,\n" @@ -79,7 +79,7 @@ static void elfhdr_get(const char *filename, Elf64_Ehdr *hdrp) offset_rv = lseek(fd, 0, SEEK_SET); TEST_ASSERT(offset_rv == 0, "Seek to ELF header failed,\n" " rv: %zi expected: %i", offset_rv, 0); - test_read(fd, hdrp, sizeof(*hdrp)); + test_read_bounce(fd, hdrp, sizeof(*hdrp)); TEST_ASSERT(hdrp->e_phentsize == sizeof(Elf64_Phdr), "Unexpected physical header size,\n" " hdrp->e_phentsize: %x\n" @@ -146,7 +146,7 @@ void kvm_vm_elf_load(struct kvm_vm *vm, const char *filename) /* Read in the program header. */ Elf64_Phdr phdr; - test_read(fd, &phdr, sizeof(phdr)); + test_read_bounce(fd, &phdr, sizeof(phdr)); /* Skip if this header doesn't describe a loadable segment. */ if (phdr.p_type != PT_LOAD) @@ -187,7 +187,7 @@ void kvm_vm_elf_load(struct kvm_vm *vm, const char *filename) " expected: 0x%jx", n1, errno, (intmax_t) offset_rv, (intmax_t) phdr.p_offset); - test_read(fd, addr_gva2hva(vm, phdr.p_vaddr), + test_read_bounce(fd, addr_gva2hva(vm, phdr.p_vaddr), phdr.p_filesz); } } diff --git a/tools/testing/selftests/kvm/lib/io.c b/tools/testing/selftests/kvm/lib/io.c index fedb2a741f0b..74419becc8bc 100644 --- a/tools/testing/selftests/kvm/lib/io.c +++ b/tools/testing/selftests/kvm/lib/io.c @@ -155,3 +155,26 @@ ssize_t test_read(int fd, void *buf, size_t count) return num_read; } + +/* Test read via intermediary buffer + * + * Same as test_read, except read(2)s happen into a bounce buffer that is memcpy'd + * to buf. For use with buffers that cannot be GUP'd (e.g. guest_memfd VMAs if + * guest_memfd was created with GUEST_MEMFD_FLAG_NO_DIRECT_MAP). + */ +ssize_t test_read_bounce(int fd, void *buf, size_t count) +{ + void *bounce_buffer; + ssize_t num_read; + + TEST_ASSERT(count >= 0, "Unexpected count, count: %li", count); + + bounce_buffer = malloc(count); + TEST_ASSERT(bounce_buffer != NULL, "Failed to allocate bounce buffer"); + + num_read = test_read(fd, bounce_buffer, count); + memcpy(buf, bounce_buffer, num_read); + free(bounce_buffer); + + return num_read; +} -- 2.50.1 From roypat at amazon.co.uk Fri Sep 12 02:17:32 2025 From: roypat at amazon.co.uk (Roy, Patrick) Date: Fri, 12 Sep 2025 09:17:32 +0000 Subject: [PATCH v6 02/11] arch: export set_direct_map_valid_noflush to KVM module In-Reply-To: <20250912091708.17502-1-roypat@amazon.co.uk> References: <20250912091708.17502-1-roypat@amazon.co.uk> Message-ID: <20250912091708.17502-3-roypat@amazon.co.uk> Use the new per-module export functionality to allow KVM (and only KVM) access to set_direct_map_valid_noflush(). This allows guest_memfd to remove its memory from the direct map, even if KVM is built as a module. Direct map removal gives guest_memfd the same protection that memfd_secret enjoys, such as hardening against Spectre-like attacks through in-kernel gadgets. Reviewed-by: Fuad Tabba Signed-off-by: Patrick Roy --- arch/arm64/mm/pageattr.c | 1 + arch/loongarch/mm/pageattr.c | 1 + arch/riscv/mm/pageattr.c | 1 + arch/s390/mm/pageattr.c | 1 + arch/x86/mm/pat/set_memory.c | 1 + 5 files changed, 5 insertions(+) diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c index 04d4a8f676db..4f3cddfab9b0 100644 --- a/arch/arm64/mm/pageattr.c +++ b/arch/arm64/mm/pageattr.c @@ -291,6 +291,7 @@ int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid) return set_memory_valid(addr, nr, valid); } +EXPORT_SYMBOL_FOR_MODULES(set_direct_map_valid_noflush, "kvm"); #ifdef CONFIG_DEBUG_PAGEALLOC /* diff --git a/arch/loongarch/mm/pageattr.c b/arch/loongarch/mm/pageattr.c index f5e910b68229..458f5ae6a89b 100644 --- a/arch/loongarch/mm/pageattr.c +++ b/arch/loongarch/mm/pageattr.c @@ -236,3 +236,4 @@ int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid) return __set_memory(addr, 1, set, clear); } +EXPORT_SYMBOL_FOR_MODULES(set_direct_map_valid_noflush, "kvm"); diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c index 3f76db3d2769..6db31040cd66 100644 --- a/arch/riscv/mm/pageattr.c +++ b/arch/riscv/mm/pageattr.c @@ -400,6 +400,7 @@ int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid) return __set_memory((unsigned long)page_address(page), nr, set, clear); } +EXPORT_SYMBOL_FOR_MODULES(set_direct_map_valid_noflush, "kvm"); #ifdef CONFIG_DEBUG_PAGEALLOC static int debug_pagealloc_set_page(pte_t *pte, unsigned long addr, void *data) diff --git a/arch/s390/mm/pageattr.c b/arch/s390/mm/pageattr.c index 348e759840e7..8ffd9ef09bc6 100644 --- a/arch/s390/mm/pageattr.c +++ b/arch/s390/mm/pageattr.c @@ -413,6 +413,7 @@ int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid) return __set_memory((unsigned long)page_to_virt(page), nr, flags); } +EXPORT_SYMBOL_FOR_MODULES(set_direct_map_valid_noflush, "kvm"); bool kernel_page_present(struct page *page) { diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index 8834c76f91c9..87e9c7d2dcdc 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -2661,6 +2661,7 @@ int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid) return __set_pages_np(page, nr); } +EXPORT_SYMBOL_FOR_MODULES(set_direct_map_valid_noflush, "kvm"); #ifdef CONFIG_DEBUG_PAGEALLOC void __kernel_map_pages(struct page *page, int numpages, int enable) -- 2.50.1 From roypat at amazon.co.uk Fri Sep 12 02:17:31 2025 From: roypat at amazon.co.uk (Roy, Patrick) Date: Fri, 12 Sep 2025 09:17:31 +0000 Subject: [PATCH v6 01/11] filemap: Pass address_space mapping to ->free_folio() In-Reply-To: <20250912091708.17502-1-roypat@amazon.co.uk> References: <20250912091708.17502-1-roypat@amazon.co.uk> Message-ID: <20250912091708.17502-2-roypat@amazon.co.uk> From: Elliot Berman When guest_memfd removes memory from the host kernel's direct map, direct map entries must be restored before the memory is freed again. To do so, ->free_folio() needs to know whether a gmem folio was direct map removed in the first place though. While possible to keep track of this information on each individual folio (e.g. via page flags), direct map removal is an all-or-nothing property of the entire guest_memfd, so it is less error prone to just check the flag stored in the gmem inode's private data. However, by the time ->free_folio() is called, folio->mapping might be cleared. To still allow access to the address space from which the folio was just removed, pass it in as an additional argument to ->free_folio, as the mapping is well-known to all callers. Link: https://lore.kernel.org/all/15f665b4-2d33-41ca-ac50-fafe24ade32f at redhat.com/ Suggested-by: David Hildenbrand Acked-by: David Hildenbrand Signed-off-by: Elliot Berman [patrick: rewrite shortlog for new usecase] Signed-off-by: Patrick Roy --- Documentation/filesystems/locking.rst | 2 +- fs/nfs/dir.c | 11 ++++++----- fs/orangefs/inode.c | 3 ++- include/linux/fs.h | 2 +- mm/filemap.c | 9 +++++---- mm/secretmem.c | 3 ++- mm/vmscan.c | 4 ++-- virt/kvm/guest_memfd.c | 3 ++- 8 files changed, 21 insertions(+), 16 deletions(-) diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index aa287ccdac2f..74c97287ec40 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -262,7 +262,7 @@ prototypes:: sector_t (*bmap)(struct address_space *, sector_t); void (*invalidate_folio) (struct folio *, size_t start, size_t len); bool (*release_folio)(struct folio *, gfp_t); - void (*free_folio)(struct folio *); + void (*free_folio)(struct address_space *, struct folio *); int (*direct_IO)(struct kiocb *, struct iov_iter *iter); int (*migrate_folio)(struct address_space *, struct folio *dst, struct folio *src, enum migrate_mode); diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index d81217923936..644bd54e052c 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -55,7 +55,7 @@ static int nfs_closedir(struct inode *, struct file *); static int nfs_readdir(struct file *, struct dir_context *); static int nfs_fsync_dir(struct file *, loff_t, loff_t, int); static loff_t nfs_llseek_dir(struct file *, loff_t, int); -static void nfs_readdir_clear_array(struct folio *); +static void nfs_readdir_clear_array(struct address_space *, struct folio *); static int nfs_do_create(struct inode *dir, struct dentry *dentry, umode_t mode, int open_flags); @@ -218,7 +218,8 @@ static void nfs_readdir_folio_init_array(struct folio *folio, u64 last_cookie, /* * we are freeing strings created by nfs_add_to_readdir_array() */ -static void nfs_readdir_clear_array(struct folio *folio) +static void nfs_readdir_clear_array(struct address_space *mapping, + struct folio *folio) { struct nfs_cache_array *array; unsigned int i; @@ -233,7 +234,7 @@ static void nfs_readdir_clear_array(struct folio *folio) static void nfs_readdir_folio_reinit_array(struct folio *folio, u64 last_cookie, u64 change_attr) { - nfs_readdir_clear_array(folio); + nfs_readdir_clear_array(folio->mapping, folio); nfs_readdir_folio_init_array(folio, last_cookie, change_attr); } @@ -249,7 +250,7 @@ nfs_readdir_folio_array_alloc(u64 last_cookie, gfp_t gfp_flags) static void nfs_readdir_folio_array_free(struct folio *folio) { if (folio) { - nfs_readdir_clear_array(folio); + nfs_readdir_clear_array(folio->mapping, folio); folio_put(folio); } } @@ -391,7 +392,7 @@ static void nfs_readdir_folio_init_and_validate(struct folio *folio, u64 cookie, if (folio_test_uptodate(folio)) { if (nfs_readdir_folio_validate(folio, cookie, change_attr)) return; - nfs_readdir_clear_array(folio); + nfs_readdir_clear_array(folio->mapping, folio); } nfs_readdir_folio_init_array(folio, cookie, change_attr); folio_mark_uptodate(folio); diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c index a01400cd41fd..37227ba71593 100644 --- a/fs/orangefs/inode.c +++ b/fs/orangefs/inode.c @@ -452,7 +452,8 @@ static bool orangefs_release_folio(struct folio *folio, gfp_t foo) return !folio_test_private(folio); } -static void orangefs_free_folio(struct folio *folio) +static void orangefs_free_folio(struct address_space *mapping, + struct folio *folio) { kfree(folio_detach_private(folio)); } diff --git a/include/linux/fs.h b/include/linux/fs.h index d7ab4f96d705..afb0748ffda6 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -457,7 +457,7 @@ struct address_space_operations { sector_t (*bmap)(struct address_space *, sector_t); void (*invalidate_folio) (struct folio *, size_t offset, size_t len); bool (*release_folio)(struct folio *, gfp_t); - void (*free_folio)(struct folio *folio); + void (*free_folio)(struct address_space *, struct folio *folio); ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter); /* * migrate the contents of a folio to the specified target. If diff --git a/mm/filemap.c b/mm/filemap.c index 751838ef05e5..3dd8ad922d80 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -226,11 +226,11 @@ void __filemap_remove_folio(struct folio *folio, void *shadow) void filemap_free_folio(struct address_space *mapping, struct folio *folio) { - void (*free_folio)(struct folio *); + void (*free_folio)(struct address_space *, struct folio *); free_folio = mapping->a_ops->free_folio; if (free_folio) - free_folio(folio); + free_folio(mapping, folio); folio_put_refs(folio, folio_nr_pages(folio)); } @@ -820,7 +820,8 @@ EXPORT_SYMBOL(file_write_and_wait_range); void replace_page_cache_folio(struct folio *old, struct folio *new) { struct address_space *mapping = old->mapping; - void (*free_folio)(struct folio *) = mapping->a_ops->free_folio; + void (*free_folio)(struct address_space *, struct folio *) = + mapping->a_ops->free_folio; pgoff_t offset = old->index; XA_STATE(xas, &mapping->i_pages, offset); @@ -849,7 +850,7 @@ void replace_page_cache_folio(struct folio *old, struct folio *new) __lruvec_stat_add_folio(new, NR_SHMEM); xas_unlock_irq(&xas); if (free_folio) - free_folio(old); + free_folio(mapping, old); folio_put(old); } EXPORT_SYMBOL_GPL(replace_page_cache_folio); diff --git a/mm/secretmem.c b/mm/secretmem.c index 60137305bc20..422dcaa32506 100644 --- a/mm/secretmem.c +++ b/mm/secretmem.c @@ -150,7 +150,8 @@ static int secretmem_migrate_folio(struct address_space *mapping, return -EBUSY; } -static void secretmem_free_folio(struct folio *folio) +static void secretmem_free_folio(struct address_space *mapping, + struct folio *folio) { set_direct_map_default_noflush(folio_page(folio, 0)); folio_zero_segment(folio, 0, folio_size(folio)); diff --git a/mm/vmscan.c b/mm/vmscan.c index a48aec8bfd92..559bd6ac965c 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -788,7 +788,7 @@ static int __remove_mapping(struct address_space *mapping, struct folio *folio, xa_unlock_irq(&mapping->i_pages); put_swap_folio(folio, swap); } else { - void (*free_folio)(struct folio *); + void (*free_folio)(struct address_space *, struct folio *); free_folio = mapping->a_ops->free_folio; /* @@ -817,7 +817,7 @@ static int __remove_mapping(struct address_space *mapping, struct folio *folio, spin_unlock(&mapping->host->i_lock); if (free_folio) - free_folio(folio); + free_folio(mapping, folio); } return 1; diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 08a6bc7d25b6..9ec4c45e3cf2 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -430,7 +430,8 @@ static int kvm_gmem_error_folio(struct address_space *mapping, struct folio *fol } #ifdef CONFIG_HAVE_KVM_ARCH_GMEM_INVALIDATE -static void kvm_gmem_free_folio(struct folio *folio) +static void kvm_gmem_free_folio(struct address_space *mapping, + struct folio *folio) { struct page *page = folio_page(folio, 0); kvm_pfn_t pfn = page_to_pfn(page); -- 2.50.1 From roypat at amazon.co.uk Fri Sep 12 02:17:36 2025 From: roypat at amazon.co.uk (Roy, Patrick) Date: Fri, 12 Sep 2025 09:17:36 +0000 Subject: [PATCH v6 04/11] KVM: guest_memfd: Add stub for kvm_arch_gmem_invalidate In-Reply-To: <20250912091708.17502-1-roypat@amazon.co.uk> References: <20250912091708.17502-1-roypat@amazon.co.uk> Message-ID: <20250912091708.17502-5-roypat@amazon.co.uk> Add a no-op stub for kvm_arch_gmem_invalidate if CONFIG_HAVE_KVM_ARCH_GMEM_INVALIDATE=n. This allows defining kvm_gmem_free_folio without ifdef-ery, which allows more cleanly using guest_memfd's free_folio callback for non-arch-invalidation related code. Signed-off-by: Patrick Roy --- include/linux/kvm_host.h | 2 ++ virt/kvm/guest_memfd.c | 4 ---- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 8b47891adca1..1d0585616aa3 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2573,6 +2573,8 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t gfn, void __user *src, long npages #ifdef CONFIG_HAVE_KVM_ARCH_GMEM_INVALIDATE void kvm_arch_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end); +#else +static inline void kvm_arch_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end) { } #endif #ifdef CONFIG_KVM_GENERIC_PRE_FAULT_MEMORY diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 9ec4c45e3cf2..81028984ff89 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -429,7 +429,6 @@ static int kvm_gmem_error_folio(struct address_space *mapping, struct folio *fol return MF_DELAYED; } -#ifdef CONFIG_HAVE_KVM_ARCH_GMEM_INVALIDATE static void kvm_gmem_free_folio(struct address_space *mapping, struct folio *folio) { @@ -439,15 +438,12 @@ static void kvm_gmem_free_folio(struct address_space *mapping, kvm_arch_gmem_invalidate(pfn, pfn + (1ul << order)); } -#endif static const struct address_space_operations kvm_gmem_aops = { .dirty_folio = noop_dirty_folio, .migrate_folio = kvm_gmem_migrate_folio, .error_remove_folio = kvm_gmem_error_folio, -#ifdef CONFIG_HAVE_KVM_ARCH_GMEM_INVALIDATE .free_folio = kvm_gmem_free_folio, -#endif }; static int kvm_gmem_setattr(struct mnt_idmap *idmap, struct dentry *dentry, -- 2.50.1 From roypat at amazon.co.uk Fri Sep 12 02:17:34 2025 From: roypat at amazon.co.uk (Roy, Patrick) Date: Fri, 12 Sep 2025 09:17:34 +0000 Subject: [PATCH v6 03/11] mm: introduce AS_NO_DIRECT_MAP In-Reply-To: <20250912091708.17502-1-roypat@amazon.co.uk> References: <20250912091708.17502-1-roypat@amazon.co.uk> Message-ID: <20250912091708.17502-4-roypat@amazon.co.uk> Add AS_NO_DIRECT_MAP for mappings where direct map entries of folios are set to not present . Currently, mappings that match this description are secretmem mappings (memfd_secret()). Later, some guest_memfd configurations will also fall into this category. Reject this new type of mappings in all locations that currently reject secretmem mappings, on the assumption that if secretmem mappings are rejected somewhere, it is precisely because of an inability to deal with folios without direct map entries, and then make memfd_secret() use AS_NO_DIRECT_MAP on its address_space to drop its special vma_is_secretmem()/secretmem_mapping() checks. This drops a optimization in gup_fast_folio_allowed() where secretmem_mapping() was only called if CONFIG_SECRETMEM=y. secretmem is enabled by default since commit b758fe6df50d ("mm/secretmem: make it on by default"), so the secretmem check did not actually end up elided in most cases anymore anyway. Use a new flag instead of overloading AS_INACCESSIBLE (which is already set by guest_memfd) because not all guest_memfd mappings will end up being direct map removed (e.g. in pKVM setups, parts of guest_memfd that can be mapped to userspace should also be GUP-able, and generally not have restrictions on who can access it). Signed-off-by: Patrick Roy --- include/linux/pagemap.h | 16 ++++++++++++++++ include/linux/secretmem.h | 18 ------------------ lib/buildid.c | 4 ++-- mm/gup.c | 19 +++++-------------- mm/mlock.c | 2 +- mm/secretmem.c | 8 ++------ 6 files changed, 26 insertions(+), 41 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 12a12dae727d..1f5739f6a9f5 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -211,6 +211,7 @@ enum mapping_flags { folio contents */ AS_INACCESSIBLE = 8, /* Do not attempt direct R/W access to the mapping */ AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM = 9, + AS_NO_DIRECT_MAP = 10, /* Folios in the mapping are not in the direct map */ /* Bits 16-25 are used for FOLIO_ORDER */ AS_FOLIO_ORDER_BITS = 5, AS_FOLIO_ORDER_MIN = 16, @@ -346,6 +347,21 @@ static inline bool mapping_writeback_may_deadlock_on_reclaim(struct address_spac return test_bit(AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM, &mapping->flags); } +static inline void mapping_set_no_direct_map(struct address_space *mapping) +{ + set_bit(AS_NO_DIRECT_MAP, &mapping->flags); +} + +static inline bool mapping_no_direct_map(const struct address_space *mapping) +{ + return test_bit(AS_NO_DIRECT_MAP, &mapping->flags); +} + +static inline bool vma_has_no_direct_map(const struct vm_area_struct *vma) +{ + return vma->vm_file && mapping_no_direct_map(vma->vm_file->f_mapping); +} + static inline gfp_t mapping_gfp_mask(struct address_space * mapping) { return mapping->gfp_mask; diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h index e918f96881f5..0ae1fb057b3d 100644 --- a/include/linux/secretmem.h +++ b/include/linux/secretmem.h @@ -4,28 +4,10 @@ #ifdef CONFIG_SECRETMEM -extern const struct address_space_operations secretmem_aops; - -static inline bool secretmem_mapping(struct address_space *mapping) -{ - return mapping->a_ops == &secretmem_aops; -} - -bool vma_is_secretmem(struct vm_area_struct *vma); bool secretmem_active(void); #else -static inline bool vma_is_secretmem(struct vm_area_struct *vma) -{ - return false; -} - -static inline bool secretmem_mapping(struct address_space *mapping) -{ - return false; -} - static inline bool secretmem_active(void) { return false; diff --git a/lib/buildid.c b/lib/buildid.c index c4b0f376fb34..89e567954284 100644 --- a/lib/buildid.c +++ b/lib/buildid.c @@ -65,8 +65,8 @@ static int freader_get_folio(struct freader *r, loff_t file_off) freader_put_folio(r); - /* reject secretmem folios created with memfd_secret() */ - if (secretmem_mapping(r->file->f_mapping)) + /* reject folios without direct map entries (e.g. from memfd_secret() or guest_memfd()) */ + if (mapping_no_direct_map(r->file->f_mapping)) return -EFAULT; r->folio = filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT); diff --git a/mm/gup.c b/mm/gup.c index adffe663594d..75a0cffdf37d 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -11,7 +11,6 @@ #include #include #include -#include #include #include @@ -1234,7 +1233,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) if ((gup_flags & FOLL_SPLIT_PMD) && is_vm_hugetlb_page(vma)) return -EOPNOTSUPP; - if (vma_is_secretmem(vma)) + if (vma_has_no_direct_map(vma)) return -EFAULT; if (write) { @@ -2736,7 +2735,7 @@ EXPORT_SYMBOL(get_user_pages_unlocked); * This call assumes the caller has pinned the folio, that the lowest page table * level still points to this folio, and that interrupts have been disabled. * - * GUP-fast must reject all secretmem folios. + * GUP-fast must reject all folios without direct map entries (such as secretmem). * * Writing to pinned file-backed dirty tracked folios is inherently problematic * (see comment describing the writable_file_mapping_allowed() function). We @@ -2751,7 +2750,6 @@ static bool gup_fast_folio_allowed(struct folio *folio, unsigned int flags) { bool reject_file_backed = false; struct address_space *mapping; - bool check_secretmem = false; unsigned long mapping_flags; /* @@ -2763,18 +2761,10 @@ static bool gup_fast_folio_allowed(struct folio *folio, unsigned int flags) reject_file_backed = true; /* We hold a folio reference, so we can safely access folio fields. */ - - /* secretmem folios are always order-0 folios. */ - if (IS_ENABLED(CONFIG_SECRETMEM) && !folio_test_large(folio)) - check_secretmem = true; - - if (!reject_file_backed && !check_secretmem) - return true; - if (WARN_ON_ONCE(folio_test_slab(folio))) return false; - /* hugetlb neither requires dirty-tracking nor can be secretmem. */ + /* hugetlb neither requires dirty-tracking nor can be without direct map. */ if (folio_test_hugetlb(folio)) return true; @@ -2812,8 +2802,9 @@ static bool gup_fast_folio_allowed(struct folio *folio, unsigned int flags) * At this point, we know the mapping is non-null and points to an * address_space object. */ - if (check_secretmem && secretmem_mapping(mapping)) + if (mapping_no_direct_map(mapping)) return false; + /* The only remaining allowed file system is shmem. */ return !reject_file_backed || shmem_mapping(mapping); } diff --git a/mm/mlock.c b/mm/mlock.c index a1d93ad33c6d..36f5e70faeb0 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -474,7 +474,7 @@ static int mlock_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, if (newflags == oldflags || (oldflags & VM_SPECIAL) || is_vm_hugetlb_page(vma) || vma == get_gate_vma(current->mm) || - vma_is_dax(vma) || vma_is_secretmem(vma) || (oldflags & VM_DROPPABLE)) + vma_is_dax(vma) || vma_has_no_direct_map(vma) || (oldflags & VM_DROPPABLE)) /* don't set VM_LOCKED or VM_LOCKONFAULT and don't count */ goto out; diff --git a/mm/secretmem.c b/mm/secretmem.c index 422dcaa32506..b5ce55079695 100644 --- a/mm/secretmem.c +++ b/mm/secretmem.c @@ -134,11 +134,6 @@ static int secretmem_mmap_prepare(struct vm_area_desc *desc) return 0; } -bool vma_is_secretmem(struct vm_area_struct *vma) -{ - return vma->vm_ops == &secretmem_vm_ops; -} - static const struct file_operations secretmem_fops = { .release = secretmem_release, .mmap_prepare = secretmem_mmap_prepare, @@ -157,7 +152,7 @@ static void secretmem_free_folio(struct address_space *mapping, folio_zero_segment(folio, 0, folio_size(folio)); } -const struct address_space_operations secretmem_aops = { +static const struct address_space_operations secretmem_aops = { .dirty_folio = noop_dirty_folio, .free_folio = secretmem_free_folio, .migrate_folio = secretmem_migrate_folio, @@ -206,6 +201,7 @@ static struct file *secretmem_file_create(unsigned long flags) mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); mapping_set_unevictable(inode->i_mapping); + mapping_set_no_direct_map(inode->i_mapping); inode->i_op = &secretmem_iops; inode->i_mapping->a_ops = &secretmem_aops; -- 2.50.1 From roypat at amazon.co.uk Fri Sep 12 02:17:41 2025 From: roypat at amazon.co.uk (Roy, Patrick) Date: Fri, 12 Sep 2025 09:17:41 +0000 Subject: [PATCH v6 07/11] KVM: selftests: set KVM_MEM_GUEST_MEMFD in vm_mem_add() if guest_memfd != -1 In-Reply-To: <20250912091708.17502-1-roypat@amazon.co.uk> References: <20250912091708.17502-1-roypat@amazon.co.uk> Message-ID: <20250912091708.17502-8-roypat@amazon.co.uk> Have vm_mem_add() always set KVM_MEM_GUEST_MEMFD in the memslot flags if a guest_memfd is passed in as an argument. This eliminates the possibility where a guest_memfd instance is passed to vm_mem_add(), but it ends up being ignored because the flags argument does not specify KVM_MEM_GUEST_MEMFD at the same time. This makes it easy to support more scenarios in which no vm_mem_add() is not passed a guest_memfd instance, but is expected to allocate one. Currently, this only happens if guest_memfd == -1 but flags & KVM_MEM_GUEST_MEMFD != 0, but later vm_mem_add() will gain support for loading the test code itself into guest_memfd (via GUEST_MEMFD_FLAG_MMAP) if requested via a special vm_mem_backing_src_type, at which point having to make sure the src_type and flags are in-sync becomes cumbersome. Signed-off-by: Patrick Roy --- tools/testing/selftests/kvm/lib/kvm_util.c | 26 +++++++++++++--------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index c3f5142b0a54..cc67dfecbf65 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1107,22 +1107,26 @@ void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, region->backing_src_type = src_type; - if (flags & KVM_MEM_GUEST_MEMFD) { - if (guest_memfd < 0) { + if (guest_memfd < 0) { + if (flags & KVM_MEM_GUEST_MEMFD) { uint32_t guest_memfd_flags = 0; TEST_ASSERT(!guest_memfd_offset, "Offset must be zero when creating new guest_memfd"); guest_memfd = vm_create_guest_memfd(vm, mem_size, guest_memfd_flags); - } else { - /* - * Install a unique fd for each memslot so that the fd - * can be closed when the region is deleted without - * needing to track if the fd is owned by the framework - * or by the caller. - */ - guest_memfd = dup(guest_memfd); - TEST_ASSERT(guest_memfd >= 0, __KVM_SYSCALL_ERROR("dup()", guest_memfd)); } + } else { + /* + * Install a unique fd for each memslot so that the fd + * can be closed when the region is deleted without + * needing to track if the fd is owned by the framework + * or by the caller. + */ + guest_memfd = dup(guest_memfd); + TEST_ASSERT(guest_memfd >= 0, __KVM_SYSCALL_ERROR("dup()", guest_memfd)); + } + + if (guest_memfd > 0) { + flags |= KVM_MEM_GUEST_MEMFD; region->region.guest_memfd = guest_memfd; region->region.guest_memfd_offset = guest_memfd_offset; -- 2.50.1 From roypat at amazon.co.uk Fri Sep 12 02:17:29 2025 From: roypat at amazon.co.uk (Roy, Patrick) Date: Fri, 12 Sep 2025 09:17:29 +0000 Subject: [PATCH v6 00/11] Direct Map Removal Support for guest_memfd Message-ID: <20250912091708.17502-1-roypat@amazon.co.uk> [ based on kvm/next ] Unmapping virtual machine guest memory from the host kernel's direct map is a successful mitigation against Spectre-style transient execution issues: If the kernel page tables do not contain entries pointing to guest memory, then any attempted speculative read through the direct map will necessarily be blocked by the MMU before any observable microarchitectural side-effects happen. This means that Spectre-gadgets and similar cannot be used to target virtual machine memory. Roughly 60% of speculative execution issues fall into this category [1, Table 1]. This patch series extends guest_memfd with the ability to remove its memory from the host kernel's direct map, to be able to attain the above protection for KVM guests running inside guest_memfd. Additionally, a Firecracker branch with support for these VMs can be found on GitHub [2]. For more details, please refer to the v5 cover letter [v5]. No substantial changes in design have taken place since. === Changes Since v5 === - Fix up error handling for set_direct_map_[in]valid_noflush() (Mike) - Fix capability check for KVM_GUEST_MEMFD_NO_DIRECT_MAP (Mike) - Make secretmem_aops static in mm/secretmem.c (Mike) - Fixup some more comments in gup.c that referred to secretmem specifically to instead point to AS_NO_DIRECT_MAP (Mike) - New patch (PATCH 4/11) to avoid ifdeffery in kvm_gmem_free_folio() (Mike) - vma_is_no_direct_map() -> vma_has_no_direct_map() rename (David) - Squash some patches (David) - Fix up const-ness of parameters to new functions in pagemap.h (Fuad) [1]: https://download.vusec.net/papers/quarantine_raid23.pdf [2]: https://github.com/firecracker-microvm/firecracker/tree/feature/secret-hiding [RFCv1]: https://lore.kernel.org/kvm/20240709132041.3625501-1-roypat at amazon.co.uk/ [RFCv2]: https://lore.kernel.org/kvm/20240910163038.1298452-1-roypat at amazon.co.uk/ [RFCv3]: https://lore.kernel.org/kvm/20241030134912.515725-1-roypat at amazon.co.uk/ [v4]: https://lore.kernel.org/kvm/20250221160728.1584559-1-roypat at amazon.co.uk/ [v5]: https://lore.kernel.org/kvm/20250828093902.2719-1-roypat at amazon.co.uk/ Elliot Berman (1): filemap: Pass address_space mapping to ->free_folio() Patrick Roy (10): arch: export set_direct_map_valid_noflush to KVM module mm: introduce AS_NO_DIRECT_MAP KVM: guest_memfd: Add stub for kvm_arch_gmem_invalidate KVM: guest_memfd: Add flag to remove from direct map KVM: selftests: load elf via bounce buffer KVM: selftests: set KVM_MEM_GUEST_MEMFD in vm_mem_add() if guest_memfd != -1 KVM: selftests: Add guest_memfd based vm_mem_backing_src_types KVM: selftests: stuff vm_mem_backing_src_type into vm_shape KVM: selftests: cover GUEST_MEMFD_FLAG_NO_DIRECT_MAP in existing selftests KVM: selftests: Test guest execution from direct map removed gmem Documentation/filesystems/locking.rst | 2 +- Documentation/virt/kvm/api.rst | 5 ++ arch/arm64/include/asm/kvm_host.h | 12 ++++ arch/arm64/mm/pageattr.c | 1 + arch/loongarch/mm/pageattr.c | 1 + arch/riscv/mm/pageattr.c | 1 + arch/s390/mm/pageattr.c | 1 + arch/x86/mm/pat/set_memory.c | 1 + fs/nfs/dir.c | 11 ++-- fs/orangefs/inode.c | 3 +- include/linux/fs.h | 2 +- include/linux/kvm_host.h | 9 +++ include/linux/pagemap.h | 16 +++++ include/linux/secretmem.h | 18 ------ include/uapi/linux/kvm.h | 2 + lib/buildid.c | 4 +- mm/filemap.c | 9 +-- mm/gup.c | 19 ++---- mm/mlock.c | 2 +- mm/secretmem.c | 11 ++-- mm/vmscan.c | 4 +- .../testing/selftests/kvm/guest_memfd_test.c | 2 + .../testing/selftests/kvm/include/kvm_util.h | 37 ++++++++--- .../testing/selftests/kvm/include/test_util.h | 8 +++ tools/testing/selftests/kvm/lib/elf.c | 8 +-- tools/testing/selftests/kvm/lib/io.c | 23 +++++++ tools/testing/selftests/kvm/lib/kvm_util.c | 61 +++++++++++-------- tools/testing/selftests/kvm/lib/test_util.c | 8 +++ tools/testing/selftests/kvm/lib/x86/sev.c | 1 + .../selftests/kvm/pre_fault_memory_test.c | 1 + .../selftests/kvm/set_memory_region_test.c | 50 +++++++++++++-- .../kvm/x86/private_mem_conversions_test.c | 7 ++- virt/kvm/guest_memfd.c | 56 ++++++++++++++--- virt/kvm/kvm_main.c | 5 ++ 34 files changed, 288 insertions(+), 113 deletions(-) base-commit: a6ad54137af92535cfe32e19e5f3bc1bb7dbd383 -- 2.50.1 From roypat at amazon.co.uk Fri Sep 12 02:17:43 2025 From: roypat at amazon.co.uk (Roy, Patrick) Date: Fri, 12 Sep 2025 09:17:43 +0000 Subject: [PATCH v6 08/11] KVM: selftests: Add guest_memfd based vm_mem_backing_src_types In-Reply-To: <20250912091708.17502-1-roypat@amazon.co.uk> References: <20250912091708.17502-1-roypat@amazon.co.uk> Message-ID: <20250912091708.17502-9-roypat@amazon.co.uk> Allow selftests to configure their memslots such that userspace_addr is set to a MAP_SHARED mapping of the guest_memfd that's associated with the memslot. This setup is the configuration for non-CoCo VMs, where all guest memory is backed by a guest_memfd whose folios are all marked shared, but KVM is still able to access guest memory to provide functionality such as MMIO emulation on x86. Add backing types for normal guest_memfd, as well as direct map removed guest_memfd. Signed-off-by: Patrick Roy --- .../testing/selftests/kvm/include/kvm_util.h | 18 ++++++ .../testing/selftests/kvm/include/test_util.h | 7 +++ tools/testing/selftests/kvm/lib/kvm_util.c | 63 ++++++++++--------- tools/testing/selftests/kvm/lib/test_util.c | 8 +++ 4 files changed, 66 insertions(+), 30 deletions(-) diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 23a506d7eca3..5204a0a18a7f 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -635,6 +635,24 @@ static inline bool is_smt_on(void) void vm_create_irqchip(struct kvm_vm *vm); +static inline uint32_t backing_src_guest_memfd_flags(enum vm_mem_backing_src_type t) +{ + uint32_t flags = 0; + + switch (t) { + case VM_MEM_SRC_GUEST_MEMFD: + flags |= GUEST_MEMFD_FLAG_MMAP; + fallthrough; + case VM_MEM_SRC_GUEST_MEMFD_NO_DIRECT_MAP: + flags |= GUEST_MEMFD_FLAG_NO_DIRECT_MAP; + break; + default: + break; + } + + return flags; +} + static inline int __vm_create_guest_memfd(struct kvm_vm *vm, uint64_t size, uint64_t flags) { diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index 0409b7b96c94..a56e53fc7b39 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -133,6 +133,8 @@ enum vm_mem_backing_src_type { VM_MEM_SRC_ANONYMOUS_HUGETLB_16GB, VM_MEM_SRC_SHMEM, VM_MEM_SRC_SHARED_HUGETLB, + VM_MEM_SRC_GUEST_MEMFD, + VM_MEM_SRC_GUEST_MEMFD_NO_DIRECT_MAP, NUM_SRC_TYPES, }; @@ -165,6 +167,11 @@ static inline bool backing_src_is_shared(enum vm_mem_backing_src_type t) return vm_mem_backing_src_alias(t)->flag & MAP_SHARED; } +static inline bool backing_src_is_guest_memfd(enum vm_mem_backing_src_type t) +{ + return t == VM_MEM_SRC_GUEST_MEMFD || t == VM_MEM_SRC_GUEST_MEMFD_NO_DIRECT_MAP; +} + static inline bool backing_src_can_be_huge(enum vm_mem_backing_src_type t) { return t != VM_MEM_SRC_ANONYMOUS && t != VM_MEM_SRC_SHMEM; diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index cc67dfecbf65..a81089f7c83f 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1060,6 +1060,34 @@ void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, alignment = 1; #endif + if (guest_memfd < 0) { + if ((flags & KVM_MEM_GUEST_MEMFD) || backing_src_is_guest_memfd(src_type)) { + uint32_t guest_memfd_flags = backing_src_guest_memfd_flags(src_type); + + TEST_ASSERT(!guest_memfd_offset, + "Offset must be zero when creating new guest_memfd"); + guest_memfd = vm_create_guest_memfd(vm, mem_size, guest_memfd_flags); + } + } else { + /* + * Install a unique fd for each memslot so that the fd + * can be closed when the region is deleted without + * needing to track if the fd is owned by the framework + * or by the caller. + */ + guest_memfd = dup(guest_memfd); + TEST_ASSERT(guest_memfd >= 0, __KVM_SYSCALL_ERROR("dup()", guest_memfd)); + } + + if (guest_memfd > 0) { + flags |= KVM_MEM_GUEST_MEMFD; + + region->region.guest_memfd = guest_memfd; + region->region.guest_memfd_offset = guest_memfd_offset; + } else { + region->region.guest_memfd = -1; + } + /* * When using THP mmap is not guaranteed to returned a hugepage aligned * address so we have to pad the mmap. Padding is not needed for HugeTLB @@ -1075,10 +1103,13 @@ void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, if (alignment > 1) region->mmap_size += alignment; - region->fd = -1; - if (backing_src_is_shared(src_type)) + if (backing_src_is_guest_memfd(src_type)) + region->fd = guest_memfd; + else if (backing_src_is_shared(src_type)) region->fd = kvm_memfd_alloc(region->mmap_size, src_type == VM_MEM_SRC_SHARED_HUGETLB); + else + region->fd = -1; region->mmap_start = mmap(NULL, region->mmap_size, PROT_READ | PROT_WRITE, @@ -1106,34 +1137,6 @@ void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, } region->backing_src_type = src_type; - - if (guest_memfd < 0) { - if (flags & KVM_MEM_GUEST_MEMFD) { - uint32_t guest_memfd_flags = 0; - TEST_ASSERT(!guest_memfd_offset, - "Offset must be zero when creating new guest_memfd"); - guest_memfd = vm_create_guest_memfd(vm, mem_size, guest_memfd_flags); - } - } else { - /* - * Install a unique fd for each memslot so that the fd - * can be closed when the region is deleted without - * needing to track if the fd is owned by the framework - * or by the caller. - */ - guest_memfd = dup(guest_memfd); - TEST_ASSERT(guest_memfd >= 0, __KVM_SYSCALL_ERROR("dup()", guest_memfd)); - } - - if (guest_memfd > 0) { - flags |= KVM_MEM_GUEST_MEMFD; - - region->region.guest_memfd = guest_memfd; - region->region.guest_memfd_offset = guest_memfd_offset; - } else { - region->region.guest_memfd = -1; - } - region->unused_phy_pages = sparsebit_alloc(); if (vm_arch_has_protected_memory(vm)) region->protected_phy_pages = sparsebit_alloc(); diff --git a/tools/testing/selftests/kvm/lib/test_util.c b/tools/testing/selftests/kvm/lib/test_util.c index 03eb99af9b8d..b2baee680083 100644 --- a/tools/testing/selftests/kvm/lib/test_util.c +++ b/tools/testing/selftests/kvm/lib/test_util.c @@ -299,6 +299,14 @@ const struct vm_mem_backing_src_alias *vm_mem_backing_src_alias(uint32_t i) */ .flag = MAP_SHARED, }, + [VM_MEM_SRC_GUEST_MEMFD] = { + .name = "guest_memfd", + .flag = MAP_SHARED, + }, + [VM_MEM_SRC_GUEST_MEMFD_NO_DIRECT_MAP] = { + .name = "guest_memfd_no_direct_map", + .flag = MAP_SHARED, + } }; _Static_assert(ARRAY_SIZE(aliases) == NUM_SRC_TYPES, "Missing new backing src types?"); -- 2.50.1 From roypat at amazon.co.uk Fri Sep 12 02:17:37 2025 From: roypat at amazon.co.uk (Roy, Patrick) Date: Fri, 12 Sep 2025 09:17:37 +0000 Subject: [PATCH v6 05/11] KVM: guest_memfd: Add flag to remove from direct map In-Reply-To: <20250912091708.17502-1-roypat@amazon.co.uk> References: <20250912091708.17502-1-roypat@amazon.co.uk> Message-ID: <20250912091708.17502-6-roypat@amazon.co.uk> Add GUEST_MEMFD_FLAG_NO_DIRECT_MAP flag for KVM_CREATE_GUEST_MEMFD() ioctl. When set, guest_memfd folios will be removed from the direct map after preparation, with direct map entries only restored when the folios are freed. To ensure these folios do not end up in places where the kernel cannot deal with them, set AS_NO_DIRECT_MAP on the guest_memfd's struct address_space if GUEST_MEMFD_FLAG_NO_DIRECT_MAP is requested. Add KVM_CAP_GUEST_MEMFD_NO_DIRECT_MAP to let userspace discover whether guest_memfd supports GUEST_MEMFD_FLAG_NO_DIRECT_MAP. Support depends on guest_memfd itself being supported, but also on whether linux supports manipulatomg the direct map at page granularity at all (possible most of the time, outliers being arm64 where its impossible if the direct map has been setup using hugepages, as arm64 cannot break these apart due to break-before-make semantics, and powerpc, which does not select ARCH_HAS_SET_DIRECT_MAP, which also doesn't support guest_memfd anyway though). Note that this flag causes removal of direct map entries for all guest_memfd folios independent of whether they are "shared" or "private" (although current guest_memfd only supports either all folios in the "shared" state, or all folios in the "private" state if GUEST_MEMFD_FLAG_MMAP is not set). The usecase for removing direct map entries of also the shared parts of guest_memfd are a special type of non-CoCo VM where, host userspace is trusted to have access to all of guest memory, but where Spectre-style transient execution attacks through the host kernel's direct map should still be mitigated. In this setup, KVM retains access to guest memory via userspace mappings of guest_memfd, which are reflected back into KVM's memslots via userspace_addr. This is needed for things like MMIO emulation on x86_64 to work. Do not perform TLB flushes after direct map manipulations. This is because TLB flushes resulted in a up to 40x elongation of page faults in guest_memfd (scaling with the number of CPU cores), or a 5x elongation of memory population. TLB flushes are not needed for functional correctness (the virt->phys mapping technically stays "correct", the kernel should simply not use it for a while). On the other hand, it means that the desired protection from Spectre-style attacks is not perfect, as an attacker could try to prevent a stale TLB entry from getting evicted, keeping it alive until the page it refers to is used by the guest for some sensitive data, and then targeting it using a spectre-gadget. Signed-off-by: Patrick Roy --- Documentation/virt/kvm/api.rst | 5 ++++ arch/arm64/include/asm/kvm_host.h | 12 ++++++++ include/linux/kvm_host.h | 7 +++++ include/uapi/linux/kvm.h | 2 ++ virt/kvm/guest_memfd.c | 49 +++++++++++++++++++++++++++---- virt/kvm/kvm_main.c | 5 ++++ 6 files changed, 75 insertions(+), 5 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index c17a87a0a5ac..b52c14d58798 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6418,6 +6418,11 @@ When the capability KVM_CAP_GUEST_MEMFD_MMAP is supported, the 'flags' field supports GUEST_MEMFD_FLAG_MMAP. Setting this flag on guest_memfd creation enables mmap() and faulting of guest_memfd memory to host userspace. +When the capability KVM_CAP_GMEM_NO_DIRECT_MAP is supported, the 'flags' field +supports GUEST_MEMFG_FLAG_NO_DIRECT_MAP. Setting this flag makes the guest_memfd +instance behave similarly to memfd_secret, and unmaps the memory backing it from +the kernel's address space after allocation. + When the KVM MMU performs a PFN lookup to service a guest fault and the backing guest_memfd has the GUEST_MEMFD_FLAG_MMAP set, then the fault will always be consumed from guest_memfd, regardless of whether it is a shared or a private diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 2f2394cce24e..0bfd8e5fd9de 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -1706,5 +1707,16 @@ void compute_fgu(struct kvm *kvm, enum fgt_group_id fgt); void get_reg_fixed_bits(struct kvm *kvm, enum vcpu_sysreg reg, u64 *res0, u64 *res1); void check_feature_map(void); +#ifdef CONFIG_KVM_GUEST_MEMFD +static inline bool kvm_arch_gmem_supports_no_direct_map(void) +{ + /* + * Without FWB, direct map access is needed in kvm_pgtable_stage2_map(), + * as it calls dcache_clean_inval_poc(). + */ + return can_set_direct_map() && cpus_have_final_cap(ARM64_HAS_STAGE2_FWB); +} +#define kvm_arch_gmem_supports_no_direct_map kvm_arch_gmem_supports_no_direct_map +#endif /* CONFIG_KVM_GUEST_MEMFD */ #endif /* __ARM64_KVM_HOST_H__ */ diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 1d0585616aa3..a9468bce55f2 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -36,6 +36,7 @@ #include #include #include +#include #include #include @@ -731,6 +732,12 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm) bool kvm_arch_supports_gmem_mmap(struct kvm *kvm); #endif +#ifdef CONFIG_KVM_GUEST_MEMFD +#ifndef kvm_arch_gmem_supports_no_direct_map +#define kvm_arch_gmem_supports_no_direct_map can_set_direct_map +#endif +#endif /* CONFIG_KVM_GUEST_MEMFD */ + #ifndef kvm_arch_has_readonly_mem static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm) { diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 6efa98a57ec1..33c8e8946019 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -963,6 +963,7 @@ struct kvm_enable_cap { #define KVM_CAP_RISCV_MP_STATE_RESET 242 #define KVM_CAP_ARM_CACHEABLE_PFNMAP_SUPPORTED 243 #define KVM_CAP_GUEST_MEMFD_MMAP 244 +#define KVM_CAP_GUEST_MEMFD_NO_DIRECT_MAP 245 struct kvm_irq_routing_irqchip { __u32 irqchip; @@ -1600,6 +1601,7 @@ struct kvm_memory_attributes { #define KVM_CREATE_GUEST_MEMFD _IOWR(KVMIO, 0xd4, struct kvm_create_guest_memfd) #define GUEST_MEMFD_FLAG_MMAP (1ULL << 0) +#define GUEST_MEMFD_FLAG_NO_DIRECT_MAP (1ULL << 1) struct kvm_create_guest_memfd { __u64 size; diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 81028984ff89..3c64099fc98a 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -4,6 +4,7 @@ #include #include #include +#include #include "kvm_mm.h" @@ -42,9 +43,24 @@ static int __kvm_gmem_prepare_folio(struct kvm *kvm, struct kvm_memory_slot *slo return 0; } -static inline void kvm_gmem_mark_prepared(struct folio *folio) +static bool kvm_gmem_test_no_direct_map(struct inode *inode) { - folio_mark_uptodate(folio); + return ((unsigned long) inode->i_private) & GUEST_MEMFD_FLAG_NO_DIRECT_MAP; +} + +static inline int kvm_gmem_mark_prepared(struct folio *folio) +{ + struct inode *inode = folio_inode(folio); + int r = 0; + + if (kvm_gmem_test_no_direct_map(inode)) + r = set_direct_map_valid_noflush(folio_page(folio, 0), folio_nr_pages(folio), + false); + + if (!r) + folio_mark_uptodate(folio); + + return r; } /* @@ -82,7 +98,7 @@ static int kvm_gmem_prepare_folio(struct kvm *kvm, struct kvm_memory_slot *slot, index = ALIGN_DOWN(index, 1 << folio_order(folio)); r = __kvm_gmem_prepare_folio(kvm, slot, index, folio); if (!r) - kvm_gmem_mark_prepared(folio); + r = kvm_gmem_mark_prepared(folio); return r; } @@ -344,8 +360,15 @@ static vm_fault_t kvm_gmem_fault_user_mapping(struct vm_fault *vmf) } if (!folio_test_uptodate(folio)) { + int err = 0; + clear_highpage(folio_page(folio, 0)); - kvm_gmem_mark_prepared(folio); + err = kvm_gmem_mark_prepared(folio); + + if (err) { + ret = vmf_error(err); + goto out_folio; + } } vmf->page = folio_file_page(folio, vmf->pgoff); @@ -436,6 +459,16 @@ static void kvm_gmem_free_folio(struct address_space *mapping, kvm_pfn_t pfn = page_to_pfn(page); int order = folio_order(folio); + /* + * Direct map restoration cannot fail, as the only error condition + * for direct map manipulation is failure to allocate page tables + * when splitting huge pages, but this split would have already + * happened in set_direct_map_invalid_noflush() in kvm_gmem_mark_prepared(). + * Thus set_direct_map_valid_noflush() here only updates prot bits. + */ + if (kvm_gmem_test_no_direct_map(mapping->host)) + set_direct_map_valid_noflush(page, folio_nr_pages(folio), true); + kvm_arch_gmem_invalidate(pfn, pfn + (1ul << order)); } @@ -500,6 +533,9 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) /* Unmovable mappings are supposed to be marked unevictable as well. */ WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); + if (flags & GUEST_MEMFD_FLAG_NO_DIRECT_MAP) + mapping_set_no_direct_map(inode->i_mapping); + kvm_get_kvm(kvm); gmem->kvm = kvm; xa_init(&gmem->bindings); @@ -524,6 +560,9 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) if (kvm_arch_supports_gmem_mmap(kvm)) valid_flags |= GUEST_MEMFD_FLAG_MMAP; + if (kvm_arch_gmem_supports_no_direct_map()) + valid_flags |= GUEST_MEMFD_FLAG_NO_DIRECT_MAP; + if (flags & ~valid_flags) return -EINVAL; @@ -768,7 +807,7 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t start_gfn, void __user *src, long p = src ? src + i * PAGE_SIZE : NULL; ret = post_populate(kvm, gfn, pfn, p, max_order, opaque); if (!ret) - kvm_gmem_mark_prepared(folio); + ret = kvm_gmem_mark_prepared(folio); put_folio_and_exit: folio_put(folio); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 18f29ef93543..b5e702d95230 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -65,6 +65,7 @@ #include #include +#include /* Worst case buffer size needed for holding an integer. */ @@ -4916,6 +4917,10 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) return kvm_supported_mem_attributes(kvm); #endif #ifdef CONFIG_KVM_GUEST_MEMFD + case KVM_CAP_GUEST_MEMFD_NO_DIRECT_MAP: + if (!kvm_arch_gmem_supports_no_direct_map()) + return 0; + fallthrough; case KVM_CAP_GUEST_MEMFD: return 1; case KVM_CAP_GUEST_MEMFD_MMAP: -- 2.50.1 From roypat at amazon.co.uk Fri Sep 12 02:17:46 2025 From: roypat at amazon.co.uk (Roy, Patrick) Date: Fri, 12 Sep 2025 09:17:46 +0000 Subject: [PATCH v6 10/11] KVM: selftests: cover GUEST_MEMFD_FLAG_NO_DIRECT_MAP in existing selftests In-Reply-To: <20250912091708.17502-1-roypat@amazon.co.uk> References: <20250912091708.17502-1-roypat@amazon.co.uk> Message-ID: <20250912091708.17502-11-roypat@amazon.co.uk> Extend mem conversion selftests to cover the scenario that the guest can fault in and write gmem-backed guest memory even if its direct map removed. Also cover the new flag in guest_memfd_test.c tests. Signed-off-by: Patrick Roy --- tools/testing/selftests/kvm/guest_memfd_test.c | 2 ++ .../selftests/kvm/x86/private_mem_conversions_test.c | 7 ++++--- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c index b3ca6737f304..1187438b6831 100644 --- a/tools/testing/selftests/kvm/guest_memfd_test.c +++ b/tools/testing/selftests/kvm/guest_memfd_test.c @@ -275,6 +275,8 @@ static void test_guest_memfd(unsigned long vm_type) if (vm_check_cap(vm, KVM_CAP_GUEST_MEMFD_MMAP)) flags |= GUEST_MEMFD_FLAG_MMAP; + if (vm_check_cap(vm, KVM_CAP_GUEST_MEMFD_NO_DIRECT_MAP)) + flags |= GUEST_MEMFD_FLAG_NO_DIRECT_MAP; test_create_guest_memfd_multiple(vm); test_create_guest_memfd_invalid_sizes(vm, flags, page_size); diff --git a/tools/testing/selftests/kvm/x86/private_mem_conversions_test.c b/tools/testing/selftests/kvm/x86/private_mem_conversions_test.c index 82a8d88b5338..8427d9fbdb23 100644 --- a/tools/testing/selftests/kvm/x86/private_mem_conversions_test.c +++ b/tools/testing/selftests/kvm/x86/private_mem_conversions_test.c @@ -367,7 +367,7 @@ static void *__test_mem_conversions(void *__vcpu) } static void test_mem_conversions(enum vm_mem_backing_src_type src_type, uint32_t nr_vcpus, - uint32_t nr_memslots) + uint32_t nr_memslots, uint64_t gmem_flags) { /* * Allocate enough memory so that each vCPU's chunk of memory can be @@ -394,7 +394,7 @@ static void test_mem_conversions(enum vm_mem_backing_src_type src_type, uint32_t vm_enable_cap(vm, KVM_CAP_EXIT_HYPERCALL, (1 << KVM_HC_MAP_GPA_RANGE)); - memfd = vm_create_guest_memfd(vm, memfd_size, 0); + memfd = vm_create_guest_memfd(vm, memfd_size, gmem_flags); for (i = 0; i < nr_memslots; i++) vm_mem_add(vm, src_type, BASE_DATA_GPA + slot_size * i, @@ -477,7 +477,8 @@ int main(int argc, char *argv[]) } } - test_mem_conversions(src_type, nr_vcpus, nr_memslots); + test_mem_conversions(src_type, nr_vcpus, nr_memslots, 0); + test_mem_conversions(src_type, nr_vcpus, nr_memslots, GUEST_MEMFD_FLAG_NO_DIRECT_MAP); return 0; } -- 2.50.1 From roypat at amazon.co.uk Fri Sep 12 02:17:44 2025 From: roypat at amazon.co.uk (Roy, Patrick) Date: Fri, 12 Sep 2025 09:17:44 +0000 Subject: [PATCH v6 09/11] KVM: selftests: stuff vm_mem_backing_src_type into vm_shape In-Reply-To: <20250912091708.17502-1-roypat@amazon.co.uk> References: <20250912091708.17502-1-roypat@amazon.co.uk> Message-ID: <20250912091708.17502-10-roypat@amazon.co.uk> Use one of the padding fields in struct vm_shape to carry an enum vm_mem_backing_src_type value, to give the option to overwrite the default of VM_MEM_SRC_ANONYMOUS in __vm_create(). Overwriting this default will allow tests to create VMs where the test code is backed by mmap'd guest_memfd instead of anonymous memory. Signed-off-by: Patrick Roy --- .../testing/selftests/kvm/include/kvm_util.h | 19 ++++++++++--------- tools/testing/selftests/kvm/lib/kvm_util.c | 2 +- tools/testing/selftests/kvm/lib/x86/sev.c | 1 + .../selftests/kvm/pre_fault_memory_test.c | 1 + 4 files changed, 13 insertions(+), 10 deletions(-) diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 5204a0a18a7f..8baa0bbacd09 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -188,7 +188,7 @@ enum vm_guest_mode { struct vm_shape { uint32_t type; uint8_t mode; - uint8_t pad0; + uint8_t src_type; uint16_t pad1; }; @@ -196,14 +196,15 @@ kvm_static_assert(sizeof(struct vm_shape) == sizeof(uint64_t)); #define VM_TYPE_DEFAULT 0 -#define VM_SHAPE(__mode) \ -({ \ - struct vm_shape shape = { \ - .mode = (__mode), \ - .type = VM_TYPE_DEFAULT \ - }; \ - \ - shape; \ +#define VM_SHAPE(__mode) \ +({ \ + struct vm_shape shape = { \ + .mode = (__mode), \ + .type = VM_TYPE_DEFAULT, \ + .src_type = VM_MEM_SRC_ANONYMOUS \ + }; \ + \ + shape; \ }) #if defined(__aarch64__) diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index a81089f7c83f..3a22794bd959 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -495,7 +495,7 @@ struct kvm_vm *__vm_create(struct vm_shape shape, uint32_t nr_runnable_vcpus, if (is_guest_memfd_required(shape)) flags |= KVM_MEM_GUEST_MEMFD; - vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, 0, 0, nr_pages, flags); + vm_userspace_mem_region_add(vm, shape.src_type, 0, 0, nr_pages, flags); for (i = 0; i < NR_MEM_REGIONS; i++) vm->memslots[i] = 0; diff --git a/tools/testing/selftests/kvm/lib/x86/sev.c b/tools/testing/selftests/kvm/lib/x86/sev.c index c3a9838f4806..d920880e4fc0 100644 --- a/tools/testing/selftests/kvm/lib/x86/sev.c +++ b/tools/testing/selftests/kvm/lib/x86/sev.c @@ -164,6 +164,7 @@ struct kvm_vm *vm_sev_create_with_one_vcpu(uint32_t type, void *guest_code, struct vm_shape shape = { .mode = VM_MODE_DEFAULT, .type = type, + .src_type = VM_MEM_SRC_ANONYMOUS, }; struct kvm_vm *vm; struct kvm_vcpu *cpus[1]; diff --git a/tools/testing/selftests/kvm/pre_fault_memory_test.c b/tools/testing/selftests/kvm/pre_fault_memory_test.c index 0350a8896a2f..d403f8d2f26f 100644 --- a/tools/testing/selftests/kvm/pre_fault_memory_test.c +++ b/tools/testing/selftests/kvm/pre_fault_memory_test.c @@ -68,6 +68,7 @@ static void __test_pre_fault_memory(unsigned long vm_type, bool private) const struct vm_shape shape = { .mode = VM_MODE_DEFAULT, .type = vm_type, + .src_type = VM_MEM_SRC_ANONYMOUS, }; struct kvm_vcpu *vcpu; struct kvm_run *run; -- 2.50.1 From roypat at amazon.co.uk Fri Sep 12 02:17:47 2025 From: roypat at amazon.co.uk (Roy, Patrick) Date: Fri, 12 Sep 2025 09:17:47 +0000 Subject: [PATCH v6 11/11] KVM: selftests: Test guest execution from direct map removed gmem In-Reply-To: <20250912091708.17502-1-roypat@amazon.co.uk> References: <20250912091708.17502-1-roypat@amazon.co.uk> Message-ID: <20250912091708.17502-12-roypat@amazon.co.uk> Add a selftest that loads itself into guest_memfd (via GUEST_MEMFD_FLAG_MMAP) and triggers an MMIO exit when executed. This exercises x86 MMIO emulation code inside KVM for guest_memfd-backed memslots where the guest_memfd folios are direct map removed. Particularly, it validates that x86 MMIO emulation code (guest page table walks + instruction fetch) correctly accesses gmem through the VMA that's been reflected into the memslot's userspace_addr field (instead of trying to do direct map accesses). Signed-off-by: Patrick Roy --- .../selftests/kvm/set_memory_region_test.c | 50 +++++++++++++++++-- 1 file changed, 46 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/kvm/set_memory_region_test.c b/tools/testing/selftests/kvm/set_memory_region_test.c index ce3ac0fd6dfb..cb3bc642d376 100644 --- a/tools/testing/selftests/kvm/set_memory_region_test.c +++ b/tools/testing/selftests/kvm/set_memory_region_test.c @@ -603,6 +603,41 @@ static void test_mmio_during_vectoring(void) kvm_vm_free(vm); } + +static void guest_code_trigger_mmio(void) +{ + /* + * Read some GPA that is not backed by a memslot. KVM consider this + * as MMIO and tell userspace to emulate the read. + */ + READ_ONCE(*((uint64_t *)MEM_REGION_GPA)); + + GUEST_DONE(); +} + +static void test_guest_memfd_mmio(void) +{ + struct kvm_vm *vm; + struct kvm_vcpu *vcpu; + struct vm_shape shape = { + .mode = VM_MODE_DEFAULT, + .src_type = VM_MEM_SRC_GUEST_MEMFD_NO_DIRECT_MAP, + }; + pthread_t vcpu_thread; + + pr_info("Testing MMIO emulation for instructions in gmem\n"); + + vm = __vm_create_shape_with_one_vcpu(shape, &vcpu, 0, guest_code_trigger_mmio); + + virt_map(vm, MEM_REGION_GPA, MEM_REGION_GPA, 1); + + pthread_create(&vcpu_thread, NULL, vcpu_worker, vcpu); + + /* If the MMIO read was successfully emulated, the vcpu thread will exit */ + pthread_join(vcpu_thread, NULL); + + kvm_vm_free(vm); +} #endif int main(int argc, char *argv[]) @@ -626,10 +661,17 @@ int main(int argc, char *argv[]) test_add_max_memory_regions(); #ifdef __x86_64__ - if (kvm_has_cap(KVM_CAP_GUEST_MEMFD) && - (kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SW_PROTECTED_VM))) { - test_add_private_memory_region(); - test_add_overlapping_private_memory_regions(); + if (kvm_has_cap(KVM_CAP_GUEST_MEMFD)) { + if (kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SW_PROTECTED_VM)) { + test_add_private_memory_region(); + test_add_overlapping_private_memory_regions(); + } + + if (kvm_has_cap(KVM_CAP_GUEST_MEMFD_MMAP) && + kvm_has_cap(KVM_CAP_GUEST_MEMFD_NO_DIRECT_MAP)) + test_guest_memfd_mmio(); + else + pr_info("Skipping tests requiring KVM_CAP_GUEST_MEMFD_MMAP | KVM_CAP_GUEST_MEMFD_NO_DIRECT_MAP"); } else { pr_info("Skipping tests for KVM_MEM_GUEST_MEMFD memory regions\n"); } -- 2.50.1 From zhang.lyra at gmail.com Fri Sep 12 02:21:11 2025 From: zhang.lyra at gmail.com (Chunyan Zhang) Date: Fri, 12 Sep 2025 17:21:11 +0800 Subject: [PATCH v11 1/5] mm: softdirty: Add pgtable_soft_dirty_supported() In-Reply-To: <04d2d781-fd5e-4778-b042-d4dbeb8c5d49@redhat.com> References: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> <20250911095602.1130290-2-zhangchunyan@iscas.ac.cn> <9bcaf3ec-c0a1-4ca5-87aa-f84e297d1e42@redhat.com> <04d2d781-fd5e-4778-b042-d4dbeb8c5d49@redhat.com> Message-ID: On Fri, 12 Sept 2025 at 16:41, David Hildenbrand wrote: > > [...] > > >>> +/* > >>> + * We should remove the VM_SOFTDIRTY flag if the soft-dirty bit is > >>> + * unavailable on which the kernel is running, even if the architecture > >>> + * provides the resource and soft-dirty is compiled in. > >>> + */ > >>> +#ifdef CONFIG_MEM_SOFT_DIRTY > >>> + if (!pgtable_soft_dirty_supported()) > >>> + mnemonics[ilog2(VM_SOFTDIRTY)][0] = 0; > >>> +#endif > >> > >> You can now drop the ifdef. > > > > Ok, you mean define VM_SOFTDIRTY 0x08000000 no matter if > > MEM_SOFT_DIRTY is compiled in, right? > > > > Then I need memcpy() to set mnemonics[ilog2(VM_SOFTDIRTY)] here. > > The whole hunk will not be required when we make sure VM_SOFTDIRTY never > gets set, correct? Oh no, this hunk code does not set vmflag. The mnemonics[ilog2(VM_SOFTDIRTY)] is for show_smap_vma_flags(), something like below: # cat /proc/1/smaps 5555605c7000-555560680000 r-xp 00000000 fe:00 19 /bin/busybox ... VmFlags: rd ex mr mw me sd 'sd' is for soft-dirty I think this is still needed, right? > > > > >> > >> But, I wonder if could we instead just stop setting the flag. Then we don't > >> have to worry about any VM_SOFTDIRTY checks. > >> > >> Something like the following > >> > >> diff --git a/include/linux/mm.h b/include/linux/mm.h > >> index 892fe5dbf9de0..8b8bf63a32ef7 100644 > >> --- a/include/linux/mm.h > >> +++ b/include/linux/mm.h > >> @@ -783,6 +783,7 @@ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) > >> static inline void vm_flags_init(struct vm_area_struct *vma, > >> vm_flags_t flags) > >> { > >> + VM_WARN_ON_ONCE(!pgtable_soft_dirty_supported() && (flags & VM_SOFTDIRTY)); > >> ACCESS_PRIVATE(vma, __vm_flags) = flags; > >> } > >> > >> @@ -801,6 +802,7 @@ static inline void vm_flags_reset(struct vm_area_struct *vma, > >> static inline void vm_flags_reset_once(struct vm_area_struct *vma, > >> vm_flags_t flags) > >> { > >> + VM_WARN_ON_ONCE(!pgtable_soft_dirty_supported() && (flags & VM_SOFTDIRTY)); > >> vma_assert_write_locked(vma); > >> WRITE_ONCE(ACCESS_PRIVATE(vma, __vm_flags), flags); > >> } > >> @@ -808,6 +810,7 @@ static inline void vm_flags_reset_once(struct vm_area_struct *vma, > >> static inline void vm_flags_set(struct vm_area_struct *vma, > >> vm_flags_t flags) > >> { > >> + VM_WARN_ON_ONCE(!pgtable_soft_dirty_supported() && (flags & VM_SOFTDIRTY)); > >> vma_start_write(vma); > >> ACCESS_PRIVATE(vma, __vm_flags) |= flags; > >> } > >> diff --git a/mm/mmap.c b/mm/mmap.c > >> index 5fd3b80fda1d5..40cb3fbf9a247 100644 > >> --- a/mm/mmap.c > >> +++ b/mm/mmap.c > >> @@ -1451,8 +1451,10 @@ static struct vm_area_struct *__install_special_mapping( > >> return ERR_PTR(-ENOMEM); > >> > >> vma_set_range(vma, addr, addr + len, 0); > >> - vm_flags_init(vma, (vm_flags | mm->def_flags | > >> - VM_DONTEXPAND | VM_SOFTDIRTY) & ~VM_LOCKED_MASK); > >> + vm_flags |= mm->def_flags | VM_DONTEXPAND; > > > > Why use '|=' rather than not directly setting vm_flags which is an > > uninitialized variable? > > vm_flags is passed in by the caller? > Then the original code seems wrong. > But just to clarify: this code was just a quick hack, adjust it as you need. Got it. > > [...] > > >>> > >>> + if (!pgtable_soft_dirty_supported()) > >>> + return; > >>> + > >>> if (pmd_present(pmd)) { > >>> /* See comment in change_huge_pmd() */ > >>> old = pmdp_invalidate(vma, addr, pmdp); > >> > >> That would all be handled with the above never-set-VM_SOFTDIRTY. > > I meant that there is no need to add the pgtable_soft_dirty_supported() > check. Ok I will take a look. > > > > > Sorry I'm not sure I understand here, you mean no longer need #ifdef > > CONFIG_MEM_SOFT_DIRTY for these function definitions, right? > > Likely we could drop them. VM_SOFTDIRTY will never be set so the code > will not be invoked. The relationship of VM_SOFTDIRTY and clear_soft_dirty_pmd() is not very direct from the first sight, let me take a further look. > > And for architectures where VM_SOFTDIRTY is never even possible > (!CONFIG_MEM_SOFT_DIRTY) we keep it as 0. Ok. > > That way, the compiler can even optimize out all of that code because > > "vma->vm_flags & VM_SOFTDIRTY" -> "vma->vm_flags & 0" > > will never be true. > > > > >> > >>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > >>> index 4c035637eeb7..2a3578a4ae4c 100644 > >>> --- a/include/linux/pgtable.h > >>> +++ b/include/linux/pgtable.h > >>> @@ -1537,6 +1537,18 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) > >>> #define arch_start_context_switch(prev) do {} while (0) > >>> #endif > >>> > >>> +/* > >>> + * Some platforms can customize the PTE soft-dirty bit making it unavailable > >>> + * even if the architecture provides the resource. > >>> + * Adding this API allows architectures to add their own checks for the > >>> + * devices on which the kernel is running. > >>> + * Note: When overiding it, please make sure the CONFIG_MEM_SOFT_DIRTY > >>> + * is part of this macro. > >>> + */ > >>> +#ifndef pgtable_soft_dirty_supported > >>> +#define pgtable_soft_dirty_supported() IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) > >>> +#endif > >>> + > >>> #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY > >>> #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION > >>> static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) > >>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > >>> index 830107b6dd08..b32ce2b0b998 100644 > >>> --- a/mm/debug_vm_pgtable.c > >>> +++ b/mm/debug_vm_pgtable.c > >>> @@ -690,7 +690,7 @@ static void __init pte_soft_dirty_tests(struct pgtable_debug_args *args) > >>> { > >>> pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); > >>> > >>> - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > >>> + if (!pgtable_soft_dirty_supported()) > >>> return; > >>> > >>> pr_debug("Validating PTE soft dirty\n"); > >>> @@ -702,7 +702,7 @@ static void __init pte_swap_soft_dirty_tests(struct pgtable_debug_args *args) > >>> { > >>> pte_t pte; > >>> > >>> - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > >>> + if (!pgtable_soft_dirty_supported()) > >>> return; > >>> > >>> pr_debug("Validating PTE swap soft dirty\n"); > >>> @@ -718,7 +718,7 @@ static void __init pmd_soft_dirty_tests(struct pgtable_debug_args *args) > >>> { > >>> pmd_t pmd; > >>> > >>> - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > >>> + if (!pgtable_soft_dirty_supported()) > >>> return; > >>> > >>> if (!has_transparent_hugepage()) > >>> @@ -734,8 +734,8 @@ static void __init pmd_swap_soft_dirty_tests(struct pgtable_debug_args *args) > >>> { > >>> pmd_t pmd; > >>> > >>> - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) || > >>> - !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) > >>> + if (!pgtable_soft_dirty_supported() || > >>> + !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION)) > >>> return; > >>> > >>> if (!has_transparent_hugepage()) > >>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c > >>> index 9c38a95e9f09..218d430a2ec6 100644 > >>> --- a/mm/huge_memory.c > >>> +++ b/mm/huge_memory.c > >>> @@ -2271,12 +2271,13 @@ static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl, > >>> > >>> static pmd_t move_soft_dirty_pmd(pmd_t pmd) > >>> { > >>> -#ifdef CONFIG_MEM_SOFT_DIRTY > >>> - if (unlikely(is_pmd_migration_entry(pmd))) > >>> - pmd = pmd_swp_mksoft_dirty(pmd); > >>> - else if (pmd_present(pmd)) > >>> - pmd = pmd_mksoft_dirty(pmd); > >>> -#endif > >>> + if (pgtable_soft_dirty_supported()) { > >>> + if (unlikely(is_pmd_migration_entry(pmd))) > >>> + pmd = pmd_swp_mksoft_dirty(pmd); > >>> + else if (pmd_present(pmd)) > >>> + pmd = pmd_mksoft_dirty(pmd); > >>> + } > >>> + > >> > >> Wondering, should simply the arch take care of that and we can just clal > >> pmd_swp_mksoft_dirty / pmd_mksoft_dirty? > > > > I think we have that already in include/linux/pgtable.h: > > We have stubs that just don't do anything. > > For riscv support you would handle runtime-enablement in these helpers. > > > > >> > >>> return pmd; > >>> } > >>> > >>> diff --git a/mm/internal.h b/mm/internal.h > >>> index 45b725c3dc03..c6ca62f8ecf3 100644 > >>> --- a/mm/internal.h > >>> +++ b/mm/internal.h > >>> @@ -1538,7 +1538,7 @@ static inline bool vma_soft_dirty_enabled(struct vm_area_struct *vma) > >>> * VM_SOFTDIRTY is defined as 0x0, then !(vm_flags & VM_SOFTDIRTY) > >>> * will be constantly true. > >>> */ > >>> - if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)) > >>> + if (!pgtable_soft_dirty_supported()) > >>> return false; > >>> > >> > >> That should be handled with the above never-set-VM_SOFTDIRTY. > > > > We don't need to check if (!pgtable_soft_dirty_supported()) if I > > understand correctly. > Hm, let me think about that. No, I think this has to stay as the comment > says, so this case here is special. I will cook a new version and then we can discuss further based on the new patch. Thanks for your review, Chunyan From kevin.tian at intel.com Fri Sep 12 02:33:06 2025 From: kevin.tian at intel.com (Tian, Kevin) Date: Fri, 12 Sep 2025 09:33:06 +0000 Subject: [PATCH v4 1/7] iommu/arm-smmu-v3: Add release_domain to attach prior to release_dev() In-Reply-To: References: Message-ID: > From: Nicolin Chen > Sent: Monday, September 1, 2025 7:32 AM > > +static int arm_smmu_attach_dev_release(struct iommu_domain *domain, > + struct device *dev) > +{ > + struct arm_smmu_master *master = dev_iommu_priv_get(dev); > + > + WARN_ON(master->iopf_refcount); > + > + /* Put the STE back to what arm_smmu_init_strtab() sets */ > + if (dev->iommu->require_direct) > + > arm_smmu_attach_dev_identity(&arm_smmu_identity_domain, > dev); > + else > + > arm_smmu_attach_dev_blocked(&arm_smmu_blocked_domain, > dev); it's a bit confusing that a BLOCKED domain type could turn to the identity mode, though this movement doesn't change the original behavior. > + > + return 0; > +} > + > +static const struct iommu_domain_ops arm_smmu_release_ops = { > + .attach_dev = arm_smmu_attach_dev_release, > +}; > + > +static struct iommu_domain arm_smmu_release_domain = { > + .type = IOMMU_DOMAIN_BLOCKED, > + .ops = &arm_smmu_release_ops, > +}; > + From kevin.tian at intel.com Fri Sep 12 02:34:11 2025 From: kevin.tian at intel.com (Tian, Kevin) Date: Fri, 12 Sep 2025 09:34:11 +0000 Subject: [PATCH v4 2/7] iommu: Lock group->mutex in iommu_deferred_attach() In-Reply-To: <9b9199e03c87c3cf8152cf93dc403a95c883811b.1756682135.git.nicolinc@nvidia.com> References: <9b9199e03c87c3cf8152cf93dc403a95c883811b.1756682135.git.nicolinc@nvidia.com> Message-ID: > From: Nicolin Chen > Sent: Monday, September 1, 2025 7:32 AM > > The iommu_deferred_attach() function invokes __iommu_attach_device() > while > not holding the group->mutex, like other __iommu_attach_device() callers. > > Though there is no pratical bug being triggered so far, it would be better > to apply the same locking to this __iommu_attach_device(), since the > IOMMU > drivers nowaday are more aware of the group->mutex -- some of them use > the > iommu_group_mutex_assert() function that could be potentially in the path > of an attach_dev callback function invoked by the __iommu_attach_device(). > > The iommu_deferred_attach() will soon need to verify a new flag stored in > the struct group_device. To iterate the gdev list, the group->mutex should > be held for this matter too. > > So, grab the mutex to guard __iommu_attach_device() like other callers. > > Reviewed-by: Jason Gunthorpe > Signed-off-by: Nicolin Chen Reviewed-by: Kevin Tian From kevin.tian at intel.com Fri Sep 12 02:35:08 2025 From: kevin.tian at intel.com (Tian, Kevin) Date: Fri, 12 Sep 2025 09:35:08 +0000 Subject: [PATCH v4 4/7] iommu: Pass in old domain to attach_dev callback functions In-Reply-To: <19570f350d15528e13447168b7dcd95795afdbf3.1756682135.git.nicolinc@nvidia.com> References: <19570f350d15528e13447168b7dcd95795afdbf3.1756682135.git.nicolinc@nvidia.com> Message-ID: > From: Nicolin Chen > Sent: Monday, September 1, 2025 7:32 AM > > The IOMMU core attaches each device to a default domain on probe(). Then, > every new "attach" operation has a fundamental meaning of two-fold: > - detach from its currently attached (old) domain > - attach to a given new domain > > Modern IOMMU drivers following this pattern usually want to clean up the > things related to the old domain, so they call iommu_get_domain_for_dev() > to fetch the old domain. > > Pass in the old domain pointer from the core to drivers, aligning with the > set_dev_pasid op that passes in already. > > Ensure all low-level attach fcuntions in the core can forward the correct > old domain pointer. Thus, rework those functions as well. > > Suggested-by: Jason Gunthorpe > Signed-off-by: Nicolin Chen Reviewed-by: Kevin Tian From kevin.tian at intel.com Fri Sep 12 02:36:38 2025 From: kevin.tian at intel.com (Tian, Kevin) Date: Fri, 12 Sep 2025 09:36:38 +0000 Subject: [PATCH v4 5/7] iommu: Add iommu_get_domain_for_dev_locked() helper In-Reply-To: References: Message-ID: > From: Nicolin Chen > Sent: Monday, September 1, 2025 7:32 AM > > > +/* Caller must be a general/external function that isn't an IOMMU callback > */ > struct iommu_domain *iommu_get_domain_for_dev(struct device *dev) > { 'general function' is not easy to get its meaning. just keep 'external'? Reviewed-by: Kevin Tian From kevin.tian at intel.com Fri Sep 12 02:49:13 2025 From: kevin.tian at intel.com (Tian, Kevin) Date: Fri, 12 Sep 2025 09:49:13 +0000 Subject: [PATCH v4 6/7] iommu: Introduce iommu_dev_reset_prepare() and iommu_dev_reset_done() In-Reply-To: <0f6021b500c74db33af8118210dd7a2b2fd31b3c.1756682135.git.nicolinc@nvidia.com> References: <0f6021b500c74db33af8118210dd7a2b2fd31b3c.1756682135.git.nicolinc@nvidia.com> Message-ID: > From: Nicolin Chen > Sent: Monday, September 1, 2025 7:32 AM > > PCIe permits a device to ignore ATS invalidation TLPs, while processing a > reset. This creates a problem visible to the OS where an ATS invalidation > command will time out. E.g. an SVA domain will have no coordination with a > reset event and can racily issue ATS invalidations to a resetting device. > > The OS should do something to mitigate this as we do not want production > systems to be reporting critical ATS failures, especially in a hypervisor > environment. Broadly, OS could arrange to ignore the timeouts, block page > table mutations to prevent invalidations, or disable and block ATS. > > The PCIe spec in sec 10.3.1 IMPLEMENTATION NOTE recommends to disable > and > block ATS before initiating a Function Level Reset. It also mentions that > other reset methods could have the same vulnerability as well. > > Provide a callback from the PCI subsystem that will enclose the reset and > have the iommu core temporarily change all the attached domain to > BLOCKED. > After attaching a BLOCKED domain, IOMMU hardware would fence any > incoming > ATS queries. And IOMMU drivers should also synchronously stop issuing new > ATS invalidations and wait for all ATS invalidations to complete. This can > avoid any ATS invaliation timeouts. > > However, if there is a domain attachment/replacement happening during an > ongoing reset, ATS routines may be re-activated between the two function > calls. So, introduce a new pending_reset flag in group_device, and reject > any concurrent attach_dev/set_dev_pasid call during a reset for a concern > of compatibility failure. > > There are two corner cases that won't work: > 1. Alias devices that share the same RID > Blocking one device also blocks the other alias devices that might not > want a reset. Given that it's very rare for an alias device to support > ATS, simply skip the blocking routine. it also applies to the devices in the same iommu group. While one device is being reset, all other devices in the group cannot change the domain. This needs to be documented in the attach uAPI. > > 2. SRIOV devices that its PF is resetting while its VF isn't. > Both PF and VF should block RID and PASIDs. But, since VF is not aware > of the reset, it is difficult to block it and reject concurrent attach > calls, because it's not logically reasonable to reject a VF attachment > due to a resetting PF unless the VF is resetting too. To address this, > we won't be able to reject any concurrent attachment as simple as this > patch does; instead we will need two new compatibility testing ops for > attach_dev/set_dev_pasid to allowing caching a compatible attach. This > itself, however, would be a big series. So, for now, skip the blocking > routine for PF devices, and leave a note. > given it impacts uAPI: - now attach/replace can be done anytime - with this series attach/replace is rejected when a device is being reset - later with compat testing ops attach/replace can be done again at any time we should be cautious here, especially if this series goes into 6.18 (likely the next LTS version) the interim behavior change may last long. yes we discussed that no know usage would want to do attach/replace while a device is being reset, but I wonder whether we should instead wait for a full solution to avoid unnecessary uAPI change back-and-forth... Thanks Kevin From pfalcato at suse.de Fri Sep 12 03:48:15 2025 From: pfalcato at suse.de (Pedro Falcato) Date: Fri, 12 Sep 2025 11:48:15 +0100 Subject: [PATCH v6 01/11] filemap: Pass address_space mapping to ->free_folio() In-Reply-To: <20250912091708.17502-2-roypat@amazon.co.uk> References: <20250912091708.17502-1-roypat@amazon.co.uk> <20250912091708.17502-2-roypat@amazon.co.uk> Message-ID: <2w22wsqar437lyp3w4bltyoql4ksn3exppkyaia5ogtnt2ttte@6nptj6ed4qnm> On Fri, Sep 12, 2025 at 09:17:31AM +0000, Roy, Patrick wrote: > From: Elliot Berman > > When guest_memfd removes memory from the host kernel's direct map, > direct map entries must be restored before the memory is freed again. To > do so, ->free_folio() needs to know whether a gmem folio was direct map > removed in the first place though. While possible to keep track of this > information on each individual folio (e.g. via page flags), direct map > removal is an all-or-nothing property of the entire guest_memfd, so it > is less error prone to just check the flag stored in the gmem inode's > private data. However, by the time ->free_folio() is called, > folio->mapping might be cleared. To still allow access to the address > space from which the folio was just removed, pass it in as an additional > argument to ->free_folio, as the mapping is well-known to all callers. > > Link: https://lore.kernel.org/all/15f665b4-2d33-41ca-ac50-fafe24ade32f at redhat.com/ > Suggested-by: David Hildenbrand > Acked-by: David Hildenbrand > Signed-off-by: Elliot Berman > [patrick: rewrite shortlog for new usecase] > Signed-off-by: Patrick Roy Reviewed-by: Pedro Falcato -- Pedro From naresh.kamboju at linaro.org Fri Sep 12 04:32:36 2025 From: naresh.kamboju at linaro.org (Naresh Kamboju) Date: Fri, 12 Sep 2025 17:02:36 +0530 Subject: next-20250912: riscv: s390: mm/kasan/shadow.c 'kasan_populate_vmalloc_pte' pgtable.h:247:41: error: statement with no effect [-Werror=unused-value] Message-ID: The following build warnings / errors noticed on the riscv and s390 with allyesconfig build on the Linux next-20250912 tag. Regression Analysis: - New regression? yes - Reproducibility? yes Build regression: next-20250912 mm/kasan/shadow.c 'kasan_populate_vmalloc_pte' pgtable.h error statement with no effect [-Werror=unused-value] Reported-by: Linux Kernel Functional Testing $ git log --oneline next-20250911..next-20250912 -- mm/kasan/shadow.c aed53ec0b797a mm: introduce local state for lazy_mmu sections 307f2dc9b308e kasan: introduce ARCH_DEFER_KASAN and unify static key across modes ## Test log In file included from include/linux/kasan.h:37, from mm/kasan/shadow.c:14: mm/kasan/shadow.c: In function 'kasan_populate_vmalloc_pte': include/linux/pgtable.h:247:41: error: statement with no effect [-Werror=unused-value] 247 | #define arch_enter_lazy_mmu_mode() (LAZY_MMU_DEFAULT) | ^ mm/kasan/shadow.c:322:9: note: in expansion of macro 'arch_enter_lazy_mmu_mode' 322 | arch_enter_lazy_mmu_mode(); | ^~~~~~~~~~~~~~~~~~~~~~~~ mm/kasan/shadow.c: In function 'kasan_depopulate_vmalloc_pte': include/linux/pgtable.h:247:41: error: statement with no effect [-Werror=unused-value] 247 | #define arch_enter_lazy_mmu_mode() (LAZY_MMU_DEFAULT) | ^ mm/kasan/shadow.c:497:9: note: in expansion of macro 'arch_enter_lazy_mmu_mode' 497 | arch_enter_lazy_mmu_mode(); | ^~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors ## Source * Kernel version: 6.17.0-rc5 * Git tree: https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git * Git describe: 6.17.0-rc5-next-20250912 * Git commit: 590b221ed4256fd6c34d3dea77aa5bd6e741bbc1 * Architectures: riscv, s390 * Toolchains: gcc (Debian 13.3.0-16) 13.3.0 * Kconfigs: allyesconfig ## Build * Build log: https://qa-reports.linaro.org/api/testruns/29863344/log_file/ * Build details: https://regressions.linaro.org/lkft/linux-next-master/next-20250912/log-parser-build-gcc/gcc-compiler-include_linux_pgtable_h-error-statement-with-no-effect/ * Build plan: https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/builds/32aTGVWBLzkF7PsIq9FBtLK3T4W * Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/32aTGVWBLzkF7PsIq9FBtLK3T4W/ * Kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/32aTGVWBLzkF7PsIq9FBtLK3T4W/config ## Steps to reproduce $ tuxmake --runtime podman --target-arch riscv --toolchain gcc-13 --kconfig allyesconfig -- Linaro LKFT From david at redhat.com Fri Sep 12 04:34:37 2025 From: david at redhat.com (David Hildenbrand) Date: Fri, 12 Sep 2025 13:34:37 +0200 Subject: next-20250912: riscv: s390: mm/kasan/shadow.c 'kasan_populate_vmalloc_pte' pgtable.h:247:41: error: statement with no effect [-Werror=unused-value] In-Reply-To: References: Message-ID: On 12.09.25 13:32, Naresh Kamboju wrote: > The following build warnings / errors noticed on the riscv and s390 > with allyesconfig build on the Linux next-20250912 tag. > > Regression Analysis: > - New regression? yes > - Reproducibility? yes > > Build regression: next-20250912 mm/kasan/shadow.c > 'kasan_populate_vmalloc_pte' pgtable.h error statement with no effect > [-Werror=unused-value] > > Reported-by: Linux Kernel Functional Testing > > $ git log --oneline next-20250911..next-20250912 -- mm/kasan/shadow.c > aed53ec0b797a mm: introduce local state for lazy_mmu sections > 307f2dc9b308e kasan: introduce ARCH_DEFER_KASAN and unify static key > across modes > > ## Test log > In file included from include/linux/kasan.h:37, > from mm/kasan/shadow.c:14: > mm/kasan/shadow.c: In function 'kasan_populate_vmalloc_pte': > include/linux/pgtable.h:247:41: error: statement with no effect > [-Werror=unused-value] > 247 | #define arch_enter_lazy_mmu_mode() (LAZY_MMU_DEFAULT) > | ^ > mm/kasan/shadow.c:322:9: note: in expansion of macro 'arch_enter_lazy_mmu_mode' > 322 | arch_enter_lazy_mmu_mode(); > | ^~~~~~~~~~~~~~~~~~~~~~~~ > mm/kasan/shadow.c: In function 'kasan_depopulate_vmalloc_pte': > include/linux/pgtable.h:247:41: error: statement with no effect > [-Werror=unused-value] > 247 | #define arch_enter_lazy_mmu_mode() (LAZY_MMU_DEFAULT) > | ^ > mm/kasan/shadow.c:497:9: note: in expansion of macro 'arch_enter_lazy_mmu_mode' > 497 | arch_enter_lazy_mmu_mode(); > | ^~~~~~~~~~~~~~~~~~~~~~~~ > cc1: all warnings being treated as errors > I'm afraid these changes landed in -mm-unstable a bit too early. -- Cheers David / dhildenb From sergio.paracuellos at gmail.com Fri Sep 12 02:49:01 2025 From: sergio.paracuellos at gmail.com (Sergio Paracuellos) Date: Fri, 12 Sep 2025 11:49:01 +0200 Subject: [PATCH 08/15] gpio: mt7621: use new generic GPIO chip API In-Reply-To: <20250909-gpio-mmio-gpio-conv-part4-v1-8-9f723dc3524a@linaro.org> References: <20250909-gpio-mmio-gpio-conv-part4-v1-0-9f723dc3524a@linaro.org> <20250909-gpio-mmio-gpio-conv-part4-v1-8-9f723dc3524a@linaro.org> Message-ID: On Tue, Sep 9, 2025 at 11:50?AM Bartosz Golaszewski wrote: > > From: Bartosz Golaszewski > > Convert the driver to using the new generic GPIO chip interfaces from > linux/gpio/generic.h. > > Signed-off-by: Bartosz Golaszewski > --- > drivers/gpio/gpio-mt7621.c | 51 +++++++++++++++++++++++++++++----------------- > 1 file changed, 32 insertions(+), 19 deletions(-) Reviewed-by: Sergio Paracuellos Best regards, Sergio Paracuellos From m.wilczynski at samsung.com Fri Sep 12 05:30:09 2025 From: m.wilczynski at samsung.com (Michal Wilczynski) Date: Fri, 12 Sep 2025 14:30:09 +0200 Subject: [PATCH v14 0/7] Rust Abstractions for PWM subsystem with TH1520 PWM driver In-Reply-To: <20250820-rust-next-pwm-working-fan-for-sending-v14-0-df2191621429@samsung.com> References: <20250820-rust-next-pwm-working-fan-for-sending-v14-0-df2191621429@samsung.com> Message-ID: On 8/20/25 10:35, Michal Wilczynski wrote: > This patch series introduces Rust support for the T-HEAD TH1520 PWM > controller and demonstrates its use for fan control on the Sipeed Lichee > Pi 4A board. > > The primary goal of this patch series is to introduce a basic set of > Rust abstractions for the Linux PWM subsystem. As a first user and > practical demonstration of these abstractions, the series also provides > a functional PWM driver for the T-HEAD TH1520 SoC. This allows control > of its PWM channels and ultimately enables temperature controlled fan > support for the Lichee Pi 4A board. This work aims to explore the use of > Rust for PWM drivers and lay a foundation for potential future Rust > based PWM drivers. > > The core of this series is a new rust/kernel/pwm.rs module that provides > abstractions for writing PWM chip provider drivers in Rust. This has > been significantly reworked from v1 based on extensive feedback. The key > features of the new abstraction layer include: > > - Ownership and Lifetime Management: The pwm::Chip wrapper is managed > by ARef, correctly tying its lifetime to its embedded struct device > reference counter. Chip registration is handled by a pwm::Registration > RAII guard, which guarantees that pwmchip_add is always paired with > pwmchip_remove, preventing resource leaks. > > - Modern and Safe API: The PwmOps trait is now based on the modern > waveform API (round_waveform_tohw, write_waveform, etc.) as recommended > by the subsystem maintainer. It is generic over a driver's > hardware specific data structure, moving all unsafe serialization logic > into the abstraction layer and allowing drivers to be written in 100% > safe Rust. > > - Ergonomics: The API provides safe, idiomatic wrappers for other PWM > types (State, Args, Device, etc.) and uses standard kernel error > handling patterns. > > The series is structured as follows: > - Expose static function pwmchip_release. > - Rust PWM Abstractions: The new safe abstraction layer. > - TH1520 PWM Driver: A new Rust driver for the TH1520 SoC, built on > top of the new abstractions. > - Device Tree Bindings & Nodes: The remaining patches add the necessary > DT bindings and nodes for the TH1520 PWM controller, and the PWM fan > configuration for the Lichee Pi 4A board. > > Testing: > Tested on the TH1520 SoC. The fan works correctly. The duty/period > calculations are correct. Fan starts slow when the chip is not hot and > gradually increases the speed when PVT reports higher temperatures. > > The patches doesn't contain any dependencies that are not currently in > the mainline kernel anymore. > > --- > Changes in v14: > - Re-base on top of 6.17-rc1. > - Cosmetic change in label function. > - Link to v13: https://lore.kernel.org/r/20250806-rust-next-pwm-working-fan-for-sending-v13-0-690b669295b6 at samsung.com > > Changes in v13: > - Re-add the T-HEAD TH1520 PWM driver and its device tree bindings, as > Iomem series got merged into mainline kernel. > - Fix Args struct to be consistent with State - no Opaque needed for > copies. > - Replace tuple retur type in the PwmOps trait with dedicated struct > for improved clarity. > - Use build_assert for WfHw size, as it doesn't have to be runtime > check. > - Various cosmetic changes. > - Link to v12: https://lore.kernel.org/r/20250717-rust-next-pwm-working-fan-for-sending-v12-0-40f73defae0c at samsung.com > > Changes in v12: > - Reworked the PWM abstractions to use the subclassing pattern as > suggested by reviewers. > - pwm::Chip and its driver data are now allocated in a single, contiguous > memory block via pwmchip_alloc() sizeof_priv argument. > - Chip::new() now uses the pin init API to construct the driver data > in place, removing the need for a separate allocation. > - The PwmOps trait is now implemented directly by the driver data struct > itself, removing the DrvData associated type and the ForeignOwnable > trait. > - The custom release handler has been updated to call drop_in_place on the driver > data, ensuring destructors are run correctly before the underlying > memory is freed. > - Moved the pwmchip_release prototype in the C header to a separate > section to clarify it is for FFI use only, as requested. > - Added a Prerequisite-patch-id trailer to the cover letter to declare > the dependency on the PWM_WFHWSIZE patch. > > - Link to v11: https://lore.kernel.org/r/20250710-rust-next-pwm-working-fan-for-sending-v11-0-93824a16f9ec at samsung.com > > Changes in v11: > - Dropped driver and DT commits, as they don't compile based on publicly > known commit. > - Re-based on top of pwm/for-next. > - Reverted back to devres::Devres::new_foreign_owned, as pwm/for-next > doesn't contain 'register' re-factor, which is present in linux-next, > queued for the next merge window. The conflict is trivial, simply > change 'new_foreign_owned' -> 'register'. > - Added list to MAINTAINERS entry as requested. > - Link to v10: https://lore.kernel.org/r/20250707-rust-next-pwm-working-fan-for-sending-v10-0-d0c5cf342004 at samsung.com > > Changes in v10: > - Exported the C pwmchip_release function and called it from the custom > Rust release_callback to fix a memory leak of the pwm_chip struct. > - Removed the PwmOps::free callback, as it is not needed for idiomatic > Rust resource management. > - Removed the redundant is_null check for drvdata in the release handler, > as the Rust API guarantees a valid pointer is always provided. > > - Link to v9: https://lore.kernel.org/r/20250706-rust-next-pwm-working-fan-for-sending-v9-0-42b5ac2101c7 at samsung.com > > Changes in v9: > - Encapsulated vtable setup in Chip::new(): The Chip::new() function is > now generic over the PwmOps implementation. This allows it to create and > assign the vtable internally, which simplifies the public API by > removing the ops_vtable parameter from Registration::register(). > - Fixed memory leak with a release handler: A custom release_callback is > now assigned to the embedded struct device's release hook. This > guarantees that driver specific data is always freed when the chip is > destroyed, even if registration fails. > - The PwmOpsVTable is now defined as a const associated item to ensure > it has a 'static lifetime. > - Combined introductory commits: The Device, Chip, and PwmOps abstractions > are now introduced in a single commit. This was necessary to resolve the > circular dependencies between them and present a clean, compilable unit > for review. > > - Link to v8: https://lore.kernel.org/r/20250704-rust-next-pwm-working-fan-for-sending-v8-0-951e5482c9fd at samsung.com > > Changes in v8: > - Dropped already accepted commit, re-based on top of linux-next > - Reworked the Chip and PwmOps APIs to address the drvdata() type-safety > comment. Chip is now generic, and PwmOps uses an associated type > to provide compile-time guarantees. > - Added a parent device sanity check to Registration::register(). > - Updated drvdata() to return the idiomatic T::Borrowed<'_>. > - added temporary unsafe blocks in the driver, as the current > abstraction for Clk is neiter Safe nor Sync. I think eventually > proper abstraction for Clk will be added as in a current state it's > not very useful. > > - Link to v7: https://lore.kernel.org/r/20250702-rust-next-pwm-working-fan-for-sending-v7-0-67ef39ff1d29 at samsung.com > > Changes in v7: > - Made parent_device function private and moved casts to Device > there as well. > - Link to v6: https://lore.kernel.org/r/20250701-rust-next-pwm-working-fan-for-sending-v6-0-2710932f6f6b at samsung.com > > Changes in v6: > - Re-based on top of linux-next, dropped two already accepted commits. > - After re-basing the IoMem dependent patchset stopped working, > reworked it to use similar API like the PCI subsystem (I think it > will end up the same). Re-worked the driver for it as well. > - Remove the apply and get_state callbacks, and most of the State as > well, as the old way of implementing drivers should not be possible > in Rust. Left only enabled(), since it's useful for my driver. > - Removed the public set_drvdata() method from pwm::Chip > - Moved WFHWSIZE to the public include/linux/pwm.h header and renamed it > to PWM_WFHWSIZE, allowing bindgen to create safe FFI bindings. > - Corrected the ns_to_cycles integer calculation in the TH1520 driver to > handle overflow correctly. > - Updated the Kconfig entry for the TH1520 driver to select the Rust > abstractions for a better user experience. > > - Link to v5: https://lore.kernel.org/r/20250623-rust-next-pwm-working-fan-for-sending-v5-0-0ca23747c23e at samsung.com > > Changes in v5: > - Reworked `pwm::Chip` creation to take driver data directly, which > allowed making the `chip.drvdata()` accessor infallible > - added missing `pwm.c` file lost during the commit split (sorry !) > - Link to v4: https://lore.kernel.org/r/20250618-rust-next-pwm-working-fan-for-sending-v4-0-a6a28f2b6d8a at samsung.com > > Changes in v4: > - Reworked the pwm::Registration API to use the devres framework, > addressing lifetime issue. > - Corrected the PwmOps trait and its callbacks to use immutable references > (&Chip, &Device) for improved safety. > - Applied various code style and naming cleanups based on feedback > > - Link to v3: https://lore.kernel.org/r/20250617-rust-next-pwm-working-fan-for-sending-v3-0-1cca847c6f9f at samsung.com > > Changes in v3: > - Addressed feedback from Uwe by making multiple changes to the TH1520 > driver and the abstraction layer. > - Split the core PWM abstractions into three focused commits to ease > review per Benno request. > - Confirmed the driver now works correctly with CONFIG_PWM_DEBUG enabled > by implementing the full waveform API, which correctly reads the > hardware state. > - Refactored the Rust code to build cleanly with > CONFIG_RUST_BUILD_ASSERT_ALLOW=n, primarily by using the try_* family of > functions for IoMem access. > - Included several cosmetic changes and cleanups to the abstractions > per Miguel review. > > - Link to v2: https://lore.kernel.org/r/20250610-rust-next-pwm-working-fan-for-sending-v2-0-753e2955f110 at samsung.com > > Changes in v2: > - Reworked the PWM abstraction layer based on extensive feedback. > - Replaced initial devm allocation with a proper ARef lifetime model > using AlwaysRefCounted. > - Implemented a Registration RAII guard to ensure safe chip add/remove. > - Migrated the PwmOps trait from the legacy .apply callback to the modern > waveform API. > - Refactored the TH1520 driver to use the new, safer abstractions. > - Added a patch to mark essential bus clocks as CLK_IGNORE_UNUSED to fix > boot hangs when the PWM and thermal sensors are enabled. > - Link to v1: https://lore.kernel.org/r/20250524-rust-next-pwm-working-fan-for-sending-v1-0-bdd2d5094ff7 at samsung.com > > --- > Michal Wilczynski (7): > pwm: Export `pwmchip_release` for external use > rust: pwm: Add Kconfig and basic data structures > rust: pwm: Add complete abstraction layer > pwm: Add Rust driver for T-HEAD TH1520 SoC > dt-bindings: pwm: thead: Add T-HEAD TH1520 PWM controller > riscv: dts: thead: Add PWM controller node > riscv: dts: thead: Add PWM fan and thermal control > > .../devicetree/bindings/pwm/thead,th1520-pwm.yaml | 48 ++ > MAINTAINERS | 10 + > arch/riscv/boot/dts/thead/th1520-lichee-pi-4a.dts | 67 ++ > arch/riscv/boot/dts/thead/th1520.dtsi | 7 + > drivers/pwm/Kconfig | 24 + > drivers/pwm/Makefile | 1 + > drivers/pwm/core.c | 3 +- > drivers/pwm/pwm_th1520.rs | 355 +++++++++ > include/linux/pwm.h | 6 + > rust/bindings/bindings_helper.h | 1 + > rust/helpers/helpers.c | 1 + > rust/helpers/pwm.c | 20 + > rust/kernel/lib.rs | 2 + > rust/kernel/pwm.rs | 790 +++++++++++++++++++++ > 14 files changed, 1334 insertions(+), 1 deletion(-) > --- > base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585 > change-id: 20250524-rust-next-pwm-working-fan-for-sending-552ad2d1b193 > > Best regards, Hi Uwe Pinging in case this thread got lost in your mailbox. Best regards, -- Michal Wilczynski From david at redhat.com Fri Sep 12 06:32:42 2025 From: david at redhat.com (David Hildenbrand) Date: Fri, 12 Sep 2025 15:32:42 +0200 Subject: [PATCH v11 1/5] mm: softdirty: Add pgtable_soft_dirty_supported() In-Reply-To: References: <20250911095602.1130290-1-zhangchunyan@iscas.ac.cn> <20250911095602.1130290-2-zhangchunyan@iscas.ac.cn> <9bcaf3ec-c0a1-4ca5-87aa-f84e297d1e42@redhat.com> <04d2d781-fd5e-4778-b042-d4dbeb8c5d49@redhat.com> Message-ID: On 12.09.25 11:21, Chunyan Zhang wrote: > On Fri, 12 Sept 2025 at 16:41, David Hildenbrand wrote: >> >> [...] >> >>>>> +/* >>>>> + * We should remove the VM_SOFTDIRTY flag if the soft-dirty bit is >>>>> + * unavailable on which the kernel is running, even if the architecture >>>>> + * provides the resource and soft-dirty is compiled in. >>>>> + */ >>>>> +#ifdef CONFIG_MEM_SOFT_DIRTY >>>>> + if (!pgtable_soft_dirty_supported()) >>>>> + mnemonics[ilog2(VM_SOFTDIRTY)][0] = 0; >>>>> +#endif >>>> >>>> You can now drop the ifdef. >>> >>> Ok, you mean define VM_SOFTDIRTY 0x08000000 no matter if >>> MEM_SOFT_DIRTY is compiled in, right? >>> >>> Then I need memcpy() to set mnemonics[ilog2(VM_SOFTDIRTY)] here. >> >> The whole hunk will not be required when we make sure VM_SOFTDIRTY never >> gets set, correct? > > Oh no, this hunk code does not set vmflag. > The mnemonics[ilog2(VM_SOFTDIRTY)] is for show_smap_vma_flags(), > something like below: > # cat /proc/1/smaps > 5555605c7000-555560680000 r-xp 00000000 fe:00 19 > /bin/busybox > ... > VmFlags: rd ex mr mw me sd > > 'sd' is for soft-dirty > > I think this is still needed, right? If nobody sets VM_SOFTDIRTY in vma->vm_flags, then we will never print it. So you can just leave the "#ifdef CONFIG_MEM_SOFT_DIRTY" as is to handle the VM_SOFTDIRTY=0 case. So you should not have to change anything in show_smap_vma_flags(). [...] >>>> That should be handled with the above never-set-VM_SOFTDIRTY. >>> >>> We don't need to check if (!pgtable_soft_dirty_supported()) if I >>> understand correctly. >> Hm, let me think about that. No, I think this has to stay as the comment >> says, so this case here is special. > > I will cook a new version and then we can discuss further based on the > new patch. Sounds good! -- Cheers David / dhildenb From fangyu.yu at linux.alibaba.com Fri Sep 12 06:43:32 2025 From: fangyu.yu at linux.alibaba.com (fangyu.yu at linux.alibaba.com) Date: Fri, 12 Sep 2025 21:43:32 +0800 Subject: [PATCH] RISC-V: KVM: Fix guest page fault within HLV* instructions Message-ID: <20250912134332.22053-1-fangyu.yu@linux.alibaba.com> From: Fangyu Yu When executing HLV* instructions at the HS mode, a guest page fault may occur when a g-stage page table migration between triggering the virtual instruction exception and executing the HLV* instruction. This may be a corner case, and one simpler way to handle this is to re-execute the instruction where the virtual instruction exception occurred, and the guest page fault will be automatically handled. Fixes: 9f7013265112 ("RISC-V: KVM: Handle MMIO exits for VCPU") Signed-off-by: Fangyu Yu --- arch/riscv/kvm/vcpu_insn.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/arch/riscv/kvm/vcpu_insn.c b/arch/riscv/kvm/vcpu_insn.c index 97dec18e6989..a8b93aa4d8ec 100644 --- a/arch/riscv/kvm/vcpu_insn.c +++ b/arch/riscv/kvm/vcpu_insn.c @@ -448,7 +448,12 @@ int kvm_riscv_vcpu_virtual_insn(struct kvm_vcpu *vcpu, struct kvm_run *run, insn = kvm_riscv_vcpu_unpriv_read(vcpu, true, ct->sepc, &utrap); - if (utrap.scause) { + switch (utrap.scause) { + case 0: + break; + case EXC_LOAD_GUEST_PAGE_FAULT: + return KVM_INSN_CONTINUE_SAME_SEPC; + default: utrap.sepc = ct->sepc; kvm_riscv_vcpu_trap_redirect(vcpu, &utrap); return 1; @@ -503,7 +508,12 @@ int kvm_riscv_vcpu_mmio_load(struct kvm_vcpu *vcpu, struct kvm_run *run, */ insn = kvm_riscv_vcpu_unpriv_read(vcpu, true, ct->sepc, &utrap); - if (utrap.scause) { + switch (utrap.scause) { + case 0: + break; + case EXC_LOAD_GUEST_PAGE_FAULT: + return KVM_INSN_CONTINUE_SAME_SEPC; + default: /* Redirect trap if we failed to read instruction */ utrap.sepc = ct->sepc; kvm_riscv_vcpu_trap_redirect(vcpu, &utrap); @@ -629,7 +639,12 @@ int kvm_riscv_vcpu_mmio_store(struct kvm_vcpu *vcpu, struct kvm_run *run, */ insn = kvm_riscv_vcpu_unpriv_read(vcpu, true, ct->sepc, &utrap); - if (utrap.scause) { + switch (utrap.scause) { + case 0: + break; + case EXC_LOAD_GUEST_PAGE_FAULT: + return KVM_INSN_CONTINUE_SAME_SEPC; + default: /* Redirect trap if we failed to read instruction */ utrap.sepc = ct->sepc; kvm_riscv_vcpu_trap_redirect(vcpu, &utrap); -- 2.49.0 From hendrik.hamerlinck at hammernet.be Fri Sep 12 06:57:20 2025 From: hendrik.hamerlinck at hammernet.be (Hendrik Hamerlinck) Date: Fri, 12 Sep 2025 15:57:20 +0200 Subject: [PATCH] riscv: dts: spacemit: add UART pinctrl combinations In-Reply-To: <20250911112251-GYA1216475@gentoo.org> References: <20250903145334.425633-1-hendrik.hamerlinck@hammernet.be> <20250911112251-GYA1216475@gentoo.org> Message-ID: Hello Yixun, Thank you for reviewing. On 9/11/25 13:22, Yixun Lan wrote: > Hi Hendrik, > > On 16:53 Wed 03 Sep , Hendrik Hamerlinck wrote: >> This adds UART pinctrl configurations based on the SoC datasheet and the >> downstream Bianbu Linux tree. The drive strength values were taken from >> the downstream implementation, which uses medium drive strength. >> >> For convenience, the board DTS files have been updated to include all >> UART instances with their possible pinmux options in a disabled state. >> >> Tested this locally on both Orange Pi RV2 and Banana Pi BPI-F3 boards. >> >> Signed-off-by: Hendrik Hamerlinck >> --- >> .../boot/dts/spacemit/k1-bananapi-f3.dts | 18 ++ >> .../boot/dts/spacemit/k1-orangepi-rv2.dts | 18 ++ >> arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi | 276 +++++++++++++++++- >> 3 files changed, 309 insertions(+), 3 deletions(-) >> >> diff --git a/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts b/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts >> index 6013be258542..661d47d1ce9e 100644 >> --- a/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts >> +++ b/arch/riscv/boot/dts/spacemit/k1-bananapi-f3.dts >> @@ -49,3 +49,21 @@ &uart0 { >> pinctrl-0 = <&uart0_2_cfg>; >> status = "okay"; >> }; >> + >> +&uart5 { >> + pinctrl-names = "default"; >> + pinctrl-0 = <&uart5_3_cfg>; >> + status = "disabled"; >> +}; >> + >> +&uart8 { >> + pinctrl-names = "default"; >> + pinctrl-0 = <&uart8_2_cfg>; >> + status = "disabled"; >> +}; >> + >> +&uart9 { >> + pinctrl-names = "default"; >> + pinctrl-0 = <&uart9_2_cfg>; >> + status = "disabled"; >> +}; > all of uart5, 8, 9 come from 26-pins port, the functionaly is > very likely depending on the final use case.. and I get your idea > of adding those nodes but with "disabled" status.. > > my suggestion is to not add them, or leave to users to add separated > dtbo (Device tree overlays) files in the future Fair enough, I was already doubting adding them. Most other .dts files don't include them either. I?ll remove them in the next version. > > but I'm ok to complete uart pinctrl info in the dtsi file > >> diff --git a/arch/riscv/boot/dts/spacemit/k1-orangepi-rv2.dts b/arch/riscv/boot/dts/spacemit/k1-orangepi-rv2.dts >> index 337240ebb7b7..dc45b75b1ad4 100644 >> --- a/arch/riscv/boot/dts/spacemit/k1-orangepi-rv2.dts >> +++ b/arch/riscv/boot/dts/spacemit/k1-orangepi-rv2.dts >> @@ -38,3 +38,21 @@ &uart0 { >> pinctrl-0 = <&uart0_2_cfg>; >> status = "okay"; >> }; >> + >> +&uart5 { >> + pinctrl-names = "default"; >> + pinctrl-0 = <&uart5_3_cfg>; >> + status = "disabled"; >> +}; >> + >> +&uart8 { >> + pinctrl-names = "default"; >> + pinctrl-0 = <&uart8_2_cfg>; >> + status = "disabled"; >> +}; >> + >> +&uart9 { >> + pinctrl-names = "default"; >> + pinctrl-0 = <&uart9_2_cfg>; >> + status = "disabled"; >> +}; >> diff --git a/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi b/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi >> index 381055737422..43425530b5bf 100644 >> --- a/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi >> +++ b/arch/riscv/boot/dts/spacemit/k1-pinctrl.dtsi >> @@ -11,12 +11,282 @@ >> #define K1_GPIO(x) (x / 32) (x % 32) >> >> &pinctrl { >> + uart0_0_cfg: uart0-0-cfg { >> + uart0-0-pins { >> + pinmux = , /* uart0_txd */ >> + ; /* uart0_rxd */ >> + power-source = <3300>; >> + bias-pull-up; >> + drive-strength = <19>; >> + }; >> + }; >> + >> + uart0_1_cfg: uart0-1-cfg { >> + uart0-1-pins { >> + pinmux = , /* uart0_txd */ >> + ; /* uart0_rxd */ >> + power-source = <3300>; >> + bias-pull-up; >> + drive-strength = <19>; >> + }; >> + }; >> + >> uart0_2_cfg: uart0-2-cfg { >> uart0-2-pins { >> - pinmux = , >> - ; >> + pinmux = , /* uart0_txd */ >> + ; /* uart0_rxd */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> >> - bias-pull-up = <0>; >> + uart2_0_cfg: uart2-0-cfg { >> + uart2-0-pins { >> + pinmux = , /* uart2_txd */ >> + , /* uart2_rxd */ >> + , /* uart2_cts */ >> + ; /* uart2_rts */ > I think for group has cts, rts pins, it's more practical to > have two separated cfgs, so the final application can choose to > request two pins (tx, rx), or four pins (tx, tx, cts, rts).. > (I believe the hardware should support this) > > something like this: > > uart2_0_cfg: uart2-0-cfg { > uart2-0-pins { > pinmux = , /* uart2_txd */ > , /* uart2_rxd */ > }; > }; > > uart2_0_cts_rts_cfg: uart2-0-cts-rts-cfg { > uart2-0-pins { > pinmux = , /* uart2_cts */ > , /* uart2_rts */ > }; > }; > > &uart2 { > pinctrl-names = "default"; > pinctrl-0 = <&uart2_0_cfg>, <&uart2_0_cts_rts_cfg>; > }; This sounds like good idea. There were some weird pin sequences, listing them that way would result in a better structure (f.e. uart9_1_cfg). The hardware seems to deal with it just fine. I will update it this way in the next version. > >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart3_0_cfg: uart3-0-cfg { >> + uart3-0-pins { >> + pinmux = , /* uart3_txd */ >> + , /* uart3_rxd */ >> + , /* uart3_cts */ >> + ; /* uart3_rts */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart3_1_cfg: uart3-1-cfg { >> + uart3-1-pins { >> + pinmux = , /* uart3_txd */ >> + , /* uart3_rxd */ >> + , /* uart3_cts */ >> + ; /* uart3_rts */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart3_2_cfg: uart3-2-cfg { >> + uart3-2-pins { >> + pinmux = , /* uart3_txd */ >> + , /* uart3_rxd */ >> + , /* uart3_cts */ >> + ; /* uart3_rts */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart4_0_cfg: uart4-0-cfg { >> + uart4-0-pins { >> + pinmux = , /* uart4_txd */ >> + ; /* uart4_rxd */ >> + power-source = <3300>; >> + bias-pull-up; >> + drive-strength = <19>; >> + }; >> + }; >> + >> + uart4_1_cfg: uart4-1-cfg { >> + uart4-1-pins { >> + pinmux = , /* uart4_cts */ >> + , /* uart4_rts */ >> + , /* uart4_txd */ >> + ; /* uart4_rxd */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart4_2_cfg: uart4-2-cfg { >> + uart4-2-pins { >> + pinmux = , /* uart4_txd */ >> + ; /* uart4_rxd */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart4_3_cfg: uart4-3-cfg { >> + uart4-3-pins { >> + pinmux = , /* uart4_txd */ >> + , /* uart4_rxd */ >> + , /* uart4_cts */ >> + ; /* uart4_rts */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart4_4_cfg: uart4-4-cfg { >> + uart4-4-pins { >> + pinmux = , /* uart4_txd */ >> + , /* uart4_rxd */ >> + , /* uart4_cts */ >> + ; /* uart4_rts */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart5_0_cfg: uart5-0-cfg { >> + uart5-0-pins { >> + pinmux = , /* uart5_txd */ >> + ; /* uart5_rxd */ >> + power-source = <3300>; >> + bias-pull-up; >> + drive-strength = <19>; >> + }; >> + }; >> + >> + uart5_1_cfg: uart5-1-cfg { >> + uart5-1-pins { >> + pinmux = , /* uart5_txd */ >> + , /* uart5_rxd */ >> + , /* uart5_cts */ >> + ; /* uart5_rts */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart5_2_cfg: uart5-2-cfg { >> + uart5-2-pins { >> + pinmux = , /* uart5_txd */ >> + , /* uart5_rxd */ >> + , /* uart5_cts */ >> + ; /* uart5_rts */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart5_3_cfg: uart5-3-cfg { >> + uart5-3-pins { >> + pinmux = , /* uart5_txd */ >> + , /* uart5_rxd */ >> + , /* uart5_cts */ >> + ; /* uart5_rts */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart6_0_cfg: uart6-0-cfg { >> + uart6-0-pins { >> + pinmux = , /* uart6_cts */ >> + , /* uart6_txd */ >> + , /* uart6_rxd */ >> + ; /* uart6_rts */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart6_1_cfg: uart6-1-cfg { >> + uart6-1-pins { >> + pinmux = , /* uart6_txd */ >> + , /* uart6_rxd */ >> + , /* uart6_cts */ >> + ; /* uart6_rts */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart6_2_cfg: uart6-2-cfg { >> + uart6-2-pins { >> + pinmux = , /* uart6_txd */ >> + ; /* uart6_rxd */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart7_0_cfg: uart7-0-cfg { >> + uart7-0-pins { >> + pinmux = , /* uart7_txd */ >> + ; /* uart7_rxd */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart7_1_cfg: uart7-1-cfg { >> + uart7-1-pins { >> + pinmux = , /* uart7_txd */ >> + , /* uart7_rxd */ >> + , /* uart7_cts */ >> + ; /* uart7_rts */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart8_0_cfg: uart8-0-cfg { >> + uart8-0-pins { >> + pinmux = , /* uart8_txd */ >> + ; /* uart8_rxd */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart8_1_cfg: uart8-1-cfg { >> + uart8-1-pins { >> + pinmux = , /* uart8_txd */ >> + , /* uart8_rxd */ >> + , /* uart8_cts */ >> + ; /* uart8_rts */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart8_2_cfg: uart8-2-cfg { >> + uart8-2-pins { >> + pinmux = , /* uart8_txd */ >> + , /* uart8_rxd */ >> + , /* uart8_cts */ >> + ; /* uart8_rts */ >> + power-source = <3300>; >> + bias-pull-up; >> + drive-strength = <19>; >> + }; >> + }; >> + >> + uart9_0_cfg: uart9-0-cfg { >> + uart9-0-pins { >> + pinmux = , /* uart9_txd */ >> + ; /* uart9_rxd */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart9_1_cfg: uart9-1-cfg { >> + uart9-1-pins { >> + pinmux = , /* uart9_cts */ >> + , /* uart9_rts */ >> + , /* uart9_txd */ >> + ; /* uart9_rxd */ >> + bias-pull-up; >> + drive-strength = <32>; >> + }; >> + }; >> + >> + uart9_2_cfg: uart9-2-cfg { >> + uart9-2-pins { >> + pinmux = , /* uart9_txd */ >> + ; /* uart9_rxd */ >> + bias-pull-up; >> drive-strength = <32>; >> }; >> }; >> -- >> 2.43.0 >> Kind regards, Hendrik From fangyu.yu at linux.alibaba.com Fri Sep 12 07:01:42 2025 From: fangyu.yu at linux.alibaba.com (fangyu.yu at linux.alibaba.com) Date: Fri, 12 Sep 2025 22:01:42 +0800 Subject: [PATCH] RISC-V: KVM: Fix guest page fault within HLV* instructions In-Reply-To: <20250912134332.22053-1-fangyu.yu@linux.alibaba.com> References: <20250912134332.22053-1-fangyu.yu@linux.alibaba.com> Message-ID: <20250912140142.25147-1-fangyu.yu@linux.alibaba.com> >From: Fangyu Yu > >When executing HLV* instructions at the HS mode, a guest page fault >may occur when a g-stage page table migration between triggering the >virtual instruction exception and executing the HLV* instruction. > >This may be a corner case, and one simpler way to handle this is to >re-execute the instruction where the virtual instruction exception >occurred, and the guest page fault will be automatically handled. > >Fixes: 9f7013265112 ("RISC-V: KVM: Handle MMIO exits for VCPU") >Signed-off-by: Fangyu Yu >--- > arch/riscv/kvm/vcpu_insn.c | 21 ++++++++++++++++++--- > 1 file changed, 18 insertions(+), 3 deletions(-) > >diff --git a/arch/riscv/kvm/vcpu_insn.c b/arch/riscv/kvm/vcpu_insn.c >index 97dec18e6989..a8b93aa4d8ec 100644 >--- a/arch/riscv/kvm/vcpu_insn.c >+++ b/arch/riscv/kvm/vcpu_insn.c >@@ -448,7 +448,12 @@ int kvm_riscv_vcpu_virtual_insn(struct kvm_vcpu *vcpu, struct kvm_run *run, > insn = kvm_riscv_vcpu_unpriv_read(vcpu, true, > ct->sepc, > &utrap); >- if (utrap.scause) { >+ switch (utrap.scause) { >+ case 0: >+ break; >+ case EXC_LOAD_GUEST_PAGE_FAULT: >+ return KVM_INSN_CONTINUE_SAME_SEPC; >+ default: > utrap.sepc = ct->sepc; > kvm_riscv_vcpu_trap_redirect(vcpu, &utrap); > return 1; >@@ -503,7 +508,12 @@ int kvm_riscv_vcpu_mmio_load(struct kvm_vcpu *vcpu, struct kvm_run *run, > */ > insn = kvm_riscv_vcpu_unpriv_read(vcpu, true, ct->sepc, > &utrap); >- if (utrap.scause) { >+ switch (utrap.scause) { >+ case 0: >+ break; >+ case EXC_LOAD_GUEST_PAGE_FAULT: >+ return KVM_INSN_CONTINUE_SAME_SEPC; >+ default: > /* Redirect trap if we failed to read instruction */ > utrap.sepc = ct->sepc; > kvm_riscv_vcpu_trap_redirect(vcpu, &utrap); >@@ -629,7 +639,12 @@ int kvm_riscv_vcpu_mmio_store(struct kvm_vcpu *vcpu, struct kvm_run *run, > */ > insn = kvm_riscv_vcpu_unpriv_read(vcpu, true, ct->sepc, > &utrap); >- if (utrap.scause) { >+ switch (utrap.scause) { >+ case 0: >+ break; >+ case EXC_LOAD_GUEST_PAGE_FAULT: Here should be EXC_STORE_GUEST_PAGE_FAULT, I will fix it next version. >+ return KVM_INSN_CONTINUE_SAME_SEPC; >+ default: > /* Redirect trap if we failed to read instruction */ > utrap.sepc = ct->sepc; > kvm_riscv_vcpu_trap_redirect(vcpu, &utrap); >-- >2.49.0 From huang.ze at linux.dev Fri Sep 12 09:53:46 2025 From: huang.ze at linux.dev (Ze Huang) Date: Sat, 13 Sep 2025 00:53:46 +0800 Subject: [PATCH v8 0/2] Add SpacemiT K1 USB3.0 host controller support Message-ID: <20250913-dwc3_generic-v8-0-b50f81f05f95@linux.dev> The USB 3.0 controller found in the SpacemiT K1 SoC[1] supports both USB3.0 Host and USB2.0 Dual-Role Device (DRD). This controller is compatible with DesignWare Core USB 3 (DWC3) driver. However, constraints in the `snps,dwc3` bindings limit the ability to describe hardware-specific features in a clean and maintainable way. While `dwc3-of-simple` still serves as a glue layer for many platforms, it requires a split device tree node structure, which is less desirable in newer platforms. To promote a transition toward a flattened `dwc` node structure, this series introduces `dwc3-generic-plat`, building upon prior efforts that exposed the DWC3 core driver [2]. The device tree support for SpacemiT K1 will be submitted separately when the associated PHY driver is ready. This series is based on 6.17-rc1 and has been tested on BananaPi development boards. Link: https://developer.spacemit.com/documentation?token=AjHDwrW78igAAEkiHracBI9HnTb [1] Link: https://lore.kernel.org/all/20250414-dwc3-refactor-v7-3-f015b358722d at oss.qualcomm.com [2] Signed-off-by: Ze Huang --- Changes in v8: - dt-bindings: remove the PCIe reset which will be managed by combo PHY driver. - driver: fix incorrect `dev_get_drvdata()` usage in PM ops, as `dwc3_core_probe()` overwrites the device's private data. - Link to v7: https://lore.kernel.org/r/20250729-dwc3_generic-v7-0-5c791bba826f at linux.dev Changes in v7: - dt-bindings: - add reset-names and reset-delay properties - add pcie0 entry in resets for combphy init - dwc3 generic plat driver: - drop clock cleanup and handle with devm_clk_bulk_get_all_enabled() - move devm reset action after successful de-assert - Link to v6: https://lore.kernel.org/r/20250712-dwc3_generic-v6-0-cc87737cc936 at linux.dev Changes in v6: - replace SET_RUNTIME_PM_OPS/SET_SYSTEM_SLEEP_PM_OPS with RUNTIME_PM_OPS/SYSTEM_SLEEP_PM_OPS - Link to v5: https://lore.kernel.org/r/20250705-dwc3_generic-v5-0-9dbc53ea53d2 at linux.dev Changes in v5: - drop DTS patch (will submit when PHY driver is ready) - drop interconnects and update resets property in dt-bindings - remove unnecessary __maybe_unused attribute and PM guards - switch to devres APIs for reset and clock management - Link to v4: https://lore.kernel.org/all/20250526-b4-k1-dwc3-v3-v4-0-63e4e525e5cb at whut.edu.cn/ Changes in v4: - dt-bindings spacemit,k1-dwc: - reorder properties - add properties of phys & phy-names - add usb hub nodes in example dt - add support for spacemit,k1-mbus - dwc3 generic plat driver: - rename dwc3-common.c to dwc3-generic-plat.c - use SYSTEM_SLEEP_PM_OPS macros and drop PM guards - dts: - reorder dts properties of usb dwc3 node - move "dr_mode" of dwc3 from dtsi to dts - Link to v3: https://lore.kernel.org/r/20250518-b4-k1-dwc3-v3-v3-0-7609c8baa2a6 at whut.edu.cn Changes in v3: - introduce dwc3-common for generic dwc3 hardware - fix warnings in usb host dt-bindings - fix errors in dts - Link to v2: https://lore.kernel.org/r/20250428-b4-k1-dwc3-v2-v1-0-7cb061abd619 at whut.edu.cn Changes in v2: - dt-bindings: - add missing 'maxItems' - remove 'status' property in exmaple - fold dwc3 node into parent - drop dwc3 glue driver and use snps,dwc3 driver directly - rename dts nodes and reorder properties to fit coding style - Link to v1: https://lore.kernel.org/all/20250407-b4-k1-usb3-v3-2-v1-0-bf0bcc41c9ba at whut.edu.cn --- Ze Huang (2): dt-bindings: usb: dwc3: add support for SpacemiT K1 usb: dwc3: add generic driver to support flattened .../devicetree/bindings/usb/spacemit,k1-dwc3.yaml | 121 +++++++++++++++ drivers/usb/dwc3/Kconfig | 11 ++ drivers/usb/dwc3/Makefile | 1 + drivers/usb/dwc3/dwc3-generic-plat.c | 166 +++++++++++++++++++++ 4 files changed, 299 insertions(+) --- base-commit: 062b3e4a1f880f104a8d4b90b767788786aa7b78 change-id: 20250705-dwc3_generic-8d02859722c3 Best regards, -- Ze Huang From huang.ze at linux.dev Fri Sep 12 09:53:47 2025 From: huang.ze at linux.dev (Ze Huang) Date: Sat, 13 Sep 2025 00:53:47 +0800 Subject: [PATCH v8 1/2] dt-bindings: usb: dwc3: add support for SpacemiT K1 In-Reply-To: <20250913-dwc3_generic-v8-0-b50f81f05f95@linux.dev> References: <20250913-dwc3_generic-v8-0-b50f81f05f95@linux.dev> Message-ID: <20250913-dwc3_generic-v8-1-b50f81f05f95@linux.dev> Add support for the USB 3.0 Dual-Role Device (DRD) controller embedded in the SpacemiT K1 SoC. The controller is based on the Synopsys DesignWare Core USB 3 (DWC3) IP, supporting USB3.0 host mode and USB 2.0 DRD mode. Reviewed-by: Krzysztof Kozlowski Signed-off-by: Ze Huang --- .../devicetree/bindings/usb/spacemit,k1-dwc3.yaml | 121 +++++++++++++++++++++ 1 file changed, 121 insertions(+) diff --git a/Documentation/devicetree/bindings/usb/spacemit,k1-dwc3.yaml b/Documentation/devicetree/bindings/usb/spacemit,k1-dwc3.yaml new file mode 100644 index 0000000000000000000000000000000000000000..0f0b5e061ca17cd02866a78d15d3808f933a76c2 --- /dev/null +++ b/Documentation/devicetree/bindings/usb/spacemit,k1-dwc3.yaml @@ -0,0 +1,121 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/usb/spacemit,k1-dwc3.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: SpacemiT K1 SuperSpeed DWC3 USB SoC Controller + +maintainers: + - Ze Huang + +description: | + The SpacemiT K1 embeds a DWC3 USB IP Core which supports Host functions + for USB 3.0 and DRD for USB 2.0. + + Key features: + - USB3.0 SuperSpeed and USB2.0 High/Full/Low-Speed support + - Supports low-power modes (USB2.0 suspend, USB3.0 U1/U2/U3) + - Internal DMA controller and flexible endpoint FIFO sizing + + Communication Interface: + - Use of PIPE3 (125MHz) interface for USB3.0 PHY + - Use of UTMI+ (30/60MHz) interface for USB2.0 PHY + +allOf: + - $ref: snps,dwc3-common.yaml# + +properties: + compatible: + const: spacemit,k1-dwc3 + + reg: + maxItems: 1 + + clocks: + maxItems: 1 + + clock-names: + const: usbdrd30 + + interrupts: + maxItems: 1 + + phys: + items: + - description: phandle to USB2/HS PHY + - description: phandle to USB3/SS PHY + + phy-names: + items: + - const: usb2-phy + - const: usb3-phy + + resets: + items: + - description: USB3.0 AHB reset + - description: USB3.0 VCC reset + - description: USB3.0 PHY reset + + reset-names: + items: + - const: ahb + - const: vcc + - const: phy + + reset-delay: + $ref: /schemas/types.yaml#/definitions/uint32 + default: 2 + description: delay after reset sequence [us] + + vbus-supply: + description: A phandle to the regulator supplying the VBUS voltage. + +required: + - compatible + - reg + - clocks + - clock-names + - interrupts + - phys + - phy-names + - resets + - reset-names + +unevaluatedProperties: false + +examples: + - | + usb at c0a00000 { + compatible = "spacemit,k1-dwc3"; + reg = <0xc0a00000 0x10000>; + clocks = <&syscon_apmu 16>; + clock-names = "usbdrd30"; + interrupts = <125>; + phys = <&usb2phy>, <&usb3phy>; + phy-names = "usb2-phy", "usb3-phy"; + resets = <&syscon_apmu 8>, + <&syscon_apmu 9>, + <&syscon_apmu 10>; + reset-names = "ahb", "vcc", "phy"; + reset-delay = <2>; + vbus-supply = <&usb3_vbus>; + #address-cells = <1>; + #size-cells = <0>; + + hub_2_0: hub at 1 { + compatible = "usb2109,2817"; + reg = <1>; + vdd-supply = <&usb3_vhub>; + peer-hub = <&hub_3_0>; + reset-gpios = <&gpio 3 28 1>; + }; + + hub_3_0: hub at 2 { + compatible = "usb2109,817"; + reg = <2>; + vdd-supply = <&usb3_vhub>; + peer-hub = <&hub_2_0>; + reset-gpios = <&gpio 3 28 1>; + }; + }; -- 2.34.1 From huang.ze at linux.dev Fri Sep 12 09:53:48 2025 From: huang.ze at linux.dev (Ze Huang) Date: Sat, 13 Sep 2025 00:53:48 +0800 Subject: [PATCH v8 2/2] usb: dwc3: add generic driver to support flattened In-Reply-To: <20250913-dwc3_generic-v8-0-b50f81f05f95@linux.dev> References: <20250913-dwc3_generic-v8-0-b50f81f05f95@linux.dev> Message-ID: <20250913-dwc3_generic-v8-2-b50f81f05f95@linux.dev> To support flattened dwc3 dt model and drop the glue layer, introduce the `dwc3-generic` driver. This enables direct binding of the DWC3 core driver and offers an alternative to the existing glue driver `dwc3-of-simple`. Acked-by: Thinh Nguyen Signed-off-by: Ze Huang --- drivers/usb/dwc3/Kconfig | 11 +++ drivers/usb/dwc3/Makefile | 1 + drivers/usb/dwc3/dwc3-generic-plat.c | 166 +++++++++++++++++++++++++++++++++++ 3 files changed, 178 insertions(+) diff --git a/drivers/usb/dwc3/Kconfig b/drivers/usb/dwc3/Kconfig index 310d182e10b50b253d7e5a51674806e6ec442a2a..4925d15084f816d3ff92059b476ebcc799b56b51 100644 --- a/drivers/usb/dwc3/Kconfig +++ b/drivers/usb/dwc3/Kconfig @@ -189,4 +189,15 @@ config USB_DWC3_RTK or dual-role mode. Say 'Y' or 'M' if you have such device. +config USB_DWC3_GENERIC_PLAT + tristate "DWC3 Generic Platform Driver" + depends on OF && COMMON_CLK + default USB_DWC3 + help + Support USB3 functionality in simple SoC integrations. + Currently supports SpacemiT DWC USB3. Platforms using + dwc3-of-simple can easily switch to dwc3-generic by flattening + the dwc3 child node in the device tree. + Say 'Y' or 'M' here if your platform integrates DWC3 in a similar way. + endif diff --git a/drivers/usb/dwc3/Makefile b/drivers/usb/dwc3/Makefile index 830e6c9e5fe073c1f662ce34b6a4a2da34c407a2..96469e48ff9d189cc8d0b65e65424eae2158bcfe 100644 --- a/drivers/usb/dwc3/Makefile +++ b/drivers/usb/dwc3/Makefile @@ -57,3 +57,4 @@ obj-$(CONFIG_USB_DWC3_IMX8MP) += dwc3-imx8mp.o obj-$(CONFIG_USB_DWC3_XILINX) += dwc3-xilinx.o obj-$(CONFIG_USB_DWC3_OCTEON) += dwc3-octeon.o obj-$(CONFIG_USB_DWC3_RTK) += dwc3-rtk.o +obj-$(CONFIG_USB_DWC3_GENERIC_PLAT) += dwc3-generic-plat.o diff --git a/drivers/usb/dwc3/dwc3-generic-plat.c b/drivers/usb/dwc3/dwc3-generic-plat.c new file mode 100644 index 0000000000000000000000000000000000000000..d96b20570002dc619ea813f4d6a8013636a0f346 --- /dev/null +++ b/drivers/usb/dwc3/dwc3-generic-plat.c @@ -0,0 +1,166 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * dwc3-generic-plat.c - DesignWare USB3 generic platform driver + * + * Copyright (C) 2025 Ze Huang + * + * Inspired by dwc3-qcom.c and dwc3-of-simple.c + */ + +#include +#include +#include +#include "glue.h" + +struct dwc3_generic { + struct device *dev; + struct dwc3 dwc; + struct clk_bulk_data *clks; + int num_clocks; + struct reset_control *resets; +}; + +#define to_dwc3_generic(d) container_of((d), struct dwc3_generic, dwc) + +static void dwc3_generic_reset_control_assert(void *data) +{ + reset_control_assert(data); +} + +static int dwc3_generic_probe(struct platform_device *pdev) +{ + struct dwc3_probe_data probe_data = {}; + struct device *dev = &pdev->dev; + struct dwc3_generic *dwc3g; + struct resource *res; + int ret; + + dwc3g = devm_kzalloc(dev, sizeof(*dwc3g), GFP_KERNEL); + if (!dwc3g) + return -ENOMEM; + + dwc3g->dev = dev; + + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); + if (!res) { + dev_err(&pdev->dev, "missing memory resource\n"); + return -ENODEV; + } + + dwc3g->resets = devm_reset_control_array_get_optional_exclusive(dev); + if (IS_ERR(dwc3g->resets)) + return dev_err_probe(dev, PTR_ERR(dwc3g->resets), "failed to get resets\n"); + + ret = reset_control_assert(dwc3g->resets); + if (ret) + return dev_err_probe(dev, ret, "failed to assert resets\n"); + + /* Not strict timing, just for safety */ + udelay(2); + + ret = reset_control_deassert(dwc3g->resets); + if (ret) + return dev_err_probe(dev, ret, "failed to deassert resets\n"); + + ret = devm_add_action_or_reset(dev, dwc3_generic_reset_control_assert, dwc3g->resets); + if (ret) + return ret; + + ret = devm_clk_bulk_get_all_enabled(dwc3g->dev, &dwc3g->clks); + if (ret < 0) + return dev_err_probe(dev, ret, "failed to get clocks\n"); + + dwc3g->num_clocks = ret; + dwc3g->dwc.dev = dev; + probe_data.dwc = &dwc3g->dwc; + probe_data.res = res; + probe_data.ignore_clocks_and_resets = true; + ret = dwc3_core_probe(&probe_data); + if (ret) + return dev_err_probe(dev, ret, "failed to register DWC3 Core\n"); + + return 0; +} + +static void dwc3_generic_remove(struct platform_device *pdev) +{ + struct dwc3 *dwc = platform_get_drvdata(pdev); + struct dwc3_generic *dwc3g = to_dwc3_generic(dwc); + + dwc3_core_remove(dwc); + + clk_bulk_disable_unprepare(dwc3g->num_clocks, dwc3g->clks); +} + +static int dwc3_generic_suspend(struct device *dev) +{ + struct dwc3 *dwc = dev_get_drvdata(dev); + struct dwc3_generic *dwc3g = to_dwc3_generic(dwc); + int ret; + + ret = dwc3_pm_suspend(dwc); + if (ret) + return ret; + + clk_bulk_disable_unprepare(dwc3g->num_clocks, dwc3g->clks); + + return 0; +} + +static int dwc3_generic_resume(struct device *dev) +{ + struct dwc3 *dwc = dev_get_drvdata(dev); + struct dwc3_generic *dwc3g = to_dwc3_generic(dwc); + int ret; + + ret = clk_bulk_prepare_enable(dwc3g->num_clocks, dwc3g->clks); + if (ret) + return ret; + + ret = dwc3_pm_resume(dwc); + if (ret) + return ret; + + return 0; +} + +static int dwc3_generic_runtime_suspend(struct device *dev) +{ + return dwc3_runtime_suspend(dev_get_drvdata(dev)); +} + +static int dwc3_generic_runtime_resume(struct device *dev) +{ + return dwc3_runtime_resume(dev_get_drvdata(dev)); +} + +static int dwc3_generic_runtime_idle(struct device *dev) +{ + return dwc3_runtime_idle(dev_get_drvdata(dev)); +} + +static const struct dev_pm_ops dwc3_generic_dev_pm_ops = { + SYSTEM_SLEEP_PM_OPS(dwc3_generic_suspend, dwc3_generic_resume) + RUNTIME_PM_OPS(dwc3_generic_runtime_suspend, dwc3_generic_runtime_resume, + dwc3_generic_runtime_idle) +}; + +static const struct of_device_id dwc3_generic_of_match[] = { + { .compatible = "spacemit,k1-dwc3", }, + { /* sentinel */ } +}; +MODULE_DEVICE_TABLE(of, dwc3_generic_of_match); + +static struct platform_driver dwc3_generic_driver = { + .probe = dwc3_generic_probe, + .remove = dwc3_generic_remove, + .driver = { + .name = "dwc3-generic-plat", + .of_match_table = dwc3_generic_of_match, + .pm = pm_ptr(&dwc3_generic_dev_pm_ops), + }, +}; +module_platform_driver(dwc3_generic_driver); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("DesignWare USB3 generic platform driver"); -- 2.34.1 From conor at kernel.org Fri Sep 12 10:57:19 2025 From: conor at kernel.org (Conor Dooley) Date: Fri, 12 Sep 2025 18:57:19 +0100 Subject: [PATCH 2/3] riscv: dts: thead: add ziccrse for th1520 In-Reply-To: <20250911184528.1512543-3-rabenda.cn@gmail.com> References: <20250911184528.1512543-1-rabenda.cn@gmail.com> <20250911184528.1512543-3-rabenda.cn@gmail.com> Message-ID: <20250912-gander-fox-d20c2e431816@spud> On Fri, Sep 12, 2025 at 02:45:27AM +0800, Han Gao wrote: > th1520 support Ziccrse ISA extension [1]. > > Link: https://lore.kernel.org/all/20241103145153.105097-12-alexghiti at rivosinc.com/ [1] I don't see what this link has to do with th1520 supporting the extension. The kernel supporting it has nothing to do with whether it should be in the dts or not. A useful link would substantiate your claim. > Signed-off-by: Han Gao > Signed-off-by: Han Gao You only need to sign this off once. Cheers, Conor. > --- > arch/riscv/boot/dts/thead/th1520.dtsi | 24 ++++++++++++++++-------- > 1 file changed, 16 insertions(+), 8 deletions(-) > > diff --git a/arch/riscv/boot/dts/thead/th1520.dtsi b/arch/riscv/boot/dts/thead/th1520.dtsi > index 59d1927764a6..7f07688aa964 100644 > --- a/arch/riscv/boot/dts/thead/th1520.dtsi > +++ b/arch/riscv/boot/dts/thead/th1520.dtsi > @@ -24,8 +24,10 @@ c910_0: cpu at 0 { > device_type = "cpu"; > riscv,isa = "rv64imafdc"; > riscv,isa-base = "rv64i"; > - riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", > - "zifencei", "zihpm", "xtheadvector"; > + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > + "ziccrse", "zicntr", "zicsr", > + "zifencei", "zihpm", > + "xtheadvector"; > thead,vlenb = <16>; > reg = <0>; > i-cache-block-size = <64>; > @@ -49,8 +51,10 @@ c910_1: cpu at 1 { > device_type = "cpu"; > riscv,isa = "rv64imafdc"; > riscv,isa-base = "rv64i"; > - riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", > - "zifencei", "zihpm", "xtheadvector"; > + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > + "ziccrse", "zicntr", "zicsr", > + "zifencei", "zihpm", > + "xtheadvector"; > thead,vlenb = <16>; > reg = <1>; > i-cache-block-size = <64>; > @@ -74,8 +78,10 @@ c910_2: cpu at 2 { > device_type = "cpu"; > riscv,isa = "rv64imafdc"; > riscv,isa-base = "rv64i"; > - riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", > - "zifencei", "zihpm", "xtheadvector"; > + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > + "ziccrse", "zicntr", "zicsr", > + "zifencei", "zihpm", > + "xtheadvector"; > thead,vlenb = <16>; > reg = <2>; > i-cache-block-size = <64>; > @@ -99,8 +105,10 @@ c910_3: cpu at 3 { > device_type = "cpu"; > riscv,isa = "rv64imafdc"; > riscv,isa-base = "rv64i"; > - riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", > - "zifencei", "zihpm", "xtheadvector"; > + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > + "ziccrse", "zicntr", "zicsr", > + "zifencei", "zihpm", > + "xtheadvector"; > thead,vlenb = <16>; > reg = <3>; > i-cache-block-size = <64>; > -- > 2.47.3 > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From conor at kernel.org Fri Sep 12 10:59:08 2025 From: conor at kernel.org (Conor Dooley) Date: Fri, 12 Sep 2025 18:59:08 +0100 Subject: [PATCH 3/3] riscv: dts: thead: add zfh for th1520 In-Reply-To: <20250911184528.1512543-4-rabenda.cn@gmail.com> References: <20250911184528.1512543-1-rabenda.cn@gmail.com> <20250911184528.1512543-4-rabenda.cn@gmail.com> Message-ID: <20250912-verdict-croon-81ac20e5b621@spud> On Fri, Sep 12, 2025 at 02:45:28AM +0800, Han Gao wrote: > th1520 support Zfh ISA extension [1]. > > Link: https://occ-oss-prod.oss-cn-hangzhou.aliyuncs.com/resource//1737721869472/%E7%8E%84%E9%93%81C910%E4%B8%8EC920R1S6%E7%94%A8%E6%88%B7%E6%89%8B%E5%86%8C%28xrvm%29_20250124.pdf [1] Could you please cite the section that this is detailed in? > > Signed-off-by: Han Gao > Signed-off-by: Han Gao > --- > arch/riscv/boot/dts/thead/th1520.dtsi | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/riscv/boot/dts/thead/th1520.dtsi b/arch/riscv/boot/dts/thead/th1520.dtsi > index 7f07688aa964..2075bb969c2f 100644 > --- a/arch/riscv/boot/dts/thead/th1520.dtsi > +++ b/arch/riscv/boot/dts/thead/th1520.dtsi > @@ -26,7 +26,7 @@ c910_0: cpu at 0 { > riscv,isa-base = "rv64i"; > riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > "ziccrse", "zicntr", "zicsr", > - "zifencei", "zihpm", > + "zifencei", "zihpm", "zfh", > "xtheadvector"; > thead,vlenb = <16>; > reg = <0>; > @@ -53,7 +53,7 @@ c910_1: cpu at 1 { > riscv,isa-base = "rv64i"; > riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > "ziccrse", "zicntr", "zicsr", > - "zifencei", "zihpm", > + "zifencei", "zihpm", "zfh", > "xtheadvector"; > thead,vlenb = <16>; > reg = <1>; > @@ -80,7 +80,7 @@ c910_2: cpu at 2 { > riscv,isa-base = "rv64i"; > riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > "ziccrse", "zicntr", "zicsr", > - "zifencei", "zihpm", > + "zifencei", "zihpm", "zfh", > "xtheadvector"; > thead,vlenb = <16>; > reg = <2>; > @@ -107,7 +107,7 @@ c910_3: cpu at 3 { > riscv,isa-base = "rv64i"; > riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > "ziccrse", "zicntr", "zicsr", > - "zifencei", "zihpm", > + "zifencei", "zihpm", "zfh", > "xtheadvector"; > thead,vlenb = <16>; > reg = <3>; > -- > 2.47.3 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: From devnull+schuster.simon.siemens-energy.com at kernel.org Mon Sep 1 06:09:49 2025 From: devnull+schuster.simon.siemens-energy.com at kernel.org (Simon Schuster via B4 Relay) Date: Mon, 01 Sep 2025 15:09:49 +0200 Subject: [PATCH v2 0/4] nios2: Add architecture support for clone3 Message-ID: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> This series adds support for the clone3 system call to the nios2 architecture. This addresses the build-time warning "warning: clone3() entry point is missing, please fix" introduced in 505d66d1abfb9 ("clone3: drop __ARCH_WANT_SYS_CLONE3 macro"). The implementation passes the relevant clone3 tests of kselftest when applied on top of next-20250815: ./run_kselftest.sh TAP version 13 1..4 # selftests: clone3: clone3 ok 1 selftests: clone3: clone3 # selftests: clone3: clone3_clear_sighand ok 2 selftests: clone3: clone3_clear_sighand # selftests: clone3: clone3_set_tid ok 3 selftests: clone3: clone3_set_tid # selftests: clone3: clone3_cap_checkpoint_restore ok 4 selftests: clone3: clone3_cap_checkpoint_restore The series also includes a small patch to kernel/fork.c that ensures that clone_flags are passed correctly on architectures where unsigned long is insufficient to store the u64 clone_flags. It is marked as a fix for stable backporting. As requested, in v2, this series now further tries to correct this type error throughout the whole code base. Thus, it now touches a larger number of subsystems and all architectures. Therefore, another test was performed for ARCH=x86_64 (as a representative for 64-bit architectures). Here, the series builds cleanly without warnings on defconfig with CONFIG_SECURITY_APPARMOR=y and CONFIG_SECURITY_TOMOYO=y (to compile-check the LSM-related changes). The build further successfully passes testing/selftests/clone3 (with the patch from 20241105062948.1037011-1-zhouyuhang1010 at 163.com to prepare clone3_cap_checkpoint_restore for compatibility with the newer libcap version on my system). Is there any option to further preflight check this patch series via lkp/KernelCI/etc. for a broader test across architectures, or is this degree of testing sufficient to eventually get the series merged? N.B.: The series is not checkpatch clean right now: - include/linux/cred.h, include/linux/mnt_namespace.h: function definition arguments without identifier name - include/trace/events/task.h: space prohibited after that open parenthesis I did not fix these warnings to keep my changes minimal and reviewable, as the issues persist throughout the files and they were not introduced by me; I only followed the existing code style and just replaced the types. If desired, I'd be happy to make the changes in a potential v3, though. Signed-off-by: Simon Schuster --- Changes in v2: - Introduce "Fixes:" and "Cc: stable at vger.kernel.org" where necessary - Factor out "Fixes:" when adapting the datatype of clone_flags for easier backports - Fix additional instances where `unsigned long` clone_flags is used - Reword commit message to make it clearer that any 32-bit arch is affected by this bug - Link to v1: https://lore.kernel.org/r/20250821-nios2-implement-clone3-v1-0-1bb24017376a at siemens-energy.com --- Simon Schuster (4): copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64) copy_process: pass clone_flags as u64 across calltree arch: copy_thread: pass clone_flags as u64 nios2: implement architecture-specific portion of sys_clone3 arch/alpha/kernel/process.c | 2 +- arch/arc/kernel/process.c | 2 +- arch/arm/kernel/process.c | 2 +- arch/arm64/kernel/process.c | 2 +- arch/csky/kernel/process.c | 2 +- arch/hexagon/kernel/process.c | 2 +- arch/loongarch/kernel/process.c | 2 +- arch/m68k/kernel/process.c | 2 +- arch/microblaze/kernel/process.c | 2 +- arch/mips/kernel/process.c | 2 +- arch/nios2/include/asm/syscalls.h | 1 + arch/nios2/include/asm/unistd.h | 2 -- arch/nios2/kernel/entry.S | 6 ++++++ arch/nios2/kernel/process.c | 2 +- arch/nios2/kernel/syscall_table.c | 1 + arch/openrisc/kernel/process.c | 2 +- arch/parisc/kernel/process.c | 2 +- arch/powerpc/kernel/process.c | 2 +- arch/riscv/kernel/process.c | 2 +- arch/s390/kernel/process.c | 2 +- arch/sh/kernel/process_32.c | 2 +- arch/sparc/kernel/process_32.c | 2 +- arch/sparc/kernel/process_64.c | 2 +- arch/um/kernel/process.c | 2 +- arch/x86/include/asm/fpu/sched.h | 2 +- arch/x86/include/asm/shstk.h | 4 ++-- arch/x86/kernel/fpu/core.c | 2 +- arch/x86/kernel/process.c | 2 +- arch/x86/kernel/shstk.c | 2 +- arch/xtensa/kernel/process.c | 2 +- block/blk-ioc.c | 2 +- fs/namespace.c | 2 +- include/linux/cgroup.h | 4 ++-- include/linux/cred.h | 2 +- include/linux/iocontext.h | 6 +++--- include/linux/ipc_namespace.h | 4 ++-- include/linux/lsm_hook_defs.h | 2 +- include/linux/mnt_namespace.h | 2 +- include/linux/nsproxy.h | 2 +- include/linux/pid_namespace.h | 4 ++-- include/linux/rseq.h | 4 ++-- include/linux/sched/task.h | 2 +- include/linux/security.h | 4 ++-- include/linux/sem.h | 4 ++-- include/linux/time_namespace.h | 4 ++-- include/linux/uprobes.h | 4 ++-- include/linux/user_events.h | 4 ++-- include/linux/utsname.h | 4 ++-- include/net/net_namespace.h | 4 ++-- include/trace/events/task.h | 6 +++--- ipc/namespace.c | 2 +- ipc/sem.c | 2 +- kernel/cgroup/namespace.c | 2 +- kernel/cred.c | 2 +- kernel/events/uprobes.c | 2 +- kernel/fork.c | 10 +++++----- kernel/nsproxy.c | 4 ++-- kernel/pid_namespace.c | 2 +- kernel/sched/core.c | 4 ++-- kernel/sched/fair.c | 2 +- kernel/sched/sched.h | 4 ++-- kernel/time/namespace.c | 2 +- kernel/utsname.c | 2 +- net/core/net_namespace.c | 2 +- security/apparmor/lsm.c | 2 +- security/security.c | 2 +- security/selinux/hooks.c | 2 +- security/tomoyo/tomoyo.c | 2 +- 68 files changed, 95 insertions(+), 89 deletions(-) --- base-commit: 1357b2649c026b51353c84ddd32bc963e8999603 change-id: 20250818-nios2-implement-clone3-7f252c20860b Best regards, -- Simon Schuster From devnull+schuster.simon.siemens-energy.com at kernel.org Mon Sep 1 06:09:51 2025 From: devnull+schuster.simon.siemens-energy.com at kernel.org (Simon Schuster via B4 Relay) Date: Mon, 01 Sep 2025 15:09:51 +0200 Subject: [PATCH v2 2/4] copy_process: pass clone_flags as u64 across calltree In-Reply-To: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> Message-ID: <20250901-nios2-implement-clone3-v2-2-53fcf5577d57@siemens-energy.com> From: Simon Schuster With the introduction of clone3 in commit 7f192e3cd316 ("fork: add clone3") the effective bit width of clone_flags on all architectures was increased from 32-bit to 64-bit, with a new type of u64 for the flags. However, for most consumers of clone_flags the interface was not changed from the previous type of unsigned long. While this works fine as long as none of the new 64-bit flag bits (CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still undesirable in terms of the principle of least surprise. Thus, this commit fixes all relevant interfaces of callees to sys_clone3/copy_process (excluding the architecture-specific copy_thread) to consistently pass clone_flags as u64, so that no truncation to 32-bit integers occurs on 32-bit architectures. Signed-off-by: Simon Schuster Reviewed-by: Lorenzo Stoakes --- block/blk-ioc.c | 2 +- fs/namespace.c | 2 +- include/linux/cgroup.h | 4 ++-- include/linux/cred.h | 2 +- include/linux/iocontext.h | 6 +++--- include/linux/ipc_namespace.h | 4 ++-- include/linux/lsm_hook_defs.h | 2 +- include/linux/mnt_namespace.h | 2 +- include/linux/nsproxy.h | 2 +- include/linux/pid_namespace.h | 4 ++-- include/linux/rseq.h | 4 ++-- include/linux/sched/task.h | 2 +- include/linux/security.h | 4 ++-- include/linux/sem.h | 4 ++-- include/linux/time_namespace.h | 4 ++-- include/linux/uprobes.h | 4 ++-- include/linux/user_events.h | 4 ++-- include/linux/utsname.h | 4 ++-- include/net/net_namespace.h | 4 ++-- include/trace/events/task.h | 6 +++--- ipc/namespace.c | 2 +- ipc/sem.c | 2 +- kernel/cgroup/namespace.c | 2 +- kernel/cred.c | 2 +- kernel/events/uprobes.c | 2 +- kernel/fork.c | 8 ++++---- kernel/nsproxy.c | 4 ++-- kernel/pid_namespace.c | 2 +- kernel/sched/core.c | 4 ++-- kernel/sched/fair.c | 2 +- kernel/sched/sched.h | 4 ++-- kernel/time/namespace.c | 2 +- kernel/utsname.c | 2 +- net/core/net_namespace.c | 2 +- security/apparmor/lsm.c | 2 +- security/security.c | 2 +- security/selinux/hooks.c | 2 +- security/tomoyo/tomoyo.c | 2 +- 38 files changed, 59 insertions(+), 59 deletions(-) diff --git a/block/blk-ioc.c b/block/blk-ioc.c index 9fda3906e5f5..d15918d7fabb 100644 --- a/block/blk-ioc.c +++ b/block/blk-ioc.c @@ -286,7 +286,7 @@ int set_task_ioprio(struct task_struct *task, int ioprio) } EXPORT_SYMBOL_GPL(set_task_ioprio); -int __copy_io(unsigned long clone_flags, struct task_struct *tsk) +int __copy_io(u64 clone_flags, struct task_struct *tsk) { struct io_context *ioc = current->io_context; diff --git a/fs/namespace.c b/fs/namespace.c index 4b352a44cb80..0cd875b38552 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -4202,7 +4202,7 @@ static struct mnt_namespace *alloc_mnt_ns(struct user_namespace *user_ns, bool a } __latent_entropy -struct mnt_namespace *copy_mnt_ns(unsigned long flags, struct mnt_namespace *ns, +struct mnt_namespace *copy_mnt_ns(u64 flags, struct mnt_namespace *ns, struct user_namespace *user_ns, struct fs_struct *new_fs) { struct mnt_namespace *new_ns; diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index ae73dbb19165..15ed7a8f0abb 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -801,7 +801,7 @@ extern struct cgroup_namespace init_cgroup_ns; void free_cgroup_ns(struct cgroup_namespace *ns); -struct cgroup_namespace *copy_cgroup_ns(unsigned long flags, +struct cgroup_namespace *copy_cgroup_ns(u64 flags, struct user_namespace *user_ns, struct cgroup_namespace *old_ns); @@ -823,7 +823,7 @@ static inline void put_cgroup_ns(struct cgroup_namespace *ns) static inline void free_cgroup_ns(struct cgroup_namespace *ns) { } static inline struct cgroup_namespace * -copy_cgroup_ns(unsigned long flags, struct user_namespace *user_ns, +copy_cgroup_ns(u64 flags, struct user_namespace *user_ns, struct cgroup_namespace *old_ns) { return old_ns; diff --git a/include/linux/cred.h b/include/linux/cred.h index a102a10f833f..89ae50ad2ace 100644 --- a/include/linux/cred.h +++ b/include/linux/cred.h @@ -148,7 +148,7 @@ struct cred { extern void __put_cred(struct cred *); extern void exit_creds(struct task_struct *); -extern int copy_creds(struct task_struct *, unsigned long); +extern int copy_creds(struct task_struct *, u64); extern const struct cred *get_task_cred(struct task_struct *); extern struct cred *cred_alloc_blank(void); extern struct cred *prepare_creds(void); diff --git a/include/linux/iocontext.h b/include/linux/iocontext.h index 14f7eaf1b443..079d8773790c 100644 --- a/include/linux/iocontext.h +++ b/include/linux/iocontext.h @@ -118,8 +118,8 @@ struct task_struct; #ifdef CONFIG_BLOCK void put_io_context(struct io_context *ioc); void exit_io_context(struct task_struct *task); -int __copy_io(unsigned long clone_flags, struct task_struct *tsk); -static inline int copy_io(unsigned long clone_flags, struct task_struct *tsk) +int __copy_io(u64 clone_flags, struct task_struct *tsk); +static inline int copy_io(u64 clone_flags, struct task_struct *tsk) { if (!current->io_context) return 0; @@ -129,7 +129,7 @@ static inline int copy_io(unsigned long clone_flags, struct task_struct *tsk) struct io_context; static inline void put_io_context(struct io_context *ioc) { } static inline void exit_io_context(struct task_struct *task) { } -static inline int copy_io(unsigned long clone_flags, struct task_struct *tsk) +static inline int copy_io(u64 clone_flags, struct task_struct *tsk) { return 0; } diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h index e8240cf2611a..4b399893e2b3 100644 --- a/include/linux/ipc_namespace.h +++ b/include/linux/ipc_namespace.h @@ -129,7 +129,7 @@ static inline int mq_init_ns(struct ipc_namespace *ns) { return 0; } #endif #if defined(CONFIG_IPC_NS) -extern struct ipc_namespace *copy_ipcs(unsigned long flags, +extern struct ipc_namespace *copy_ipcs(u64 flags, struct user_namespace *user_ns, struct ipc_namespace *ns); static inline struct ipc_namespace *get_ipc_ns(struct ipc_namespace *ns) @@ -151,7 +151,7 @@ static inline struct ipc_namespace *get_ipc_ns_not_zero(struct ipc_namespace *ns extern void put_ipc_ns(struct ipc_namespace *ns); #else -static inline struct ipc_namespace *copy_ipcs(unsigned long flags, +static inline struct ipc_namespace *copy_ipcs(u64 flags, struct user_namespace *user_ns, struct ipc_namespace *ns) { if (flags & CLONE_NEWIPC) diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h index fd11fffdd3c3..adbe234a6f6c 100644 --- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -211,7 +211,7 @@ LSM_HOOK(int, 0, file_open, struct file *file) LSM_HOOK(int, 0, file_post_open, struct file *file, int mask) LSM_HOOK(int, 0, file_truncate, struct file *file) LSM_HOOK(int, 0, task_alloc, struct task_struct *task, - unsigned long clone_flags) + u64 clone_flags) LSM_HOOK(void, LSM_RET_VOID, task_free, struct task_struct *task) LSM_HOOK(int, 0, cred_alloc_blank, struct cred *cred, gfp_t gfp) LSM_HOOK(void, LSM_RET_VOID, cred_free, struct cred *cred) diff --git a/include/linux/mnt_namespace.h b/include/linux/mnt_namespace.h index 70b366b64816..ff290c87b2e7 100644 --- a/include/linux/mnt_namespace.h +++ b/include/linux/mnt_namespace.h @@ -11,7 +11,7 @@ struct fs_struct; struct user_namespace; struct ns_common; -extern struct mnt_namespace *copy_mnt_ns(unsigned long, struct mnt_namespace *, +extern struct mnt_namespace *copy_mnt_ns(u64, struct mnt_namespace *, struct user_namespace *, struct fs_struct *); extern void put_mnt_ns(struct mnt_namespace *ns); DEFINE_FREE(put_mnt_ns, struct mnt_namespace *, if (!IS_ERR_OR_NULL(_T)) put_mnt_ns(_T)) diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h index dab6a1734a22..82533e899ff4 100644 --- a/include/linux/nsproxy.h +++ b/include/linux/nsproxy.h @@ -103,7 +103,7 @@ static inline struct cred *nsset_cred(struct nsset *set) * */ -int copy_namespaces(unsigned long flags, struct task_struct *tsk); +int copy_namespaces(u64 flags, struct task_struct *tsk); void exit_task_namespaces(struct task_struct *tsk); void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new); int exec_task_namespaces(void); diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h index 7c67a5811199..0620a3e08e83 100644 --- a/include/linux/pid_namespace.h +++ b/include/linux/pid_namespace.h @@ -78,7 +78,7 @@ static inline int pidns_memfd_noexec_scope(struct pid_namespace *ns) } #endif -extern struct pid_namespace *copy_pid_ns(unsigned long flags, +extern struct pid_namespace *copy_pid_ns(u64 flags, struct user_namespace *user_ns, struct pid_namespace *ns); extern void zap_pid_ns_processes(struct pid_namespace *pid_ns); extern int reboot_pid_ns(struct pid_namespace *pid_ns, int cmd); @@ -97,7 +97,7 @@ static inline int pidns_memfd_noexec_scope(struct pid_namespace *ns) return 0; } -static inline struct pid_namespace *copy_pid_ns(unsigned long flags, +static inline struct pid_namespace *copy_pid_ns(u64 flags, struct user_namespace *user_ns, struct pid_namespace *ns) { if (flags & CLONE_NEWPID) diff --git a/include/linux/rseq.h b/include/linux/rseq.h index bc8af3eb5598..a96fd345aa38 100644 --- a/include/linux/rseq.h +++ b/include/linux/rseq.h @@ -65,7 +65,7 @@ static inline void rseq_migrate(struct task_struct *t) * If parent process has a registered restartable sequences area, the * child inherits. Unregister rseq for a clone with CLONE_VM set. */ -static inline void rseq_fork(struct task_struct *t, unsigned long clone_flags) +static inline void rseq_fork(struct task_struct *t, u64 clone_flags) { if (clone_flags & CLONE_VM) { t->rseq = NULL; @@ -107,7 +107,7 @@ static inline void rseq_preempt(struct task_struct *t) static inline void rseq_migrate(struct task_struct *t) { } -static inline void rseq_fork(struct task_struct *t, unsigned long clone_flags) +static inline void rseq_fork(struct task_struct *t, u64 clone_flags) { } static inline void rseq_execve(struct task_struct *t) diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h index ea41795a352b..34d6a0e108c3 100644 --- a/include/linux/sched/task.h +++ b/include/linux/sched/task.h @@ -63,7 +63,7 @@ extern int lockdep_tasklist_lock_is_held(void); extern asmlinkage void schedule_tail(struct task_struct *prev); extern void init_idle(struct task_struct *idle, int cpu); -extern int sched_fork(unsigned long clone_flags, struct task_struct *p); +extern int sched_fork(u64 clone_flags, struct task_struct *p); extern int sched_cgroup_fork(struct task_struct *p, struct kernel_clone_args *kargs); extern void sched_cancel_fork(struct task_struct *p); extern void sched_post_fork(struct task_struct *p); diff --git a/include/linux/security.h b/include/linux/security.h index 521bcb5b9717..9a1d4a6c8673 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -489,7 +489,7 @@ int security_file_receive(struct file *file); int security_file_open(struct file *file); int security_file_post_open(struct file *file, int mask); int security_file_truncate(struct file *file); -int security_task_alloc(struct task_struct *task, unsigned long clone_flags); +int security_task_alloc(struct task_struct *task, u64 clone_flags); void security_task_free(struct task_struct *task); int security_cred_alloc_blank(struct cred *cred, gfp_t gfp); void security_cred_free(struct cred *cred); @@ -1215,7 +1215,7 @@ static inline int security_file_truncate(struct file *file) } static inline int security_task_alloc(struct task_struct *task, - unsigned long clone_flags) + u64 clone_flags) { return 0; } diff --git a/include/linux/sem.h b/include/linux/sem.h index c4deefe42aeb..275269ce2ec8 100644 --- a/include/linux/sem.h +++ b/include/linux/sem.h @@ -9,12 +9,12 @@ struct task_struct; #ifdef CONFIG_SYSVIPC -extern int copy_semundo(unsigned long clone_flags, struct task_struct *tsk); +extern int copy_semundo(u64 clone_flags, struct task_struct *tsk); extern void exit_sem(struct task_struct *tsk); #else -static inline int copy_semundo(unsigned long clone_flags, struct task_struct *tsk) +static inline int copy_semundo(u64 clone_flags, struct task_struct *tsk) { return 0; } diff --git a/include/linux/time_namespace.h b/include/linux/time_namespace.h index bb2c52f4fc94..b6e36525e0be 100644 --- a/include/linux/time_namespace.h +++ b/include/linux/time_namespace.h @@ -43,7 +43,7 @@ static inline struct time_namespace *get_time_ns(struct time_namespace *ns) return ns; } -struct time_namespace *copy_time_ns(unsigned long flags, +struct time_namespace *copy_time_ns(u64 flags, struct user_namespace *user_ns, struct time_namespace *old_ns); void free_time_ns(struct time_namespace *ns); @@ -129,7 +129,7 @@ static inline void put_time_ns(struct time_namespace *ns) } static inline -struct time_namespace *copy_time_ns(unsigned long flags, +struct time_namespace *copy_time_ns(u64 flags, struct user_namespace *user_ns, struct time_namespace *old_ns) { diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index 516217c39094..915303a82d84 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -205,7 +205,7 @@ extern void uprobe_start_dup_mmap(void); extern void uprobe_end_dup_mmap(void); extern void uprobe_dup_mmap(struct mm_struct *oldmm, struct mm_struct *newmm); extern void uprobe_free_utask(struct task_struct *t); -extern void uprobe_copy_process(struct task_struct *t, unsigned long flags); +extern void uprobe_copy_process(struct task_struct *t, u64 flags); extern int uprobe_post_sstep_notifier(struct pt_regs *regs); extern int uprobe_pre_sstep_notifier(struct pt_regs *regs); extern void uprobe_notify_resume(struct pt_regs *regs); @@ -281,7 +281,7 @@ static inline bool uprobe_deny_signal(void) static inline void uprobe_free_utask(struct task_struct *t) { } -static inline void uprobe_copy_process(struct task_struct *t, unsigned long flags) +static inline void uprobe_copy_process(struct task_struct *t, u64 flags) { } static inline void uprobe_clear_state(struct mm_struct *mm) diff --git a/include/linux/user_events.h b/include/linux/user_events.h index 8afa8c3a0973..57d1ff006090 100644 --- a/include/linux/user_events.h +++ b/include/linux/user_events.h @@ -33,7 +33,7 @@ extern void user_event_mm_dup(struct task_struct *t, extern void user_event_mm_remove(struct task_struct *t); static inline void user_events_fork(struct task_struct *t, - unsigned long clone_flags) + u64 clone_flags) { struct user_event_mm *old_mm; @@ -68,7 +68,7 @@ static inline void user_events_exit(struct task_struct *t) } #else static inline void user_events_fork(struct task_struct *t, - unsigned long clone_flags) + u64 clone_flags) { } diff --git a/include/linux/utsname.h b/include/linux/utsname.h index bf7613ba412b..ba34ec0e2f95 100644 --- a/include/linux/utsname.h +++ b/include/linux/utsname.h @@ -35,7 +35,7 @@ static inline void get_uts_ns(struct uts_namespace *ns) refcount_inc(&ns->ns.count); } -extern struct uts_namespace *copy_utsname(unsigned long flags, +extern struct uts_namespace *copy_utsname(u64 flags, struct user_namespace *user_ns, struct uts_namespace *old_ns); extern void free_uts_ns(struct uts_namespace *ns); @@ -55,7 +55,7 @@ static inline void put_uts_ns(struct uts_namespace *ns) { } -static inline struct uts_namespace *copy_utsname(unsigned long flags, +static inline struct uts_namespace *copy_utsname(u64 flags, struct user_namespace *user_ns, struct uts_namespace *old_ns) { if (flags & CLONE_NEWUTS) diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h index 025a7574b275..0e008cfe159d 100644 --- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -204,7 +204,7 @@ struct net { extern struct net init_net; #ifdef CONFIG_NET_NS -struct net *copy_net_ns(unsigned long flags, struct user_namespace *user_ns, +struct net *copy_net_ns(u64 flags, struct user_namespace *user_ns, struct net *old_net); void net_ns_get_ownership(const struct net *net, kuid_t *uid, kgid_t *gid); @@ -218,7 +218,7 @@ extern struct task_struct *cleanup_net_task; #else /* CONFIG_NET_NS */ #include #include -static inline struct net *copy_net_ns(unsigned long flags, +static inline struct net *copy_net_ns(u64 flags, struct user_namespace *user_ns, struct net *old_net) { if (flags & CLONE_NEWNET) diff --git a/include/trace/events/task.h b/include/trace/events/task.h index af535b053033..4f0759634306 100644 --- a/include/trace/events/task.h +++ b/include/trace/events/task.h @@ -8,14 +8,14 @@ TRACE_EVENT(task_newtask, - TP_PROTO(struct task_struct *task, unsigned long clone_flags), + TP_PROTO(struct task_struct *task, u64 clone_flags), TP_ARGS(task, clone_flags), TP_STRUCT__entry( __field( pid_t, pid) __array( char, comm, TASK_COMM_LEN) - __field( unsigned long, clone_flags) + __field( u64, clone_flags) __field( short, oom_score_adj) ), @@ -26,7 +26,7 @@ TRACE_EVENT(task_newtask, __entry->oom_score_adj = task->signal->oom_score_adj; ), - TP_printk("pid=%d comm=%s clone_flags=%lx oom_score_adj=%hd", + TP_printk("pid=%d comm=%s clone_flags=%llx oom_score_adj=%hd", __entry->pid, __entry->comm, __entry->clone_flags, __entry->oom_score_adj) ); diff --git a/ipc/namespace.c b/ipc/namespace.c index 4df91ceeeafe..a712ec27209c 100644 --- a/ipc/namespace.c +++ b/ipc/namespace.c @@ -106,7 +106,7 @@ static struct ipc_namespace *create_ipc_ns(struct user_namespace *user_ns, return ERR_PTR(err); } -struct ipc_namespace *copy_ipcs(unsigned long flags, +struct ipc_namespace *copy_ipcs(u64 flags, struct user_namespace *user_ns, struct ipc_namespace *ns) { if (!(flags & CLONE_NEWIPC)) diff --git a/ipc/sem.c b/ipc/sem.c index a39cdc7bf88f..0f06e4bd4673 100644 --- a/ipc/sem.c +++ b/ipc/sem.c @@ -2303,7 +2303,7 @@ SYSCALL_DEFINE3(semop, int, semid, struct sembuf __user *, tsops, * parent and child tasks. */ -int copy_semundo(unsigned long clone_flags, struct task_struct *tsk) +int copy_semundo(u64 clone_flags, struct task_struct *tsk) { struct sem_undo_list *undo_list; int error; diff --git a/kernel/cgroup/namespace.c b/kernel/cgroup/namespace.c index 144a464e45c6..dedadb525880 100644 --- a/kernel/cgroup/namespace.c +++ b/kernel/cgroup/namespace.c @@ -47,7 +47,7 @@ void free_cgroup_ns(struct cgroup_namespace *ns) } EXPORT_SYMBOL(free_cgroup_ns); -struct cgroup_namespace *copy_cgroup_ns(unsigned long flags, +struct cgroup_namespace *copy_cgroup_ns(u64 flags, struct user_namespace *user_ns, struct cgroup_namespace *old_ns) { diff --git a/kernel/cred.c b/kernel/cred.c index 9676965c0981..dbf6b687dc5c 100644 --- a/kernel/cred.c +++ b/kernel/cred.c @@ -287,7 +287,7 @@ struct cred *prepare_exec_creds(void) * The new process gets the current process's subjective credentials as its * objective and subjective credentials */ -int copy_creds(struct task_struct *p, unsigned long clone_flags) +int copy_creds(struct task_struct *p, u64 clone_flags) { struct cred *new; int ret; diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 31a12b60055f..aa479d24ccaf 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -2160,7 +2160,7 @@ static void dup_xol_work(struct callback_head *work) /* * Called in context of a new clone/fork from copy_process. */ -void uprobe_copy_process(struct task_struct *t, unsigned long flags) +void uprobe_copy_process(struct task_struct *t, u64 flags) { struct uprobe_task *utask = current->utask; struct mm_struct *mm = current->mm; diff --git a/kernel/fork.c b/kernel/fork.c index 82f5d52fecf1..0e9b2dd6c365 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1510,7 +1510,7 @@ static struct mm_struct *dup_mm(struct task_struct *tsk, return NULL; } -static int copy_mm(unsigned long clone_flags, struct task_struct *tsk) +static int copy_mm(u64 clone_flags, struct task_struct *tsk) { struct mm_struct *mm, *oldmm; @@ -1548,7 +1548,7 @@ static int copy_mm(unsigned long clone_flags, struct task_struct *tsk) return 0; } -static int copy_fs(unsigned long clone_flags, struct task_struct *tsk) +static int copy_fs(u64 clone_flags, struct task_struct *tsk) { struct fs_struct *fs = current->fs; if (clone_flags & CLONE_FS) { @@ -1569,7 +1569,7 @@ static int copy_fs(unsigned long clone_flags, struct task_struct *tsk) return 0; } -static int copy_files(unsigned long clone_flags, struct task_struct *tsk, +static int copy_files(u64 clone_flags, struct task_struct *tsk, int no_files) { struct files_struct *oldf, *newf; @@ -1648,7 +1648,7 @@ static void posix_cpu_timers_init_group(struct signal_struct *sig) posix_cputimers_group_init(pct, cpu_limit); } -static int copy_signal(unsigned long clone_flags, struct task_struct *tsk) +static int copy_signal(u64 clone_flags, struct task_struct *tsk) { struct signal_struct *sig; diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c index 5f31fdff8a38..8af3b9ec3aa8 100644 --- a/kernel/nsproxy.c +++ b/kernel/nsproxy.c @@ -64,7 +64,7 @@ static inline struct nsproxy *create_nsproxy(void) * Return the newly created nsproxy. Do not attach this to the task, * leave it to the caller to do proper locking and attach it to task. */ -static struct nsproxy *create_new_namespaces(unsigned long flags, +static struct nsproxy *create_new_namespaces(u64 flags, struct task_struct *tsk, struct user_namespace *user_ns, struct fs_struct *new_fs) { @@ -144,7 +144,7 @@ static struct nsproxy *create_new_namespaces(unsigned long flags, * called from clone. This now handles copy for nsproxy and all * namespaces therein. */ -int copy_namespaces(unsigned long flags, struct task_struct *tsk) +int copy_namespaces(u64 flags, struct task_struct *tsk) { struct nsproxy *old_ns = tsk->nsproxy; struct user_namespace *user_ns = task_cred_xxx(tsk, user_ns); diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c index 7098ed44e717..06bc7c7f78e0 100644 --- a/kernel/pid_namespace.c +++ b/kernel/pid_namespace.c @@ -171,7 +171,7 @@ static void destroy_pid_namespace_work(struct work_struct *work) } while (ns != &init_pid_ns && refcount_dec_and_test(&ns->ns.count)); } -struct pid_namespace *copy_pid_ns(unsigned long flags, +struct pid_namespace *copy_pid_ns(u64 flags, struct user_namespace *user_ns, struct pid_namespace *old_ns) { if (!(flags & CLONE_NEWPID)) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index be00629f0ba4..6fa85d30d965 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4472,7 +4472,7 @@ int wake_up_state(struct task_struct *p, unsigned int state) * __sched_fork() is basic setup which is also used by sched_init() to * initialize the boot CPU's idle task. */ -static void __sched_fork(unsigned long clone_flags, struct task_struct *p) +static void __sched_fork(u64 clone_flags, struct task_struct *p) { p->on_rq = 0; @@ -4707,7 +4707,7 @@ late_initcall(sched_core_sysctl_init); /* * fork()/clone()-time setup: */ -int sched_fork(unsigned long clone_flags, struct task_struct *p) +int sched_fork(u64 clone_flags, struct task_struct *p) { __sched_fork(clone_flags, p); /* diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e256793b9a08..06bcba61ca75 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3542,7 +3542,7 @@ static void task_numa_work(struct callback_head *work) } } -void init_numa_balancing(unsigned long clone_flags, struct task_struct *p) +void init_numa_balancing(u64 clone_flags, struct task_struct *p) { int mm_users = 0; struct mm_struct *mm = p->mm; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index be9745d104f7..f9adfc912ddc 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1935,12 +1935,12 @@ extern void sched_setnuma(struct task_struct *p, int node); extern int migrate_task_to(struct task_struct *p, int cpu); extern int migrate_swap(struct task_struct *p, struct task_struct *t, int cpu, int scpu); -extern void init_numa_balancing(unsigned long clone_flags, struct task_struct *p); +extern void init_numa_balancing(u64 clone_flags, struct task_struct *p); #else /* !CONFIG_NUMA_BALANCING: */ static inline void -init_numa_balancing(unsigned long clone_flags, struct task_struct *p) +init_numa_balancing(u64 clone_flags, struct task_struct *p) { } diff --git a/kernel/time/namespace.c b/kernel/time/namespace.c index 667452768ed3..888872bcc5bb 100644 --- a/kernel/time/namespace.c +++ b/kernel/time/namespace.c @@ -130,7 +130,7 @@ static struct time_namespace *clone_time_ns(struct user_namespace *user_ns, * * Return: timens_for_children namespace or ERR_PTR. */ -struct time_namespace *copy_time_ns(unsigned long flags, +struct time_namespace *copy_time_ns(u64 flags, struct user_namespace *user_ns, struct time_namespace *old_ns) { if (!(flags & CLONE_NEWTIME)) diff --git a/kernel/utsname.c b/kernel/utsname.c index b1ac3ca870f2..00d8d7922f86 100644 --- a/kernel/utsname.c +++ b/kernel/utsname.c @@ -86,7 +86,7 @@ static struct uts_namespace *clone_uts_ns(struct user_namespace *user_ns, * utsname of this process won't be seen by parent, and vice * versa. */ -struct uts_namespace *copy_utsname(unsigned long flags, +struct uts_namespace *copy_utsname(u64 flags, struct user_namespace *user_ns, struct uts_namespace *old_ns) { struct uts_namespace *new_ns; diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index 1b6f3826dd0e..8ec9d83475bf 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -539,7 +539,7 @@ void net_drop_ns(void *p) net_passive_dec(net); } -struct net *copy_net_ns(unsigned long flags, +struct net *copy_net_ns(u64 flags, struct user_namespace *user_ns, struct net *old_net) { struct ucounts *ucounts; diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c index 8e1cc229b41b..ba39cfe0cd08 100644 --- a/security/apparmor/lsm.c +++ b/security/apparmor/lsm.c @@ -112,7 +112,7 @@ static void apparmor_task_free(struct task_struct *task) } static int apparmor_task_alloc(struct task_struct *task, - unsigned long clone_flags) + u64 clone_flags) { struct aa_task_ctx *new = task_ctx(task); diff --git a/security/security.c b/security/security.c index ca126b02d2fe..d5fea03a741a 100644 --- a/security/security.c +++ b/security/security.c @@ -3224,7 +3224,7 @@ int security_file_truncate(struct file *file) * * Return: Returns a zero on success, negative values on failure. */ -int security_task_alloc(struct task_struct *task, unsigned long clone_flags) +int security_task_alloc(struct task_struct *task, u64 clone_flags) { int rc = lsm_task_alloc(task); diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index f94642ca34f2..9d3b5ebd7657 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -4144,7 +4144,7 @@ static int selinux_file_open(struct file *file) /* task security operations */ static int selinux_task_alloc(struct task_struct *task, - unsigned long clone_flags) + u64 clone_flags) { u32 sid = current_sid(); diff --git a/security/tomoyo/tomoyo.c b/security/tomoyo/tomoyo.c index d6ebcd9db80a..48fc59d38ab2 100644 --- a/security/tomoyo/tomoyo.c +++ b/security/tomoyo/tomoyo.c @@ -514,7 +514,7 @@ struct lsm_blob_sizes tomoyo_blob_sizes __ro_after_init = { * Returns 0. */ static int tomoyo_task_alloc(struct task_struct *task, - unsigned long clone_flags) + u64 clone_flags) { struct tomoyo_task *old = tomoyo_task(current); struct tomoyo_task *new = tomoyo_task(task); -- 2.39.5 From devnull+schuster.simon.siemens-energy.com at kernel.org Mon Sep 1 06:09:50 2025 From: devnull+schuster.simon.siemens-energy.com at kernel.org (Simon Schuster via B4 Relay) Date: Mon, 01 Sep 2025 15:09:50 +0200 Subject: [PATCH v2 1/4] copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64) In-Reply-To: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> Message-ID: <20250901-nios2-implement-clone3-v2-1-53fcf5577d57@siemens-energy.com> From: Simon Schuster With the introduction of clone3 in commit 7f192e3cd316 ("fork: add clone3") the effective bit width of clone_flags on all architectures was increased from 32-bit to 64-bit. However, the signature of the copy_* helper functions (e.g., copy_sighand) used by copy_process was not adapted. As such, they truncate the flags on any 32-bit architectures that supports clone3 (arc, arm, csky, m68k, microblaze, mips32, openrisc, parisc32, powerpc32, riscv32, x86-32 and xtensa). For copy_sighand with CLONE_CLEAR_SIGHAND being an actual u64 constant, this triggers an observable bug in kernel selftest clone3_clear_sighand: if (clone_flags & CLONE_CLEAR_SIGHAND) in function copy_sighand within fork.c will always fail given: unsigned long /* == uint32_t */ clone_flags #define CLONE_CLEAR_SIGHAND 0x100000000ULL This commit fixes the bug by always passing clone_flags to copy_sighand via their declared u64 type, invariant of architecture-dependent integer sizes. Fixes: b612e5df4587 ("clone3: add CLONE_CLEAR_SIGHAND") Cc: stable at vger.kernel.org # linux-5.5+ Signed-off-by: Simon Schuster Reviewed-by: Lorenzo Stoakes --- kernel/fork.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/fork.c b/kernel/fork.c index 5115be549234..82f5d52fecf1 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1599,7 +1599,7 @@ static int copy_files(unsigned long clone_flags, struct task_struct *tsk, return 0; } -static int copy_sighand(unsigned long clone_flags, struct task_struct *tsk) +static int copy_sighand(u64 clone_flags, struct task_struct *tsk) { struct sighand_struct *sig; -- 2.39.5 From devnull+schuster.simon.siemens-energy.com at kernel.org Mon Sep 1 06:09:52 2025 From: devnull+schuster.simon.siemens-energy.com at kernel.org (Simon Schuster via B4 Relay) Date: Mon, 01 Sep 2025 15:09:52 +0200 Subject: [PATCH v2 3/4] arch: copy_thread: pass clone_flags as u64 In-Reply-To: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> Message-ID: <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> From: Simon Schuster With the introduction of clone3 in commit 7f192e3cd316 ("fork: add clone3") the effective bit width of clone_flags on all architectures was increased from 32-bit to 64-bit, with a new type of u64 for the flags. However, for most consumers of clone_flags the interface was not changed from the previous type of unsigned long. While this works fine as long as none of the new 64-bit flag bits (CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still undesirable in terms of the principle of least surprise. Thus, this commit fixes all relevant interfaces of the copy_thread function that is called from copy_process to consistently pass clone_flags as u64, so that no truncation to 32-bit integers occurs on 32-bit architectures. Signed-off-by: Simon Schuster --- arch/alpha/kernel/process.c | 2 +- arch/arc/kernel/process.c | 2 +- arch/arm/kernel/process.c | 2 +- arch/arm64/kernel/process.c | 2 +- arch/csky/kernel/process.c | 2 +- arch/hexagon/kernel/process.c | 2 +- arch/loongarch/kernel/process.c | 2 +- arch/m68k/kernel/process.c | 2 +- arch/microblaze/kernel/process.c | 2 +- arch/mips/kernel/process.c | 2 +- arch/nios2/kernel/process.c | 2 +- arch/openrisc/kernel/process.c | 2 +- arch/parisc/kernel/process.c | 2 +- arch/powerpc/kernel/process.c | 2 +- arch/riscv/kernel/process.c | 2 +- arch/s390/kernel/process.c | 2 +- arch/sh/kernel/process_32.c | 2 +- arch/sparc/kernel/process_32.c | 2 +- arch/sparc/kernel/process_64.c | 2 +- arch/um/kernel/process.c | 2 +- arch/x86/include/asm/fpu/sched.h | 2 +- arch/x86/include/asm/shstk.h | 4 ++-- arch/x86/kernel/fpu/core.c | 2 +- arch/x86/kernel/process.c | 2 +- arch/x86/kernel/shstk.c | 2 +- arch/xtensa/kernel/process.c | 2 +- 26 files changed, 27 insertions(+), 27 deletions(-) diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c index 582d96548385..06522451f018 100644 --- a/arch/alpha/kernel/process.c +++ b/arch/alpha/kernel/process.c @@ -231,7 +231,7 @@ flush_thread(void) */ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; extern void ret_from_fork(void); diff --git a/arch/arc/kernel/process.c b/arch/arc/kernel/process.c index 186ceab661eb..8166d0908713 100644 --- a/arch/arc/kernel/process.c +++ b/arch/arc/kernel/process.c @@ -166,7 +166,7 @@ asmlinkage void ret_from_fork(void); */ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; struct pt_regs *c_regs; /* child's pt_regs */ diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index e16ed102960c..d7aa95225c70 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -234,7 +234,7 @@ asmlinkage void ret_from_fork(void) __asm__("ret_from_fork"); int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long stack_start = args->stack; unsigned long tls = args->tls; struct thread_info *thread = task_thread_info(p); diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 96482a1412c6..fba7ca102a8c 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -409,7 +409,7 @@ asmlinkage void ret_from_fork(void) asm("ret_from_fork"); int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long stack_start = args->stack; unsigned long tls = args->tls; struct pt_regs *childregs = task_pt_regs(p); diff --git a/arch/csky/kernel/process.c b/arch/csky/kernel/process.c index 0c6e4b17fe00..a7a90340042a 100644 --- a/arch/csky/kernel/process.c +++ b/arch/csky/kernel/process.c @@ -32,7 +32,7 @@ void flush_thread(void){} int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; struct switch_stack *childstack; diff --git a/arch/hexagon/kernel/process.c b/arch/hexagon/kernel/process.c index 2a77bfd75694..15b4992bfa29 100644 --- a/arch/hexagon/kernel/process.c +++ b/arch/hexagon/kernel/process.c @@ -52,7 +52,7 @@ void arch_cpu_idle(void) */ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; struct thread_info *ti = task_thread_info(p); diff --git a/arch/loongarch/kernel/process.c b/arch/loongarch/kernel/process.c index 3582f591bab2..efd9edf65603 100644 --- a/arch/loongarch/kernel/process.c +++ b/arch/loongarch/kernel/process.c @@ -167,7 +167,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) unsigned long childksp; unsigned long tls = args->tls; unsigned long usp = args->stack; - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; struct pt_regs *childregs, *regs = current_pt_regs(); childksp = (unsigned long)task_stack_page(p) + THREAD_SIZE; diff --git a/arch/m68k/kernel/process.c b/arch/m68k/kernel/process.c index fda7eac23f87..f5a07a70e938 100644 --- a/arch/m68k/kernel/process.c +++ b/arch/m68k/kernel/process.c @@ -141,7 +141,7 @@ asmlinkage int m68k_clone3(struct pt_regs *regs) int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; struct fork_frame { diff --git a/arch/microblaze/kernel/process.c b/arch/microblaze/kernel/process.c index 56342e11442d..6cbf642d7b80 100644 --- a/arch/microblaze/kernel/process.c +++ b/arch/microblaze/kernel/process.c @@ -54,7 +54,7 @@ void flush_thread(void) int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; struct pt_regs *childregs = task_pt_regs(p); diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c index 02aa6a04a21d..29191fa1801e 100644 --- a/arch/mips/kernel/process.c +++ b/arch/mips/kernel/process.c @@ -107,7 +107,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) */ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; struct thread_info *ti = task_thread_info(p); diff --git a/arch/nios2/kernel/process.c b/arch/nios2/kernel/process.c index f84021303f6a..151404139085 100644 --- a/arch/nios2/kernel/process.c +++ b/arch/nios2/kernel/process.c @@ -101,7 +101,7 @@ void flush_thread(void) int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; struct pt_regs *childregs = task_pt_regs(p); diff --git a/arch/openrisc/kernel/process.c b/arch/openrisc/kernel/process.c index eef99fee2110..73ffb9fa3118 100644 --- a/arch/openrisc/kernel/process.c +++ b/arch/openrisc/kernel/process.c @@ -165,7 +165,7 @@ extern asmlinkage void ret_from_fork(void); int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; struct pt_regs *userregs; diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c index ed93bd8c1545..e64ab5d2a40d 100644 --- a/arch/parisc/kernel/process.c +++ b/arch/parisc/kernel/process.c @@ -201,7 +201,7 @@ arch_initcall(parisc_idle_init); int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; struct pt_regs *cregs = &(p->thread.regs); diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 855e09886503..eb23966ac0a9 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1805,7 +1805,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) f = ret_from_kernel_user_thread; } else { struct pt_regs *regs = current_pt_regs(); - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp = args->stack; /* Copy registers */ diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index a0a40889d79a..31a392993cb4 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -223,7 +223,7 @@ asmlinkage void ret_from_fork_user(struct pt_regs *regs) int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; struct pt_regs *childregs = task_pt_regs(p); diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c index f55f09cda6f8..b107dbca4ed7 100644 --- a/arch/s390/kernel/process.c +++ b/arch/s390/kernel/process.c @@ -106,7 +106,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long new_stackp = args->stack; unsigned long tls = args->tls; struct fake_frame diff --git a/arch/sh/kernel/process_32.c b/arch/sh/kernel/process_32.c index 92b6649d4929..62f753a85b89 100644 --- a/arch/sh/kernel/process_32.c +++ b/arch/sh/kernel/process_32.c @@ -89,7 +89,7 @@ asmlinkage void ret_from_kernel_thread(void); int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; struct thread_info *ti = task_thread_info(p); diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c index 9c7c662cb565..5a28c0e91bf1 100644 --- a/arch/sparc/kernel/process_32.c +++ b/arch/sparc/kernel/process_32.c @@ -260,7 +260,7 @@ extern void ret_from_kernel_thread(void); int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long sp = args->stack; unsigned long tls = args->tls; struct thread_info *ti = task_thread_info(p); diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c index 529adfecd58c..25781923788a 100644 --- a/arch/sparc/kernel/process_64.c +++ b/arch/sparc/kernel/process_64.c @@ -567,7 +567,7 @@ void fault_in_user_windows(struct pt_regs *regs) */ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long sp = args->stack; unsigned long tls = args->tls; struct thread_info *t = task_thread_info(p); diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c index 1be644de9e41..9c9c66dc45f0 100644 --- a/arch/um/kernel/process.c +++ b/arch/um/kernel/process.c @@ -143,7 +143,7 @@ static void fork_handler(void) int copy_thread(struct task_struct * p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long sp = args->stack; unsigned long tls = args->tls; void (*handler)(void); diff --git a/arch/x86/include/asm/fpu/sched.h b/arch/x86/include/asm/fpu/sched.h index c060549c6c94..89004f4ca208 100644 --- a/arch/x86/include/asm/fpu/sched.h +++ b/arch/x86/include/asm/fpu/sched.h @@ -11,7 +11,7 @@ extern void save_fpregs_to_fpstate(struct fpu *fpu); extern void fpu__drop(struct task_struct *tsk); -extern int fpu_clone(struct task_struct *dst, unsigned long clone_flags, bool minimal, +extern int fpu_clone(struct task_struct *dst, u64 clone_flags, bool minimal, unsigned long shstk_addr); extern void fpu_flush_thread(void); diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h index ba6f2fe43848..0f50e0125943 100644 --- a/arch/x86/include/asm/shstk.h +++ b/arch/x86/include/asm/shstk.h @@ -16,7 +16,7 @@ struct thread_shstk { long shstk_prctl(struct task_struct *task, int option, unsigned long arg2); void reset_thread_features(void); -unsigned long shstk_alloc_thread_stack(struct task_struct *p, unsigned long clone_flags, +unsigned long shstk_alloc_thread_stack(struct task_struct *p, u64 clone_flags, unsigned long stack_size); void shstk_free(struct task_struct *p); int setup_signal_shadow_stack(struct ksignal *ksig); @@ -28,7 +28,7 @@ static inline long shstk_prctl(struct task_struct *task, int option, unsigned long arg2) { return -EINVAL; } static inline void reset_thread_features(void) {} static inline unsigned long shstk_alloc_thread_stack(struct task_struct *p, - unsigned long clone_flags, + u64 clone_flags, unsigned long stack_size) { return 0; } static inline void shstk_free(struct task_struct *p) {} static inline int setup_signal_shadow_stack(struct ksignal *ksig) { return 0; } diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index aefd412a23dc..1f71cc135e9a 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -631,7 +631,7 @@ static int update_fpu_shstk(struct task_struct *dst, unsigned long ssp) } /* Clone current's FPU state on fork */ -int fpu_clone(struct task_struct *dst, unsigned long clone_flags, bool minimal, +int fpu_clone(struct task_struct *dst, u64 clone_flags, bool minimal, unsigned long ssp) { /* diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 1b7960cf6eb0..e3a3987b0c4f 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -159,7 +159,7 @@ __visible void ret_from_fork(struct task_struct *prev, struct pt_regs *regs, int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long sp = args->stack; unsigned long tls = args->tls; struct inactive_task_frame *frame; diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c index 2ddf23387c7e..5eba6c5a6775 100644 --- a/arch/x86/kernel/shstk.c +++ b/arch/x86/kernel/shstk.c @@ -191,7 +191,7 @@ void reset_thread_features(void) current->thread.features_locked = 0; } -unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, unsigned long clone_flags, +unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, u64 clone_flags, unsigned long stack_size) { struct thread_shstk *shstk = &tsk->thread.shstk; diff --git a/arch/xtensa/kernel/process.c b/arch/xtensa/kernel/process.c index 7bd66677f7b6..94d43f44be13 100644 --- a/arch/xtensa/kernel/process.c +++ b/arch/xtensa/kernel/process.c @@ -267,7 +267,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { - unsigned long clone_flags = args->flags; + u64 clone_flags = args->flags; unsigned long usp_thread_fn = args->stack; unsigned long tls = args->tls; struct pt_regs *childregs = task_pt_regs(p); -- 2.39.5 From devnull+schuster.simon.siemens-energy.com at kernel.org Mon Sep 1 06:09:53 2025 From: devnull+schuster.simon.siemens-energy.com at kernel.org (Simon Schuster via B4 Relay) Date: Mon, 01 Sep 2025 15:09:53 +0200 Subject: [PATCH v2 4/4] nios2: implement architecture-specific portion of sys_clone3 In-Reply-To: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> Message-ID: <20250901-nios2-implement-clone3-v2-4-53fcf5577d57@siemens-energy.com> From: Simon Schuster This commit adds the sys_clone3 entry point for nios2. An architecture-specific wrapper (__sys_clone3) is required to save and restore additional registers to the kernel stack via SAVE_SWITCH_STACK and RESTORE_SWITCH_STACK. Signed-off-by: Simon Schuster --- arch/nios2/include/asm/syscalls.h | 1 + arch/nios2/include/asm/unistd.h | 2 -- arch/nios2/kernel/entry.S | 6 ++++++ arch/nios2/kernel/syscall_table.c | 1 + 4 files changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/nios2/include/asm/syscalls.h b/arch/nios2/include/asm/syscalls.h index b4d4ed3bf9c8..0e214b0a0ac8 100644 --- a/arch/nios2/include/asm/syscalls.h +++ b/arch/nios2/include/asm/syscalls.h @@ -7,6 +7,7 @@ int sys_cacheflush(unsigned long addr, unsigned long len, unsigned int op); +asmlinkage long __sys_clone3(struct clone_args __user *uargs, size_t size); #include diff --git a/arch/nios2/include/asm/unistd.h b/arch/nios2/include/asm/unistd.h index 1146e56473c5..213f6de3cf7b 100644 --- a/arch/nios2/include/asm/unistd.h +++ b/arch/nios2/include/asm/unistd.h @@ -7,6 +7,4 @@ #define __ARCH_WANT_STAT64 #define __ARCH_WANT_SET_GET_RLIMIT -#define __ARCH_BROKEN_SYS_CLONE3 - #endif diff --git a/arch/nios2/kernel/entry.S b/arch/nios2/kernel/entry.S index 99f0a65e6234..dd40dfd908e5 100644 --- a/arch/nios2/kernel/entry.S +++ b/arch/nios2/kernel/entry.S @@ -403,6 +403,12 @@ ENTRY(sys_clone) addi sp, sp, 4 RESTORE_SWITCH_STACK ret +/* long syscall(SYS_clone3, struct clone_args *cl_args, size_t size); */ +ENTRY(__sys_clone3) + SAVE_SWITCH_STACK + call sys_clone3 + RESTORE_SWITCH_STACK + ret ENTRY(sys_rt_sigreturn) SAVE_SWITCH_STACK diff --git a/arch/nios2/kernel/syscall_table.c b/arch/nios2/kernel/syscall_table.c index 434694067d8f..c99818aac9e1 100644 --- a/arch/nios2/kernel/syscall_table.c +++ b/arch/nios2/kernel/syscall_table.c @@ -13,6 +13,7 @@ #define __SYSCALL_WITH_COMPAT(nr, native, compat) __SYSCALL(nr, native) #define sys_mmap2 sys_mmap_pgoff +#define sys_clone3 __sys_clone3 void *sys_call_table[__NR_syscalls] = { [0 ... __NR_syscalls-1] = sys_ni_syscall, -- 2.39.5 From arnd at arndb.de Mon Sep 1 06:19:31 2025 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 01 Sep 2025 15:19:31 +0200 Subject: [PATCH v2 1/4] copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64) In-Reply-To: <20250901-nios2-implement-clone3-v2-1-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> <20250901-nios2-implement-clone3-v2-1-53fcf5577d57@siemens-energy.com> Message-ID: <13f8ca46-92f0-47bb-a046-165402122a44@app.fastmail.com> On Mon, Sep 1, 2025, at 15:09, Simon Schuster via B4 Relay wrote: > This commit fixes the bug by always passing clone_flags to copy_sighand > via their declared u64 type, invariant of architecture-dependent integer > sizes. > > Fixes: b612e5df4587 ("clone3: add CLONE_CLEAR_SIGHAND") > Cc: stable at vger.kernel.org # linux-5.5+ > Signed-off-by: Simon Schuster > Reviewed-by: Lorenzo Stoakes Reviewed-by: Arnd Bergmann From arnd at arndb.de Mon Sep 1 06:30:20 2025 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 01 Sep 2025 15:30:20 +0200 Subject: [PATCH v2 3/4] arch: copy_thread: pass clone_flags as u64 In-Reply-To: <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> Message-ID: <78ca3a1f-2ed9-4450-95ef-d690cf4aace1@app.fastmail.com> On Mon, Sep 1, 2025, at 15:09, Simon Schuster via B4 Relay wrote: > From: Simon Schuster > > With the introduction of clone3 in commit 7f192e3cd316 ("fork: add > clone3") the effective bit width of clone_flags on all architectures was > increased from 32-bit to 64-bit, with a new type of u64 for the flags. > However, for most consumers of clone_flags the interface was not > changed from the previous type of unsigned long. > > While this works fine as long as none of the new 64-bit flag bits > (CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still > undesirable in terms of the principle of least surprise. > > Thus, this commit fixes all relevant interfaces of the copy_thread > function that is called from copy_process to consistently pass > clone_flags as u64, so that no truncation to 32-bit integers occurs on > 32-bit architectures. > > Signed-off-by: Simon Schuster Reviewed-by: Arnd Bergmann From arnd at arndb.de Mon Sep 1 06:35:06 2025 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 01 Sep 2025 15:35:06 +0200 Subject: [PATCH v2 2/4] copy_process: pass clone_flags as u64 across calltree In-Reply-To: <20250901-nios2-implement-clone3-v2-2-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> <20250901-nios2-implement-clone3-v2-2-53fcf5577d57@siemens-energy.com> Message-ID: On Mon, Sep 1, 2025, at 15:09, Simon Schuster via B4 Relay wrote: > From: Simon Schuster > > With the introduction of clone3 in commit 7f192e3cd316 ("fork: add > clone3") the effective bit width of clone_flags on all architectures was > increased from 32-bit to 64-bit, with a new type of u64 for the flags. > However, for most consumers of clone_flags the interface was not > changed from the previous type of unsigned long. > > While this works fine as long as none of the new 64-bit flag bits > (CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still > undesirable in terms of the principle of least surprise. > > Thus, this commit fixes all relevant interfaces of callees to > sys_clone3/copy_process (excluding the architecture-specific > copy_thread) to consistently pass clone_flags as u64, so that > no truncation to 32-bit integers occurs on 32-bit architectures. > > Signed-off-by: Simon Schuster > Reviewed-by: Lorenzo Stoakes Reviewed-by: Arnd Bergmann From arnd at arndb.de Mon Sep 1 06:35:42 2025 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 01 Sep 2025 15:35:42 +0200 Subject: [PATCH v2 4/4] nios2: implement architecture-specific portion of sys_clone3 In-Reply-To: <20250901-nios2-implement-clone3-v2-4-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> <20250901-nios2-implement-clone3-v2-4-53fcf5577d57@siemens-energy.com> Message-ID: <35893d46-6caf-49ea-bbae-6e1cab6b2914@app.fastmail.com> On Mon, Sep 1, 2025, at 15:09, Simon Schuster via B4 Relay wrote: > From: Simon Schuster > > This commit adds the sys_clone3 entry point for nios2. An > architecture-specific wrapper (__sys_clone3) is required to save and > restore additional registers to the kernel stack via SAVE_SWITCH_STACK > and RESTORE_SWITCH_STACK. > > Signed-off-by: Simon Schuster Reviewed-by: Arnd Bergmann From brauner at kernel.org Mon Sep 1 06:40:26 2025 From: brauner at kernel.org (Christian Brauner) Date: Mon, 1 Sep 2025 15:40:26 +0200 Subject: [PATCH v2 0/4] nios2: Add architecture support for clone3 In-Reply-To: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> Message-ID: <20250901-lammfell-kaninchen-c160a69e6b36@brauner> On Mon, Sep 01, 2025 at 03:09:49PM +0200, Simon Schuster via B4 Relay wrote: > This series adds support for the clone3 system call to the nios2 > architecture. This addresses the build-time warning "warning: clone3() I did not expect that to happen or matter but fine. > entry point is missing, please fix" introduced in 505d66d1abfb9 > ("clone3: drop __ARCH_WANT_SYS_CLONE3 macro"). The implementation passes > the relevant clone3 tests of kselftest when applied on top of > next-20250815: > > ./run_kselftest.sh > TAP version 13 > 1..4 > # selftests: clone3: clone3 > ok 1 selftests: clone3: clone3 > # selftests: clone3: clone3_clear_sighand > ok 2 selftests: clone3: clone3_clear_sighand > # selftests: clone3: clone3_set_tid > ok 3 selftests: clone3: clone3_set_tid > # selftests: clone3: clone3_cap_checkpoint_restore > ok 4 selftests: clone3: clone3_cap_checkpoint_restore > > The series also includes a small patch to kernel/fork.c that ensures > that clone_flags are passed correctly on architectures where unsigned > long is insufficient to store the u64 clone_flags. It is marked as a fix > for stable backporting. > > As requested, in v2, this series now further tries to correct this type > error throughout the whole code base. Thus, it now touches a larger > number of subsystems and all architectures. I've reworked copy_thread()/copy_thread_tls() a few years ago but I don't remember why I didn't switch to a u64 for them. Probably because only CLONE_VM and CLONE_SETTLS mattered. Thanks for doing that. > Therefore, another test was performed for ARCH=x86_64 (as a > representative for 64-bit architectures). Here, the series builds cleanly > without warnings on defconfig with CONFIG_SECURITY_APPARMOR=y and > CONFIG_SECURITY_TOMOYO=y (to compile-check the LSM-related changes). > The build further successfully passes testing/selftests/clone3 (with the > patch from 20241105062948.1037011-1-zhouyuhang1010 at 163.com to prepare > clone3_cap_checkpoint_restore for compatibility with the newer libcap > version on my system). > > Is there any option to further preflight check this patch series via > lkp/KernelCI/etc. for a broader test across architectures, or is this > degree of testing sufficient to eventually get the series merged? > > N.B.: The series is not checkpatch clean right now: > - include/linux/cred.h, include/linux/mnt_namespace.h: > function definition arguments without identifier name > - include/trace/events/task.h: > space prohibited after that open parenthesis > > I did not fix these warnings to keep my changes minimal and reviewable, > as the issues persist throughout the files and they were not introduced > by me; I only followed the existing code style and just replaced the > types. If desired, I'd be happy to make the changes in a potential v3, > though. > > Signed-off-by: Simon Schuster > --- > Changes in v2: > - Introduce "Fixes:" and "Cc: stable at vger.kernel.org" where necessary > - Factor out "Fixes:" when adapting the datatype of clone_flags for > easier backports > - Fix additional instances where `unsigned long` clone_flags is used > - Reword commit message to make it clearer that any 32-bit arch is > affected by this bug > - Link to v1: https://lore.kernel.org/r/20250821-nios2-implement-clone3-v1-0-1bb24017376a at siemens-energy.com > > --- > Simon Schuster (4): > copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64) > copy_process: pass clone_flags as u64 across calltree > arch: copy_thread: pass clone_flags as u64 > nios2: implement architecture-specific portion of sys_clone3 > > arch/alpha/kernel/process.c | 2 +- > arch/arc/kernel/process.c | 2 +- > arch/arm/kernel/process.c | 2 +- > arch/arm64/kernel/process.c | 2 +- > arch/csky/kernel/process.c | 2 +- > arch/hexagon/kernel/process.c | 2 +- > arch/loongarch/kernel/process.c | 2 +- > arch/m68k/kernel/process.c | 2 +- > arch/microblaze/kernel/process.c | 2 +- > arch/mips/kernel/process.c | 2 +- > arch/nios2/include/asm/syscalls.h | 1 + > arch/nios2/include/asm/unistd.h | 2 -- > arch/nios2/kernel/entry.S | 6 ++++++ > arch/nios2/kernel/process.c | 2 +- > arch/nios2/kernel/syscall_table.c | 1 + > arch/openrisc/kernel/process.c | 2 +- > arch/parisc/kernel/process.c | 2 +- > arch/powerpc/kernel/process.c | 2 +- > arch/riscv/kernel/process.c | 2 +- > arch/s390/kernel/process.c | 2 +- > arch/sh/kernel/process_32.c | 2 +- > arch/sparc/kernel/process_32.c | 2 +- > arch/sparc/kernel/process_64.c | 2 +- > arch/um/kernel/process.c | 2 +- > arch/x86/include/asm/fpu/sched.h | 2 +- > arch/x86/include/asm/shstk.h | 4 ++-- > arch/x86/kernel/fpu/core.c | 2 +- > arch/x86/kernel/process.c | 2 +- > arch/x86/kernel/shstk.c | 2 +- > arch/xtensa/kernel/process.c | 2 +- > block/blk-ioc.c | 2 +- > fs/namespace.c | 2 +- > include/linux/cgroup.h | 4 ++-- > include/linux/cred.h | 2 +- > include/linux/iocontext.h | 6 +++--- > include/linux/ipc_namespace.h | 4 ++-- > include/linux/lsm_hook_defs.h | 2 +- > include/linux/mnt_namespace.h | 2 +- > include/linux/nsproxy.h | 2 +- > include/linux/pid_namespace.h | 4 ++-- > include/linux/rseq.h | 4 ++-- > include/linux/sched/task.h | 2 +- > include/linux/security.h | 4 ++-- > include/linux/sem.h | 4 ++-- > include/linux/time_namespace.h | 4 ++-- > include/linux/uprobes.h | 4 ++-- > include/linux/user_events.h | 4 ++-- > include/linux/utsname.h | 4 ++-- > include/net/net_namespace.h | 4 ++-- > include/trace/events/task.h | 6 +++--- > ipc/namespace.c | 2 +- > ipc/sem.c | 2 +- > kernel/cgroup/namespace.c | 2 +- > kernel/cred.c | 2 +- > kernel/events/uprobes.c | 2 +- > kernel/fork.c | 10 +++++----- > kernel/nsproxy.c | 4 ++-- > kernel/pid_namespace.c | 2 +- > kernel/sched/core.c | 4 ++-- > kernel/sched/fair.c | 2 +- > kernel/sched/sched.h | 4 ++-- > kernel/time/namespace.c | 2 +- > kernel/utsname.c | 2 +- > net/core/net_namespace.c | 2 +- > security/apparmor/lsm.c | 2 +- > security/security.c | 2 +- > security/selinux/hooks.c | 2 +- > security/tomoyo/tomoyo.c | 2 +- > 68 files changed, 95 insertions(+), 89 deletions(-) > --- > base-commit: 1357b2649c026b51353c84ddd32bc963e8999603 > change-id: 20250818-nios2-implement-clone3-7f252c20860b > > Best regards, > -- > Simon Schuster > > From linux at armlinux.org.uk Mon Sep 1 06:39:07 2025 From: linux at armlinux.org.uk (Russell King (Oracle)) Date: Mon, 1 Sep 2025 14:39:07 +0100 Subject: [PATCH v2 3/4] arch: copy_thread: pass clone_flags as u64 In-Reply-To: <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> Message-ID: On Mon, Sep 01, 2025 at 03:09:52PM +0200, Simon Schuster via B4 Relay wrote: > diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c > index e16ed102960c..d7aa95225c70 100644 > --- a/arch/arm/kernel/process.c > +++ b/arch/arm/kernel/process.c > @@ -234,7 +234,7 @@ asmlinkage void ret_from_fork(void) __asm__("ret_from_fork"); > > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long stack_start = args->stack; > unsigned long tls = args->tls; > struct thread_info *thread = task_thread_info(p); We only have one user of clone_flags in this function, which is: if (clone_flags & CLONE_SETTLS) I would much rather clone_flags was removed, and this changed to: if (args->flags & CLONE_SETTLS) Thanks. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last! From brauner at kernel.org Mon Sep 1 06:41:21 2025 From: brauner at kernel.org (Christian Brauner) Date: Mon, 1 Sep 2025 15:41:21 +0200 Subject: [PATCH v2 0/4] nios2: Add architecture support for clone3 In-Reply-To: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> Message-ID: <20250901-sauer-stunk-49def0170f7d@brauner> On Mon, 01 Sep 2025 15:09:49 +0200, Simon Schuster wrote: > This series adds support for the clone3 system call to the nios2 > architecture. This addresses the build-time warning "warning: clone3() > entry point is missing, please fix" introduced in 505d66d1abfb9 > ("clone3: drop __ARCH_WANT_SYS_CLONE3 macro"). The implementation passes > the relevant clone3 tests of kselftest when applied on top of > next-20250815: > > [...] Seems fine to me. Thanks for fixing this. --- Applied to the kernel-6.18.clone3 branch of the vfs/vfs.git tree. Patches in the kernel-6.18.clone3 branch should appear in linux-next soon. Please report any outstanding bugs that were missed during review in a new review to the original patch series allowing us to drop it. It's encouraged to provide Acked-bys and Reviewed-bys even though the patch has now been applied. If possible patch trailers will be updated. Note that commit hashes shown below are subject to change due to rebase, trailer updates or similar. If in doubt, please check the listed branch. tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git branch: kernel-6.18.clone3 [1/4] copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64) https://git.kernel.org/vfs/vfs/c/04ff48239f46 [2/4] copy_process: pass clone_flags as u64 across calltree https://git.kernel.org/vfs/vfs/c/5b38576cb8d3 [3/4] arch: copy_thread: pass clone_flags as u64 https://git.kernel.org/vfs/vfs/c/04e760acd97f [4/4] nios2: implement architecture-specific portion of sys_clone3 https://git.kernel.org/vfs/vfs/c/d7109d2a2358 From geert at linux-m68k.org Mon Sep 1 06:51:38 2025 From: geert at linux-m68k.org (Geert Uytterhoeven) Date: Mon, 1 Sep 2025 15:51:38 +0200 Subject: [PATCH v2 3/4] arch: copy_thread: pass clone_flags as u64 In-Reply-To: <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> Message-ID: On Mon, 1 Sept 2025 at 15:10, Simon Schuster via B4 Relay wrote: > From: Simon Schuster > > With the introduction of clone3 in commit 7f192e3cd316 ("fork: add > clone3") the effective bit width of clone_flags on all architectures was > increased from 32-bit to 64-bit, with a new type of u64 for the flags. > However, for most consumers of clone_flags the interface was not > changed from the previous type of unsigned long. > > While this works fine as long as none of the new 64-bit flag bits > (CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still > undesirable in terms of the principle of least surprise. > > Thus, this commit fixes all relevant interfaces of the copy_thread > function that is called from copy_process to consistently pass > clone_flags as u64, so that no truncation to 32-bit integers occurs on > 32-bit architectures. > > Signed-off-by: Simon Schuster Fixes: c5febea0956fd387 ("fork: Pass struct kernel_clone_args into copy_thread") > arch/m68k/kernel/process.c | 2 +- Acked-by: Geert Uytterhoeven # m68k Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds From david at redhat.com Mon Sep 1 08:03:21 2025 From: david at redhat.com (David Hildenbrand) Date: Mon, 1 Sep 2025 17:03:21 +0200 Subject: [PATCH v2 00/37] mm: remove nth_page() Message-ID: <20250901150359.867252-1-david@redhat.com> This is based on mm-unstable. I will only CC non-MM folks on the cover letter and the respective patch to not flood too many inboxes (the lists receive all patches). -- As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch #6 -> #13 : disallow folios to have non-contiguous pages Patch #14 -> #20 : remove nth_page() usage within folios Patch #22 : disallow CMA allocations of non-contiguous pages Patch #23 -> #33 : sanity+check + remove nth_page() usage within SG entry Patch #34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch #35 : remove nth_page() in kfence Patch #36 : adjust stale comment regarding nth_page Patch #37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. [1] https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A at mail.gmail.com/T/#u v1 -> v2: * "fs: hugetlbfs: cleanup folio in adjust_range_hwpoison()" -> Add comment for loop and remove comment of function regarding copy_page_to_iter(). * Various smaller patch description tweaks I am not going to list for my sanity * "mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages()" -> Fix flush_dcache_page() -> Drop "extern" * "mm/gup: remove record_subpages()" -> Added * "mm/hugetlb: check for unreasonable folio sizes when registering hstate" -> Refine comment * "mm/cma: refuse handing out non-contiguous page ranges" -> Add comment above loop * "mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof()" -> Added comment above check * "mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock()" -> Refined comment RFC -> v1: * "wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu kernel config" -> Mention that it was never really relevant for the test * "mm/mm_init: make memmap_init_compound() look more like prep_compound_page()" -> Mention the setup of page links * "mm: limit folio/compound page sizes in problematic kernel configs" -> Improve comment for PUD handling, mentioning hugetlb and dax * "mm: simplify folio_page() and folio_page_idx()" -> Call variable "n" * "mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap()" -> Keep __init_single_page() and refer to the usage of memblock_reserved_mark_noinit() * "fs: hugetlbfs: cleanup folio in adjust_range_hwpoison()" * "fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison()" -> Separate nth_page() removal from cleanups -> Further improve cleanups * "io_uring/zcrx: remove nth_page() usage within folio" -> Keep the io_copy_cache for now and limit to nth_page() removal * "mm/gup: drop nth_page() usage within folio when recording subpages" -> Cleanup record_subpages as bit * "mm/cma: refuse handing out non-contiguous page ranges" -> Replace another instance of "pfn_to_page(pfn)" where we already have the page * "scatterlist: disallow non-contigous page ranges in a single SG entry" -> We have to EXPORT the symbol. I thought about moving it to mm_inline.h, but I really don't want to include that in include/linux/scatterlist.h * "ata: libata-eh: drop nth_page() usage within SG entry" * "mspro_block: drop nth_page() usage within SG entry" * "memstick: drop nth_page() usage within SG entry" * "mmc: drop nth_page() usage within SG entry" -> Keep PAGE_SHIFT * "scsi: scsi_lib: drop nth_page() usage within SG entry" * "scsi: sg: drop nth_page() usage within SG entry" -> Split patches, Keep PAGE_SHIFT * "crypto: remove nth_page() usage within SG entry" -> Keep PAGE_SHIFT * "kfence: drop nth_page() usage" -> Keep modifying i and use "start_pfn" only instead Cc: Andrew Morton Cc: Linus Torvalds Cc: Jason Gunthorpe Cc: Lorenzo Stoakes Cc: "Liam R. Howlett" Cc: Vlastimil Babka Cc: Mike Rapoport Cc: Suren Baghdasaryan Cc: Michal Hocko Cc: Jens Axboe Cc: Marek Szyprowski Cc: Robin Murphy Cc: John Hubbard Cc: Peter Xu Cc: Alexander Potapenko Cc: Marco Elver Cc: Dmitry Vyukov Cc: Brendan Jackman Cc: Johannes Weiner Cc: Zi Yan Cc: Dennis Zhou Cc: Tejun Heo Cc: Christoph Lameter Cc: Muchun Song Cc: Oscar Salvador Cc: x86 at kernel.org Cc: linux-arm-kernel at lists.infradead.org Cc: linux-mips at vger.kernel.org Cc: linux-s390 at vger.kernel.org Cc: linux-crypto at vger.kernel.org Cc: linux-ide at vger.kernel.org Cc: intel-gfx at lists.freedesktop.org Cc: dri-devel at lists.freedesktop.org Cc: linux-mmc at vger.kernel.org Cc: linux-arm-kernel at axis.com Cc: linux-scsi at vger.kernel.org Cc: kvm at vger.kernel.org Cc: virtualization at lists.linux.dev Cc: linux-mm at kvack.org Cc: io-uring at vger.kernel.org Cc: iommu at lists.linux.dev Cc: kasan-dev at googlegroups.com Cc: wireguard at lists.zx2c4.com Cc: netdev at vger.kernel.org Cc: linux-kselftest at vger.kernel.org Cc: linux-riscv at lists.infradead.org David Hildenbrand (37): mm: stop making SPARSEMEM_VMEMMAP user-selectable arm64: Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP" s390/Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP" x86/Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP" wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu kernel config mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof() mm/memremap: reject unreasonable folio/compound page sizes in memremap_pages() mm/hugetlb: check for unreasonable folio sizes when registering hstate mm/mm_init: make memmap_init_compound() look more like prep_compound_page() mm: sanity-check maximum folio size in folio_set_order() mm: limit folio/compound page sizes in problematic kernel configs mm: simplify folio_page() and folio_page_idx() mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap() mm/mm/percpu-km: drop nth_page() usage within single allocation fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison() fs: hugetlbfs: cleanup folio in adjust_range_hwpoison() mm/pagewalk: drop nth_page() usage within folio in folio_walk_start() mm/gup: drop nth_page() usage within folio when recording subpages mm/gup: remove record_subpages() io_uring/zcrx: remove nth_page() usage within folio mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages() mm/cma: refuse handing out non-contiguous page ranges dma-remap: drop nth_page() in dma_common_contiguous_remap() scatterlist: disallow non-contigous page ranges in a single SG entry ata: libata-sff: drop nth_page() usage within SG entry drm/i915/gem: drop nth_page() usage within SG entry mspro_block: drop nth_page() usage within SG entry memstick: drop nth_page() usage within SG entry mmc: drop nth_page() usage within SG entry scsi: scsi_lib: drop nth_page() usage within SG entry scsi: sg: drop nth_page() usage within SG entry vfio/pci: drop nth_page() usage within SG entry crypto: remove nth_page() usage within SG entry mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock() kfence: drop nth_page() usage block: update comment of "struct bio_vec" regarding nth_page() mm: remove nth_page() arch/arm64/Kconfig | 1 - arch/mips/include/asm/cacheflush.h | 11 +++-- arch/mips/mm/cache.c | 8 ++-- arch/s390/Kconfig | 1 - arch/x86/Kconfig | 1 - crypto/ahash.c | 4 +- crypto/scompress.c | 8 ++-- drivers/ata/libata-sff.c | 6 +-- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 2 +- drivers/memstick/core/mspro_block.c | 3 +- drivers/memstick/host/jmb38x_ms.c | 3 +- drivers/memstick/host/tifm_ms.c | 3 +- drivers/mmc/host/tifm_sd.c | 4 +- drivers/mmc/host/usdhi6rol0.c | 4 +- drivers/scsi/scsi_lib.c | 3 +- drivers/scsi/sg.c | 3 +- drivers/vfio/pci/pds/lm.c | 3 +- drivers/vfio/pci/virtio/migrate.c | 3 +- fs/hugetlbfs/inode.c | 36 +++++--------- include/crypto/scatterwalk.h | 4 +- include/linux/bvec.h | 7 +-- include/linux/mm.h | 48 +++++++++++++++---- include/linux/page-flags.h | 5 +- include/linux/scatterlist.h | 3 +- io_uring/zcrx.c | 4 +- kernel/dma/remap.c | 2 +- mm/Kconfig | 3 +- mm/cma.c | 39 +++++++++------ mm/gup.c | 36 +++++++------- mm/hugetlb.c | 22 +++++---- mm/internal.h | 1 + mm/kfence/core.c | 12 +++-- mm/memremap.c | 3 ++ mm/mm_init.c | 15 +++--- mm/page_alloc.c | 10 +++- mm/pagewalk.c | 2 +- mm/percpu-km.c | 2 +- mm/util.c | 36 ++++++++++++++ tools/testing/scatterlist/linux/mm.h | 1 - .../selftests/wireguard/qemu/kernel.config | 1 - 40 files changed, 217 insertions(+), 146 deletions(-) base-commit: b73c6f2b5712809f5f386780ac46d1d78c31b2e6 -- 2.50.1 From andreas at gaisler.com Tue Sep 2 00:02:41 2025 From: andreas at gaisler.com (Andreas Larsson) Date: Tue, 2 Sep 2025 09:02:41 +0200 Subject: [PATCH v2 3/4] arch: copy_thread: pass clone_flags as u64 In-Reply-To: <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> Message-ID: On 2025-09-01 15:09, Simon Schuster via B4 Relay wrote: > From: Simon Schuster > > With the introduction of clone3 in commit 7f192e3cd316 ("fork: add > clone3") the effective bit width of clone_flags on all architectures was > increased from 32-bit to 64-bit, with a new type of u64 for the flags. > However, for most consumers of clone_flags the interface was not > changed from the previous type of unsigned long. > > While this works fine as long as none of the new 64-bit flag bits > (CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still > undesirable in terms of the principle of least surprise. > > Thus, this commit fixes all relevant interfaces of the copy_thread > function that is called from copy_process to consistently pass > clone_flags as u64, so that no truncation to 32-bit integers occurs on > 32-bit architectures. > > Signed-off-by: Simon Schuster > --- Thanks for this and for the whole series! Needed foundation for a sparc32 clone3 implementation as well. > arch/sparc/kernel/process_32.c | 2 +- > arch/sparc/kernel/process_64.c | 2 +- Acked-by: Andreas Larsson # sparc Cheers, Andreas From glaubitz at physik.fu-berlin.de Tue Sep 2 00:15:08 2025 From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz) Date: Tue, 02 Sep 2025 09:15:08 +0200 Subject: [PATCH v2 3/4] arch: copy_thread: pass clone_flags as u64 In-Reply-To: References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> Message-ID: <11a4d0a953e3a9405177d67f287c69379a2b2f8f.camel@physik.fu-berlin.de> Hi Andreas, On Tue, 2025-09-02 at 09:02 +0200, Andreas Larsson wrote: > On 2025-09-01 15:09, Simon Schuster via B4 Relay wrote: > > From: Simon Schuster > > > > With the introduction of clone3 in commit 7f192e3cd316 ("fork: add > > clone3") the effective bit width of clone_flags on all architectures was > > increased from 32-bit to 64-bit, with a new type of u64 for the flags. > > However, for most consumers of clone_flags the interface was not > > changed from the previous type of unsigned long. > > > > While this works fine as long as none of the new 64-bit flag bits > > (CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still > > undesirable in terms of the principle of least surprise. > > > > Thus, this commit fixes all relevant interfaces of the copy_thread > > function that is called from copy_process to consistently pass > > clone_flags as u64, so that no truncation to 32-bit integers occurs on > > 32-bit architectures. > > > > Signed-off-by: Simon Schuster > > --- > > Thanks for this and for the whole series! Needed foundation for a > sparc32 clone3 implementation as well. Can you implement clone3 for sparc64 as well? Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer `. `' Physicist `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913 From guoren at kernel.org Tue Sep 2 02:48:10 2025 From: guoren at kernel.org (Guo Ren) Date: Tue, 2 Sep 2025 17:48:10 +0800 Subject: [PATCH v2 3/4] arch: copy_thread: pass clone_flags as u64 In-Reply-To: <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> References: <20250901-nios2-implement-clone3-v2-0-53fcf5577d57@siemens-energy.com> <20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com> Message-ID: On Mon, Sep 1, 2025 at 9:10?PM Simon Schuster via B4 Relay wrote: > > From: Simon Schuster > > With the introduction of clone3 in commit 7f192e3cd316 ("fork: add > clone3") the effective bit width of clone_flags on all architectures was > increased from 32-bit to 64-bit, with a new type of u64 for the flags. > However, for most consumers of clone_flags the interface was not > changed from the previous type of unsigned long. > > While this works fine as long as none of the new 64-bit flag bits > (CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still > undesirable in terms of the principle of least surprise. > > Thus, this commit fixes all relevant interfaces of the copy_thread > function that is called from copy_process to consistently pass > clone_flags as u64, so that no truncation to 32-bit integers occurs on > 32-bit architectures. > > Signed-off-by: Simon Schuster > --- > arch/alpha/kernel/process.c | 2 +- > arch/arc/kernel/process.c | 2 +- > arch/arm/kernel/process.c | 2 +- > arch/arm64/kernel/process.c | 2 +- > arch/csky/kernel/process.c | 2 +- > arch/hexagon/kernel/process.c | 2 +- > arch/loongarch/kernel/process.c | 2 +- > arch/m68k/kernel/process.c | 2 +- > arch/microblaze/kernel/process.c | 2 +- > arch/mips/kernel/process.c | 2 +- > arch/nios2/kernel/process.c | 2 +- > arch/openrisc/kernel/process.c | 2 +- > arch/parisc/kernel/process.c | 2 +- > arch/powerpc/kernel/process.c | 2 +- > arch/riscv/kernel/process.c | 2 +- > arch/s390/kernel/process.c | 2 +- > arch/sh/kernel/process_32.c | 2 +- > arch/sparc/kernel/process_32.c | 2 +- > arch/sparc/kernel/process_64.c | 2 +- > arch/um/kernel/process.c | 2 +- > arch/x86/include/asm/fpu/sched.h | 2 +- > arch/x86/include/asm/shstk.h | 4 ++-- > arch/x86/kernel/fpu/core.c | 2 +- > arch/x86/kernel/process.c | 2 +- > arch/x86/kernel/shstk.c | 2 +- > arch/xtensa/kernel/process.c | 2 +- > 26 files changed, 27 insertions(+), 27 deletions(-) > > diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c > index 582d96548385..06522451f018 100644 > --- a/arch/alpha/kernel/process.c > +++ b/arch/alpha/kernel/process.c > @@ -231,7 +231,7 @@ flush_thread(void) > */ > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long usp = args->stack; > unsigned long tls = args->tls; > extern void ret_from_fork(void); > diff --git a/arch/arc/kernel/process.c b/arch/arc/kernel/process.c > index 186ceab661eb..8166d0908713 100644 > --- a/arch/arc/kernel/process.c > +++ b/arch/arc/kernel/process.c > @@ -166,7 +166,7 @@ asmlinkage void ret_from_fork(void); > */ > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long usp = args->stack; > unsigned long tls = args->tls; > struct pt_regs *c_regs; /* child's pt_regs */ > diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c > index e16ed102960c..d7aa95225c70 100644 > --- a/arch/arm/kernel/process.c > +++ b/arch/arm/kernel/process.c > @@ -234,7 +234,7 @@ asmlinkage void ret_from_fork(void) __asm__("ret_from_fork"); > > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long stack_start = args->stack; > unsigned long tls = args->tls; > struct thread_info *thread = task_thread_info(p); > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > index 96482a1412c6..fba7ca102a8c 100644 > --- a/arch/arm64/kernel/process.c > +++ b/arch/arm64/kernel/process.c > @@ -409,7 +409,7 @@ asmlinkage void ret_from_fork(void) asm("ret_from_fork"); > > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long stack_start = args->stack; > unsigned long tls = args->tls; > struct pt_regs *childregs = task_pt_regs(p); > diff --git a/arch/csky/kernel/process.c b/arch/csky/kernel/process.c > index 0c6e4b17fe00..a7a90340042a 100644 > --- a/arch/csky/kernel/process.c > +++ b/arch/csky/kernel/process.c > @@ -32,7 +32,7 @@ void flush_thread(void){} > > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; Acked-by: Guo Ren (Alibaba Damo Academy) > unsigned long usp = args->stack; > unsigned long tls = args->tls; > struct switch_stack *childstack; > diff --git a/arch/hexagon/kernel/process.c b/arch/hexagon/kernel/process.c > index 2a77bfd75694..15b4992bfa29 100644 > --- a/arch/hexagon/kernel/process.c > +++ b/arch/hexagon/kernel/process.c > @@ -52,7 +52,7 @@ void arch_cpu_idle(void) > */ > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long usp = args->stack; > unsigned long tls = args->tls; > struct thread_info *ti = task_thread_info(p); > diff --git a/arch/loongarch/kernel/process.c b/arch/loongarch/kernel/process.c > index 3582f591bab2..efd9edf65603 100644 > --- a/arch/loongarch/kernel/process.c > +++ b/arch/loongarch/kernel/process.c > @@ -167,7 +167,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > unsigned long childksp; > unsigned long tls = args->tls; > unsigned long usp = args->stack; > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > struct pt_regs *childregs, *regs = current_pt_regs(); > > childksp = (unsigned long)task_stack_page(p) + THREAD_SIZE; > diff --git a/arch/m68k/kernel/process.c b/arch/m68k/kernel/process.c > index fda7eac23f87..f5a07a70e938 100644 > --- a/arch/m68k/kernel/process.c > +++ b/arch/m68k/kernel/process.c > @@ -141,7 +141,7 @@ asmlinkage int m68k_clone3(struct pt_regs *regs) > > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long usp = args->stack; > unsigned long tls = args->tls; > struct fork_frame { > diff --git a/arch/microblaze/kernel/process.c b/arch/microblaze/kernel/process.c > index 56342e11442d..6cbf642d7b80 100644 > --- a/arch/microblaze/kernel/process.c > +++ b/arch/microblaze/kernel/process.c > @@ -54,7 +54,7 @@ void flush_thread(void) > > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long usp = args->stack; > unsigned long tls = args->tls; > struct pt_regs *childregs = task_pt_regs(p); > diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c > index 02aa6a04a21d..29191fa1801e 100644 > --- a/arch/mips/kernel/process.c > +++ b/arch/mips/kernel/process.c > @@ -107,7 +107,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) > */ > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long usp = args->stack; > unsigned long tls = args->tls; > struct thread_info *ti = task_thread_info(p); > diff --git a/arch/nios2/kernel/process.c b/arch/nios2/kernel/process.c > index f84021303f6a..151404139085 100644 > --- a/arch/nios2/kernel/process.c > +++ b/arch/nios2/kernel/process.c > @@ -101,7 +101,7 @@ void flush_thread(void) > > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long usp = args->stack; > unsigned long tls = args->tls; > struct pt_regs *childregs = task_pt_regs(p); > diff --git a/arch/openrisc/kernel/process.c b/arch/openrisc/kernel/process.c > index eef99fee2110..73ffb9fa3118 100644 > --- a/arch/openrisc/kernel/process.c > +++ b/arch/openrisc/kernel/process.c > @@ -165,7 +165,7 @@ extern asmlinkage void ret_from_fork(void); > int > copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long usp = args->stack; > unsigned long tls = args->tls; > struct pt_regs *userregs; > diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c > index ed93bd8c1545..e64ab5d2a40d 100644 > --- a/arch/parisc/kernel/process.c > +++ b/arch/parisc/kernel/process.c > @@ -201,7 +201,7 @@ arch_initcall(parisc_idle_init); > int > copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long usp = args->stack; > unsigned long tls = args->tls; > struct pt_regs *cregs = &(p->thread.regs); > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c > index 855e09886503..eb23966ac0a9 100644 > --- a/arch/powerpc/kernel/process.c > +++ b/arch/powerpc/kernel/process.c > @@ -1805,7 +1805,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > f = ret_from_kernel_user_thread; > } else { > struct pt_regs *regs = current_pt_regs(); > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long usp = args->stack; > > /* Copy registers */ > diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c > index a0a40889d79a..31a392993cb4 100644 > --- a/arch/riscv/kernel/process.c > +++ b/arch/riscv/kernel/process.c > @@ -223,7 +223,7 @@ asmlinkage void ret_from_fork_user(struct pt_regs *regs) > > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long usp = args->stack; > unsigned long tls = args->tls; > struct pt_regs *childregs = task_pt_regs(p); > diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c > index f55f09cda6f8..b107dbca4ed7 100644 > --- a/arch/s390/kernel/process.c > +++ b/arch/s390/kernel/process.c > @@ -106,7 +106,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) > > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long new_stackp = args->stack; > unsigned long tls = args->tls; > struct fake_frame > diff --git a/arch/sh/kernel/process_32.c b/arch/sh/kernel/process_32.c > index 92b6649d4929..62f753a85b89 100644 > --- a/arch/sh/kernel/process_32.c > +++ b/arch/sh/kernel/process_32.c > @@ -89,7 +89,7 @@ asmlinkage void ret_from_kernel_thread(void); > > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long usp = args->stack; > unsigned long tls = args->tls; > struct thread_info *ti = task_thread_info(p); > diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c > index 9c7c662cb565..5a28c0e91bf1 100644 > --- a/arch/sparc/kernel/process_32.c > +++ b/arch/sparc/kernel/process_32.c > @@ -260,7 +260,7 @@ extern void ret_from_kernel_thread(void); > > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long sp = args->stack; > unsigned long tls = args->tls; > struct thread_info *ti = task_thread_info(p); > diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c > index 529adfecd58c..25781923788a 100644 > --- a/arch/sparc/kernel/process_64.c > +++ b/arch/sparc/kernel/process_64.c > @@ -567,7 +567,7 @@ void fault_in_user_windows(struct pt_regs *regs) > */ > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long sp = args->stack; > unsigned long tls = args->tls; > struct thread_info *t = task_thread_info(p); > diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c > index 1be644de9e41..9c9c66dc45f0 100644 > --- a/arch/um/kernel/process.c > +++ b/arch/um/kernel/process.c > @@ -143,7 +143,7 @@ static void fork_handler(void) > > int copy_thread(struct task_struct * p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long sp = args->stack; > unsigned long tls = args->tls; > void (*handler)(void); > diff --git a/arch/x86/include/asm/fpu/sched.h b/arch/x86/include/asm/fpu/sched.h > index c060549c6c94..89004f4ca208 100644 > --- a/arch/x86/include/asm/fpu/sched.h > +++ b/arch/x86/include/asm/fpu/sched.h > @@ -11,7 +11,7 @@ > > extern void save_fpregs_to_fpstate(struct fpu *fpu); > extern void fpu__drop(struct task_struct *tsk); > -extern int fpu_clone(struct task_struct *dst, unsigned long clone_flags, bool minimal, > +extern int fpu_clone(struct task_struct *dst, u64 clone_flags, bool minimal, > unsigned long shstk_addr); > extern void fpu_flush_thread(void); > > diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h > index ba6f2fe43848..0f50e0125943 100644 > --- a/arch/x86/include/asm/shstk.h > +++ b/arch/x86/include/asm/shstk.h > @@ -16,7 +16,7 @@ struct thread_shstk { > > long shstk_prctl(struct task_struct *task, int option, unsigned long arg2); > void reset_thread_features(void); > -unsigned long shstk_alloc_thread_stack(struct task_struct *p, unsigned long clone_flags, > +unsigned long shstk_alloc_thread_stack(struct task_struct *p, u64 clone_flags, > unsigned long stack_size); > void shstk_free(struct task_struct *p); > int setup_signal_shadow_stack(struct ksignal *ksig); > @@ -28,7 +28,7 @@ static inline long shstk_prctl(struct task_struct *task, int option, > unsigned long arg2) { return -EINVAL; } > static inline void reset_thread_features(void) {} > static inline unsigned long shstk_alloc_thread_stack(struct task_struct *p, > - unsigned long clone_flags, > + u64 clone_flags, > unsigned long stack_size) { return 0; } > static inline void shstk_free(struct task_struct *p) {} > static inline int setup_signal_shadow_stack(struct ksignal *ksig) { return 0; } > diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c > index aefd412a23dc..1f71cc135e9a 100644 > --- a/arch/x86/kernel/fpu/core.c > +++ b/arch/x86/kernel/fpu/core.c > @@ -631,7 +631,7 @@ static int update_fpu_shstk(struct task_struct *dst, unsigned long ssp) > } > > /* Clone current's FPU state on fork */ > -int fpu_clone(struct task_struct *dst, unsigned long clone_flags, bool minimal, > +int fpu_clone(struct task_struct *dst, u64 clone_flags, bool minimal, > unsigned long ssp) > { > /* > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c > index 1b7960cf6eb0..e3a3987b0c4f 100644 > --- a/arch/x86/kernel/process.c > +++ b/arch/x86/kernel/process.c > @@ -159,7 +159,7 @@ __visible void ret_from_fork(struct task_struct *prev, struct pt_regs *regs, > > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long sp = args->stack; > unsigned long tls = args->tls; > struct inactive_task_frame *frame; > diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c > index 2ddf23387c7e..5eba6c5a6775 100644 > --- a/arch/x86/kernel/shstk.c > +++ b/arch/x86/kernel/shstk.c > @@ -191,7 +191,7 @@ void reset_thread_features(void) > current->thread.features_locked = 0; > } > > -unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, unsigned long clone_flags, > +unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, u64 clone_flags, > unsigned long stack_size) > { > struct thread_shstk *shstk = &tsk->thread.shstk; > diff --git a/arch/xtensa/kernel/process.c b/arch/xtensa/kernel/process.c > index 7bd66677f7b6..94d43f44be13 100644 > --- a/arch/xtensa/kernel/process.c > +++ b/arch/xtensa/kernel/process.c > @@ -267,7 +267,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) > > int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > { > - unsigned long clone_flags = args->flags; > + u64 clone_flags = args->flags; > unsigned long usp_thread_fn = args->stack; > unsigned long tls = args->tls; > struct pt_regs *childregs = task_pt_regs(p); > > -- > 2.39.5 > > -- Best Regards Guo Ren From yana2bsh at gmail.com Fri Sep 12 12:53:23 2025 From: yana2bsh at gmail.com (Yana Bashlykova) Date: Fri, 12 Sep 2025 22:53:23 +0300 Subject: [PATCH 6.1 00/15] genetlink: Test Netlink subsystem of Linux v6.1 Message-ID: <20250912195339.20635-1-yana2bsh@gmail.com> This series adds comprehensive testing infrastructure for Netlink and Generic Netlink The implementation includes both kernel module and userspace tests to verify correct Generic Netlink and Netlink behaviors under various conditions. Yana Bashlykova (15): genetlink: add sysfs test module for Generic Netlink genetlink: add TEST_GENL family for netlink testing genetlink: add PARALLEL_GENL test family genetlink: add test case for duplicate genl family registration genetlink: add test case for family with invalid ops genetlink: add netlink notifier support genetlink: add THIRD_GENL family genetlink: verify unregister fails for non-registered family genetlink: add LARGE_GENL stress test family selftests: net: genetlink: add packet capture test infrastructure selftests: net: genetlink: add /proc/net/netlink test selftests: net: genetlink: add Generic Netlink controller tests selftests: net: genetlink: add large family ID resolution test selftests: net: genetlink: add Netlink and Generic Netlink test suite selftests: net: genetlink: fix expectation for large family resolution drivers/net/Kconfig | 2 + drivers/net/Makefile | 2 + drivers/net/genetlink/Kconfig | 8 + drivers/net/genetlink/Makefile | 3 + .../net-pf-16-proto-16-family-PARALLEL_GENL.c | 1921 ++++++ tools/testing/selftests/net/Makefile | 6 + tools/testing/selftests/net/genetlink.c | 5152 +++++++++++++++++ 7 files changed, 7094 insertions(+) create mode 100644 drivers/net/genetlink/Kconfig create mode 100644 drivers/net/genetlink/Makefile create mode 100644 drivers/net/genetlink/net-pf-16-proto-16-family-PARALLEL_GENL.c create mode 100644 tools/testing/selftests/net/genetlink.c -- 2.34.1 From yana2bsh at gmail.com Fri Sep 12 12:53:33 2025 From: yana2bsh at gmail.com (Yana Bashlykova) Date: Fri, 12 Sep 2025 22:53:33 +0300 Subject: [PATCH 6.1 10/15] selftests: net: genetlink: add packet capture test infrastructure In-Reply-To: <20250912195339.20635-1-yana2bsh@gmail.com> References: <20250912195339.20635-1-yana2bsh@gmail.com> Message-ID: <20250912195339.20635-11-yana2bsh@gmail.com> Add test cases for monitoring Netlink traffic during test execution Require CONFIG_NLMON. Signed-off-by: Yana Bashlykova --- tools/testing/selftests/net/Makefile | 6 + tools/testing/selftests/net/genetlink.c | 234 ++++++++++++++++++++++++ 2 files changed, 240 insertions(+) create mode 100644 tools/testing/selftests/net/genetlink.c diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index 69c58362c0ed..0c325ccc5f03 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -71,6 +71,7 @@ TEST_GEN_FILES += bind_bhash TEST_GEN_PROGS += sk_bind_sendto_listen TEST_GEN_PROGS += sk_connect_zero_addr TEST_PROGS += test_ingress_egress_chaining.sh +TEST_GEN_PROGS += genetlink TEST_FILES := settings @@ -82,3 +83,8 @@ $(OUTPUT)/reuseport_bpf_numa: LDLIBS += -lnuma $(OUTPUT)/tcp_mmap: LDLIBS += -lpthread $(OUTPUT)/tcp_inq: LDLIBS += -lpthread $(OUTPUT)/bind_bhash: LDLIBS += -lpthread + +$(OUTPUT)/genetlink: LDLIBS += -lnl-3 -lnl-genl-3 +$(OUTPUT)/genetlink: CFLAGS += $(shell pkg-config --cflags libnl-3.0 libnl-genl-3.0) + +EXTRA_CLEAN := $(SCRATCH_DIR) $(OUTPUT)/genetlink.pcap diff --git a/tools/testing/selftests/net/genetlink.c b/tools/testing/selftests/net/genetlink.c new file mode 100644 index 000000000000..5be9ca68accd --- /dev/null +++ b/tools/testing/selftests/net/genetlink.c @@ -0,0 +1,234 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Generic Netlink and Netlink test cases + * + * This test suite validates various aspects of Generic Netlink and Netlink communication + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "../kselftest_harness.h" + +#define MY_GENL_FAMILY_NAME "TEST_GENL" +#define MY_GENL_CMD_UNSPEC 0 +#define MY_GENL_CMD_ECHO 1 +#define MY_GENL_CMD_SET_VALUE 2 +#define MY_GENL_CMD_GET_VALUE 3 +#define MY_GENL_CMD_EVENT 4 +#define MY_GENL_CMD_NO_ATTRS 5 + +#define MY_GENL_SMALL_CMD_GET 0 + +#define MY_GENL_ATTR_UNSPEC 0 +#define MY_GENL_ATTR_DATA 1 +#define MY_GENL_ATTR_VALUE 2 +#define MY_GENL_ATTR_PATH 3 +#define MY_GENL_ATTR_NESTED 4 +#define MY_GENL_ATTR_MAX 4 + +#define THIRD_GENL_FAMILY_NAME "THIRD_GENL" + +#define THIRD_GENL_CMD_ECHO 1 + +#define THIRD_GENL_ATTR_UNSPEC 0 +#define THIRD_GENL_ATTR_DATA 1 +#define THIRD_GENL_ATTR_FLAG 2 +#define THIRD_GENL_ATTR_MAX 2 + +#define PATH_GENL_TEST_NUM "/sys/kernel/genl_test/value" +#define PATH_GENL_TEST_MES "/sys/kernel/genl_test/message" +#define PATH_GENL_TEST_DEV "/sys/kernel/genl_test/some_info" +#define PATH_PARALLEL_GENL_MES "/sys/kernel/parallel_genl/message" +#define PATH_THIRD_GENL_MES "/sys/kernel/third_genl/message" + +#define MY_MCGRP_NAME "MY_MCGRP_GENL" + +#define GENL_CTRL "nlctrl" +#define CTRL_ATTR_POLICY_MAX (__CTRL_ATTR_POLICY_DUMP_MAX - 1) + +#define PARALLEL_GENL_FAMILY_NAME "PARALLEL_GENL" +#define PARALLEL_GENL_ATTR_UNSPEC 0 +#define PARALLEL_GENL_CMD_SEND 1 +#define PARALLEL_GENL_CMD_DUMP_INFO 2 +#define PARALLEL_GENL_CMD_SET_VALUE 3 +#define PARALLEL_GENL_CMD_GET_VALUE 4 + +#define PARALLEL_GENL_ATTR_DATA 1 +#define PARALLEL_GENL_ATTR_BINARY 2 +#define PARALLEL_GENL_ATTR_NAME 3 +#define PARALLEL_GENL_ATTR_DESC 4 +#define PARALLEL_GENL_ATTR_FLAG_NONBLOCK 9 +#define PARALLEL_GENL_ATTR_FLAG_BLOCK 10 +#define PARALLEL_GENL_ATTR_PATH 12 +#define PARALLEL_GENL_ATTR_MAX 12 + +#define LARGE_GENL_FAMILY_NAME "LARGE_GENL" + +/* + * Test cases + */ + +/** + * TEST(capture_start) - Starts Netlink traffic capture using nlmon interface + * + * Creates a virtual nlmon interface, enables it and starts packet capture + * with tcpdump. Captured packets are saved to 'genetlink.pcap' file. + * + * Note: + * - Requires root privileges + * - Creates temporary interface 'nlmon0' + * - Runs tcpdump in background + * - Adds small delay to ensure capture starts + */ + +TEST(capture_start) +{ + printf("Running Test: starting Netlink traffic capture...\n"); + + // Only root can monitor Netlink traffic + if (geteuid()) { + SKIP(return, "test requires root"); + return; + } + + char command[256]; + int result; + + snprintf(command, sizeof(command), "ip link add nlmon0 type nlmon"); + result = system(command); + ASSERT_EQ(WEXITSTATUS(result), 0); + if (result == -1) { + perror("system"); + return; + } + + snprintf(command, sizeof(command), "ip link set nlmon0 up"); + result = system(command); + ASSERT_EQ(WEXITSTATUS(result), 0); + if (result == -1) { + perror("system"); + return; + } + + snprintf(command, sizeof(command), + "tcpdump -i nlmon0 -w genetlink.pcap &"); + result = system(command); + ASSERT_EQ(WEXITSTATUS(result), 0); + if (result == -1) { + perror("system"); + return; + } + + printf("nlmon is up. Starting netlink process...\n"); + + sleep(2); + + printf("Starting Netlink tests...\n"); +} + +/** + * TEST(capture_end) - Terminates Netlink traffic monitoring session + * + * Performs controlled shutdown of nlmon capture interface by: + * 1. Stopping tcpdump capture process + * 2. Bringing down nlmon interface + * 3. Deleting nlmon interface + * + * Test Procedure: + * 1. Privilege Check: + * - Verifies root privileges (required for nlmon operations) + * - Gracefully skips if not root + * + * 2. Capture Termination: + * - Stops tcpdump process (2-second delay for cleanup) + * - Brings nlmon0 interface down + * - Deletes nlmon0 interface + * - Validates each operation succeeds + * + * 3. Cleanup Verification: + * - Checks system command exit statuses + * - Provides detailed error reporting + * + * Key Validations: + * - Proper termination of monitoring session + * - Correct interface teardown + * - Root privilege enforcement + * - System command error handling + * + * Expected Behavior: + * - tcpdump process should terminate successfully + * - nlmon0 interface should deactivate cleanly + * - Interface should be removable + * - Non-root execution should skip gracefully + * + * Security Considerations: + * - Requires root for network interface control + * - Ensures complete capture session cleanup + * - Verifies proper resource release + * + * Note: + * - Should be paired with capture_start test + * - Includes 2-second delay for process stabilization + * - Provides status feedback through printf + */ + +TEST(capture_end) +{ + printf("Running Test: stopping Netlink traffic capture...\n"); + + // Only root can monitor Netlink traffic + if (geteuid()) { + SKIP(return, "test requires root"); + return; + } + + char command[256]; + int result; + + sleep(2); + + snprintf(command, sizeof(command), "pkill tcpdump"); + result = system(command); + ASSERT_EQ(WEXITSTATUS(result), 0); + if (result == -1) { + perror("system"); + return; + } + + snprintf(command, sizeof(command), "ip link set nlmon0 down"); + result = system(command); + ASSERT_EQ(WEXITSTATUS(result), 0); + if (result == -1) { + perror("system"); + return; + } + + snprintf(command, sizeof(command), "ip link delete nlmon0 type nlmon"); + result = system(command); + ASSERT_EQ(WEXITSTATUS(result), 0); + if (result == -1) { + perror("system"); + return; + } + + printf("The capturing is over\n"); +} + +TEST_HARNESS_MAIN -- 2.34.1 From kuba at kernel.org Fri Sep 12 13:17:22 2025 From: kuba at kernel.org (Jakub Kicinski) Date: Fri, 12 Sep 2025 13:17:22 -0700 Subject: [PATCH 6.1 00/15] genetlink: Test Netlink subsystem of Linux v6.1 In-Reply-To: <20250912195339.20635-1-yana2bsh@gmail.com> References: <20250912195339.20635-1-yana2bsh@gmail.com> Message-ID: <20250912131722.74658ec0@kernel.org> On Fri, 12 Sep 2025 22:53:23 +0300 Yana Bashlykova wrote: > This series adds comprehensive testing infrastructure for Netlink > and Generic Netlink > > The implementation includes both kernel module and userspace tests to > verify correct Generic Netlink and Netlink behaviors under > various conditions. What is the motivation for this work? From andrew at lunn.ch Fri Sep 12 14:10:51 2025 From: andrew at lunn.ch (Andrew Lunn) Date: Fri, 12 Sep 2025 23:10:51 +0200 Subject: [PATCH net-next v11 2/5] net: spacemit: Add K1 Ethernet MAC In-Reply-To: <20250912-net-k1-emac-v11-2-aa3e84f8043b@iscas.ac.cn> References: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> <20250912-net-k1-emac-v11-2-aa3e84f8043b@iscas.ac.cn> Message-ID: <1f2887e4-2644-48a4-8171-98bd310d190f@lunn.ch> > +static u32 emac_rd(struct emac_priv *priv, u32 reg) > +{ > + return readl(priv->iobase + reg); > +} > +static int emac_mii_read(struct mii_bus *bus, int phy_addr, int regnum) > +{ > + struct emac_priv *priv = bus->priv; > + u32 cmd = 0, val; > + int ret; > + > + cmd |= FIELD_PREP(MREGBIT_PHY_ADDRESS, phy_addr); > + cmd |= FIELD_PREP(MREGBIT_REGISTER_ADDRESS, regnum); > + cmd |= MREGBIT_START_MDIO_TRANS | MREGBIT_MDIO_READ_WRITE; > + > + emac_wr(priv, MAC_MDIO_DATA, 0x0); > + emac_wr(priv, MAC_MDIO_CONTROL, cmd); > + > + ret = readl_poll_timeout(priv->iobase + MAC_MDIO_CONTROL, val, > + !(val & MREGBIT_START_MDIO_TRANS), 100, 10000); > + > + if (ret) > + return ret; > + > + val = emac_rd(priv, MAC_MDIO_DATA); > + return val; emac_rd() returns a u32. Is it guaranteed by the hardware that the upper word is 0? Maybe this needs to be masked? Andrew From andrew at lunn.ch Fri Sep 12 14:12:06 2025 From: andrew at lunn.ch (Andrew Lunn) Date: Fri, 12 Sep 2025 23:12:06 +0200 Subject: [PATCH net-next v11 4/5] riscv: dts: spacemit: Add Ethernet support for BPI-F3 In-Reply-To: <20250912-net-k1-emac-v11-4-aa3e84f8043b@iscas.ac.cn> References: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> <20250912-net-k1-emac-v11-4-aa3e84f8043b@iscas.ac.cn> Message-ID: <0df5d251-3c2e-4b5a-8fb8-b5a6d00383c2@lunn.ch> On Fri, Sep 12, 2025 at 02:13:56AM +0800, Vivian Wang wrote: > Banana Pi BPI-F3 uses an RGMII PHY for each port and uses GPIO for PHY > reset. > > Tested-by: Hendrik Hamerlinck > Signed-off-by: Vivian Wang > Reviewed-by: Yixun Lan Reviewed-by: Andrew Lunn Andrew From andrew at lunn.ch Fri Sep 12 14:12:30 2025 From: andrew at lunn.ch (Andrew Lunn) Date: Fri, 12 Sep 2025 23:12:30 +0200 Subject: [PATCH net-next v11 5/5] riscv: dts: spacemit: Add Ethernet support for Jupiter In-Reply-To: <20250912-net-k1-emac-v11-5-aa3e84f8043b@iscas.ac.cn> References: <20250912-net-k1-emac-v11-0-aa3e84f8043b@iscas.ac.cn> <20250912-net-k1-emac-v11-5-aa3e84f8043b@iscas.ac.cn> Message-ID: <12583aec-4499-4cc1-a487-9c7b8d8efb01@lunn.ch> On Fri, Sep 12, 2025 at 02:13:57AM +0800, Vivian Wang wrote: > Milk-V Jupiter uses an RGMII PHY for each port and uses GPIO for PHY > reset. > > Signed-off-by: Vivian Wang > Reviewed-by: Yixun Lan Reviewed-by: Andrew Lunn Andrew From safinaskar at zohomail.com Fri Sep 12 15:38:35 2025 From: safinaskar at zohomail.com (Askar Safin) Date: Fri, 12 Sep 2025 22:38:35 +0000 Subject: [PATCH 00/62] initrd: remove classic initrd support Message-ID: <20250912223937.3735076-1-safinaskar@zohomail.com> Intro ==== This patchset removes classic initrd (initial RAM disk) support, which was deprecated in 2020. Initramfs still stays, and RAM disk itself (brd) still stays, too. init/do_mounts* and init/*initramfs* are listed in VFS entry in MAINTAINERS, so I think this patchset should go through VFS tree. This patchset touchs every subdirectory in arch/, so I tested it on 8 (!!!) archs in Qemu (see details below). Warning: this patchset renames CONFIG_BLK_DEV_INITRD (!!!) to CONFIG_INITRAMFS and CONFIG_RD_* to CONFIG_INITRAMFS_DECOMPRESS_* (for example, CONFIG_RD_GZIP to CONFIG_INITRAMFS_DECOMPRESS_GZIP). If you still use initrd, see below for workaround. Details ==== I not only removed initrd, I also removed a lot of code, which became dead, including a lot of code in arch/. Still I think the only two architectures I touched in non-trivial way are sh and 32-bit arm. Also I renamed some files, functions and variables (which became misnomers) to proper names, moved some code around, removed a lot of mentions of initrd in code and comments. Also I cleaned up some docs. For example, I renamed the following global variables: __initramfs_start __initramfs_size phys_initrd_start phys_initrd_size initrd_start initrd_end to: __builtin_initramfs_start __builtin_initramfs_size phys_external_initramfs_start phys_external_initramfs_size virt_external_initramfs_start virt_external_initramfs_end New names precisely capture meaning of these variables. Also I renamed CONFIG_BLK_DEV_INITRD (which became total misnomer) to CONFIG_INITRAMFS. And CONFIG_RD_* to CONFIG_INITRAMFS_DECOMPRESS_*. This will break all configs out there (update your configs!). Still I think this is okay, because config names never were part of stable API. Still, I don't have strong opinion here, so I can drop these renamings if needed. Other user-visible changes: - Removed kernel command line parameters "load_ramdisk" and "prompt_ramdisk", which did nothing and were deprecated - Removed kernel command line parameter "ramdisk_start", which was used for initrd only (not for initramfs) - Removed kernel command line parameter "noinitrd", which was inconsistent: it controlled initrd only (not initramfs), except for EFI boot, where it controlled both initramfs and initrd. EFI users still can disable initramfs simply by not passing it - Removed kernel command line parameter "ramdisk_size", which used for controlling ramdisk (brd), but only in non-modular mode. Use brd.rd_size instead, it always works - Removed /proc/sys/kernel/real-root-dev . It was used for initrd only This patchset is based on v6.17-rc5. Testing ==== I tested my patchset on many architectures in Qemu using my Rust program, heavily based on mkroot [1]. I used the following cross-compilers: aarch64-linux-musleabi armv4l-linux-musleabihf armv5l-linux-musleabihf armv7l-linux-musleabihf i486-linux-musl i686-linux-musl mips-linux-musl mips64-linux-musl mipsel-linux-musl powerpc-linux-musl powerpc64-linux-musl powerpc64le-linux-musl riscv32-linux-musl riscv64-linux-musl s390x-linux-musl sh4-linux-musl sh4eb-linux-musl x86_64-linux-musl taken from this directory [2]. So, as you can see, there are 18 triplets, which correspond to 8 subdirs in arch/. And note that this list contains two archs (arm and sh) touched in non-trivial way. For every triplet I tested that: - Initramfs still works (both builtin and external) - Direct boot from disk still works Workaround ==== If "retain_initrd" is passed to kernel, then initramfs/initrd, passed by bootloader, is retained and becomes available after boot as read-only magic file /sys/firmware/initrd [3]. No copies are involved. I. e. /sys/firmware/initrd is simply a reference to original blob passed by bootloader. This works even if initrd/initramfs is not recognized by kernel in any way, i. e. even if it is not valid cpio archive, nor a fs image supported by classic initrd. This works both with my patchset and without it. This means that you can emulate classic initrd so: link builtin initramfs to kernel. In /init in this initramfs copy /sys/firmware/initrd to some file in / and loop-mount it. This is even better than classic initrd, because: - You can use fs not supported by classic initrd, for example erofs - One copy is involved (from /sys/firmware/initrd to some file in /) as opposed to two when using classic initrd Still, I don't recommend using this workaround, because I want everyone to migrate to proper modern initramfs. But still you can use this workaround if you want. Also: it is not possible to directly loop-mount /sys/firmware/initrd . Theoretically kernel can be changed to allow this (and/or to make it writable), but I think nobody needs this. And I don't want to implement this. [1] https://github.com/landley/toybox/tree/master/mkroot [2] https://landley.net/toybox/downloads/binaries/toolchains/latest [3] https://lore.kernel.org/all/20231207235654.16622-1-graf at amazon.com/ Askar Safin (62): init: remove deprecated "load_ramdisk" command line parameter, which does nothing init: remove deprecated "prompt_ramdisk" command line parameter, which does nothing init: sh, sparc, x86: remove unused constants RAMDISK_PROMPT_FLAG and RAMDISK_LOAD_FLAG init: x86, arm, sh, sparc: remove variable rd_image_start, which controls starting block number of initrd init: remove "ramdisk_start" command line parameter, which controls starting block number of initrd arm: init: remove special logic for setting brd.rd_size arm: init: remove ATAG_RAMDISK arm: init: remove FLAG_RDLOAD and FLAG_RDPROMPT arm: init: document rd_start (in param_struct) as obsolete initrd: remove initrd (initial RAM disk) support init, efi: remove "noinitrd" command line parameter init: remove /proc/sys/kernel/real-root-dev ext2: remove ext2_image_size and associated code init: m68k, mips, powerpc, s390, sh: remove Root_RAM0 doc: modernize Documentation/admin-guide/blockdev/ramdisk.rst brd: remove "ramdisk_size" command line parameter doc: modernize Documentation/filesystems/ramfs-rootfs-initramfs.rst doc: modernize Documentation/driver-api/early-userspace/early_userspace_support.rst init: remove mentions of "ramdisk=" command line parameter doc: remove Documentation/power/swsusp-dmcrypt.rst init: remove all mentions of root=/dev/ram* doc: remove obsolete mentions of pivot_root init: rename __initramfs_{start,size} to __builtin_initramfs_{start,size} init: remove wrong comment init: rename phys_initrd_{start,size} to phys_external_initramfs_{start,size} init: move phys_external_initramfs_{start,size} to init/initramfs.c init: alpha: remove "extern unsigned long initrd_start, initrd_end" init: alpha, arc, arm, arm64, csky, m68k, microblaze, mips, nios2, openrisc, parisc, powerpc, s390, sh, sparc, um, x86, xtensa: rename initrd_{start,end} to virt_external_initramfs_{start,end} init: move virt_external_initramfs_{start,end} to init/initramfs.c doc: remove documentation for block device 4 0 init: rename initrd_below_start_ok to initramfs_below_start_ok init: move initramfs_below_start_ok to init/initramfs.c init: remove init/do_mounts_initrd.c init: inline create_dev into the only caller init: make mount_root_generic static init: make mount_root static init: remove root_mountflags from init/do_mounts.h init: remove most headers from init/do_mounts.h init: make console_on_rootfs static init: rename free_initrd_mem to free_initramfs_mem init: rename reserve_initrd_mem to reserve_initramfs_mem init: rename to setsid: inline ksys_setsid into the only caller doc: kernel-parameters: remove [RAM] from reserve_mem= doc: kernel-parameters: replace [RAM] with [INITRAMFS] init: edit docs for initramfs-related configs init: fix typo: virtul => virtual init: fix comment init: rename ramdisk_execute_command to initramfs_execute_command init: rename ramdisk_command_access to initramfs_command_access init: rename get_boot_config_from_initrd to get_boot_config_from_initramfs init: rename do_retain_initrd to retain_initramfs init: rename kexec_free_initrd to kexec_free_initramfs init: arm, x86: deal with some references to initrd init: rename CONFIG_BLK_DEV_INITRD to CONFIG_INITRAMFS init: rename CONFIG_RD_GZIP to CONFIG_INITRAMFS_DECOMPRESS_GZIP init: rename CONFIG_RD_BZIP2 to CONFIG_INITRAMFS_DECOMPRESS_BZIP2 init: rename CONFIG_RD_LZMA to CONFIG_INITRAMFS_DECOMPRESS_LZMA init: rename CONFIG_RD_XZ to CONFIG_INITRAMFS_DECOMPRESS_XZ init: rename CONFIG_RD_LZO to CONFIG_INITRAMFS_DECOMPRESS_LZO init: rename CONFIG_RD_LZ4 to CONFIG_INITRAMFS_DECOMPRESS_LZ4 init: rename CONFIG_RD_ZSTD to CONFIG_INITRAMFS_DECOMPRESS_ZSTD .../admin-guide/blockdev/ramdisk.rst | 104 +---- .../admin-guide/device-mapper/dm-init.rst | 4 +- Documentation/admin-guide/devices.txt | 12 - Documentation/admin-guide/index.rst | 1 - Documentation/admin-guide/initrd.rst | 383 ------------------ .../admin-guide/kernel-parameters.rst | 4 +- .../admin-guide/kernel-parameters.txt | 38 +- Documentation/admin-guide/nfs/nfsroot.rst | 4 +- Documentation/admin-guide/sysctl/kernel.rst | 6 - Documentation/arch/arm/ixp4xx.rst | 4 +- Documentation/arch/arm/setup.rst | 6 +- Documentation/arch/m68k/kernel-options.rst | 29 +- Documentation/arch/x86/boot.rst | 4 +- .../early_userspace_support.rst | 18 +- .../filesystems/ramfs-rootfs-initramfs.rst | 20 +- Documentation/power/index.rst | 1 - Documentation/power/swsusp-dmcrypt.rst | 140 ------- Documentation/security/ipe.rst | 2 +- .../translations/zh_CN/power/index.rst | 1 - arch/alpha/kernel/core_irongate.c | 12 +- arch/alpha/kernel/proto.h | 2 +- arch/alpha/kernel/setup.c | 32 +- arch/arc/configs/axs101_defconfig | 2 +- arch/arc/configs/axs103_defconfig | 2 +- arch/arc/configs/axs103_smp_defconfig | 2 +- arch/arc/configs/haps_hs_defconfig | 2 +- arch/arc/configs/haps_hs_smp_defconfig | 2 +- arch/arc/configs/hsdk_defconfig | 2 +- arch/arc/configs/nsim_700_defconfig | 2 +- arch/arc/configs/nsimosci_defconfig | 2 +- arch/arc/configs/nsimosci_hs_defconfig | 2 +- arch/arc/configs/nsimosci_hs_smp_defconfig | 2 +- arch/arc/configs/tb10x_defconfig | 4 +- arch/arc/configs/vdk_hs38_defconfig | 2 +- arch/arc/configs/vdk_hs38_smp_defconfig | 2 +- arch/arc/mm/init.c | 14 +- arch/arm/Kconfig | 2 +- arch/arm/boot/dts/arm/integratorap.dts | 2 +- arch/arm/boot/dts/arm/integratorcp.dts | 2 +- .../dts/aspeed/aspeed-bmc-facebook-cmm.dts | 2 +- .../aspeed/aspeed-bmc-facebook-galaxy100.dts | 2 +- .../aspeed/aspeed-bmc-facebook-minipack.dts | 2 +- .../aspeed/aspeed-bmc-facebook-wedge100.dts | 2 +- .../aspeed/aspeed-bmc-facebook-wedge40.dts | 2 +- .../dts/aspeed/aspeed-bmc-facebook-yamp.dts | 2 +- .../ast2600-facebook-netbmc-common.dtsi | 2 +- arch/arm/boot/dts/hisilicon/hi3620-hi4511.dts | 2 +- .../ixp/intel-ixp42x-welltech-epbx100.dts | 2 +- arch/arm/boot/dts/nspire/nspire-classic.dtsi | 2 +- arch/arm/boot/dts/nspire/nspire-cx.dts | 2 +- .../boot/dts/samsung/exynos4210-origen.dts | 2 +- .../boot/dts/samsung/exynos4210-smdkv310.dts | 2 +- .../boot/dts/samsung/exynos4412-smdk4412.dts | 2 +- .../boot/dts/samsung/exynos5250-smdk5250.dts | 2 +- arch/arm/boot/dts/st/ste-nomadik-nhk15.dts | 2 +- arch/arm/boot/dts/st/ste-nomadik-s8815.dts | 2 +- arch/arm/boot/dts/st/stm32429i-eval.dts | 2 +- arch/arm/boot/dts/st/stm32746g-eval.dts | 2 +- arch/arm/boot/dts/st/stm32f429-disco.dts | 2 +- arch/arm/boot/dts/st/stm32f469-disco.dts | 2 +- arch/arm/boot/dts/st/stm32f746-disco.dts | 2 +- arch/arm/boot/dts/st/stm32f769-disco.dts | 2 +- arch/arm/boot/dts/st/stm32h743i-disco.dts | 2 +- arch/arm/boot/dts/st/stm32h743i-eval.dts | 2 +- arch/arm/boot/dts/st/stm32h747i-disco.dts | 2 +- arch/arm/boot/dts/st/stm32h750i-art-pi.dts | 2 +- arch/arm/configs/aspeed_g4_defconfig | 8 +- arch/arm/configs/aspeed_g5_defconfig | 8 +- arch/arm/configs/assabet_defconfig | 4 +- arch/arm/configs/at91_dt_defconfig | 4 +- arch/arm/configs/axm55xx_defconfig | 2 +- arch/arm/configs/bcm2835_defconfig | 2 +- arch/arm/configs/clps711x_defconfig | 4 +- arch/arm/configs/collie_defconfig | 4 +- arch/arm/configs/davinci_all_defconfig | 2 +- arch/arm/configs/exynos_defconfig | 4 +- arch/arm/configs/footbridge_defconfig | 2 +- arch/arm/configs/gemini_defconfig | 2 +- arch/arm/configs/h3600_defconfig | 2 +- arch/arm/configs/hisi_defconfig | 4 +- arch/arm/configs/imx_v4_v5_defconfig | 2 +- arch/arm/configs/imx_v6_v7_defconfig | 4 +- arch/arm/configs/integrator_defconfig | 2 +- arch/arm/configs/ixp4xx_defconfig | 2 +- arch/arm/configs/keystone_defconfig | 2 +- arch/arm/configs/lpc18xx_defconfig | 12 +- arch/arm/configs/lpc32xx_defconfig | 4 +- arch/arm/configs/milbeaut_m10v_defconfig | 2 +- arch/arm/configs/multi_v4t_defconfig | 2 +- arch/arm/configs/multi_v5_defconfig | 2 +- arch/arm/configs/multi_v7_defconfig | 2 +- arch/arm/configs/mvebu_v7_defconfig | 2 +- arch/arm/configs/mxs_defconfig | 2 +- arch/arm/configs/neponset_defconfig | 4 +- arch/arm/configs/nhk8815_defconfig | 2 +- arch/arm/configs/omap1_defconfig | 2 +- arch/arm/configs/omap2plus_defconfig | 2 +- arch/arm/configs/pxa910_defconfig | 2 +- arch/arm/configs/pxa_defconfig | 4 +- arch/arm/configs/qcom_defconfig | 2 +- arch/arm/configs/rpc_defconfig | 2 +- arch/arm/configs/s3c6400_defconfig | 4 +- arch/arm/configs/s5pv210_defconfig | 4 +- arch/arm/configs/sama5_defconfig | 4 +- arch/arm/configs/sama7_defconfig | 2 +- arch/arm/configs/shmobile_defconfig | 2 +- arch/arm/configs/socfpga_defconfig | 2 +- arch/arm/configs/sp7021_defconfig | 12 +- arch/arm/configs/spear13xx_defconfig | 2 +- arch/arm/configs/spear3xx_defconfig | 2 +- arch/arm/configs/spear6xx_defconfig | 2 +- arch/arm/configs/spitz_defconfig | 2 +- arch/arm/configs/stm32_defconfig | 2 +- arch/arm/configs/sunxi_defconfig | 2 +- arch/arm/configs/tegra_defconfig | 2 +- arch/arm/configs/u8500_defconfig | 4 +- arch/arm/configs/versatile_defconfig | 2 +- arch/arm/configs/vexpress_defconfig | 2 +- arch/arm/configs/vf610m4_defconfig | 10 +- arch/arm/configs/vt8500_v6_v7_defconfig | 2 +- arch/arm/configs/wpcm450_defconfig | 2 +- arch/arm/include/uapi/asm/setup.h | 10 - arch/arm/kernel/atags_compat.c | 10 - arch/arm/kernel/atags_parse.c | 16 +- arch/arm/kernel/setup.c | 2 +- arch/arm/mm/init.c | 24 +- arch/arm64/configs/defconfig | 2 +- arch/arm64/kernel/setup.c | 2 +- arch/arm64/mm/init.c | 17 +- arch/csky/kernel/setup.c | 24 +- arch/csky/mm/init.c | 2 +- arch/hexagon/configs/comet_defconfig | 2 +- arch/loongarch/configs/loongson3_defconfig | 2 +- arch/loongarch/kernel/mem.c | 2 +- arch/loongarch/kernel/setup.c | 4 +- arch/m68k/configs/amiga_defconfig | 2 +- arch/m68k/configs/apollo_defconfig | 2 +- arch/m68k/configs/atari_defconfig | 2 +- arch/m68k/configs/bvme6000_defconfig | 2 +- arch/m68k/configs/hp300_defconfig | 2 +- arch/m68k/configs/mac_defconfig | 2 +- arch/m68k/configs/multi_defconfig | 2 +- arch/m68k/configs/mvme147_defconfig | 2 +- arch/m68k/configs/mvme16x_defconfig | 2 +- arch/m68k/configs/q40_defconfig | 2 +- arch/m68k/configs/stmark2_defconfig | 2 +- arch/m68k/configs/sun3_defconfig | 2 +- arch/m68k/configs/sun3x_defconfig | 2 +- arch/m68k/kernel/setup_mm.c | 12 +- arch/m68k/kernel/setup_no.c | 12 +- arch/m68k/kernel/uboot.c | 17 +- arch/microblaze/kernel/cpu/mb.c | 2 +- arch/microblaze/kernel/setup.c | 2 +- arch/microblaze/mm/init.c | 12 +- arch/mips/ath79/prom.c | 12 +- arch/mips/configs/ath25_defconfig | 12 +- arch/mips/configs/ath79_defconfig | 4 +- arch/mips/configs/bcm47xx_defconfig | 2 +- arch/mips/configs/bigsur_defconfig | 2 +- arch/mips/configs/bmips_be_defconfig | 2 +- arch/mips/configs/bmips_stb_defconfig | 14 +- arch/mips/configs/cavium_octeon_defconfig | 2 +- arch/mips/configs/eyeq5_defconfig | 2 +- arch/mips/configs/eyeq6_defconfig | 2 +- arch/mips/configs/generic_defconfig | 2 +- arch/mips/configs/gpr_defconfig | 2 +- arch/mips/configs/lemote2f_defconfig | 2 +- arch/mips/configs/loongson2k_defconfig | 2 +- arch/mips/configs/loongson3_defconfig | 2 +- arch/mips/configs/malta_defconfig | 2 +- arch/mips/configs/mtx1_defconfig | 2 +- arch/mips/configs/rb532_defconfig | 2 +- arch/mips/configs/rbtx49xx_defconfig | 2 +- arch/mips/configs/rt305x_defconfig | 4 +- arch/mips/configs/sb1250_swarm_defconfig | 2 +- arch/mips/configs/xway_defconfig | 4 +- arch/mips/kernel/setup.c | 53 ++- arch/mips/mm/init.c | 2 +- arch/mips/sibyte/common/cfe.c | 36 +- arch/mips/sibyte/swarm/setup.c | 2 +- arch/nios2/kernel/setup.c | 20 +- arch/openrisc/configs/or1klitex_defconfig | 2 +- arch/openrisc/configs/or1ksim_defconfig | 4 +- arch/openrisc/configs/simple_smp_defconfig | 14 +- arch/openrisc/configs/virt_defconfig | 2 +- arch/openrisc/kernel/setup.c | 24 +- arch/openrisc/kernel/vmlinux.h | 2 +- arch/parisc/boot/compressed/misc.c | 2 +- arch/parisc/configs/generic-32bit_defconfig | 2 +- arch/parisc/configs/generic-64bit_defconfig | 2 +- arch/parisc/defpalo.conf | 2 +- arch/parisc/kernel/pdt.c | 6 +- arch/parisc/kernel/setup.c | 8 +- arch/parisc/mm/init.c | 32 +- arch/powerpc/configs/44x/akebono_defconfig | 2 +- arch/powerpc/configs/44x/arches_defconfig | 2 +- arch/powerpc/configs/44x/bamboo_defconfig | 2 +- arch/powerpc/configs/44x/bluestone_defconfig | 2 +- .../powerpc/configs/44x/canyonlands_defconfig | 2 +- arch/powerpc/configs/44x/ebony_defconfig | 2 +- arch/powerpc/configs/44x/eiger_defconfig | 2 +- arch/powerpc/configs/44x/fsp2_defconfig | 10 +- arch/powerpc/configs/44x/icon_defconfig | 2 +- arch/powerpc/configs/44x/iss476-smp_defconfig | 2 +- arch/powerpc/configs/44x/katmai_defconfig | 2 +- arch/powerpc/configs/44x/rainier_defconfig | 2 +- arch/powerpc/configs/44x/redwood_defconfig | 2 +- arch/powerpc/configs/44x/sam440ep_defconfig | 2 +- arch/powerpc/configs/44x/sequoia_defconfig | 2 +- arch/powerpc/configs/44x/taishan_defconfig | 2 +- arch/powerpc/configs/44x/warp_defconfig | 2 +- arch/powerpc/configs/52xx/cm5200_defconfig | 2 +- arch/powerpc/configs/52xx/lite5200b_defconfig | 2 +- arch/powerpc/configs/52xx/motionpro_defconfig | 2 +- arch/powerpc/configs/52xx/tqm5200_defconfig | 2 +- arch/powerpc/configs/83xx/asp8347_defconfig | 2 +- .../configs/83xx/mpc8313_rdb_defconfig | 2 +- .../configs/83xx/mpc8315_rdb_defconfig | 2 +- .../configs/83xx/mpc832x_rdb_defconfig | 2 +- .../configs/83xx/mpc834x_itx_defconfig | 2 +- .../configs/83xx/mpc834x_itxgp_defconfig | 2 +- .../configs/83xx/mpc836x_rdk_defconfig | 2 +- .../configs/83xx/mpc837x_rdb_defconfig | 2 +- arch/powerpc/configs/85xx/ge_imp3a_defconfig | 2 +- arch/powerpc/configs/85xx/ksi8560_defconfig | 2 +- arch/powerpc/configs/85xx/socrates_defconfig | 2 +- arch/powerpc/configs/85xx/stx_gp3_defconfig | 2 +- arch/powerpc/configs/85xx/tqm8540_defconfig | 2 +- arch/powerpc/configs/85xx/tqm8541_defconfig | 2 +- arch/powerpc/configs/85xx/tqm8548_defconfig | 2 +- arch/powerpc/configs/85xx/tqm8555_defconfig | 2 +- arch/powerpc/configs/85xx/tqm8560_defconfig | 2 +- .../configs/85xx/xes_mpc85xx_defconfig | 2 +- arch/powerpc/configs/amigaone_defconfig | 2 +- arch/powerpc/configs/cell_defconfig | 2 +- arch/powerpc/configs/chrp32_defconfig | 2 +- arch/powerpc/configs/fsl-emb-nonhw.config | 2 +- arch/powerpc/configs/g5_defconfig | 2 +- arch/powerpc/configs/gamecube_defconfig | 2 +- arch/powerpc/configs/holly_defconfig | 2 +- arch/powerpc/configs/linkstation_defconfig | 2 +- arch/powerpc/configs/mgcoge_defconfig | 4 +- arch/powerpc/configs/microwatt_defconfig | 2 +- arch/powerpc/configs/mpc512x_defconfig | 2 +- arch/powerpc/configs/mpc5200_defconfig | 2 +- arch/powerpc/configs/mpc83xx_defconfig | 2 +- arch/powerpc/configs/pasemi_defconfig | 2 +- arch/powerpc/configs/pmac32_defconfig | 2 +- arch/powerpc/configs/powernv_defconfig | 2 +- arch/powerpc/configs/ppc44x_defconfig | 2 +- arch/powerpc/configs/ppc64_defconfig | 2 +- arch/powerpc/configs/ppc64e_defconfig | 2 +- arch/powerpc/configs/ppc6xx_defconfig | 2 +- arch/powerpc/configs/ps3_defconfig | 2 +- arch/powerpc/configs/skiroot_defconfig | 12 +- arch/powerpc/configs/wii_defconfig | 2 +- arch/powerpc/kernel/prom.c | 22 +- arch/powerpc/kernel/prom_init.c | 6 +- arch/powerpc/kernel/setup-common.c | 25 +- arch/powerpc/kernel/setup_32.c | 2 +- arch/powerpc/kernel/setup_64.c | 2 +- arch/powerpc/mm/init_32.c | 2 +- arch/powerpc/platforms/52xx/lite5200.c | 2 +- arch/powerpc/platforms/83xx/km83xx.c | 2 +- arch/powerpc/platforms/85xx/mpc85xx_mds.c | 2 +- arch/powerpc/platforms/chrp/setup.c | 2 +- .../platforms/embedded6xx/linkstation.c | 2 +- .../platforms/embedded6xx/storcenter.c | 2 +- arch/powerpc/platforms/powermac/setup.c | 8 +- arch/riscv/configs/defconfig | 2 +- arch/riscv/configs/nommu_k210_defconfig | 16 +- arch/riscv/configs/nommu_virt_defconfig | 12 +- arch/riscv/mm/init.c | 4 +- arch/s390/boot/ipl_parm.c | 2 +- arch/s390/boot/startup.c | 4 +- arch/s390/configs/zfcpdump_defconfig | 2 +- arch/s390/kernel/setup.c | 10 +- arch/s390/mm/init.c | 2 +- arch/sh/configs/apsh4a3a_defconfig | 2 +- arch/sh/configs/apsh4ad0a_defconfig | 2 +- arch/sh/configs/ecovec24-romimage_defconfig | 2 +- arch/sh/configs/edosk7760_defconfig | 2 +- arch/sh/configs/kfr2r09-romimage_defconfig | 2 +- arch/sh/configs/kfr2r09_defconfig | 2 +- arch/sh/configs/magicpanelr2_defconfig | 2 +- arch/sh/configs/migor_defconfig | 2 +- arch/sh/configs/rsk7201_defconfig | 2 +- arch/sh/configs/rsk7203_defconfig | 2 +- arch/sh/configs/sdk7786_defconfig | 8 +- arch/sh/configs/se7206_defconfig | 2 +- arch/sh/configs/se7705_defconfig | 2 +- arch/sh/configs/se7722_defconfig | 2 +- arch/sh/configs/se7751_defconfig | 2 +- arch/sh/configs/secureedge5410_defconfig | 2 +- arch/sh/configs/sh03_defconfig | 2 +- arch/sh/configs/sh7757lcr_defconfig | 2 +- arch/sh/configs/titan_defconfig | 2 +- arch/sh/configs/ul2_defconfig | 2 +- arch/sh/configs/urquell_defconfig | 2 +- arch/sh/include/asm/setup.h | 1 - arch/sh/kernel/head_32.S | 2 +- arch/sh/kernel/setup.c | 27 +- arch/sparc/boot/piggyback.c | 4 +- arch/sparc/configs/sparc32_defconfig | 2 +- arch/sparc/configs/sparc64_defconfig | 2 +- arch/sparc/kernel/head_32.S | 4 +- arch/sparc/kernel/head_64.S | 6 +- arch/sparc/kernel/setup_32.c | 9 +- arch/sparc/kernel/setup_64.c | 9 +- arch/sparc/mm/init_32.c | 22 +- arch/sparc/mm/init_64.c | 20 +- arch/um/kernel/Makefile | 2 +- arch/um/kernel/initrd.c | 6 +- arch/x86/Kconfig | 2 +- arch/x86/boot/header.S | 2 +- arch/x86/boot/startup/sme.c | 2 +- arch/x86/configs/i386_defconfig | 2 +- arch/x86/configs/x86_64_defconfig | 2 +- arch/x86/include/uapi/asm/bootparam.h | 7 +- arch/x86/kernel/cpu/microcode/amd.c | 2 +- arch/x86/kernel/cpu/microcode/core.c | 12 +- arch/x86/kernel/cpu/microcode/intel.c | 2 +- arch/x86/kernel/cpu/microcode/internal.h | 2 +- arch/x86/kernel/devicetree.c | 2 +- arch/x86/kernel/setup.c | 39 +- arch/x86/mm/init.c | 8 +- arch/x86/mm/init_32.c | 2 +- arch/x86/mm/init_64.c | 2 +- arch/x86/tools/relocs.c | 2 +- arch/xtensa/Kconfig | 2 +- arch/xtensa/boot/dts/csp.dts | 2 +- arch/xtensa/configs/audio_kc705_defconfig | 2 +- arch/xtensa/configs/cadence_csp_defconfig | 12 +- arch/xtensa/configs/generic_kc705_defconfig | 2 +- arch/xtensa/configs/nommu_kc705_defconfig | 12 +- arch/xtensa/configs/smp_lx200_defconfig | 2 +- arch/xtensa/configs/virt_defconfig | 2 +- arch/xtensa/configs/xip_kc705_defconfig | 2 +- arch/xtensa/kernel/setup.c | 26 +- drivers/acpi/Kconfig | 2 +- drivers/acpi/tables.c | 10 +- drivers/base/firmware_loader/main.c | 2 +- drivers/block/Kconfig | 8 +- drivers/block/brd.c | 20 +- drivers/firmware/efi/efi.c | 10 +- .../firmware/efi/libstub/efi-stub-helper.c | 5 +- drivers/gpu/drm/ci/arm.config | 2 +- drivers/gpu/drm/ci/arm64.config | 2 +- drivers/gpu/drm/ci/x86_64.config | 2 +- drivers/of/fdt.c | 18 +- fs/ext2/ext2.h | 9 - fs/init.c | 14 - include/asm-generic/vmlinux.lds.h | 8 +- include/linux/ext2_fs.h | 13 - include/linux/init_syscalls.h | 1 - include/linux/initramfs.h | 26 ++ include/linux/initrd.h | 37 -- include/linux/root_dev.h | 1 - include/linux/syscalls.h | 1 - include/uapi/linux/sysctl.h | 1 - init/.kunitconfig | 2 +- init/Kconfig | 28 +- init/Makefile | 6 +- init/do_mounts.c | 28 +- init/do_mounts.h | 42 -- init/do_mounts_initrd.c | 154 ------- init/do_mounts_rd.c | 334 --------------- init/initramfs.c | 152 ++++--- init/main.c | 66 +-- kernel/sys.c | 7 +- kernel/sysctl.c | 2 +- kernel/umh.c | 2 +- scripts/package/builddeb | 2 +- .../ktest/examples/bootconfigs/tracing.bconf | 3 - tools/testing/selftests/bpf/config.aarch64 | 2 +- tools/testing/selftests/bpf/config.ppc64el | 2 +- tools/testing/selftests/bpf/config.riscv64 | 2 +- tools/testing/selftests/bpf/config.s390x | 2 +- tools/testing/selftests/kho/vmtest.sh | 2 +- .../testing/selftests/nolibc/Makefile.nolibc | 4 +- tools/testing/selftests/vsock/config | 2 +- .../selftests/wireguard/qemu/kernel.config | 2 +- usr/Kconfig | 70 ++-- usr/Makefile | 2 +- usr/initramfs_data.S | 4 +- 385 files changed, 969 insertions(+), 2346 deletions(-) delete mode 100644 Documentation/admin-guide/initrd.rst delete mode 100644 Documentation/power/swsusp-dmcrypt.rst create mode 100644 include/linux/initramfs.h delete mode 100644 include/linux/initrd.h delete mode 100644 init/do_mounts_initrd.c delete mode 100644 init/do_mounts_rd.c base-commit: 76eeb9b8de9880ca38696b2fb56ac45ac0a25c6c -- 2.47.2 From safinaskar at zohomail.com Fri Sep 12 15:38:36 2025 From: safinaskar at zohomail.com (Askar Safin) Date: Fri, 12 Sep 2025 22:38:36 +0000 Subject: [PATCH 01/62] init: remove deprecated "load_ramdisk" command line parameter, which does nothing In-Reply-To: <20250912223937.3735076-1-safinaskar@zohomail.com> References: <20250912223937.3735076-1-safinaskar@zohomail.com> Message-ID: <20250912223937.3735076-2-safinaskar@zohomail.com> This is preparation for initrd removal Signed-off-by: Askar Safin --- Documentation/admin-guide/kernel-parameters.txt | 2 -- arch/arm/configs/neponset_defconfig | 2 +- init/do_mounts.c | 7 ------- 3 files changed, 1 insertion(+), 10 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 747a55abf494..d3b05ce249ff 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3275,8 +3275,6 @@ If there are multiple matching configurations changing the same attribute, the last one is used. - load_ramdisk= [RAM] [Deprecated] - lockd.nlm_grace_period=P [NFS] Assign grace period. Format: diff --git a/arch/arm/configs/neponset_defconfig b/arch/arm/configs/neponset_defconfig index 2227f86100ad..16f7300239da 100644 --- a/arch/arm/configs/neponset_defconfig +++ b/arch/arm/configs/neponset_defconfig @@ -9,7 +9,7 @@ CONFIG_ASSABET_NEPONSET=y CONFIG_ZBOOT_ROM_TEXT=0x80000 CONFIG_ZBOOT_ROM_BSS=0xc1000000 CONFIG_ZBOOT_ROM=y -CONFIG_CMDLINE="console=ttySA0,38400n8 cpufreq=221200 rw root=/dev/mtdblock2 mtdparts=sa1100:512K(boot),1M(kernel),2560K(initrd),4M(root) load_ramdisk=1 prompt_ramdisk=0 mem=32M noinitrd initrd=0xc0800000,3M" +CONFIG_CMDLINE="console=ttySA0,38400n8 cpufreq=221200 rw root=/dev/mtdblock2 mtdparts=sa1100:512K(boot),1M(kernel),2560K(initrd),4M(root) prompt_ramdisk=0 mem=32M noinitrd initrd=0xc0800000,3M" CONFIG_FPE_NWFPE=y CONFIG_PM=y CONFIG_MODULES=y diff --git a/init/do_mounts.c b/init/do_mounts.c index 6af29da8889e..0f2f44e6250c 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -34,13 +34,6 @@ static int root_wait; dev_t ROOT_DEV; -static int __init load_ramdisk(char *str) -{ - pr_warn("ignoring the deprecated load_ramdisk= option\n"); - return 1; -} -__setup("load_ramdisk=", load_ramdisk); - static int __init readonly(char *str) { if (*str) -- 2.47.2 From safinaskar at zohomail.com Fri Sep 12 15:38:37 2025 From: safinaskar at zohomail.com (Askar Safin) Date: Fri, 12 Sep 2025 22:38:37 +0000 Subject: [PATCH 02/62] init: remove deprecated "prompt_ramdisk" command line parameter, which does nothing In-Reply-To: <20250912223937.3735076-1-safinaskar@zohomail.com> References: <20250912223937.3735076-1-safinaskar@zohomail.com> Message-ID: <20250912223937.3735076-3-safinaskar@zohomail.com> This is preparation for initrd removal Signed-off-by: Askar Safin --- Documentation/admin-guide/kernel-parameters.txt | 2 -- arch/arm/configs/neponset_defconfig | 2 +- init/do_mounts_rd.c | 7 ------- 3 files changed, 1 insertion(+), 10 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index d3b05ce249ff..f940c1184912 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5229,8 +5229,6 @@ Param: - step/bucket size as a power of 2 for statistical time based profiling. - prompt_ramdisk= [RAM] [Deprecated] - prot_virt= [S390] enable hosting protected virtual machines isolated from the hypervisor (if hardware supports that). If enabled, the default kernel base address diff --git a/arch/arm/configs/neponset_defconfig b/arch/arm/configs/neponset_defconfig index 16f7300239da..4d720001c12e 100644 --- a/arch/arm/configs/neponset_defconfig +++ b/arch/arm/configs/neponset_defconfig @@ -9,7 +9,7 @@ CONFIG_ASSABET_NEPONSET=y CONFIG_ZBOOT_ROM_TEXT=0x80000 CONFIG_ZBOOT_ROM_BSS=0xc1000000 CONFIG_ZBOOT_ROM=y -CONFIG_CMDLINE="console=ttySA0,38400n8 cpufreq=221200 rw root=/dev/mtdblock2 mtdparts=sa1100:512K(boot),1M(kernel),2560K(initrd),4M(root) prompt_ramdisk=0 mem=32M noinitrd initrd=0xc0800000,3M" +CONFIG_CMDLINE="console=ttySA0,38400n8 cpufreq=221200 rw root=/dev/mtdblock2 mtdparts=sa1100:512K(boot),1M(kernel),2560K(initrd),4M(root) mem=32M noinitrd initrd=0xc0800000,3M" CONFIG_FPE_NWFPE=y CONFIG_PM=y CONFIG_MODULES=y diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c index ac021ae6e6fa..f7d53bc21e41 100644 --- a/init/do_mounts_rd.c +++ b/init/do_mounts_rd.c @@ -17,13 +17,6 @@ static struct file *in_file, *out_file; static loff_t in_pos, out_pos; -static int __init prompt_ramdisk(char *str) -{ - pr_warn("ignoring the deprecated prompt_ramdisk= option\n"); - return 1; -} -__setup("prompt_ramdisk=", prompt_ramdisk); - int __initdata rd_image_start; /* starting block # of image */ static int __init ramdisk_start_setup(char *str) -- 2.47.2 From safinaskar at zohomail.com Fri Sep 12 15:38:38 2025 From: safinaskar at zohomail.com (Askar Safin) Date: Fri, 12 Sep 2025 22:38:38 +0000 Subject: [PATCH 03/62] init: sh, sparc, x86: remove unused constants RAMDISK_PROMPT_FLAG and RAMDISK_LOAD_FLAG In-Reply-To: <20250912223937.3735076-1-safinaskar@zohomail.com> References: <20250912223937.3735076-1-safinaskar@zohomail.com> Message-ID: <20250912223937.3735076-4-safinaskar@zohomail.com> They were used for initrd before c8376994c86. c8376994c86c made them unused and forgot to remove them Fixes: c8376994c86c ("initrd: remove support for multiple floppies") Cc: # because changes uapi headers Signed-off-by: Askar Safin --- arch/sh/kernel/setup.c | 2 -- arch/sparc/kernel/setup_32.c | 2 -- arch/sparc/kernel/setup_64.c | 2 -- arch/x86/include/uapi/asm/bootparam.h | 2 -- arch/x86/kernel/setup.c | 2 -- 5 files changed, 10 deletions(-) diff --git a/arch/sh/kernel/setup.c b/arch/sh/kernel/setup.c index 039a51291002..d66f098e9e9f 100644 --- a/arch/sh/kernel/setup.c +++ b/arch/sh/kernel/setup.c @@ -71,8 +71,6 @@ EXPORT_SYMBOL(sh_mv); extern int root_mountflags; #define RAMDISK_IMAGE_START_MASK 0x07FF -#define RAMDISK_PROMPT_FLAG 0x8000 -#define RAMDISK_LOAD_FLAG 0x4000 static char __initdata command_line[COMMAND_LINE_SIZE] = { 0, }; diff --git a/arch/sparc/kernel/setup_32.c b/arch/sparc/kernel/setup_32.c index 704375c061e7..eb60be31127f 100644 --- a/arch/sparc/kernel/setup_32.c +++ b/arch/sparc/kernel/setup_32.c @@ -172,8 +172,6 @@ extern unsigned short root_flags; extern unsigned short root_dev; extern unsigned short ram_flags; #define RAMDISK_IMAGE_START_MASK 0x07FF -#define RAMDISK_PROMPT_FLAG 0x8000 -#define RAMDISK_LOAD_FLAG 0x4000 extern int root_mountflags; diff --git a/arch/sparc/kernel/setup_64.c b/arch/sparc/kernel/setup_64.c index 63615f5c99b4..f728f1b00aca 100644 --- a/arch/sparc/kernel/setup_64.c +++ b/arch/sparc/kernel/setup_64.c @@ -145,8 +145,6 @@ extern unsigned short root_flags; extern unsigned short root_dev; extern unsigned short ram_flags; #define RAMDISK_IMAGE_START_MASK 0x07FF -#define RAMDISK_PROMPT_FLAG 0x8000 -#define RAMDISK_LOAD_FLAG 0x4000 extern int root_mountflags; diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h index dafbf581c515..f53dd3f319ba 100644 --- a/arch/x86/include/uapi/asm/bootparam.h +++ b/arch/x86/include/uapi/asm/bootparam.h @@ -6,8 +6,6 @@ /* ram_size flags */ #define RAMDISK_IMAGE_START_MASK 0x07FF -#define RAMDISK_PROMPT_FLAG 0x8000 -#define RAMDISK_LOAD_FLAG 0x4000 /* loadflags */ #define LOADED_HIGH (1<<0) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 1b2edd07a3e1..6409e766fb17 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -223,8 +223,6 @@ extern int root_mountflags; unsigned long saved_video_mode; #define RAMDISK_IMAGE_START_MASK 0x07FF -#define RAMDISK_PROMPT_FLAG 0x8000 -#define RAMDISK_LOAD_FLAG 0x4000 static char __initdata command_line[COMMAND_LINE_SIZE]; #ifdef CONFIG_CMDLINE_BOOL -- 2.47.2 From akpm at linux-foundation.org Fri Sep 12 16:12:40 2025 From: akpm at linux-foundation.org (Andrew Morton) Date: Fri, 12 Sep 2025 16:12:40 -0700 Subject: next-20250912: riscv: s390: mm/kasan/shadow.c 'kasan_populate_vmalloc_pte' pgtable.h:247:41: error: statement with no effect [-Werror=unused-value] In-Reply-To: References: Message-ID: <20250912161240.0a5fac78fed5ed8ddc32450a@linux-foundation.org> On Fri, 12 Sep 2025 13:34:37 +0200 David Hildenbrand wrote: > > [-Werror=unused-value] > > 247 | #define arch_enter_lazy_mmu_mode() (LAZY_MMU_DEFAULT) > > | ^ > > mm/kasan/shadow.c:322:9: note: in expansion of macro 'arch_enter_lazy_mmu_mode' > > 322 | arch_enter_lazy_mmu_mode(); > > | ^~~~~~~~~~~~~~~~~~~~~~~~ > > mm/kasan/shadow.c: In function 'kasan_depopulate_vmalloc_pte': > > include/linux/pgtable.h:247:41: error: statement with no effect > > [-Werror=unused-value] > > 247 | #define arch_enter_lazy_mmu_mode() (LAZY_MMU_DEFAULT) > > | ^ > > mm/kasan/shadow.c:497:9: note: in expansion of macro 'arch_enter_lazy_mmu_mode' > > 497 | arch_enter_lazy_mmu_mode(); > > | ^~~~~~~~~~~~~~~~~~~~~~~~ > > cc1: all warnings being treated as errors > > > > > I'm afraid these changes landed in -mm-unstable a bit too early. > OK, I'll drop Patch series "Nesting support for lazy MMU mode", v2. From safinaskar at gmail.com Fri Sep 12 17:37:39 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:39 +0000 Subject: [PATCH RESEND 00/62] initrd: remove classic initrd support Message-ID: <20250913003842.41944-1-safinaskar@gmail.com> Intro ==== This patchset removes classic initrd (initial RAM disk) support, which was deprecated in 2020. Initramfs still stays, and RAM disk itself (brd) still stays, too. init/do_mounts* and init/*initramfs* are listed in VFS entry in MAINTAINERS, so I think this patchset should go through VFS tree. This patchset touchs every subdirectory in arch/, so I tested it on 8 (!!!) archs in Qemu (see details below). Warning: this patchset renames CONFIG_BLK_DEV_INITRD (!!!) to CONFIG_INITRAMFS and CONFIG_RD_* to CONFIG_INITRAMFS_DECOMPRESS_* (for example, CONFIG_RD_GZIP to CONFIG_INITRAMFS_DECOMPRESS_GZIP). If you still use initrd, see below for workaround. Details ==== I not only removed initrd, I also removed a lot of code, which became dead, including a lot of code in arch/. Still I think the only two architectures I touched in non-trivial way are sh and 32-bit arm. Also I renamed some files, functions and variables (which became misnomers) to proper names, moved some code around, removed a lot of mentions of initrd in code and comments. Also I cleaned up some docs. For example, I renamed the following global variables: __initramfs_start __initramfs_size phys_initrd_start phys_initrd_size initrd_start initrd_end to: __builtin_initramfs_start __builtin_initramfs_size phys_external_initramfs_start phys_external_initramfs_size virt_external_initramfs_start virt_external_initramfs_end New names precisely capture meaning of these variables. Also I renamed CONFIG_BLK_DEV_INITRD (which became total misnomer) to CONFIG_INITRAMFS. And CONFIG_RD_* to CONFIG_INITRAMFS_DECOMPRESS_*. This will break all configs out there (update your configs!). Still I think this is okay, because config names never were part of stable API. Still, I don't have strong opinion here, so I can drop these renamings if needed. Other user-visible changes: - Removed kernel command line parameters "load_ramdisk" and "prompt_ramdisk", which did nothing and were deprecated - Removed kernel command line parameter "ramdisk_start", which was used for initrd only (not for initramfs) - Removed kernel command line parameter "noinitrd", which was inconsistent: it controlled initrd only (not initramfs), except for EFI boot, where it controlled both initramfs and initrd. EFI users still can disable initramfs simply by not passing it - Removed kernel command line parameter "ramdisk_size", which used for controlling ramdisk (brd), but only in non-modular mode. Use brd.rd_size instead, it always works - Removed /proc/sys/kernel/real-root-dev . It was used for initrd only This patchset is based on v6.17-rc5. Testing ==== I tested my patchset on many architectures in Qemu using my Rust program, heavily based on mkroot [1]. I used the following cross-compilers: aarch64-linux-musleabi armv4l-linux-musleabihf armv5l-linux-musleabihf armv7l-linux-musleabihf i486-linux-musl i686-linux-musl mips-linux-musl mips64-linux-musl mipsel-linux-musl powerpc-linux-musl powerpc64-linux-musl powerpc64le-linux-musl riscv32-linux-musl riscv64-linux-musl s390x-linux-musl sh4-linux-musl sh4eb-linux-musl x86_64-linux-musl taken from this directory [2]. So, as you can see, there are 18 triplets, which correspond to 8 subdirs in arch/. And note that this list contains two archs (arm and sh) touched in non-trivial way. For every triplet I tested that: - Initramfs still works (both builtin and external) - Direct boot from disk still works Workaround ==== If "retain_initrd" is passed to kernel, then initramfs/initrd, passed by bootloader, is retained and becomes available after boot as read-only magic file /sys/firmware/initrd [3]. No copies are involved. I. e. /sys/firmware/initrd is simply a reference to original blob passed by bootloader. This works even if initrd/initramfs is not recognized by kernel in any way, i. e. even if it is not valid cpio archive, nor a fs image supported by classic initrd. This works both with my patchset and without it. This means that you can emulate classic initrd so: link builtin initramfs to kernel. In /init in this initramfs copy /sys/firmware/initrd to some file in / and loop-mount it. This is even better than classic initrd, because: - You can use fs not supported by classic initrd, for example erofs - One copy is involved (from /sys/firmware/initrd to some file in /) as opposed to two when using classic initrd Still, I don't recommend using this workaround, because I want everyone to migrate to proper modern initramfs. But still you can use this workaround if you want. Also: it is not possible to directly loop-mount /sys/firmware/initrd . Theoretically kernel can be changed to allow this (and/or to make it writable), but I think nobody needs this. And I don't want to implement this. P. S. When I sent this patchset first time, zoho mail banned me for too much email. So I resend this using gmail. The only change is email change, there are no other changes [1] https://github.com/landley/toybox/tree/master/mkroot [2] https://landley.net/toybox/downloads/binaries/toolchains/latest [3] https://lore.kernel.org/all/20231207235654.16622-1-graf at amazon.com/ Askar Safin (62): init: remove deprecated "load_ramdisk" command line parameter, which does nothing init: remove deprecated "prompt_ramdisk" command line parameter, which does nothing init: sh, sparc, x86: remove unused constants RAMDISK_PROMPT_FLAG and RAMDISK_LOAD_FLAG init: x86, arm, sh, sparc: remove variable rd_image_start, which controls starting block number of initrd init: remove "ramdisk_start" command line parameter, which controls starting block number of initrd arm: init: remove special logic for setting brd.rd_size arm: init: remove ATAG_RAMDISK arm: init: remove FLAG_RDLOAD and FLAG_RDPROMPT arm: init: document rd_start (in param_struct) as obsolete initrd: remove initrd (initial RAM disk) support init, efi: remove "noinitrd" command line parameter init: remove /proc/sys/kernel/real-root-dev ext2: remove ext2_image_size and associated code init: m68k, mips, powerpc, s390, sh: remove Root_RAM0 doc: modernize Documentation/admin-guide/blockdev/ramdisk.rst brd: remove "ramdisk_size" command line parameter doc: modernize Documentation/filesystems/ramfs-rootfs-initramfs.rst doc: modernize Documentation/driver-api/early-userspace/early_userspace_support.rst init: remove mentions of "ramdisk=" command line parameter doc: remove Documentation/power/swsusp-dmcrypt.rst init: remove all mentions of root=/dev/ram* doc: remove obsolete mentions of pivot_root init: rename __initramfs_{start,size} to __builtin_initramfs_{start,size} init: remove wrong comment init: rename phys_initrd_{start,size} to phys_external_initramfs_{start,size} init: move phys_external_initramfs_{start,size} to init/initramfs.c init: alpha: remove "extern unsigned long initrd_start, initrd_end" init: alpha, arc, arm, arm64, csky, m68k, microblaze, mips, nios2, openrisc, parisc, powerpc, s390, sh, sparc, um, x86, xtensa: rename initrd_{start,end} to virt_external_initramfs_{start,end} init: move virt_external_initramfs_{start,end} to init/initramfs.c doc: remove documentation for block device 4 0 init: rename initrd_below_start_ok to initramfs_below_start_ok init: move initramfs_below_start_ok to init/initramfs.c init: remove init/do_mounts_initrd.c init: inline create_dev into the only caller init: make mount_root_generic static init: make mount_root static init: remove root_mountflags from init/do_mounts.h init: remove most headers from init/do_mounts.h init: make console_on_rootfs static init: rename free_initrd_mem to free_initramfs_mem init: rename reserve_initrd_mem to reserve_initramfs_mem init: rename to setsid: inline ksys_setsid into the only caller doc: kernel-parameters: remove [RAM] from reserve_mem= doc: kernel-parameters: replace [RAM] with [INITRAMFS] init: edit docs for initramfs-related configs init: fix typo: virtul => virtual init: fix comment init: rename ramdisk_execute_command to initramfs_execute_command init: rename ramdisk_command_access to initramfs_command_access init: rename get_boot_config_from_initrd to get_boot_config_from_initramfs init: rename do_retain_initrd to retain_initramfs init: rename kexec_free_initrd to kexec_free_initramfs init: arm, x86: deal with some references to initrd init: rename CONFIG_BLK_DEV_INITRD to CONFIG_INITRAMFS init: rename CONFIG_RD_GZIP to CONFIG_INITRAMFS_DECOMPRESS_GZIP init: rename CONFIG_RD_BZIP2 to CONFIG_INITRAMFS_DECOMPRESS_BZIP2 init: rename CONFIG_RD_LZMA to CONFIG_INITRAMFS_DECOMPRESS_LZMA init: rename CONFIG_RD_XZ to CONFIG_INITRAMFS_DECOMPRESS_XZ init: rename CONFIG_RD_LZO to CONFIG_INITRAMFS_DECOMPRESS_LZO init: rename CONFIG_RD_LZ4 to CONFIG_INITRAMFS_DECOMPRESS_LZ4 init: rename CONFIG_RD_ZSTD to CONFIG_INITRAMFS_DECOMPRESS_ZSTD .../admin-guide/blockdev/ramdisk.rst | 104 +---- .../admin-guide/device-mapper/dm-init.rst | 4 +- Documentation/admin-guide/devices.txt | 12 - Documentation/admin-guide/index.rst | 1 - Documentation/admin-guide/initrd.rst | 383 ------------------ .../admin-guide/kernel-parameters.rst | 4 +- .../admin-guide/kernel-parameters.txt | 38 +- Documentation/admin-guide/nfs/nfsroot.rst | 4 +- Documentation/admin-guide/sysctl/kernel.rst | 6 - Documentation/arch/arm/ixp4xx.rst | 4 +- Documentation/arch/arm/setup.rst | 6 +- Documentation/arch/m68k/kernel-options.rst | 29 +- Documentation/arch/x86/boot.rst | 4 +- .../early_userspace_support.rst | 18 +- .../filesystems/ramfs-rootfs-initramfs.rst | 20 +- Documentation/power/index.rst | 1 - Documentation/power/swsusp-dmcrypt.rst | 140 ------- Documentation/security/ipe.rst | 2 +- .../translations/zh_CN/power/index.rst | 1 - arch/alpha/kernel/core_irongate.c | 12 +- arch/alpha/kernel/proto.h | 2 +- arch/alpha/kernel/setup.c | 32 +- arch/arc/configs/axs101_defconfig | 2 +- arch/arc/configs/axs103_defconfig | 2 +- arch/arc/configs/axs103_smp_defconfig | 2 +- arch/arc/configs/haps_hs_defconfig | 2 +- arch/arc/configs/haps_hs_smp_defconfig | 2 +- arch/arc/configs/hsdk_defconfig | 2 +- arch/arc/configs/nsim_700_defconfig | 2 +- arch/arc/configs/nsimosci_defconfig | 2 +- arch/arc/configs/nsimosci_hs_defconfig | 2 +- arch/arc/configs/nsimosci_hs_smp_defconfig | 2 +- arch/arc/configs/tb10x_defconfig | 4 +- arch/arc/configs/vdk_hs38_defconfig | 2 +- arch/arc/configs/vdk_hs38_smp_defconfig | 2 +- arch/arc/mm/init.c | 14 +- arch/arm/Kconfig | 2 +- arch/arm/boot/dts/arm/integratorap.dts | 2 +- arch/arm/boot/dts/arm/integratorcp.dts | 2 +- .../dts/aspeed/aspeed-bmc-facebook-cmm.dts | 2 +- .../aspeed/aspeed-bmc-facebook-galaxy100.dts | 2 +- .../aspeed/aspeed-bmc-facebook-minipack.dts | 2 +- .../aspeed/aspeed-bmc-facebook-wedge100.dts | 2 +- .../aspeed/aspeed-bmc-facebook-wedge40.dts | 2 +- .../dts/aspeed/aspeed-bmc-facebook-yamp.dts | 2 +- .../ast2600-facebook-netbmc-common.dtsi | 2 +- arch/arm/boot/dts/hisilicon/hi3620-hi4511.dts | 2 +- .../ixp/intel-ixp42x-welltech-epbx100.dts | 2 +- arch/arm/boot/dts/nspire/nspire-classic.dtsi | 2 +- arch/arm/boot/dts/nspire/nspire-cx.dts | 2 +- .../boot/dts/samsung/exynos4210-origen.dts | 2 +- .../boot/dts/samsung/exynos4210-smdkv310.dts | 2 +- .../boot/dts/samsung/exynos4412-smdk4412.dts | 2 +- .../boot/dts/samsung/exynos5250-smdk5250.dts | 2 +- arch/arm/boot/dts/st/ste-nomadik-nhk15.dts | 2 +- arch/arm/boot/dts/st/ste-nomadik-s8815.dts | 2 +- arch/arm/boot/dts/st/stm32429i-eval.dts | 2 +- arch/arm/boot/dts/st/stm32746g-eval.dts | 2 +- arch/arm/boot/dts/st/stm32f429-disco.dts | 2 +- arch/arm/boot/dts/st/stm32f469-disco.dts | 2 +- arch/arm/boot/dts/st/stm32f746-disco.dts | 2 +- arch/arm/boot/dts/st/stm32f769-disco.dts | 2 +- arch/arm/boot/dts/st/stm32h743i-disco.dts | 2 +- arch/arm/boot/dts/st/stm32h743i-eval.dts | 2 +- arch/arm/boot/dts/st/stm32h747i-disco.dts | 2 +- arch/arm/boot/dts/st/stm32h750i-art-pi.dts | 2 +- arch/arm/configs/aspeed_g4_defconfig | 8 +- arch/arm/configs/aspeed_g5_defconfig | 8 +- arch/arm/configs/assabet_defconfig | 4 +- arch/arm/configs/at91_dt_defconfig | 4 +- arch/arm/configs/axm55xx_defconfig | 2 +- arch/arm/configs/bcm2835_defconfig | 2 +- arch/arm/configs/clps711x_defconfig | 4 +- arch/arm/configs/collie_defconfig | 4 +- arch/arm/configs/davinci_all_defconfig | 2 +- arch/arm/configs/exynos_defconfig | 4 +- arch/arm/configs/footbridge_defconfig | 2 +- arch/arm/configs/gemini_defconfig | 2 +- arch/arm/configs/h3600_defconfig | 2 +- arch/arm/configs/hisi_defconfig | 4 +- arch/arm/configs/imx_v4_v5_defconfig | 2 +- arch/arm/configs/imx_v6_v7_defconfig | 4 +- arch/arm/configs/integrator_defconfig | 2 +- arch/arm/configs/ixp4xx_defconfig | 2 +- arch/arm/configs/keystone_defconfig | 2 +- arch/arm/configs/lpc18xx_defconfig | 12 +- arch/arm/configs/lpc32xx_defconfig | 4 +- arch/arm/configs/milbeaut_m10v_defconfig | 2 +- arch/arm/configs/multi_v4t_defconfig | 2 +- arch/arm/configs/multi_v5_defconfig | 2 +- arch/arm/configs/multi_v7_defconfig | 2 +- arch/arm/configs/mvebu_v7_defconfig | 2 +- arch/arm/configs/mxs_defconfig | 2 +- arch/arm/configs/neponset_defconfig | 4 +- arch/arm/configs/nhk8815_defconfig | 2 +- arch/arm/configs/omap1_defconfig | 2 +- arch/arm/configs/omap2plus_defconfig | 2 +- arch/arm/configs/pxa910_defconfig | 2 +- arch/arm/configs/pxa_defconfig | 4 +- arch/arm/configs/qcom_defconfig | 2 +- arch/arm/configs/rpc_defconfig | 2 +- arch/arm/configs/s3c6400_defconfig | 4 +- arch/arm/configs/s5pv210_defconfig | 4 +- arch/arm/configs/sama5_defconfig | 4 +- arch/arm/configs/sama7_defconfig | 2 +- arch/arm/configs/shmobile_defconfig | 2 +- arch/arm/configs/socfpga_defconfig | 2 +- arch/arm/configs/sp7021_defconfig | 12 +- arch/arm/configs/spear13xx_defconfig | 2 +- arch/arm/configs/spear3xx_defconfig | 2 +- arch/arm/configs/spear6xx_defconfig | 2 +- arch/arm/configs/spitz_defconfig | 2 +- arch/arm/configs/stm32_defconfig | 2 +- arch/arm/configs/sunxi_defconfig | 2 +- arch/arm/configs/tegra_defconfig | 2 +- arch/arm/configs/u8500_defconfig | 4 +- arch/arm/configs/versatile_defconfig | 2 +- arch/arm/configs/vexpress_defconfig | 2 +- arch/arm/configs/vf610m4_defconfig | 10 +- arch/arm/configs/vt8500_v6_v7_defconfig | 2 +- arch/arm/configs/wpcm450_defconfig | 2 +- arch/arm/include/uapi/asm/setup.h | 10 - arch/arm/kernel/atags_compat.c | 10 - arch/arm/kernel/atags_parse.c | 16 +- arch/arm/kernel/setup.c | 2 +- arch/arm/mm/init.c | 24 +- arch/arm64/configs/defconfig | 2 +- arch/arm64/kernel/setup.c | 2 +- arch/arm64/mm/init.c | 17 +- arch/csky/kernel/setup.c | 24 +- arch/csky/mm/init.c | 2 +- arch/hexagon/configs/comet_defconfig | 2 +- arch/loongarch/configs/loongson3_defconfig | 2 +- arch/loongarch/kernel/mem.c | 2 +- arch/loongarch/kernel/setup.c | 4 +- arch/m68k/configs/amiga_defconfig | 2 +- arch/m68k/configs/apollo_defconfig | 2 +- arch/m68k/configs/atari_defconfig | 2 +- arch/m68k/configs/bvme6000_defconfig | 2 +- arch/m68k/configs/hp300_defconfig | 2 +- arch/m68k/configs/mac_defconfig | 2 +- arch/m68k/configs/multi_defconfig | 2 +- arch/m68k/configs/mvme147_defconfig | 2 +- arch/m68k/configs/mvme16x_defconfig | 2 +- arch/m68k/configs/q40_defconfig | 2 +- arch/m68k/configs/stmark2_defconfig | 2 +- arch/m68k/configs/sun3_defconfig | 2 +- arch/m68k/configs/sun3x_defconfig | 2 +- arch/m68k/kernel/setup_mm.c | 12 +- arch/m68k/kernel/setup_no.c | 12 +- arch/m68k/kernel/uboot.c | 17 +- arch/microblaze/kernel/cpu/mb.c | 2 +- arch/microblaze/kernel/setup.c | 2 +- arch/microblaze/mm/init.c | 12 +- arch/mips/ath79/prom.c | 12 +- arch/mips/configs/ath25_defconfig | 12 +- arch/mips/configs/ath79_defconfig | 4 +- arch/mips/configs/bcm47xx_defconfig | 2 +- arch/mips/configs/bigsur_defconfig | 2 +- arch/mips/configs/bmips_be_defconfig | 2 +- arch/mips/configs/bmips_stb_defconfig | 14 +- arch/mips/configs/cavium_octeon_defconfig | 2 +- arch/mips/configs/eyeq5_defconfig | 2 +- arch/mips/configs/eyeq6_defconfig | 2 +- arch/mips/configs/generic_defconfig | 2 +- arch/mips/configs/gpr_defconfig | 2 +- arch/mips/configs/lemote2f_defconfig | 2 +- arch/mips/configs/loongson2k_defconfig | 2 +- arch/mips/configs/loongson3_defconfig | 2 +- arch/mips/configs/malta_defconfig | 2 +- arch/mips/configs/mtx1_defconfig | 2 +- arch/mips/configs/rb532_defconfig | 2 +- arch/mips/configs/rbtx49xx_defconfig | 2 +- arch/mips/configs/rt305x_defconfig | 4 +- arch/mips/configs/sb1250_swarm_defconfig | 2 +- arch/mips/configs/xway_defconfig | 4 +- arch/mips/kernel/setup.c | 53 ++- arch/mips/mm/init.c | 2 +- arch/mips/sibyte/common/cfe.c | 36 +- arch/mips/sibyte/swarm/setup.c | 2 +- arch/nios2/kernel/setup.c | 20 +- arch/openrisc/configs/or1klitex_defconfig | 2 +- arch/openrisc/configs/or1ksim_defconfig | 4 +- arch/openrisc/configs/simple_smp_defconfig | 14 +- arch/openrisc/configs/virt_defconfig | 2 +- arch/openrisc/kernel/setup.c | 24 +- arch/openrisc/kernel/vmlinux.h | 2 +- arch/parisc/boot/compressed/misc.c | 2 +- arch/parisc/configs/generic-32bit_defconfig | 2 +- arch/parisc/configs/generic-64bit_defconfig | 2 +- arch/parisc/defpalo.conf | 2 +- arch/parisc/kernel/pdt.c | 6 +- arch/parisc/kernel/setup.c | 8 +- arch/parisc/mm/init.c | 32 +- arch/powerpc/configs/44x/akebono_defconfig | 2 +- arch/powerpc/configs/44x/arches_defconfig | 2 +- arch/powerpc/configs/44x/bamboo_defconfig | 2 +- arch/powerpc/configs/44x/bluestone_defconfig | 2 +- .../powerpc/configs/44x/canyonlands_defconfig | 2 +- arch/powerpc/configs/44x/ebony_defconfig | 2 +- arch/powerpc/configs/44x/eiger_defconfig | 2 +- arch/powerpc/configs/44x/fsp2_defconfig | 10 +- arch/powerpc/configs/44x/icon_defconfig | 2 +- arch/powerpc/configs/44x/iss476-smp_defconfig | 2 +- arch/powerpc/configs/44x/katmai_defconfig | 2 +- arch/powerpc/configs/44x/rainier_defconfig | 2 +- arch/powerpc/configs/44x/redwood_defconfig | 2 +- arch/powerpc/configs/44x/sam440ep_defconfig | 2 +- arch/powerpc/configs/44x/sequoia_defconfig | 2 +- arch/powerpc/configs/44x/taishan_defconfig | 2 +- arch/powerpc/configs/44x/warp_defconfig | 2 +- arch/powerpc/configs/52xx/cm5200_defconfig | 2 +- arch/powerpc/configs/52xx/lite5200b_defconfig | 2 +- arch/powerpc/configs/52xx/motionpro_defconfig | 2 +- arch/powerpc/configs/52xx/tqm5200_defconfig | 2 +- arch/powerpc/configs/83xx/asp8347_defconfig | 2 +- .../configs/83xx/mpc8313_rdb_defconfig | 2 +- .../configs/83xx/mpc8315_rdb_defconfig | 2 +- .../configs/83xx/mpc832x_rdb_defconfig | 2 +- .../configs/83xx/mpc834x_itx_defconfig | 2 +- .../configs/83xx/mpc834x_itxgp_defconfig | 2 +- .../configs/83xx/mpc836x_rdk_defconfig | 2 +- .../configs/83xx/mpc837x_rdb_defconfig | 2 +- arch/powerpc/configs/85xx/ge_imp3a_defconfig | 2 +- arch/powerpc/configs/85xx/ksi8560_defconfig | 2 +- arch/powerpc/configs/85xx/socrates_defconfig | 2 +- arch/powerpc/configs/85xx/stx_gp3_defconfig | 2 +- arch/powerpc/configs/85xx/tqm8540_defconfig | 2 +- arch/powerpc/configs/85xx/tqm8541_defconfig | 2 +- arch/powerpc/configs/85xx/tqm8548_defconfig | 2 +- arch/powerpc/configs/85xx/tqm8555_defconfig | 2 +- arch/powerpc/configs/85xx/tqm8560_defconfig | 2 +- .../configs/85xx/xes_mpc85xx_defconfig | 2 +- arch/powerpc/configs/amigaone_defconfig | 2 +- arch/powerpc/configs/cell_defconfig | 2 +- arch/powerpc/configs/chrp32_defconfig | 2 +- arch/powerpc/configs/fsl-emb-nonhw.config | 2 +- arch/powerpc/configs/g5_defconfig | 2 +- arch/powerpc/configs/gamecube_defconfig | 2 +- arch/powerpc/configs/holly_defconfig | 2 +- arch/powerpc/configs/linkstation_defconfig | 2 +- arch/powerpc/configs/mgcoge_defconfig | 4 +- arch/powerpc/configs/microwatt_defconfig | 2 +- arch/powerpc/configs/mpc512x_defconfig | 2 +- arch/powerpc/configs/mpc5200_defconfig | 2 +- arch/powerpc/configs/mpc83xx_defconfig | 2 +- arch/powerpc/configs/pasemi_defconfig | 2 +- arch/powerpc/configs/pmac32_defconfig | 2 +- arch/powerpc/configs/powernv_defconfig | 2 +- arch/powerpc/configs/ppc44x_defconfig | 2 +- arch/powerpc/configs/ppc64_defconfig | 2 +- arch/powerpc/configs/ppc64e_defconfig | 2 +- arch/powerpc/configs/ppc6xx_defconfig | 2 +- arch/powerpc/configs/ps3_defconfig | 2 +- arch/powerpc/configs/skiroot_defconfig | 12 +- arch/powerpc/configs/wii_defconfig | 2 +- arch/powerpc/kernel/prom.c | 22 +- arch/powerpc/kernel/prom_init.c | 6 +- arch/powerpc/kernel/setup-common.c | 25 +- arch/powerpc/kernel/setup_32.c | 2 +- arch/powerpc/kernel/setup_64.c | 2 +- arch/powerpc/mm/init_32.c | 2 +- arch/powerpc/platforms/52xx/lite5200.c | 2 +- arch/powerpc/platforms/83xx/km83xx.c | 2 +- arch/powerpc/platforms/85xx/mpc85xx_mds.c | 2 +- arch/powerpc/platforms/chrp/setup.c | 2 +- .../platforms/embedded6xx/linkstation.c | 2 +- .../platforms/embedded6xx/storcenter.c | 2 +- arch/powerpc/platforms/powermac/setup.c | 8 +- arch/riscv/configs/defconfig | 2 +- arch/riscv/configs/nommu_k210_defconfig | 16 +- arch/riscv/configs/nommu_virt_defconfig | 12 +- arch/riscv/mm/init.c | 4 +- arch/s390/boot/ipl_parm.c | 2 +- arch/s390/boot/startup.c | 4 +- arch/s390/configs/zfcpdump_defconfig | 2 +- arch/s390/kernel/setup.c | 10 +- arch/s390/mm/init.c | 2 +- arch/sh/configs/apsh4a3a_defconfig | 2 +- arch/sh/configs/apsh4ad0a_defconfig | 2 +- arch/sh/configs/ecovec24-romimage_defconfig | 2 +- arch/sh/configs/edosk7760_defconfig | 2 +- arch/sh/configs/kfr2r09-romimage_defconfig | 2 +- arch/sh/configs/kfr2r09_defconfig | 2 +- arch/sh/configs/magicpanelr2_defconfig | 2 +- arch/sh/configs/migor_defconfig | 2 +- arch/sh/configs/rsk7201_defconfig | 2 +- arch/sh/configs/rsk7203_defconfig | 2 +- arch/sh/configs/sdk7786_defconfig | 8 +- arch/sh/configs/se7206_defconfig | 2 +- arch/sh/configs/se7705_defconfig | 2 +- arch/sh/configs/se7722_defconfig | 2 +- arch/sh/configs/se7751_defconfig | 2 +- arch/sh/configs/secureedge5410_defconfig | 2 +- arch/sh/configs/sh03_defconfig | 2 +- arch/sh/configs/sh7757lcr_defconfig | 2 +- arch/sh/configs/titan_defconfig | 2 +- arch/sh/configs/ul2_defconfig | 2 +- arch/sh/configs/urquell_defconfig | 2 +- arch/sh/include/asm/setup.h | 1 - arch/sh/kernel/head_32.S | 2 +- arch/sh/kernel/setup.c | 27 +- arch/sparc/boot/piggyback.c | 4 +- arch/sparc/configs/sparc32_defconfig | 2 +- arch/sparc/configs/sparc64_defconfig | 2 +- arch/sparc/kernel/head_32.S | 4 +- arch/sparc/kernel/head_64.S | 6 +- arch/sparc/kernel/setup_32.c | 9 +- arch/sparc/kernel/setup_64.c | 9 +- arch/sparc/mm/init_32.c | 22 +- arch/sparc/mm/init_64.c | 20 +- arch/um/kernel/Makefile | 2 +- arch/um/kernel/initrd.c | 6 +- arch/x86/Kconfig | 2 +- arch/x86/boot/header.S | 2 +- arch/x86/boot/startup/sme.c | 2 +- arch/x86/configs/i386_defconfig | 2 +- arch/x86/configs/x86_64_defconfig | 2 +- arch/x86/include/uapi/asm/bootparam.h | 7 +- arch/x86/kernel/cpu/microcode/amd.c | 2 +- arch/x86/kernel/cpu/microcode/core.c | 12 +- arch/x86/kernel/cpu/microcode/intel.c | 2 +- arch/x86/kernel/cpu/microcode/internal.h | 2 +- arch/x86/kernel/devicetree.c | 2 +- arch/x86/kernel/setup.c | 39 +- arch/x86/mm/init.c | 8 +- arch/x86/mm/init_32.c | 2 +- arch/x86/mm/init_64.c | 2 +- arch/x86/tools/relocs.c | 2 +- arch/xtensa/Kconfig | 2 +- arch/xtensa/boot/dts/csp.dts | 2 +- arch/xtensa/configs/audio_kc705_defconfig | 2 +- arch/xtensa/configs/cadence_csp_defconfig | 12 +- arch/xtensa/configs/generic_kc705_defconfig | 2 +- arch/xtensa/configs/nommu_kc705_defconfig | 12 +- arch/xtensa/configs/smp_lx200_defconfig | 2 +- arch/xtensa/configs/virt_defconfig | 2 +- arch/xtensa/configs/xip_kc705_defconfig | 2 +- arch/xtensa/kernel/setup.c | 26 +- drivers/acpi/Kconfig | 2 +- drivers/acpi/tables.c | 10 +- drivers/base/firmware_loader/main.c | 2 +- drivers/block/Kconfig | 8 +- drivers/block/brd.c | 20 +- drivers/firmware/efi/efi.c | 10 +- .../firmware/efi/libstub/efi-stub-helper.c | 5 +- drivers/gpu/drm/ci/arm.config | 2 +- drivers/gpu/drm/ci/arm64.config | 2 +- drivers/gpu/drm/ci/x86_64.config | 2 +- drivers/of/fdt.c | 18 +- fs/ext2/ext2.h | 9 - fs/init.c | 14 - include/asm-generic/vmlinux.lds.h | 8 +- include/linux/ext2_fs.h | 13 - include/linux/init_syscalls.h | 1 - include/linux/initramfs.h | 26 ++ include/linux/initrd.h | 37 -- include/linux/root_dev.h | 1 - include/linux/syscalls.h | 1 - include/uapi/linux/sysctl.h | 1 - init/.kunitconfig | 2 +- init/Kconfig | 28 +- init/Makefile | 6 +- init/do_mounts.c | 28 +- init/do_mounts.h | 42 -- init/do_mounts_initrd.c | 154 ------- init/do_mounts_rd.c | 334 --------------- init/initramfs.c | 152 ++++--- init/main.c | 66 +-- kernel/sys.c | 7 +- kernel/sysctl.c | 2 +- kernel/umh.c | 2 +- scripts/package/builddeb | 2 +- .../ktest/examples/bootconfigs/tracing.bconf | 3 - tools/testing/selftests/bpf/config.aarch64 | 2 +- tools/testing/selftests/bpf/config.ppc64el | 2 +- tools/testing/selftests/bpf/config.riscv64 | 2 +- tools/testing/selftests/bpf/config.s390x | 2 +- tools/testing/selftests/kho/vmtest.sh | 2 +- .../testing/selftests/nolibc/Makefile.nolibc | 4 +- tools/testing/selftests/vsock/config | 2 +- .../selftests/wireguard/qemu/kernel.config | 2 +- usr/Kconfig | 70 ++-- usr/Makefile | 2 +- usr/initramfs_data.S | 4 +- 385 files changed, 969 insertions(+), 2346 deletions(-) delete mode 100644 Documentation/admin-guide/initrd.rst delete mode 100644 Documentation/power/swsusp-dmcrypt.rst create mode 100644 include/linux/initramfs.h delete mode 100644 include/linux/initrd.h delete mode 100644 init/do_mounts_initrd.c delete mode 100644 init/do_mounts_rd.c base-commit: 76eeb9b8de9880ca38696b2fb56ac45ac0a25c6c -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:40 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:40 +0000 Subject: [PATCH RESEND 01/62] init: remove deprecated "load_ramdisk" command line parameter, which does nothing In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-2-safinaskar@gmail.com> This is preparation for initrd removal Signed-off-by: Askar Safin --- Documentation/admin-guide/kernel-parameters.txt | 2 -- arch/arm/configs/neponset_defconfig | 2 +- init/do_mounts.c | 7 ------- 3 files changed, 1 insertion(+), 10 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 747a55abf494..d3b05ce249ff 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3275,8 +3275,6 @@ If there are multiple matching configurations changing the same attribute, the last one is used. - load_ramdisk= [RAM] [Deprecated] - lockd.nlm_grace_period=P [NFS] Assign grace period. Format: diff --git a/arch/arm/configs/neponset_defconfig b/arch/arm/configs/neponset_defconfig index 2227f86100ad..16f7300239da 100644 --- a/arch/arm/configs/neponset_defconfig +++ b/arch/arm/configs/neponset_defconfig @@ -9,7 +9,7 @@ CONFIG_ASSABET_NEPONSET=y CONFIG_ZBOOT_ROM_TEXT=0x80000 CONFIG_ZBOOT_ROM_BSS=0xc1000000 CONFIG_ZBOOT_ROM=y -CONFIG_CMDLINE="console=ttySA0,38400n8 cpufreq=221200 rw root=/dev/mtdblock2 mtdparts=sa1100:512K(boot),1M(kernel),2560K(initrd),4M(root) load_ramdisk=1 prompt_ramdisk=0 mem=32M noinitrd initrd=0xc0800000,3M" +CONFIG_CMDLINE="console=ttySA0,38400n8 cpufreq=221200 rw root=/dev/mtdblock2 mtdparts=sa1100:512K(boot),1M(kernel),2560K(initrd),4M(root) prompt_ramdisk=0 mem=32M noinitrd initrd=0xc0800000,3M" CONFIG_FPE_NWFPE=y CONFIG_PM=y CONFIG_MODULES=y diff --git a/init/do_mounts.c b/init/do_mounts.c index 6af29da8889e..0f2f44e6250c 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -34,13 +34,6 @@ static int root_wait; dev_t ROOT_DEV; -static int __init load_ramdisk(char *str) -{ - pr_warn("ignoring the deprecated load_ramdisk= option\n"); - return 1; -} -__setup("load_ramdisk=", load_ramdisk); - static int __init readonly(char *str) { if (*str) -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:41 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:41 +0000 Subject: [PATCH RESEND 02/62] init: remove deprecated "prompt_ramdisk" command line parameter, which does nothing In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-3-safinaskar@gmail.com> This is preparation for initrd removal Signed-off-by: Askar Safin --- Documentation/admin-guide/kernel-parameters.txt | 2 -- arch/arm/configs/neponset_defconfig | 2 +- init/do_mounts_rd.c | 7 ------- 3 files changed, 1 insertion(+), 10 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index d3b05ce249ff..f940c1184912 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5229,8 +5229,6 @@ Param: - step/bucket size as a power of 2 for statistical time based profiling. - prompt_ramdisk= [RAM] [Deprecated] - prot_virt= [S390] enable hosting protected virtual machines isolated from the hypervisor (if hardware supports that). If enabled, the default kernel base address diff --git a/arch/arm/configs/neponset_defconfig b/arch/arm/configs/neponset_defconfig index 16f7300239da..4d720001c12e 100644 --- a/arch/arm/configs/neponset_defconfig +++ b/arch/arm/configs/neponset_defconfig @@ -9,7 +9,7 @@ CONFIG_ASSABET_NEPONSET=y CONFIG_ZBOOT_ROM_TEXT=0x80000 CONFIG_ZBOOT_ROM_BSS=0xc1000000 CONFIG_ZBOOT_ROM=y -CONFIG_CMDLINE="console=ttySA0,38400n8 cpufreq=221200 rw root=/dev/mtdblock2 mtdparts=sa1100:512K(boot),1M(kernel),2560K(initrd),4M(root) prompt_ramdisk=0 mem=32M noinitrd initrd=0xc0800000,3M" +CONFIG_CMDLINE="console=ttySA0,38400n8 cpufreq=221200 rw root=/dev/mtdblock2 mtdparts=sa1100:512K(boot),1M(kernel),2560K(initrd),4M(root) mem=32M noinitrd initrd=0xc0800000,3M" CONFIG_FPE_NWFPE=y CONFIG_PM=y CONFIG_MODULES=y diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c index ac021ae6e6fa..f7d53bc21e41 100644 --- a/init/do_mounts_rd.c +++ b/init/do_mounts_rd.c @@ -17,13 +17,6 @@ static struct file *in_file, *out_file; static loff_t in_pos, out_pos; -static int __init prompt_ramdisk(char *str) -{ - pr_warn("ignoring the deprecated prompt_ramdisk= option\n"); - return 1; -} -__setup("prompt_ramdisk=", prompt_ramdisk); - int __initdata rd_image_start; /* starting block # of image */ static int __init ramdisk_start_setup(char *str) -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:42 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:42 +0000 Subject: [PATCH RESEND 03/62] init: sh, sparc, x86: remove unused constants RAMDISK_PROMPT_FLAG and RAMDISK_LOAD_FLAG In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-4-safinaskar@gmail.com> They were used for initrd before c8376994c86. c8376994c86c made them unused and forgot to remove them Fixes: c8376994c86c ("initrd: remove support for multiple floppies") Cc: # because changes uapi headers Signed-off-by: Askar Safin --- arch/sh/kernel/setup.c | 2 -- arch/sparc/kernel/setup_32.c | 2 -- arch/sparc/kernel/setup_64.c | 2 -- arch/x86/include/uapi/asm/bootparam.h | 2 -- arch/x86/kernel/setup.c | 2 -- 5 files changed, 10 deletions(-) diff --git a/arch/sh/kernel/setup.c b/arch/sh/kernel/setup.c index 039a51291002..d66f098e9e9f 100644 --- a/arch/sh/kernel/setup.c +++ b/arch/sh/kernel/setup.c @@ -71,8 +71,6 @@ EXPORT_SYMBOL(sh_mv); extern int root_mountflags; #define RAMDISK_IMAGE_START_MASK 0x07FF -#define RAMDISK_PROMPT_FLAG 0x8000 -#define RAMDISK_LOAD_FLAG 0x4000 static char __initdata command_line[COMMAND_LINE_SIZE] = { 0, }; diff --git a/arch/sparc/kernel/setup_32.c b/arch/sparc/kernel/setup_32.c index 704375c061e7..eb60be31127f 100644 --- a/arch/sparc/kernel/setup_32.c +++ b/arch/sparc/kernel/setup_32.c @@ -172,8 +172,6 @@ extern unsigned short root_flags; extern unsigned short root_dev; extern unsigned short ram_flags; #define RAMDISK_IMAGE_START_MASK 0x07FF -#define RAMDISK_PROMPT_FLAG 0x8000 -#define RAMDISK_LOAD_FLAG 0x4000 extern int root_mountflags; diff --git a/arch/sparc/kernel/setup_64.c b/arch/sparc/kernel/setup_64.c index 63615f5c99b4..f728f1b00aca 100644 --- a/arch/sparc/kernel/setup_64.c +++ b/arch/sparc/kernel/setup_64.c @@ -145,8 +145,6 @@ extern unsigned short root_flags; extern unsigned short root_dev; extern unsigned short ram_flags; #define RAMDISK_IMAGE_START_MASK 0x07FF -#define RAMDISK_PROMPT_FLAG 0x8000 -#define RAMDISK_LOAD_FLAG 0x4000 extern int root_mountflags; diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h index dafbf581c515..f53dd3f319ba 100644 --- a/arch/x86/include/uapi/asm/bootparam.h +++ b/arch/x86/include/uapi/asm/bootparam.h @@ -6,8 +6,6 @@ /* ram_size flags */ #define RAMDISK_IMAGE_START_MASK 0x07FF -#define RAMDISK_PROMPT_FLAG 0x8000 -#define RAMDISK_LOAD_FLAG 0x4000 /* loadflags */ #define LOADED_HIGH (1<<0) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 1b2edd07a3e1..6409e766fb17 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -223,8 +223,6 @@ extern int root_mountflags; unsigned long saved_video_mode; #define RAMDISK_IMAGE_START_MASK 0x07FF -#define RAMDISK_PROMPT_FLAG 0x8000 -#define RAMDISK_LOAD_FLAG 0x4000 static char __initdata command_line[COMMAND_LINE_SIZE]; #ifdef CONFIG_CMDLINE_BOOL -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:43 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:43 +0000 Subject: [PATCH RESEND 04/62] init: x86, arm, sh, sparc: remove variable rd_image_start, which controls starting block number of initrd In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-5-safinaskar@gmail.com> This is preparation for initrd removal Signed-off-by: Askar Safin --- Documentation/arch/x86/boot.rst | 4 ++-- arch/arm/kernel/atags_parse.c | 2 -- arch/sh/include/asm/setup.h | 1 - arch/sh/kernel/head_32.S | 2 +- arch/sh/kernel/setup.c | 9 +-------- arch/sparc/boot/piggyback.c | 4 ++-- arch/sparc/kernel/head_32.S | 4 ++-- arch/sparc/kernel/head_64.S | 6 ++++-- arch/sparc/kernel/setup_32.c | 5 ----- arch/sparc/kernel/setup_64.c | 5 ----- arch/x86/boot/header.S | 2 +- arch/x86/include/uapi/asm/bootparam.h | 5 +---- arch/x86/kernel/setup.c | 5 ----- include/linux/initrd.h | 3 --- init/do_mounts_rd.c | 8 +++----- 15 files changed, 17 insertions(+), 48 deletions(-) diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst index 77e6163288db..118aa7b69667 100644 --- a/Documentation/arch/x86/boot.rst +++ b/Documentation/arch/x86/boot.rst @@ -189,7 +189,7 @@ Offset/Size Proto Name Meaning 01F1/1 ALL(1) setup_sects The size of the setup in sectors 01F2/2 ALL root_flags If set, the root is mounted readonly 01F4/4 2.04+(2) syssize The size of the 32-bit code in 16-byte paras -01F8/2 ALL ram_size DO NOT USE - for bootsect.S use only +01F8/2 ALL ram_size DO NOT USE - for bootsect.S use only - used to control initrd, which was removed from Linux in 2025 01FA/2 ALL vid_mode Video mode control 01FC/2 ALL root_dev Default root device number 01FE/2 ALL boot_flag 0xAA55 magic number @@ -308,7 +308,7 @@ Offset/size: 0x1f8/2 Protocol: ALL ============ =============== - This field is obsolete. + This field is obsolete. Used to control initrd, which was removed from Linux in 2025. ============ =================== Field name: vid_mode diff --git a/arch/arm/kernel/atags_parse.c b/arch/arm/kernel/atags_parse.c index 4ec591bde3df..a3f0a4f84e04 100644 --- a/arch/arm/kernel/atags_parse.c +++ b/arch/arm/kernel/atags_parse.c @@ -90,8 +90,6 @@ __tagtable(ATAG_VIDEOTEXT, parse_tag_videotext); #ifdef CONFIG_BLK_DEV_RAM static int __init parse_tag_ramdisk(const struct tag *tag) { - rd_image_start = tag->u.ramdisk.start; - if (tag->u.ramdisk.size) rd_size = tag->u.ramdisk.size; diff --git a/arch/sh/include/asm/setup.h b/arch/sh/include/asm/setup.h index 84bb23a771f3..d1b97c5726e4 100644 --- a/arch/sh/include/asm/setup.h +++ b/arch/sh/include/asm/setup.h @@ -10,7 +10,6 @@ #define PARAM ((unsigned char *)empty_zero_page) #define MOUNT_ROOT_RDONLY (*(unsigned long *) (PARAM+0x000)) -#define RAMDISK_FLAGS (*(unsigned long *) (PARAM+0x004)) #define ORIG_ROOT_DEV (*(unsigned long *) (PARAM+0x008)) #define LOADER_TYPE (*(unsigned long *) (PARAM+0x00c)) #define INITRD_START (*(unsigned long *) (PARAM+0x010)) diff --git a/arch/sh/kernel/head_32.S b/arch/sh/kernel/head_32.S index b603b7968b38..4382c0f058c8 100644 --- a/arch/sh/kernel/head_32.S +++ b/arch/sh/kernel/head_32.S @@ -28,7 +28,7 @@ .section .empty_zero_page, "aw" ENTRY(empty_zero_page) .long 1 /* MOUNT_ROOT_RDONLY */ - .long 0 /* RAMDISK_FLAGS */ + .long 0 /* RAMDISK_FLAGS - used to control initrd, which was removed from Linux in 2025 */ .long 0x0200 /* ORIG_ROOT_DEV */ .long 1 /* LOADER_TYPE */ .long 0x00000000 /* INITRD_START */ diff --git a/arch/sh/kernel/setup.c b/arch/sh/kernel/setup.c index d66f098e9e9f..50f1d39fe34f 100644 --- a/arch/sh/kernel/setup.c +++ b/arch/sh/kernel/setup.c @@ -70,8 +70,6 @@ EXPORT_SYMBOL(sh_mv); extern int root_mountflags; -#define RAMDISK_IMAGE_START_MASK 0x07FF - static char __initdata command_line[COMMAND_LINE_SIZE] = { 0, }; static struct resource code_resource = { @@ -273,19 +271,14 @@ void __init setup_arch(char **cmdline_p) printk(KERN_NOTICE "Boot params:\n" "... MOUNT_ROOT_RDONLY - %08lx\n" - "... RAMDISK_FLAGS - %08lx\n" "... ORIG_ROOT_DEV - %08lx\n" "... LOADER_TYPE - %08lx\n" "... INITRD_START - %08lx\n" "... INITRD_SIZE - %08lx\n", - MOUNT_ROOT_RDONLY, RAMDISK_FLAGS, + MOUNT_ROOT_RDONLY, ORIG_ROOT_DEV, LOADER_TYPE, INITRD_START, INITRD_SIZE); -#ifdef CONFIG_BLK_DEV_RAM - rd_image_start = RAMDISK_FLAGS & RAMDISK_IMAGE_START_MASK; -#endif - if (!MOUNT_ROOT_RDONLY) root_mountflags &= ~MS_RDONLY; setup_initial_init_mm(_text, _etext, _edata, _end); diff --git a/arch/sparc/boot/piggyback.c b/arch/sparc/boot/piggyback.c index 6d74064add0a..a9cc55254ff8 100644 --- a/arch/sparc/boot/piggyback.c +++ b/arch/sparc/boot/piggyback.c @@ -220,8 +220,8 @@ int main(int argc,char **argv) /* * root_flags = 0 - * root_dev = 1 (RAMDISK_MAJOR) - * ram_flags = 0 + * root_dev = 1 (1 used to mean RAMDISK_MAJOR, i. e. initrd, which was removed from Linux) + * ram_flags = 0 (used to control initrd, which was removed from Linux in 2025) * sparc_ramdisk_image = "PAGE aligned address after _end") * sparc_ramdisk_size = size of image */ diff --git a/arch/sparc/kernel/head_32.S b/arch/sparc/kernel/head_32.S index 38345460d542..46f0e39b9037 100644 --- a/arch/sparc/kernel/head_32.S +++ b/arch/sparc/kernel/head_32.S @@ -65,7 +65,7 @@ empty_zero_page: .skip PAGE_SIZE EXPORT_SYMBOL(empty_zero_page) .global root_flags - .global ram_flags + .global ram_flags /* used to control initrd, which was removed from Linux in 2025 */ .global root_dev .global sparc_ramdisk_image .global sparc_ramdisk_size @@ -81,7 +81,7 @@ root_flags: .half 1 root_dev: .half 0 -ram_flags: +ram_flags: /* used to control initrd, which was removed from Linux in 2025 */ .half 0 sparc_ramdisk_image: .word 0 diff --git a/arch/sparc/kernel/head_64.S b/arch/sparc/kernel/head_64.S index cf0549134234..4480c0532fe9 100644 --- a/arch/sparc/kernel/head_64.S +++ b/arch/sparc/kernel/head_64.S @@ -52,7 +52,9 @@ stext: * Fields should be kept upward compatible and whenever any change is made, * HdrS version should be incremented. */ - .global root_flags, ram_flags, root_dev + .global root_flags + .global ram_flags /* used to control initrd, which was removed from Linux in 2025 */ + .global root_dev .global sparc_ramdisk_image, sparc_ramdisk_size .global sparc_ramdisk_image64 @@ -71,7 +73,7 @@ root_flags: .half 1 root_dev: .half 0 -ram_flags: +ram_flags: /* used to control initrd, which was removed from Linux in 2025 */ .half 0 sparc_ramdisk_image: .word 0 diff --git a/arch/sparc/kernel/setup_32.c b/arch/sparc/kernel/setup_32.c index eb60be31127f..fb46fb3acf54 100644 --- a/arch/sparc/kernel/setup_32.c +++ b/arch/sparc/kernel/setup_32.c @@ -170,8 +170,6 @@ static void __init boot_flags_init(char *commands) extern unsigned short root_flags; extern unsigned short root_dev; -extern unsigned short ram_flags; -#define RAMDISK_IMAGE_START_MASK 0x07FF extern int root_mountflags; @@ -335,9 +333,6 @@ void __init setup_arch(char **cmdline_p) if (!root_flags) root_mountflags &= ~MS_RDONLY; ROOT_DEV = old_decode_dev(root_dev); -#ifdef CONFIG_BLK_DEV_RAM - rd_image_start = ram_flags & RAMDISK_IMAGE_START_MASK; -#endif prom_setsync(prom_sync_me); diff --git a/arch/sparc/kernel/setup_64.c b/arch/sparc/kernel/setup_64.c index f728f1b00aca..79b56613c6d8 100644 --- a/arch/sparc/kernel/setup_64.c +++ b/arch/sparc/kernel/setup_64.c @@ -143,8 +143,6 @@ static void __init boot_flags_init(char *commands) extern unsigned short root_flags; extern unsigned short root_dev; -extern unsigned short ram_flags; -#define RAMDISK_IMAGE_START_MASK 0x07FF extern int root_mountflags; @@ -640,9 +638,6 @@ void __init setup_arch(char **cmdline_p) if (!root_flags) root_mountflags &= ~MS_RDONLY; ROOT_DEV = old_decode_dev(root_dev); -#ifdef CONFIG_BLK_DEV_RAM - rd_image_start = ram_flags & RAMDISK_IMAGE_START_MASK; -#endif #ifdef CONFIG_IP_PNP if (!ic_set_manually) { diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index 9bea5a1e2c52..0ced2e9f100e 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -235,7 +235,7 @@ hdr: .byte setup_sects - 1 root_flags: .word ROOT_RDONLY syssize: .long ZO__edata / 16 -ram_size: .word 0 /* Obsolete */ +ram_size: .word 0 /* Used to control initrd, which was removed from Linux in 2025 */ vid_mode: .word SVGA_MODE root_dev: .word 0 /* Default to major/minor 0/0 */ boot_flag: .word 0xAA55 diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h index f53dd3f319ba..bf56549f79bb 100644 --- a/arch/x86/include/uapi/asm/bootparam.h +++ b/arch/x86/include/uapi/asm/bootparam.h @@ -4,9 +4,6 @@ #include -/* ram_size flags */ -#define RAMDISK_IMAGE_START_MASK 0x07FF - /* loadflags */ #define LOADED_HIGH (1<<0) #define KASLR_FLAG (1<<1) @@ -37,7 +34,7 @@ struct setup_header { __u8 setup_sects; __u16 root_flags; __u32 syssize; - __u16 ram_size; + __u16 ram_size; /* used to control initrd, which was removed from Linux in 2025 */ __u16 vid_mode; __u16 root_dev; __u16 boot_flag; diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 6409e766fb17..797c3c9fc75e 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -222,8 +222,6 @@ extern int root_mountflags; unsigned long saved_video_mode; -#define RAMDISK_IMAGE_START_MASK 0x07FF - static char __initdata command_line[COMMAND_LINE_SIZE]; #ifdef CONFIG_CMDLINE_BOOL char builtin_cmdline[COMMAND_LINE_SIZE] = CONFIG_CMDLINE; @@ -541,9 +539,6 @@ static void __init parse_boot_params(void) bootloader_version = bootloader_type & 0xf; bootloader_version |= boot_params.hdr.ext_loader_ver << 4; -#ifdef CONFIG_BLK_DEV_RAM - rd_image_start = boot_params.hdr.ram_size & RAMDISK_IMAGE_START_MASK; -#endif #ifdef CONFIG_EFI if (!strncmp((char *)&boot_params.efi_info.efi_loader_signature, EFI32_LOADER_SIGNATURE, 4)) { diff --git a/include/linux/initrd.h b/include/linux/initrd.h index f1a1f4c92ded..6320a9cb6686 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -5,9 +5,6 @@ #define INITRD_MINOR 250 /* shouldn't collide with /dev/ram* too soon ... */ -/* starting block # of image */ -extern int rd_image_start; - /* size of a single RAM disk */ extern unsigned long rd_size; diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c index f7d53bc21e41..8e0a774a9c6f 100644 --- a/init/do_mounts_rd.c +++ b/init/do_mounts_rd.c @@ -17,11 +17,9 @@ static struct file *in_file, *out_file; static loff_t in_pos, out_pos; -int __initdata rd_image_start; /* starting block # of image */ - static int __init ramdisk_start_setup(char *str) { - rd_image_start = simple_strtol(str,NULL,0); + /* will be removed in next commit */ return 1; } __setup("ramdisk_start=", ramdisk_start_setup); @@ -60,7 +58,7 @@ identify_ramdisk_image(struct file *file, loff_t pos, unsigned char *buf; const char *compress_name; unsigned long n; - int start_block = rd_image_start; + int start_block = 0; buf = kmalloc(size, GFP_KERNEL); if (!buf) @@ -196,7 +194,7 @@ int __init rd_load_image(char *from) if (IS_ERR(in_file)) goto noclose_input; - in_pos = rd_image_start * BLOCK_SIZE; + in_pos = 0; nblocks = identify_ramdisk_image(in_file, in_pos, &decompressor); if (nblocks < 0) goto done; -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:44 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:44 +0000 Subject: [PATCH RESEND 05/62] init: remove "ramdisk_start" command line parameter, which controls starting block number of initrd In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-6-safinaskar@gmail.com> This is preparation for initrd removal Signed-off-by: Askar Safin --- Documentation/admin-guide/blockdev/ramdisk.rst | 3 +-- Documentation/admin-guide/kernel-parameters.txt | 2 -- init/do_mounts_rd.c | 7 ------- 3 files changed, 1 insertion(+), 11 deletions(-) diff --git a/Documentation/admin-guide/blockdev/ramdisk.rst b/Documentation/admin-guide/blockdev/ramdisk.rst index 9ce6101e8dd9..e57c61108dbc 100644 --- a/Documentation/admin-guide/blockdev/ramdisk.rst +++ b/Documentation/admin-guide/blockdev/ramdisk.rst @@ -74,12 +74,11 @@ arch/x86/boot/Makefile. Some of the kernel command line boot options that may apply here are:: - ramdisk_start=N ramdisk_size=M If you make a boot disk that has LILO, then for the above, you would use:: - append = "ramdisk_start=N ramdisk_size=M" + append = "ramdisk_size=M" 4) An Example of Creating a Compressed RAM Disk ----------------------------------------------- diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index f940c1184912..07e8878f1e13 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5285,8 +5285,6 @@ ramdisk_size= [RAM] Sizes of RAM disks in kilobytes See Documentation/admin-guide/blockdev/ramdisk.rst. - ramdisk_start= [RAM] RAM disk image start address - random.trust_cpu=off [KNL,EARLY] Disable trusting the use of the CPU's random number generator (if available) to diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c index 8e0a774a9c6f..864fa88d9f89 100644 --- a/init/do_mounts_rd.c +++ b/init/do_mounts_rd.c @@ -17,13 +17,6 @@ static struct file *in_file, *out_file; static loff_t in_pos, out_pos; -static int __init ramdisk_start_setup(char *str) -{ - /* will be removed in next commit */ - return 1; -} -__setup("ramdisk_start=", ramdisk_start_setup); - static int __init crd_load(decompress_fn deco); /* -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:45 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:45 +0000 Subject: [PATCH RESEND 06/62] arm: init: remove special logic for setting brd.rd_size In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-7-safinaskar@gmail.com> There is no any reason for having special mechanism for setting ramdisk size. Also this allows us to change rd_size variable to static Signed-off-by: Askar Safin --- arch/arm/kernel/atags_parse.c | 12 ------------ drivers/block/brd.c | 8 ++++---- include/linux/initrd.h | 3 --- 3 files changed, 4 insertions(+), 19 deletions(-) diff --git a/arch/arm/kernel/atags_parse.c b/arch/arm/kernel/atags_parse.c index a3f0a4f84e04..615d9e83c9b5 100644 --- a/arch/arm/kernel/atags_parse.c +++ b/arch/arm/kernel/atags_parse.c @@ -87,18 +87,6 @@ static int __init parse_tag_videotext(const struct tag *tag) __tagtable(ATAG_VIDEOTEXT, parse_tag_videotext); #endif -#ifdef CONFIG_BLK_DEV_RAM -static int __init parse_tag_ramdisk(const struct tag *tag) -{ - if (tag->u.ramdisk.size) - rd_size = tag->u.ramdisk.size; - - return 0; -} - -__tagtable(ATAG_RAMDISK, parse_tag_ramdisk); -#endif - static int __init parse_tag_serialnr(const struct tag *tag) { system_serial_low = tag->u.serialnr.low; diff --git a/drivers/block/brd.c b/drivers/block/brd.c index 0c2eabe14af3..72f02d2b8a99 100644 --- a/drivers/block/brd.c +++ b/drivers/block/brd.c @@ -27,6 +27,10 @@ #include +static unsigned long rd_size = CONFIG_BLK_DEV_RAM_SIZE; +module_param(rd_size, ulong, 0444); +MODULE_PARM_DESC(rd_size, "Size of each RAM disk in kbytes."); + /* * Each block ramdisk device has a xarray brd_pages of pages that stores * the pages containing the block device's contents. @@ -209,10 +213,6 @@ static int rd_nr = CONFIG_BLK_DEV_RAM_COUNT; module_param(rd_nr, int, 0444); MODULE_PARM_DESC(rd_nr, "Maximum number of brd devices"); -unsigned long rd_size = CONFIG_BLK_DEV_RAM_SIZE; -module_param(rd_size, ulong, 0444); -MODULE_PARM_DESC(rd_size, "Size of each RAM disk in kbytes."); - static int max_part = 1; module_param(max_part, int, 0444); MODULE_PARM_DESC(max_part, "Num Minors to reserve between devices"); diff --git a/include/linux/initrd.h b/include/linux/initrd.h index 6320a9cb6686..b42235c21444 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -5,9 +5,6 @@ #define INITRD_MINOR 250 /* shouldn't collide with /dev/ram* too soon ... */ -/* size of a single RAM disk */ -extern unsigned long rd_size; - /* 1 if it is not an error if initrd_start < memory_start */ extern int initrd_below_start_ok; -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:46 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:46 +0000 Subject: [PATCH RESEND 07/62] arm: init: remove ATAG_RAMDISK In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-8-safinaskar@gmail.com> Previous commit removed last reference to ATAG_RAMDISK, so let's remove it Signed-off-by: Askar Safin --- arch/arm/Kconfig | 2 +- arch/arm/include/uapi/asm/setup.h | 10 ---------- arch/arm/kernel/atags_compat.c | 8 -------- 3 files changed, 1 insertion(+), 19 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index b1f3df39ed40..afc161d76c5f 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1479,7 +1479,7 @@ config ARM_ATAG_DTB_COMPAT depends on ARM_APPENDED_DTB help Some old bootloaders can't be updated to a DTB capable one, yet - they provide ATAGs with memory configuration, the ramdisk address, + they provide ATAGs with memory configuration, the kernel cmdline string, etc. Such information is dynamically provided by the bootloader and can't always be stored in a static DTB. To allow a device tree enabled kernel to be used with such diff --git a/arch/arm/include/uapi/asm/setup.h b/arch/arm/include/uapi/asm/setup.h index 8e50e034fec7..3a70890ce80f 100644 --- a/arch/arm/include/uapi/asm/setup.h +++ b/arch/arm/include/uapi/asm/setup.h @@ -59,15 +59,6 @@ struct tag_videotext { __u16 video_points; }; -/* describes how the ramdisk will be used in kernel */ -#define ATAG_RAMDISK 0x54410004 - -struct tag_ramdisk { - __u32 flags; /* bit 0 = load, bit 1 = prompt */ - __u32 size; /* decompressed ramdisk size in _kilo_ bytes */ - __u32 start; /* starting block of floppy-based RAM disk image */ -}; - /* describes where the compressed ramdisk image lives (virtual address) */ /* * this one accidentally used virtual addresses - as such, @@ -150,7 +141,6 @@ struct tag { struct tag_core core; struct tag_mem32 mem; struct tag_videotext videotext; - struct tag_ramdisk ramdisk; struct tag_initrd initrd; struct tag_serialnr serialnr; struct tag_revision revision; diff --git a/arch/arm/kernel/atags_compat.c b/arch/arm/kernel/atags_compat.c index 10da11c212cc..b9747061fa97 100644 --- a/arch/arm/kernel/atags_compat.c +++ b/arch/arm/kernel/atags_compat.c @@ -122,14 +122,6 @@ static void __init build_tag_list(struct param_struct *params, void *taglist) tag->u.core.pagesize = params->u1.s.page_size; tag->u.core.rootdev = params->u1.s.rootdev; - tag = tag_next(tag); - tag->hdr.tag = ATAG_RAMDISK; - tag->hdr.size = tag_size(tag_ramdisk); - tag->u.ramdisk.flags = (params->u1.s.flags & FLAG_RDLOAD ? 1 : 0) | - (params->u1.s.flags & FLAG_RDPROMPT ? 2 : 0); - tag->u.ramdisk.size = params->u1.s.ramdisk_size; - tag->u.ramdisk.start = params->u1.s.rd_start; - tag = tag_next(tag); tag->hdr.tag = ATAG_INITRD; tag->hdr.size = tag_size(tag_initrd); -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:47 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:47 +0000 Subject: [PATCH RESEND 08/62] arm: init: remove FLAG_RDLOAD and FLAG_RDPROMPT In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-9-safinaskar@gmail.com> They are unused since previous commit Signed-off-by: Askar Safin --- Documentation/arch/arm/setup.rst | 4 ++-- arch/arm/kernel/atags_compat.c | 2 -- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/Documentation/arch/arm/setup.rst b/Documentation/arch/arm/setup.rst index 8e12ef3fb9a7..be77d4b2aac1 100644 --- a/Documentation/arch/arm/setup.rst +++ b/Documentation/arch/arm/setup.rst @@ -35,8 +35,8 @@ below: ===== ======================== bit 0 1 = mount root read only bit 1 unused - bit 2 0 = load ramdisk - bit 3 0 = prompt for ramdisk + bit 2 unused + bit 3 unused ===== ======================== rootdev diff --git a/arch/arm/kernel/atags_compat.c b/arch/arm/kernel/atags_compat.c index b9747061fa97..8d04edee3066 100644 --- a/arch/arm/kernel/atags_compat.c +++ b/arch/arm/kernel/atags_compat.c @@ -44,8 +44,6 @@ struct param_struct { unsigned long ramdisk_size; /* 8 */ unsigned long flags; /* 12 */ #define FLAG_READONLY 1 -#define FLAG_RDLOAD 4 -#define FLAG_RDPROMPT 8 unsigned long rootdev; /* 16 */ unsigned long video_num_cols; /* 20 */ unsigned long video_num_rows; /* 24 */ -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:48 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:48 +0000 Subject: [PATCH RESEND 09/62] arm: init: document rd_start (in param_struct) as obsolete In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-10-safinaskar@gmail.com> It is unused now Signed-off-by: Askar Safin --- Documentation/arch/arm/setup.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/arch/arm/setup.rst b/Documentation/arch/arm/setup.rst index be77d4b2aac1..01257f30d489 100644 --- a/Documentation/arch/arm/setup.rst +++ b/Documentation/arch/arm/setup.rst @@ -86,7 +86,7 @@ below: initial ramdisk. rd_start - Start address in sectors of the ramdisk image on a floppy disk. + This is now obsolete, and should not be used. system_rev system revision number. -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:49 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:49 +0000 Subject: [PATCH RESEND 10/62] initrd: remove initrd (initial RAM disk) support In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-11-safinaskar@gmail.com> Initrd was deprecated in 2020. Initramfs and (non-initial) RAM disks still work. Both built-in and bootloader-supplied initramfs still work. Also remove Documentation/admin-guide/initrd.rst . It contains paragraph about initramfs, but initramfs already covered in Documentation/filesystems/ramfs-rootfs-initramfs.rst Signed-off-by: Askar Safin --- Documentation/admin-guide/devices.txt | 6 - Documentation/admin-guide/index.rst | 1 - Documentation/admin-guide/initrd.rst | 383 ------------------ Documentation/admin-guide/nfs/nfsroot.rst | 4 +- Documentation/power/swsusp-dmcrypt.rst | 2 +- fs/init.c | 14 - include/linux/init_syscalls.h | 1 - include/linux/initrd.h | 2 - init/Kconfig | 2 +- init/Makefile | 1 - init/do_mounts.c | 6 +- init/do_mounts.h | 22 - init/do_mounts_initrd.c | 83 ---- init/do_mounts_rd.c | 318 --------------- init/initramfs.c | 31 +- .../ktest/examples/bootconfigs/tracing.bconf | 3 - 16 files changed, 6 insertions(+), 873 deletions(-) delete mode 100644 Documentation/admin-guide/initrd.rst delete mode 100644 init/do_mounts_rd.c diff --git a/Documentation/admin-guide/devices.txt b/Documentation/admin-guide/devices.txt index 94c98be1329a..27835389ca49 100644 --- a/Documentation/admin-guide/devices.txt +++ b/Documentation/admin-guide/devices.txt @@ -21,12 +21,6 @@ 0 = /dev/ram0 First RAM disk 1 = /dev/ram1 Second RAM disk ... - 250 = /dev/initrd Initial RAM disk - - Older kernels had /dev/ramdisk (1, 1) here. - /dev/initrd refers to a RAM disk which was preloaded - by the boot loader; newer kernels use /dev/ram0 for - the initrd. 2 char Pseudo-TTY masters 0 = /dev/ptyp0 First PTY master diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst index 259d79fbeb94..b3b2628ea515 100644 --- a/Documentation/admin-guide/index.rst +++ b/Documentation/admin-guide/index.rst @@ -51,7 +51,6 @@ Booting the kernel bootconfig kernel-parameters efi-stub - initrd Tracking down and identifying problems diff --git a/Documentation/admin-guide/initrd.rst b/Documentation/admin-guide/initrd.rst deleted file mode 100644 index 67bbad8806e8..000000000000 --- a/Documentation/admin-guide/initrd.rst +++ /dev/null @@ -1,383 +0,0 @@ -Using the initial RAM disk (initrd) -=================================== - -Written 1996,2000 by Werner Almesberger and -Hans Lermen - - -initrd provides the capability to load a RAM disk by the boot loader. -This RAM disk can then be mounted as the root file system and programs -can be run from it. Afterwards, a new root file system can be mounted -from a different device. The previous root (from initrd) is then moved -to a directory and can be subsequently unmounted. - -initrd is mainly designed to allow system startup to occur in two phases, -where the kernel comes up with a minimum set of compiled-in drivers, and -where additional modules are loaded from initrd. - -This document gives a brief overview of the use of initrd. A more detailed -discussion of the boot process can be found in [#f1]_. - - -Operation ---------- - -When using initrd, the system typically boots as follows: - - 1) the boot loader loads the kernel and the initial RAM disk - 2) the kernel converts initrd into a "normal" RAM disk and - frees the memory used by initrd - 3) if the root device is not ``/dev/ram0``, the old (deprecated) - change_root procedure is followed. see the "Obsolete root change - mechanism" section below. - 4) root device is mounted. if it is ``/dev/ram0``, the initrd image is - then mounted as root - 5) /sbin/init is executed (this can be any valid executable, including - shell scripts; it is run with uid 0 and can do basically everything - init can do). - 6) init mounts the "real" root file system - 7) init places the root file system at the root directory using the - pivot_root system call - 8) init execs the ``/sbin/init`` on the new root filesystem, performing - the usual boot sequence - 9) the initrd file system is removed - -Note that changing the root directory does not involve unmounting it. -It is therefore possible to leave processes running on initrd during that -procedure. Also note that file systems mounted under initrd continue to -be accessible. - - -Boot command-line options -------------------------- - -initrd adds the following new options:: - - initrd= (e.g. LOADLIN) - - Loads the specified file as the initial RAM disk. When using LILO, you - have to specify the RAM disk image file in /etc/lilo.conf, using the - INITRD configuration variable. - - noinitrd - - initrd data is preserved but it is not converted to a RAM disk and - the "normal" root file system is mounted. initrd data can be read - from /dev/initrd. Note that the data in initrd can have any structure - in this case and doesn't necessarily have to be a file system image. - This option is used mainly for debugging. - - Note: /dev/initrd is read-only and it can only be used once. As soon - as the last process has closed it, all data is freed and /dev/initrd - can't be opened anymore. - - root=/dev/ram0 - - initrd is mounted as root, and the normal boot procedure is followed, - with the RAM disk mounted as root. - -Compressed cpio images ----------------------- - -Recent kernels have support for populating a ramdisk from a compressed cpio -archive. On such systems, the creation of a ramdisk image doesn't need to -involve special block devices or loopbacks; you merely create a directory on -disk with the desired initrd content, cd to that directory, and run (as an -example):: - - find . | cpio --quiet -H newc -o | gzip -9 -n > /boot/imagefile.img - -Examining the contents of an existing image file is just as simple:: - - mkdir /tmp/imagefile - cd /tmp/imagefile - gzip -cd /boot/imagefile.img | cpio -imd --quiet - -Installation ------------- - -First, a directory for the initrd file system has to be created on the -"normal" root file system, e.g.:: - - # mkdir /initrd - -The name is not relevant. More details can be found on the -:manpage:`pivot_root(2)` man page. - -If the root file system is created during the boot procedure (i.e. if -you're building an install floppy), the root file system creation -procedure should create the ``/initrd`` directory. - -If initrd will not be mounted in some cases, its content is still -accessible if the following device has been created:: - - # mknod /dev/initrd b 1 250 - # chmod 400 /dev/initrd - -Second, the kernel has to be compiled with RAM disk support and with -support for the initial RAM disk enabled. Also, at least all components -needed to execute programs from initrd (e.g. executable format and file -system) must be compiled into the kernel. - -Third, you have to create the RAM disk image. This is done by creating a -file system on a block device, copying files to it as needed, and then -copying the content of the block device to the initrd file. With recent -kernels, at least three types of devices are suitable for that: - - - a floppy disk (works everywhere but it's painfully slow) - - a RAM disk (fast, but allocates physical memory) - - a loopback device (the most elegant solution) - -We'll describe the loopback device method: - - 1) make sure loopback block devices are configured into the kernel - 2) create an empty file system of the appropriate size, e.g.:: - - # dd if=/dev/zero of=initrd bs=300k count=1 - # mke2fs -F -m0 initrd - - (if space is critical, you may want to use the Minix FS instead of Ext2) - 3) mount the file system, e.g.:: - - # mount -t ext2 -o loop initrd /mnt - - 4) create the console device:: - - # mkdir /mnt/dev - # mknod /mnt/dev/console c 5 1 - - 5) copy all the files that are needed to properly use the initrd - environment. Don't forget the most important file, ``/sbin/init`` - - .. note:: ``/sbin/init`` permissions must include "x" (execute). - - 6) correct operation the initrd environment can frequently be tested - even without rebooting with the command:: - - # chroot /mnt /sbin/init - - This is of course limited to initrds that do not interfere with the - general system state (e.g. by reconfiguring network interfaces, - overwriting mounted devices, trying to start already running demons, - etc. Note however that it is usually possible to use pivot_root in - such a chroot'ed initrd environment.) - 7) unmount the file system:: - - # umount /mnt - - 8) the initrd is now in the file "initrd". Optionally, it can now be - compressed:: - - # gzip -9 initrd - -For experimenting with initrd, you may want to take a rescue floppy and -only add a symbolic link from ``/sbin/init`` to ``/bin/sh``. Alternatively, you -can try the experimental newlib environment [#f2]_ to create a small -initrd. - -Finally, you have to boot the kernel and load initrd. Almost all Linux -boot loaders support initrd. Since the boot process is still compatible -with an older mechanism, the following boot command line parameters -have to be given:: - - root=/dev/ram0 rw - -(rw is only necessary if writing to the initrd file system.) - -With LOADLIN, you simply execute:: - - LOADLIN initrd= - -e.g.:: - - LOADLIN C:\LINUX\BZIMAGE initrd=C:\LINUX\INITRD.GZ root=/dev/ram0 rw - -With LILO, you add the option ``INITRD=`` to either the global section -or to the section of the respective kernel in ``/etc/lilo.conf``, and pass -the options using APPEND, e.g.:: - - image = /bzImage - initrd = /boot/initrd.gz - append = "root=/dev/ram0 rw" - -and run ``/sbin/lilo`` - -For other boot loaders, please refer to the respective documentation. - -Now you can boot and enjoy using initrd. - - -Changing the root device ------------------------- - -When finished with its duties, init typically changes the root device -and proceeds with starting the Linux system on the "real" root device. - -The procedure involves the following steps: - - mounting the new root file system - - turning it into the root file system - - removing all accesses to the old (initrd) root file system - - unmounting the initrd file system and de-allocating the RAM disk - -Mounting the new root file system is easy: it just needs to be mounted on -a directory under the current root. Example:: - - # mkdir /new-root - # mount -o ro /dev/hda1 /new-root - -The root change is accomplished with the pivot_root system call, which -is also available via the ``pivot_root`` utility (see :manpage:`pivot_root(8)` -man page; ``pivot_root`` is distributed with util-linux version 2.10h or higher -[#f3]_). ``pivot_root`` moves the current root to a directory under the new -root, and puts the new root at its place. The directory for the old root -must exist before calling ``pivot_root``. Example:: - - # cd /new-root - # mkdir initrd - # pivot_root . initrd - -Now, the init process may still access the old root via its -executable, shared libraries, standard input/output/error, and its -current root directory. All these references are dropped by the -following command:: - - # exec chroot . what-follows dev/console 2>&1 - -Where what-follows is a program under the new root, e.g. ``/sbin/init`` -If the new root file system will be used with udev and has no valid -``/dev`` directory, udev must be initialized before invoking chroot in order -to provide ``/dev/console``. - -Note: implementation details of pivot_root may change with time. In order -to ensure compatibility, the following points should be observed: - - - before calling pivot_root, the current directory of the invoking - process should point to the new root directory - - use . as the first argument, and the _relative_ path of the directory - for the old root as the second argument - - a chroot program must be available under the old and the new root - - chroot to the new root afterwards - - use relative paths for dev/console in the exec command - -Now, the initrd can be unmounted and the memory allocated by the RAM -disk can be freed:: - - # umount /initrd - # blockdev --flushbufs /dev/ram0 - -It is also possible to use initrd with an NFS-mounted root, see the -:manpage:`pivot_root(8)` man page for details. - - -Usage scenarios ---------------- - -The main motivation for implementing initrd was to allow for modular -kernel configuration at system installation. The procedure would work -as follows: - - 1) system boots from floppy or other media with a minimal kernel - (e.g. support for RAM disks, initrd, a.out, and the Ext2 FS) and - loads initrd - 2) ``/sbin/init`` determines what is needed to (1) mount the "real" root FS - (i.e. device type, device drivers, file system) and (2) the - distribution media (e.g. CD-ROM, network, tape, ...). This can be - done by asking the user, by auto-probing, or by using a hybrid - approach. - 3) ``/sbin/init`` loads the necessary kernel modules - 4) ``/sbin/init`` creates and populates the root file system (this doesn't - have to be a very usable system yet) - 5) ``/sbin/init`` invokes ``pivot_root`` to change the root file system and - execs - via chroot - a program that continues the installation - 6) the boot loader is installed - 7) the boot loader is configured to load an initrd with the set of - modules that was used to bring up the system (e.g. ``/initrd`` can be - modified, then unmounted, and finally, the image is written from - ``/dev/ram0`` or ``/dev/rd/0`` to a file) - 8) now the system is bootable and additional installation tasks can be - performed - -The key role of initrd here is to re-use the configuration data during -normal system operation without requiring the use of a bloated "generic" -kernel or re-compiling or re-linking the kernel. - -A second scenario is for installations where Linux runs on systems with -different hardware configurations in a single administrative domain. In -such cases, it is desirable to generate only a small set of kernels -(ideally only one) and to keep the system-specific part of configuration -information as small as possible. In this case, a common initrd could be -generated with all the necessary modules. Then, only ``/sbin/init`` or a file -read by it would have to be different. - -A third scenario is more convenient recovery disks, because information -like the location of the root FS partition doesn't have to be provided at -boot time, but the system loaded from initrd can invoke a user-friendly -dialog and it can also perform some sanity checks (or even some form of -auto-detection). - -Last not least, CD-ROM distributors may use it for better installation -from CD, e.g. by using a boot floppy and bootstrapping a bigger RAM disk -via initrd from CD; or by booting via a loader like ``LOADLIN`` or directly -from the CD-ROM, and loading the RAM disk from CD without need of -floppies. - - -Obsolete root change mechanism ------------------------------- - -The following mechanism was used before the introduction of pivot_root. -Current kernels still support it, but you should _not_ rely on its -continued availability. - -It works by mounting the "real" root device (i.e. the one set with rdev -in the kernel image or with root=... at the boot command line) as the -root file system when linuxrc exits. The initrd file system is then -unmounted, or, if it is still busy, moved to a directory ``/initrd``, if -such a directory exists on the new root file system. - -In order to use this mechanism, you do not have to specify the boot -command options root, init, or rw. (If specified, they will affect -the real root file system, not the initrd environment.) - -If /proc is mounted, the "real" root device can be changed from within -linuxrc by writing the number of the new root FS device to the special -file /proc/sys/kernel/real-root-dev, e.g.:: - - # echo 0x301 >/proc/sys/kernel/real-root-dev - -Note that the mechanism is incompatible with NFS and similar file -systems. - -This old, deprecated mechanism is commonly called ``change_root``, while -the new, supported mechanism is called ``pivot_root``. - - -Mixed change_root and pivot_root mechanism ------------------------------------------- - -In case you did not want to use ``root=/dev/ram0`` to trigger the pivot_root -mechanism, you may create both ``/linuxrc`` and ``/sbin/init`` in your initrd -image. - -``/linuxrc`` would contain only the following:: - - #! /bin/sh - mount -n -t proc proc /proc - echo 0x0100 >/proc/sys/kernel/real-root-dev - umount -n /proc - -Once linuxrc exited, the kernel would mount again your initrd as root, -this time executing ``/sbin/init``. Again, it would be the duty of this init -to build the right environment (maybe using the ``root= device`` passed on -the cmdline) before the final execution of the real ``/sbin/init``. - - -Resources ---------- - -.. [#f1] Almesberger, Werner; "Booting Linux: The History and the Future" - https://www.almesberger.net/cv/papers/ols2k-9.ps.gz -.. [#f2] newlib package (experimental), with initrd example - https://www.sourceware.org/newlib/ -.. [#f3] util-linux: Miscellaneous utilities for Linux - https://www.kernel.org/pub/linux/utils/util-linux/ diff --git a/Documentation/admin-guide/nfs/nfsroot.rst b/Documentation/admin-guide/nfs/nfsroot.rst index 135218f33394..60452bdfd454 100644 --- a/Documentation/admin-guide/nfs/nfsroot.rst +++ b/Documentation/admin-guide/nfs/nfsroot.rst @@ -18,8 +18,8 @@ Mounting the root filesystem via NFS (nfsroot) In order to use a diskless system, such as an X-terminal or printer server for example, it is necessary for the root filesystem to be present on a non-disk device. This may be an initramfs (see -Documentation/filesystems/ramfs-rootfs-initramfs.rst), a ramdisk (see -Documentation/admin-guide/initrd.rst) or a filesystem mounted via NFS. The +Documentation/filesystems/ramfs-rootfs-initramfs.rst) +or a filesystem mounted via NFS. The following text describes on how to use NFS for the root filesystem. For the rest of this text 'client' means the diskless system, and 'server' means the NFS server. diff --git a/Documentation/power/swsusp-dmcrypt.rst b/Documentation/power/swsusp-dmcrypt.rst index 426df59172cd..afb29a58fdf8 100644 --- a/Documentation/power/swsusp-dmcrypt.rst +++ b/Documentation/power/swsusp-dmcrypt.rst @@ -10,7 +10,7 @@ Some prerequisites: You know how dm-crypt works. If not, visit the following web page: http://www.saout.de/misc/dm-crypt/ You have read Documentation/power/swsusp.rst and understand it. -You did read Documentation/admin-guide/initrd.rst and know how an initrd works. +You did read Documentation/filesystems/ramfs-rootfs-initramfs.rst and know how an initrd works. You know how to create or how to modify an initrd. Now your system is properly set up, your disk is encrypted except for diff --git a/fs/init.c b/fs/init.c index eef5124885e3..dfa50474647c 100644 --- a/fs/init.c +++ b/fs/init.c @@ -27,20 +27,6 @@ int __init init_mount(const char *dev_name, const char *dir_name, return ret; } -int __init init_umount(const char *name, int flags) -{ - int lookup_flags = LOOKUP_MOUNTPOINT; - struct path path; - int ret; - - if (!(flags & UMOUNT_NOFOLLOW)) - lookup_flags |= LOOKUP_FOLLOW; - ret = kern_path(name, lookup_flags, &path); - if (ret) - return ret; - return path_umount(&path, flags); -} - int __init init_chdir(const char *filename) { struct path path; diff --git a/include/linux/init_syscalls.h b/include/linux/init_syscalls.h index 92045d18cbfc..0bdbc458a881 100644 --- a/include/linux/init_syscalls.h +++ b/include/linux/init_syscalls.h @@ -2,7 +2,6 @@ int __init init_mount(const char *dev_name, const char *dir_name, const char *type_page, unsigned long flags, void *data_page); -int __init init_umount(const char *name, int flags); int __init init_chdir(const char *filename); int __init init_chroot(const char *filename); int __init init_chown(const char *filename, uid_t user, gid_t group, int flags); diff --git a/include/linux/initrd.h b/include/linux/initrd.h index b42235c21444..cc389ef1a738 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -3,8 +3,6 @@ #ifndef __LINUX_INITRD_H #define __LINUX_INITRD_H -#define INITRD_MINOR 250 /* shouldn't collide with /dev/ram* too soon ... */ - /* 1 if it is not an error if initrd_start < memory_start */ extern int initrd_below_start_ok; diff --git a/init/Kconfig b/init/Kconfig index e3eb63eadc87..0263c08960bc 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1441,7 +1441,7 @@ config BLK_DEV_INITRD boot loader (loadlin or lilo) and that is mounted as root before the normal boot procedure. It is typically used to load modules needed to mount the "real" root file system, - etc. See for details. + etc. See for details. If RAM disk support (BLK_DEV_RAM) is also included, this also enables initial RAM disk (initrd) support and adds diff --git a/init/Makefile b/init/Makefile index d6f75d8907e0..b020154b3d2a 100644 --- a/init/Makefile +++ b/init/Makefile @@ -17,7 +17,6 @@ obj-$(CONFIG_INITRAMFS_TEST) += initramfs_test.o obj-y += init_task.o mounts-y := do_mounts.o -mounts-$(CONFIG_BLK_DEV_RAM) += do_mounts_rd.o mounts-$(CONFIG_BLK_DEV_INITRD) += do_mounts_initrd.o # diff --git a/init/do_mounts.c b/init/do_mounts.c index 0f2f44e6250c..f0b1a83dbda4 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -452,7 +452,7 @@ static dev_t __init parse_root_device(char *root_device_name) } /* - * Prepare the namespace - decide what/where to mount, load ramdisks, etc. + * Prepare the namespace - decide what/where to mount, etc. */ void __init prepare_namespace(void) { @@ -476,13 +476,9 @@ void __init prepare_namespace(void) if (saved_root_name[0]) ROOT_DEV = parse_root_device(saved_root_name); - if (initrd_load(saved_root_name)) - goto out; - if (root_wait) wait_for_root(saved_root_name); mount_root(saved_root_name); -out: devtmpfs_mount(); init_mount(".", "/", NULL, MS_MOVE, NULL); init_chroot("."); diff --git a/init/do_mounts.h b/init/do_mounts.h index 6069ea3eb80d..6c7a535e71ce 100644 --- a/init/do_mounts.h +++ b/init/do_mounts.h @@ -22,28 +22,6 @@ static inline __init int create_dev(char *name, dev_t dev) return init_mknod(name, S_IFBLK | 0600, new_encode_dev(dev)); } -#ifdef CONFIG_BLK_DEV_RAM - -int __init rd_load_disk(int n); -int __init rd_load_image(char *from); - -#else - -static inline int rd_load_disk(int n) { return 0; } -static inline int rd_load_image(char *from) { return 0; } - -#endif - -#ifdef CONFIG_BLK_DEV_INITRD -bool __init initrd_load(char *root_device_name); -#else -static inline bool initrd_load(char *root_device_name) -{ - return false; - } - -#endif - /* Ensure that async file closing finished to prevent spurious errors. */ static inline void init_flush_fput(void) { diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c index f6867bad0d78..308744254c08 100644 --- a/init/do_mounts_initrd.c +++ b/init/do_mounts_initrd.c @@ -69,86 +69,3 @@ static int __init early_initrd(char *p) return early_initrdmem(p); } early_param("initrd", early_initrd); - -static int __init init_linuxrc(struct subprocess_info *info, struct cred *new) -{ - ksys_unshare(CLONE_FS | CLONE_FILES); - console_on_rootfs(); - /* move initrd over / and chdir/chroot in initrd root */ - init_chdir("/root"); - init_mount(".", "/", NULL, MS_MOVE, NULL); - init_chroot("."); - ksys_setsid(); - return 0; -} - -static void __init handle_initrd(char *root_device_name) -{ - struct subprocess_info *info; - static char *argv[] = { "linuxrc", NULL, }; - extern char *envp_init[]; - int error; - - pr_warn("using deprecated initrd support, will be removed soon.\n"); - - real_root_dev = new_encode_dev(ROOT_DEV); - create_dev("/dev/root.old", Root_RAM0); - /* mount initrd on rootfs' /root */ - mount_root_generic("/dev/root.old", root_device_name, - root_mountflags & ~MS_RDONLY); - init_mkdir("/old", 0700); - init_chdir("/old"); - - info = call_usermodehelper_setup("/linuxrc", argv, envp_init, - GFP_KERNEL, init_linuxrc, NULL, NULL); - if (!info) - return; - call_usermodehelper_exec(info, UMH_WAIT_PROC|UMH_FREEZABLE); - - /* move initrd to rootfs' /old */ - init_mount("..", ".", NULL, MS_MOVE, NULL); - /* switch root and cwd back to / of rootfs */ - init_chroot(".."); - - if (new_decode_dev(real_root_dev) == Root_RAM0) { - init_chdir("/old"); - return; - } - - init_chdir("/"); - ROOT_DEV = new_decode_dev(real_root_dev); - mount_root(root_device_name); - - printk(KERN_NOTICE "Trying to move old root to /initrd ... "); - error = init_mount("/old", "/root/initrd", NULL, MS_MOVE, NULL); - if (!error) - printk("okay\n"); - else { - if (error == -ENOENT) - printk("/initrd does not exist. Ignored.\n"); - else - printk("failed\n"); - printk(KERN_NOTICE "Unmounting old root\n"); - init_umount("/old", MNT_DETACH); - } -} - -bool __init initrd_load(char *root_device_name) -{ - if (mount_initrd) { - create_dev("/dev/ram", Root_RAM0); - /* - * Load the initrd data into /dev/ram0. Execute it as initrd - * unless /dev/ram0 is supposed to be our actual root device, - * in that case the ram disk is just set up here, and gets - * mounted in the normal path. - */ - if (rd_load_image("/initrd.image") && ROOT_DEV != Root_RAM0) { - init_unlink("/initrd.image"); - handle_initrd(root_device_name); - return true; - } - } - init_unlink("/initrd.image"); - return false; -} diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c deleted file mode 100644 index 864fa88d9f89..000000000000 --- a/init/do_mounts_rd.c +++ /dev/null @@ -1,318 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "do_mounts.h" -#include "../fs/squashfs/squashfs_fs.h" - -#include - -static struct file *in_file, *out_file; -static loff_t in_pos, out_pos; - -static int __init crd_load(decompress_fn deco); - -/* - * This routine tries to find a RAM disk image to load, and returns the - * number of blocks to read for a non-compressed image, 0 if the image - * is a compressed image, and -1 if an image with the right magic - * numbers could not be found. - * - * We currently check for the following magic numbers: - * minix - * ext2 - * romfs - * cramfs - * squashfs - * gzip - * bzip2 - * lzma - * xz - * lzo - * lz4 - */ -static int __init -identify_ramdisk_image(struct file *file, loff_t pos, - decompress_fn *decompressor) -{ - const int size = 512; - struct minix_super_block *minixsb; - struct romfs_super_block *romfsb; - struct cramfs_super *cramfsb; - struct squashfs_super_block *squashfsb; - int nblocks = -1; - unsigned char *buf; - const char *compress_name; - unsigned long n; - int start_block = 0; - - buf = kmalloc(size, GFP_KERNEL); - if (!buf) - return -ENOMEM; - - minixsb = (struct minix_super_block *) buf; - romfsb = (struct romfs_super_block *) buf; - cramfsb = (struct cramfs_super *) buf; - squashfsb = (struct squashfs_super_block *) buf; - memset(buf, 0xe5, size); - - /* - * Read block 0 to test for compressed kernel - */ - pos = start_block * BLOCK_SIZE; - kernel_read(file, buf, size, &pos); - - *decompressor = decompress_method(buf, size, &compress_name); - if (compress_name) { - printk(KERN_NOTICE "RAMDISK: %s image found at block %d\n", - compress_name, start_block); - if (!*decompressor) - printk(KERN_EMERG - "RAMDISK: %s decompressor not configured!\n", - compress_name); - nblocks = 0; - goto done; - } - - /* romfs is at block zero too */ - if (romfsb->word0 == ROMSB_WORD0 && - romfsb->word1 == ROMSB_WORD1) { - printk(KERN_NOTICE - "RAMDISK: romfs filesystem found at block %d\n", - start_block); - nblocks = (ntohl(romfsb->size)+BLOCK_SIZE-1)>>BLOCK_SIZE_BITS; - goto done; - } - - if (cramfsb->magic == CRAMFS_MAGIC) { - printk(KERN_NOTICE - "RAMDISK: cramfs filesystem found at block %d\n", - start_block); - nblocks = (cramfsb->size + BLOCK_SIZE - 1) >> BLOCK_SIZE_BITS; - goto done; - } - - /* squashfs is at block zero too */ - if (le32_to_cpu(squashfsb->s_magic) == SQUASHFS_MAGIC) { - printk(KERN_NOTICE - "RAMDISK: squashfs filesystem found at block %d\n", - start_block); - nblocks = (le64_to_cpu(squashfsb->bytes_used) + BLOCK_SIZE - 1) - >> BLOCK_SIZE_BITS; - goto done; - } - - /* - * Read 512 bytes further to check if cramfs is padded - */ - pos = start_block * BLOCK_SIZE + 0x200; - kernel_read(file, buf, size, &pos); - - if (cramfsb->magic == CRAMFS_MAGIC) { - printk(KERN_NOTICE - "RAMDISK: cramfs filesystem found at block %d\n", - start_block); - nblocks = (cramfsb->size + BLOCK_SIZE - 1) >> BLOCK_SIZE_BITS; - goto done; - } - - /* - * Read block 1 to test for minix and ext2 superblock - */ - pos = (start_block + 1) * BLOCK_SIZE; - kernel_read(file, buf, size, &pos); - - /* Try minix */ - if (minixsb->s_magic == MINIX_SUPER_MAGIC || - minixsb->s_magic == MINIX_SUPER_MAGIC2) { - printk(KERN_NOTICE - "RAMDISK: Minix filesystem found at block %d\n", - start_block); - nblocks = minixsb->s_nzones << minixsb->s_log_zone_size; - goto done; - } - - /* Try ext2 */ - n = ext2_image_size(buf); - if (n) { - printk(KERN_NOTICE - "RAMDISK: ext2 filesystem found at block %d\n", - start_block); - nblocks = n; - goto done; - } - - printk(KERN_NOTICE - "RAMDISK: Couldn't find valid RAM disk image starting at %d.\n", - start_block); - -done: - kfree(buf); - return nblocks; -} - -static unsigned long nr_blocks(struct file *file) -{ - struct inode *inode = file->f_mapping->host; - - if (!S_ISBLK(inode->i_mode)) - return 0; - return i_size_read(inode) >> 10; -} - -int __init rd_load_image(char *from) -{ - int res = 0; - unsigned long rd_blocks, devblocks; - int nblocks, i; - char *buf = NULL; - unsigned short rotate = 0; - decompress_fn decompressor = NULL; -#if !defined(CONFIG_S390) - char rotator[4] = { '|' , '/' , '-' , '\\' }; -#endif - - out_file = filp_open("/dev/ram", O_RDWR, 0); - if (IS_ERR(out_file)) - goto out; - - in_file = filp_open(from, O_RDONLY, 0); - if (IS_ERR(in_file)) - goto noclose_input; - - in_pos = 0; - nblocks = identify_ramdisk_image(in_file, in_pos, &decompressor); - if (nblocks < 0) - goto done; - - if (nblocks == 0) { - if (crd_load(decompressor) == 0) - goto successful_load; - goto done; - } - - /* - * NOTE NOTE: nblocks is not actually blocks but - * the number of kibibytes of data to load into a ramdisk. - */ - rd_blocks = nr_blocks(out_file); - if (nblocks > rd_blocks) { - printk("RAMDISK: image too big! (%dKiB/%ldKiB)\n", - nblocks, rd_blocks); - goto done; - } - - /* - * OK, time to copy in the data - */ - if (strcmp(from, "/initrd.image") == 0) - devblocks = nblocks; - else - devblocks = nr_blocks(in_file); - - if (devblocks == 0) { - printk(KERN_ERR "RAMDISK: could not determine device size\n"); - goto done; - } - - buf = kmalloc(BLOCK_SIZE, GFP_KERNEL); - if (!buf) { - printk(KERN_ERR "RAMDISK: could not allocate buffer\n"); - goto done; - } - - printk(KERN_NOTICE "RAMDISK: Loading %dKiB [%ld disk%s] into ram disk... ", - nblocks, ((nblocks-1)/devblocks)+1, nblocks>devblocks ? "s" : ""); - for (i = 0; i < nblocks; i++) { - if (i && (i % devblocks == 0)) { - pr_cont("done disk #1.\n"); - rotate = 0; - fput(in_file); - break; - } - kernel_read(in_file, buf, BLOCK_SIZE, &in_pos); - kernel_write(out_file, buf, BLOCK_SIZE, &out_pos); -#if !defined(CONFIG_S390) - if (!(i % 16)) { - pr_cont("%c\b", rotator[rotate & 0x3]); - rotate++; - } -#endif - } - pr_cont("done.\n"); - -successful_load: - res = 1; -done: - fput(in_file); -noclose_input: - fput(out_file); -out: - kfree(buf); - init_unlink("/dev/ram"); - return res; -} - -int __init rd_load_disk(int n) -{ - create_dev("/dev/root", ROOT_DEV); - create_dev("/dev/ram", MKDEV(RAMDISK_MAJOR, n)); - return rd_load_image("/dev/root"); -} - -static int exit_code; -static int decompress_error; - -static long __init compr_fill(void *buf, unsigned long len) -{ - long r = kernel_read(in_file, buf, len, &in_pos); - if (r < 0) - printk(KERN_ERR "RAMDISK: error while reading compressed data"); - else if (r == 0) - printk(KERN_ERR "RAMDISK: EOF while reading compressed data"); - return r; -} - -static long __init compr_flush(void *window, unsigned long outcnt) -{ - long written = kernel_write(out_file, window, outcnt, &out_pos); - if (written != outcnt) { - if (decompress_error == 0) - printk(KERN_ERR - "RAMDISK: incomplete write (%ld != %ld)\n", - written, outcnt); - decompress_error = 1; - return -1; - } - return outcnt; -} - -static void __init error(char *x) -{ - printk(KERN_ERR "%s\n", x); - exit_code = 1; - decompress_error = 1; -} - -static int __init crd_load(decompress_fn deco) -{ - int result; - - if (!deco) { - pr_emerg("Invalid ramdisk decompression routine. " - "Select appropriate config option.\n"); - panic("Could not decompress initial ramdisk image."); - } - - result = deco(NULL, 0, compr_fill, compr_flush, NULL, NULL, error); - if (decompress_error) - result = 1; - return result; -} diff --git a/init/initramfs.c b/init/initramfs.c index 097673b97784..850cb0de873e 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -692,28 +692,6 @@ static inline bool kexec_free_initrd(void) } #endif /* CONFIG_KEXEC_CORE */ -#ifdef CONFIG_BLK_DEV_RAM -static void __init populate_initrd_image(char *err) -{ - ssize_t written; - struct file *file; - loff_t pos = 0; - - printk(KERN_INFO "rootfs image is not initramfs (%s); looks like an initrd\n", - err); - file = filp_open("/initrd.image", O_WRONLY|O_CREAT|O_LARGEFILE, 0700); - if (IS_ERR(file)) - return; - - written = xwrite(file, (char *)initrd_start, initrd_end - initrd_start, - &pos); - if (written != initrd_end - initrd_start) - pr_err("/initrd.image: incomplete write (%zd != %ld)\n", - written, initrd_end - initrd_start); - fput(file); -} -#endif /* CONFIG_BLK_DEV_RAM */ - static void __init do_populate_rootfs(void *unused, async_cookie_t cookie) { /* Load the built in initramfs */ @@ -724,18 +702,11 @@ static void __init do_populate_rootfs(void *unused, async_cookie_t cookie) if (!initrd_start || IS_ENABLED(CONFIG_INITRAMFS_FORCE)) goto done; - if (IS_ENABLED(CONFIG_BLK_DEV_RAM)) - printk(KERN_INFO "Trying to unpack rootfs image as initramfs...\n"); - else - printk(KERN_INFO "Unpacking initramfs...\n"); + printk(KERN_INFO "Unpacking initramfs...\n"); err = unpack_to_rootfs((char *)initrd_start, initrd_end - initrd_start); if (err) { -#ifdef CONFIG_BLK_DEV_RAM - populate_initrd_image(err); -#else printk(KERN_EMERG "Initramfs unpacking failed: %s\n", err); -#endif } done: diff --git a/tools/testing/ktest/examples/bootconfigs/tracing.bconf b/tools/testing/ktest/examples/bootconfigs/tracing.bconf index bf117c78115a..c81ee5e30d2d 100644 --- a/tools/testing/ktest/examples/bootconfigs/tracing.bconf +++ b/tools/testing/ktest/examples/bootconfigs/tracing.bconf @@ -16,9 +16,6 @@ ftrace { myevent2 { probes = "vfs_write $arg2 +0($arg2):ustring $arg3"; } - myevent3 { - probes = "initrd_load"; - } enable } } -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:50 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:50 +0000 Subject: [PATCH RESEND 11/62] init, efi: remove "noinitrd" command line parameter In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-12-safinaskar@gmail.com> It was inconsistent before initrd removal: it mostly controlled initrd only, but in EFI stub boot mode it controlled both initrd and initramfs Signed-off-by: Askar Safin --- Documentation/admin-guide/kernel-parameters.txt | 3 --- arch/arm/configs/collie_defconfig | 2 +- arch/arm/configs/imx_v6_v7_defconfig | 2 +- arch/arm/configs/neponset_defconfig | 2 +- arch/arm/configs/spitz_defconfig | 2 +- drivers/firmware/efi/libstub/efi-stub-helper.c | 5 +---- init/do_mounts_initrd.c | 9 --------- 7 files changed, 5 insertions(+), 20 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 07e8878f1e13..ad52e3d26014 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4271,9 +4271,6 @@ Note that this argument takes precedence over the CONFIG_RCU_NOCB_CPU_DEFAULT_ALL option. - noinitrd [RAM] Tells the kernel not to load any configured - initial RAM disk. - nointremap [X86-64,Intel-IOMMU,EARLY] Do not enable interrupt remapping. [Deprecated - use intremap=off] diff --git a/arch/arm/configs/collie_defconfig b/arch/arm/configs/collie_defconfig index 578c6a4af620..00dc8ae22824 100644 --- a/arch/arm/configs/collie_defconfig +++ b/arch/arm/configs/collie_defconfig @@ -9,7 +9,7 @@ CONFIG_ARCH_MULTI_V4=y # CONFIG_ARCH_MULTI_V7 is not set CONFIG_ARCH_SA1100=y CONFIG_SA1100_COLLIE=y -CONFIG_CMDLINE="noinitrd root=/dev/mtdblock2 rootfstype=jffs2 fbcon=rotate:1" +CONFIG_CMDLINE="root=/dev/mtdblock2 rootfstype=jffs2 fbcon=rotate:1" CONFIG_FPE_NWFPE=y CONFIG_PM=y # CONFIG_SWAP is not set diff --git a/arch/arm/configs/imx_v6_v7_defconfig b/arch/arm/configs/imx_v6_v7_defconfig index 9a57763a8d38..b53ae2c052fc 100644 --- a/arch/arm/configs/imx_v6_v7_defconfig +++ b/arch/arm/configs/imx_v6_v7_defconfig @@ -32,7 +32,7 @@ CONFIG_SMP=y CONFIG_ARM_PSCI=y CONFIG_HIGHMEM=y CONFIG_ARCH_FORCE_MAX_ORDER=13 -CONFIG_CMDLINE="noinitrd console=ttymxc0,115200" +CONFIG_CMDLINE="console=ttymxc0,115200" CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_STAT=y CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y diff --git a/arch/arm/configs/neponset_defconfig b/arch/arm/configs/neponset_defconfig index 4d720001c12e..a61eb27373a8 100644 --- a/arch/arm/configs/neponset_defconfig +++ b/arch/arm/configs/neponset_defconfig @@ -9,7 +9,7 @@ CONFIG_ASSABET_NEPONSET=y CONFIG_ZBOOT_ROM_TEXT=0x80000 CONFIG_ZBOOT_ROM_BSS=0xc1000000 CONFIG_ZBOOT_ROM=y -CONFIG_CMDLINE="console=ttySA0,38400n8 cpufreq=221200 rw root=/dev/mtdblock2 mtdparts=sa1100:512K(boot),1M(kernel),2560K(initrd),4M(root) mem=32M noinitrd initrd=0xc0800000,3M" +CONFIG_CMDLINE="console=ttySA0,38400n8 cpufreq=221200 rw root=/dev/mtdblock2 mtdparts=sa1100:512K(boot),1M(kernel),2560K(initrd),4M(root) mem=32M initrd=0xc0800000,3M" CONFIG_FPE_NWFPE=y CONFIG_PM=y CONFIG_MODULES=y diff --git a/arch/arm/configs/spitz_defconfig b/arch/arm/configs/spitz_defconfig index ac2a0f998c73..8582b6f2cf9d 100644 --- a/arch/arm/configs/spitz_defconfig +++ b/arch/arm/configs/spitz_defconfig @@ -10,7 +10,7 @@ CONFIG_ARCH_PXA=y CONFIG_PXA_SHARPSL=y CONFIG_MACH_AKITA=y CONFIG_MACH_BORZOI=y -CONFIG_CMDLINE="console=ttyS0,115200n8 console=tty1 noinitrd root=/dev/mtdblock2 rootfstype=jffs2 debug" +CONFIG_CMDLINE="console=ttyS0,115200n8 console=tty1 root=/dev/mtdblock2 rootfstype=jffs2 debug" CONFIG_FPE_NWFPE=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c index 7aa2f9ad2935..6d89bf941d57 100644 --- a/drivers/firmware/efi/libstub/efi-stub-helper.c +++ b/drivers/firmware/efi/libstub/efi-stub-helper.c @@ -21,7 +21,6 @@ bool efi_nochunk; bool efi_nokaslr = !IS_ENABLED(CONFIG_RANDOMIZE_BASE); bool efi_novamap; -static bool efi_noinitrd; static bool efi_nosoftreserve; static bool efi_disable_pci_dma = IS_ENABLED(CONFIG_EFI_DISABLE_PCI_DMA); @@ -75,8 +74,6 @@ efi_status_t efi_parse_options(char const *cmdline) efi_nokaslr = true; } else if (!strcmp(param, "quiet")) { efi_loglevel = CONSOLE_LOGLEVEL_QUIET; - } else if (!strcmp(param, "noinitrd")) { - efi_noinitrd = true; } else if (IS_ENABLED(CONFIG_X86_64) && !strcmp(param, "no5lvl")) { efi_no5lvl = true; } else if (IS_ENABLED(CONFIG_ARCH_HAS_MEM_ENCRYPT) && @@ -614,7 +611,7 @@ efi_status_t efi_load_initrd(efi_loaded_image_t *image, efi_status_t status = EFI_SUCCESS; struct linux_efi_initrd initrd, *tbl; - if (!IS_ENABLED(CONFIG_BLK_DEV_INITRD) || efi_noinitrd) + if (!IS_ENABLED(CONFIG_BLK_DEV_INITRD)) return EFI_SUCCESS; status = efi_load_initrd_dev_path(&initrd, hard_limit); diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c index 308744254c08..bec1c5d684a3 100644 --- a/init/do_mounts_initrd.c +++ b/init/do_mounts_initrd.c @@ -15,7 +15,6 @@ unsigned long initrd_start, initrd_end; int initrd_below_start_ok; static unsigned int real_root_dev; /* do_proc_dointvec cannot handle kdev_t */ -static int __initdata mount_initrd = 1; phys_addr_t phys_initrd_start __initdata; unsigned long phys_initrd_size __initdata; @@ -39,14 +38,6 @@ static __init int kernel_do_mounts_initrd_sysctls_init(void) late_initcall(kernel_do_mounts_initrd_sysctls_init); #endif /* CONFIG_SYSCTL */ -static int __init no_initrd(char *str) -{ - mount_initrd = 0; - return 1; -} - -__setup("noinitrd", no_initrd); - static int __init early_initrdmem(char *p) { phys_addr_t start; -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:51 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:51 +0000 Subject: [PATCH RESEND 12/62] init: remove /proc/sys/kernel/real-root-dev In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-13-safinaskar@gmail.com> It was used for initrd support, which was removed in previous commits Signed-off-by: Askar Safin --- Documentation/admin-guide/sysctl/kernel.rst | 6 ------ include/uapi/linux/sysctl.h | 1 - init/do_mounts_initrd.c | 20 -------------------- 3 files changed, 27 deletions(-) diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst index 8b49eab937d0..cc958c228bc2 100644 --- a/Documentation/admin-guide/sysctl/kernel.rst +++ b/Documentation/admin-guide/sysctl/kernel.rst @@ -1215,12 +1215,6 @@ that support this feature. == =========================================================================== -real-root-dev -============= - -See Documentation/admin-guide/initrd.rst. - - reboot-cmd (SPARC only) ======================= diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h index 63d1464cb71c..1c7fe0f4dca4 100644 --- a/include/uapi/linux/sysctl.h +++ b/include/uapi/linux/sysctl.h @@ -92,7 +92,6 @@ enum KERN_DOMAINNAME=8, /* string: domainname */ KERN_PANIC=15, /* int: panic timeout */ - KERN_REALROOTDEV=16, /* real root device to mount after initrd */ KERN_SPARC_REBOOT=21, /* reboot command on Sparc */ KERN_CTLALTDEL=22, /* int: allow ctl-alt-del to reboot */ diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c index bec1c5d684a3..d5264e9a52e0 100644 --- a/init/do_mounts_initrd.c +++ b/init/do_mounts_initrd.c @@ -14,30 +14,10 @@ unsigned long initrd_start, initrd_end; int initrd_below_start_ok; -static unsigned int real_root_dev; /* do_proc_dointvec cannot handle kdev_t */ phys_addr_t phys_initrd_start __initdata; unsigned long phys_initrd_size __initdata; -#ifdef CONFIG_SYSCTL -static const struct ctl_table kern_do_mounts_initrd_table[] = { - { - .procname = "real-root-dev", - .data = &real_root_dev, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec, - }, -}; - -static __init int kernel_do_mounts_initrd_sysctls_init(void) -{ - register_sysctl_init("kernel", kern_do_mounts_initrd_table); - return 0; -} -late_initcall(kernel_do_mounts_initrd_sysctls_init); -#endif /* CONFIG_SYSCTL */ - static int __init early_initrdmem(char *p) { phys_addr_t start; -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:52 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:52 +0000 Subject: [PATCH RESEND 13/62] ext2: remove ext2_image_size and associated code In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-14-safinaskar@gmail.com> It is not used anymore Signed-off-by: Askar Safin --- fs/ext2/ext2.h | 9 --------- include/linux/ext2_fs.h | 13 ------------- 2 files changed, 22 deletions(-) diff --git a/fs/ext2/ext2.h b/fs/ext2/ext2.h index cf97b76e9fd3..d623a14040d9 100644 --- a/fs/ext2/ext2.h +++ b/fs/ext2/ext2.h @@ -608,15 +608,6 @@ struct ext2_dir_entry_2 { ~EXT2_DIR_ROUND) #define EXT2_MAX_REC_LEN ((1<<16)-1) -static inline void verify_offsets(void) -{ -#define A(x,y) BUILD_BUG_ON(x != offsetof(struct ext2_super_block, y)); - A(EXT2_SB_MAGIC_OFFSET, s_magic); - A(EXT2_SB_BLOCKS_OFFSET, s_blocks_count); - A(EXT2_SB_BSIZE_OFFSET, s_log_block_size); -#undef A -} - /* * ext2 mount options */ diff --git a/include/linux/ext2_fs.h b/include/linux/ext2_fs.h index 1fef88569037..e5ebe6cdf06c 100644 --- a/include/linux/ext2_fs.h +++ b/include/linux/ext2_fs.h @@ -27,17 +27,4 @@ */ #define EXT2_LINK_MAX 32000 -#define EXT2_SB_MAGIC_OFFSET 0x38 -#define EXT2_SB_BLOCKS_OFFSET 0x04 -#define EXT2_SB_BSIZE_OFFSET 0x18 - -static inline u64 ext2_image_size(void *ext2_sb) -{ - __u8 *p = ext2_sb; - if (*(__le16 *)(p + EXT2_SB_MAGIC_OFFSET) != cpu_to_le16(EXT2_SUPER_MAGIC)) - return 0; - return (u64)le32_to_cpup((__le32 *)(p + EXT2_SB_BLOCKS_OFFSET)) << - le32_to_cpup((__le32 *)(p + EXT2_SB_BSIZE_OFFSET)); -} - #endif /* _LINUX_EXT2_FS_H */ -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:53 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:53 +0000 Subject: [PATCH RESEND 14/62] init: m68k, mips, powerpc, s390, sh: remove Root_RAM0 In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-15-safinaskar@gmail.com> Root_RAM0 used to specify ramdisk as root device. It means nothing now, so let's remove it Signed-off-by: Askar Safin --- arch/m68k/kernel/uboot.c | 1 - arch/mips/kernel/setup.c | 1 - arch/powerpc/kernel/setup-common.c | 11 ++++------- arch/powerpc/platforms/powermac/setup.c | 4 +--- arch/s390/kernel/setup.c | 2 -- arch/sh/kernel/setup.c | 4 +--- include/linux/root_dev.h | 1 - init/do_mounts.c | 2 -- 8 files changed, 6 insertions(+), 20 deletions(-) diff --git a/arch/m68k/kernel/uboot.c b/arch/m68k/kernel/uboot.c index fa7c279ead5d..d278060a250c 100644 --- a/arch/m68k/kernel/uboot.c +++ b/arch/m68k/kernel/uboot.c @@ -83,7 +83,6 @@ static void __init parse_uboot_commandline(char *commandp, int size) (uboot_initrd_end > uboot_initrd_start)) { initrd_start = uboot_initrd_start; initrd_end = uboot_initrd_end; - ROOT_DEV = Root_RAM0; pr_info("initrd at 0x%lx:0x%lx\n", initrd_start, initrd_end); } #endif /* if defined(CONFIG_BLK_DEV_INITRD) */ diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c index 11b9b6b63e19..a78e24873231 100644 --- a/arch/mips/kernel/setup.c +++ b/arch/mips/kernel/setup.c @@ -173,7 +173,6 @@ static unsigned long __init init_initrd(void) goto disable; } - ROOT_DEV = Root_RAM0; return PFN_UP(end); disable: initrd_start = 0; diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index 68d47c53876c..97d330f3b8f1 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -363,17 +363,14 @@ void __init check_for_initrd(void) DBG(" -> check_for_initrd() initrd_start=0x%lx initrd_end=0x%lx\n", initrd_start, initrd_end); - /* If we were passed an initrd, set the ROOT_DEV properly if the values - * look sensible. If not, clear initrd reference. + /* If we were not passed an sensible initramfs, clear initramfs reference. */ - if (is_kernel_addr(initrd_start) && is_kernel_addr(initrd_end) && - initrd_end > initrd_start) - ROOT_DEV = Root_RAM0; - else + if (!(is_kernel_addr(initrd_start) && is_kernel_addr(initrd_end) && + initrd_end > initrd_start)) initrd_start = initrd_end = 0; if (initrd_start) - pr_info("Found initrd at 0x%lx:0x%lx\n", initrd_start, initrd_end); + pr_info("Found initramfs at 0x%lx:0x%lx\n", initrd_start, initrd_end); DBG(" <- check_for_initrd()\n"); #endif /* CONFIG_BLK_DEV_INITRD */ diff --git a/arch/powerpc/platforms/powermac/setup.c b/arch/powerpc/platforms/powermac/setup.c index eb092f293113..237d8386a3f4 100644 --- a/arch/powerpc/platforms/powermac/setup.c +++ b/arch/powerpc/platforms/powermac/setup.c @@ -296,9 +296,7 @@ static void __init pmac_setup_arch(void) #endif #ifdef CONFIG_PPC32 #ifdef CONFIG_BLK_DEV_INITRD - if (initrd_start) - ROOT_DEV = Root_RAM0; - else + if (!initrd_start) #endif ROOT_DEV = DEFAULT_ROOT_DEVICE; #endif diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index 7b529868789f..a4ce721b7fe8 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -923,8 +923,6 @@ void __init setup_arch(char **cmdline_p) /* boot_command_line has been already set up in early.c */ *cmdline_p = boot_command_line; - ROOT_DEV = Root_RAM0; - setup_initial_init_mm(_text, _etext, _edata, _end); if (IS_ENABLED(CONFIG_EXPOLINE_AUTO)) diff --git a/arch/sh/kernel/setup.c b/arch/sh/kernel/setup.c index 50f1d39fe34f..c4312ee13db9 100644 --- a/arch/sh/kernel/setup.c +++ b/arch/sh/kernel/setup.c @@ -147,10 +147,8 @@ void __init check_for_initrd(void) /* * If we got this far in spite of the boot loader's best efforts - * to the contrary, assume we actually have a valid initrd and - * fix up the root dev. + * to the contrary, assume we actually have a valid initramfs. */ - ROOT_DEV = Root_RAM0; /* * Address sanitization diff --git a/include/linux/root_dev.h b/include/linux/root_dev.h index 847c9a06101b..e411533b90b7 100644 --- a/include/linux/root_dev.h +++ b/include/linux/root_dev.h @@ -10,7 +10,6 @@ enum { Root_NFS = MKDEV(UNNAMED_MAJOR, 255), Root_CIFS = MKDEV(UNNAMED_MAJOR, 254), Root_Generic = MKDEV(UNNAMED_MAJOR, 253), - Root_RAM0 = MKDEV(RAMDISK_MAJOR, 0), }; extern dev_t ROOT_DEV; diff --git a/init/do_mounts.c b/init/do_mounts.c index f0b1a83dbda4..5c407ca54063 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -437,8 +437,6 @@ static dev_t __init parse_root_device(char *root_device_name) return Root_NFS; if (strcmp(root_device_name, "/dev/cifs") == 0) return Root_CIFS; - if (strcmp(root_device_name, "/dev/ram") == 0) - return Root_RAM0; error = early_lookup_bdev(root_device_name, &dev); if (error) { -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:54 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:54 +0000 Subject: [PATCH RESEND 15/62] doc: modernize Documentation/admin-guide/blockdev/ramdisk.rst In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-16-safinaskar@gmail.com> Update it to reflect initrd removal Signed-off-by: Askar Safin --- .../admin-guide/blockdev/ramdisk.rst | 103 ++---------------- 1 file changed, 7 insertions(+), 96 deletions(-) diff --git a/Documentation/admin-guide/blockdev/ramdisk.rst b/Documentation/admin-guide/blockdev/ramdisk.rst index e57c61108dbc..6289e085f18f 100644 --- a/Documentation/admin-guide/blockdev/ramdisk.rst +++ b/Documentation/admin-guide/blockdev/ramdisk.rst @@ -5,18 +5,14 @@ Using the RAM disk block device with Linux .. Contents: 1) Overview - 2) Kernel Command Line Parameters - 3) Using "rdev" - 4) An Example of Creating a Compressed RAM Disk + 2) Module parameters 1) Overview ----------- -The RAM disk driver is a way to use main system memory as a block device. It -is required for initrd, an initial filesystem used if you need to load modules -in order to access the root filesystem (see Documentation/admin-guide/initrd.rst). It can -also be used for a temporary filesystem for crypto work, since the contents +The RAM disk driver is a way to use main system memory as a block device. +It can also be used for a temporary filesystem for crypto work, since the contents are erased on reboot. The RAM disk dynamically grows as more space is required. It does this by using @@ -30,109 +26,24 @@ and (re)build the kernel. To use RAM disk support with your system, run './MAKEDEV ram' from the /dev directory. RAM disks are all major number 1, and start with minor number 0 -for /dev/ram0, etc. If used, modern kernels use /dev/ram0 for an initrd. - -The new RAM disk also has the ability to load compressed RAM disk images, -allowing one to squeeze more programs onto an average installation or -rescue floppy disk. +for /dev/ram0, etc. -2) Parameters ---------------------------------- +2) Module parameters +-------------------- -2a) Kernel Command Line Parameters - - ramdisk_size=N + rd_size=N Size of the ramdisk. This parameter tells the RAM disk driver to set up RAM disks of N k size. The default is 4096 (4 MB). -2b) Module parameters - rd_nr /dev/ramX devices created. max_part Maximum partition number. - rd_size - See ramdisk_size. - -3) Using "rdev" ---------------- - -"rdev" is an obsolete, deprecated, antiquated utility that could be used -to set the boot device in a Linux kernel image. - -Instead of using rdev, just place the boot device information on the -kernel command line and pass it to the kernel from the bootloader. - -You can also pass arguments to the kernel by setting FDARGS in -arch/x86/boot/Makefile and specify in initrd image by setting FDINITRD in -arch/x86/boot/Makefile. - -Some of the kernel command line boot options that may apply here are:: - - ramdisk_size=M - -If you make a boot disk that has LILO, then for the above, you would use:: - - append = "ramdisk_size=M" - -4) An Example of Creating a Compressed RAM Disk ------------------------------------------------ - -To create a RAM disk image, you will need a spare block device to -construct it on. This can be the RAM disk device itself, or an -unused disk partition (such as an unmounted swap partition). For this -example, we will use the RAM disk device, "/dev/ram0". - -Note: This technique should not be done on a machine with less than 8 MB -of RAM. If using a spare disk partition instead of /dev/ram0, then this -restriction does not apply. - -a) Decide on the RAM disk size that you want. Say 2 MB for this example. - Create it by writing to the RAM disk device. (This step is not currently - required, but may be in the future.) It is wise to zero out the - area (esp. for disks) so that maximal compression is achieved for - the unused blocks of the image that you are about to create:: - - dd if=/dev/zero of=/dev/ram0 bs=1k count=2048 - -b) Make a filesystem on it. Say ext2fs for this example:: - - mke2fs -vm0 /dev/ram0 2048 - -c) Mount it, copy the files you want to it (eg: /etc/* /dev/* ...) - and unmount it again. - -d) Compress the contents of the RAM disk. The level of compression - will be approximately 50% of the space used by the files. Unused - space on the RAM disk will compress to almost nothing:: - - dd if=/dev/ram0 bs=1k count=2048 | gzip -v9 > /tmp/ram_image.gz - -e) Put the kernel onto the floppy:: - - dd if=zImage of=/dev/fd0 bs=1k - -f) Put the RAM disk image onto the floppy, after the kernel. Use an offset - that is slightly larger than the kernel, so that you can put another - (possibly larger) kernel onto the same floppy later without overlapping - the RAM disk image. An offset of 400 kB for kernels about 350 kB in - size would be reasonable. Make sure offset+size of ram_image.gz is - not larger than the total space on your floppy (usually 1440 kB):: - - dd if=/tmp/ram_image.gz of=/dev/fd0 bs=1k seek=400 - -g) Make sure that you have already specified the boot information in - FDARGS and FDINITRD or that you use a bootloader to pass kernel - command line boot options to the kernel. - -That is it. You now have your boot/root compressed RAM disk floppy. Some -users may wish to combine steps (d) and (f) by using a pipe. - Paul Gortmaker 12/95 -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:55 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:55 +0000 Subject: [PATCH RESEND 16/62] brd: remove "ramdisk_size" command line parameter In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-17-safinaskar@gmail.com> It was used mostly for initrd. It could be used only if brd is built-in. Use "brd.rd_size" instead Signed-off-by: Askar Safin --- .../admin-guide/kernel-parameters.txt | 3 --- Documentation/arch/m68k/kernel-options.rst | 20 ++----------------- arch/arm/configs/s3c6400_defconfig | 2 +- drivers/block/brd.c | 10 ---------- 4 files changed, 3 insertions(+), 32 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index ad52e3d26014..e862a7b1d2ec 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5279,9 +5279,6 @@ raid= [HW,RAID] See Documentation/admin-guide/md.rst. - ramdisk_size= [RAM] Sizes of RAM disks in kilobytes - See Documentation/admin-guide/blockdev/ramdisk.rst. - random.trust_cpu=off [KNL,EARLY] Disable trusting the use of the CPU's random number generator (if available) to diff --git a/Documentation/arch/m68k/kernel-options.rst b/Documentation/arch/m68k/kernel-options.rst index 2008a20b4329..f6469ebeb2c7 100644 --- a/Documentation/arch/m68k/kernel-options.rst +++ b/Documentation/arch/m68k/kernel-options.rst @@ -215,27 +215,11 @@ Devices possible for Atari: seconds. -2.6) ramdisk_size= ------------------- - -:Syntax: ramdisk_size= - -This option instructs the kernel to set up a ramdisk of the given -size in KBytes. Do not use this option if the ramdisk contents are -passed by bootstrap! In this case, the size is selected automatically -and should not be overwritten. - -The only application is for root filesystems on floppy disks, that -should be loaded into memory. To do that, select the corresponding -size of the disk as ramdisk size, and set the root device to the disk -drive (with "root="). - - -2.7) swap= +2.5) swap= I can't find any sign of this option in 2.2.6. -2.8) buff= +2.6) buff= ----------- I can't find any sign of this option in 2.2.6. diff --git a/arch/arm/configs/s3c6400_defconfig b/arch/arm/configs/s3c6400_defconfig index a37e6ac40825..23635d5b9322 100644 --- a/arch/arm/configs/s3c6400_defconfig +++ b/arch/arm/configs/s3c6400_defconfig @@ -4,7 +4,7 @@ CONFIG_ARCH_MULTI_V6=y # CONFIG_ARCH_MULTI_V7 is not set CONFIG_ARCH_S3C64XX=y CONFIG_MACH_WLF_CRAGG_6410=y -CONFIG_CMDLINE="console=ttySAC0,115200 root=/dev/ram init=/linuxrc initrd=0x51000000,6M ramdisk_size=6144" +CONFIG_CMDLINE="console=ttySAC0,115200 root=/dev/ram init=/linuxrc initrd=0x51000000,6M" CONFIG_VFP=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y diff --git a/drivers/block/brd.c b/drivers/block/brd.c index 72f02d2b8a99..05c4325904d2 100644 --- a/drivers/block/brd.c +++ b/drivers/block/brd.c @@ -222,16 +222,6 @@ MODULE_LICENSE("GPL"); MODULE_ALIAS_BLOCKDEV_MAJOR(RAMDISK_MAJOR); MODULE_ALIAS("rd"); -#ifndef MODULE -/* Legacy boot options - nonmodular */ -static int __init ramdisk_size(char *str) -{ - rd_size = simple_strtol(str, NULL, 0); - return 1; -} -__setup("ramdisk_size=", ramdisk_size); -#endif - /* * The device scheme is derived from loop.c. Keep them in synch where possible * (should share code eventually). -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:56 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:56 +0000 Subject: [PATCH RESEND 17/62] doc: modernize Documentation/filesystems/ramfs-rootfs-initramfs.rst In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-18-safinaskar@gmail.com> Update it to reflect initrd removal. Also I specified that error reports should go to linux-doc at vger.kernel.org , because Rob Landley said that he keeps getting reports about this document and is unable to fix them Signed-off-by: Askar Safin --- .../filesystems/ramfs-rootfs-initramfs.rst | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/Documentation/filesystems/ramfs-rootfs-initramfs.rst b/Documentation/filesystems/ramfs-rootfs-initramfs.rst index fa4f81099cb4..38a9cf11f547 100644 --- a/Documentation/filesystems/ramfs-rootfs-initramfs.rst +++ b/Documentation/filesystems/ramfs-rootfs-initramfs.rst @@ -8,6 +8,8 @@ October 17, 2005 :Author: Rob Landley +Report errors in this document to + What is ramfs? -------------- @@ -101,9 +103,9 @@ archive is extracted into it, the kernel will fall through to the older code to locate and mount a root partition, then exec some variant of /sbin/init out of that. -All this differs from the old initrd in several ways: +All this differs from the old initrd (removed in 2025) in several ways: - - The old initrd was always a separate file, while the initramfs archive is + - The old initrd was always a separate file, while the initramfs archive can be linked into the linux kernel image. (The directory ``linux-*/usr`` is devoted to generating this archive during the build.) @@ -137,7 +139,7 @@ Populating initramfs: The 2.6 kernel build process always creates a gzipped cpio format initramfs archive and links it into the resulting kernel binary. By default, this -archive is empty (consuming 134 bytes on x86). +archive is nearly empty (consuming 134 bytes on x86). The config option CONFIG_INITRAMFS_SOURCE (in General Setup in menuconfig, and living in usr/Kconfig) can be used to specify a source for the @@ -222,15 +224,13 @@ use in place of the above config file:: External initramfs images: -------------------------- -If the kernel has initrd support enabled, an external cpio.gz archive can also -be passed into a 2.6 kernel in place of an initrd. In this case, the kernel -will autodetect the type (initramfs, not initrd) and extract the external cpio +If the kernel has CONFIG_BLK_DEV_INITRD enabled, an external cpio.gz archive can also +be passed into a 2.6 kernel. In this case, the kernel will extract the external cpio archive into rootfs before trying to run /init. -This has the memory efficiency advantages of initramfs (no ramdisk block -device) but the separate packaging of initrd (which is nice if you have +This is nice if you have non-GPL code you'd like to run from initramfs, without conflating it with -the GPL licensed Linux kernel binary). +the GPL licensed Linux kernel binary. It can also be used to supplement the kernel's built-in initramfs image. The files in the external archive will overwrite any conflicting files in @@ -278,7 +278,7 @@ User Mode Linux, like so:: EOF gcc -static hello.c -o init echo init | cpio -o -H newc | gzip > test.cpio.gz - # Testing external initramfs using the initrd loading mechanism. + # Testing external initramfs. qemu -kernel /boot/vmlinuz -initrd test.cpio.gz /dev/zero When debugging a normal root filesystem, it's nice to be able to boot with -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:57 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:57 +0000 Subject: [PATCH RESEND 18/62] doc: modernize Documentation/driver-api/early-userspace/early_userspace_support.rst In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-19-safinaskar@gmail.com> Update it to reflect initrd removal Signed-off-by: Askar Safin --- .../early_userspace_support.rst | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/Documentation/driver-api/early-userspace/early_userspace_support.rst b/Documentation/driver-api/early-userspace/early_userspace_support.rst index 61bdeac1bae5..0ca923c1007b 100644 --- a/Documentation/driver-api/early-userspace/early_userspace_support.rst +++ b/Documentation/driver-api/early-userspace/early_userspace_support.rst @@ -127,28 +127,22 @@ mailing list at https://www.zytor.com/mailman/listinfo/klibc How does it work? ================= -The kernel has currently 3 ways to mount the root filesystem: +The kernel has currently 2 ways to mount the root filesystem: a) all required device and filesystem drivers compiled into the kernel, no - initrd. init/main.c:init() will call prepare_namespace() to mount the + initramfs. init/main.c:kernel_init_freeable() will call prepare_namespace() to mount the final root filesystem, based on the root= option and optional init= to run - some other init binary than listed at the end of init/main.c:init(). + some other init binary than listed at the end of init/main.c:kernel_init(). -b) some device and filesystem drivers built as modules and stored in an - initrd. The initrd must contain a binary '/linuxrc' which is supposed to - load these driver modules. It is also possible to mount the final root - filesystem via linuxrc and use the pivot_root syscall. The initrd is - mounted and executed via prepare_namespace(). - -c) using initramfs. The call to prepare_namespace() must be skipped. +b) using initramfs. The call to prepare_namespace() must be skipped. This means that a binary must do all the work. Said binary can be stored into initramfs either via modifying usr/gen_init_cpio.c or via the new - initrd format, an cpio archive. It must be called "/init". This binary + initramfs format, an cpio archive. It must be called "/init". This binary is responsible to do all the things prepare_namespace() would do. To maintain backwards compatibility, the /init binary will only run if it comes via an initramfs cpio archive. If this is not the case, - init/main.c:init() will run prepare_namespace() to mount the final root + init/main.c:kernel_init_freeable() will run prepare_namespace() to mount the final root and exec one of the predefined init binaries. Bryan O'Sullivan -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:58 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:58 +0000 Subject: [PATCH RESEND 19/62] init: remove mentions of "ramdisk=" command line parameter In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-20-safinaskar@gmail.com> It is already removed Signed-off-by: Askar Safin --- arch/arm/boot/dts/samsung/exynos4210-origen.dts | 2 +- arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts | 2 +- arch/arm/boot/dts/samsung/exynos4412-smdk4412.dts | 2 +- arch/arm/boot/dts/samsung/exynos5250-smdk5250.dts | 2 +- arch/arm/configs/exynos_defconfig | 2 +- arch/arm/configs/s5pv210_defconfig | 2 +- drivers/block/Kconfig | 1 - 7 files changed, 6 insertions(+), 7 deletions(-) diff --git a/arch/arm/boot/dts/samsung/exynos4210-origen.dts b/arch/arm/boot/dts/samsung/exynos4210-origen.dts index f1927ca15e08..4dcf794bd18b 100644 --- a/arch/arm/boot/dts/samsung/exynos4210-origen.dts +++ b/arch/arm/boot/dts/samsung/exynos4210-origen.dts @@ -36,7 +36,7 @@ aliases { }; chosen { - bootargs = "root=/dev/ram0 rw ramdisk=8192 initrd=0x41000000,8M init=/linuxrc"; + bootargs = "root=/dev/ram0 rw initrd=0x41000000,8M init=/linuxrc"; stdout-path = "serial2:115200n8"; }; diff --git a/arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts b/arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts index 18f4f494093b..4cdeddeff3fc 100644 --- a/arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts +++ b/arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts @@ -30,7 +30,7 @@ aliases { }; chosen { - bootargs = "root=/dev/ram0 rw ramdisk=8192 initrd=0x41000000,8M init=/linuxrc"; + bootargs = "root=/dev/ram0 rw initrd=0x41000000,8M init=/linuxrc"; stdout-path = "serial1:115200n8"; }; diff --git a/arch/arm/boot/dts/samsung/exynos4412-smdk4412.dts b/arch/arm/boot/dts/samsung/exynos4412-smdk4412.dts index c83fb250e664..4b18cc55d6ca 100644 --- a/arch/arm/boot/dts/samsung/exynos4412-smdk4412.dts +++ b/arch/arm/boot/dts/samsung/exynos4412-smdk4412.dts @@ -27,7 +27,7 @@ aliases { }; chosen { - bootargs = "root=/dev/ram0 rw ramdisk=8192 initrd=0x41000000,8M init=/linuxrc"; + bootargs = "root=/dev/ram0 rw initrd=0x41000000,8M init=/linuxrc"; stdout-path = "serial1:115200n8"; }; diff --git a/arch/arm/boot/dts/samsung/exynos5250-smdk5250.dts b/arch/arm/boot/dts/samsung/exynos5250-smdk5250.dts index bb623726ef1e..4164c7c2a3eb 100644 --- a/arch/arm/boot/dts/samsung/exynos5250-smdk5250.dts +++ b/arch/arm/boot/dts/samsung/exynos5250-smdk5250.dts @@ -27,7 +27,7 @@ memory at 40000000 { }; chosen { - bootargs = "root=/dev/ram0 rw ramdisk=8192 initrd=0x41000000,8M init=/linuxrc"; + bootargs = "root=/dev/ram0 rw initrd=0x41000000,8M init=/linuxrc"; stdout-path = "serial2:115200n8"; }; diff --git a/arch/arm/configs/exynos_defconfig b/arch/arm/configs/exynos_defconfig index 6915c766923a..77d3521f55d4 100644 --- a/arch/arm/configs/exynos_defconfig +++ b/arch/arm/configs/exynos_defconfig @@ -15,7 +15,7 @@ CONFIG_HIGHMEM=y CONFIG_SECCOMP=y CONFIG_ARM_APPENDED_DTB=y CONFIG_ARM_ATAG_DTB_COMPAT=y -CONFIG_CMDLINE="root=/dev/ram0 rw ramdisk=8192 initrd=0x41000000,8M console=ttySAC1,115200 init=/linuxrc mem=256M" +CONFIG_CMDLINE="root=/dev/ram0 rw initrd=0x41000000,8M console=ttySAC1,115200 init=/linuxrc mem=256M" CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_STAT=y CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y diff --git a/arch/arm/configs/s5pv210_defconfig b/arch/arm/configs/s5pv210_defconfig index 02121eec3658..8ec82d9b51e4 100644 --- a/arch/arm/configs/s5pv210_defconfig +++ b/arch/arm/configs/s5pv210_defconfig @@ -8,7 +8,7 @@ CONFIG_KALLSYMS_ALL=y CONFIG_ARCH_S5PV210=y CONFIG_VMSPLIT_2G=y CONFIG_ARM_APPENDED_DTB=y -CONFIG_CMDLINE="root=/dev/ram0 rw ramdisk=8192 initrd=0x20800000,8M console=ttySAC1,115200 init=/linuxrc" +CONFIG_CMDLINE="root=/dev/ram0 rw initrd=0x20800000,8M console=ttySAC1,115200 init=/linuxrc" CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_STAT=y CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig index df38fb364904..8cf06e40f61c 100644 --- a/drivers/block/Kconfig +++ b/drivers/block/Kconfig @@ -229,7 +229,6 @@ config BLK_DEV_RAM store a copy of a minimal root file system off of a floppy into RAM during the initial install of Linux. - Note that the kernel command line option "ramdisk=XX" is now obsolete. For details, read . To compile this driver as a module, choose M here: the -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:37:59 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:37:59 +0000 Subject: [PATCH RESEND 20/62] doc: remove Documentation/power/swsusp-dmcrypt.rst In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-21-safinaskar@gmail.com> It contains obsolete initrd and lilo based instructions Signed-off-by: Askar Safin --- Documentation/power/index.rst | 1 - Documentation/power/swsusp-dmcrypt.rst | 140 ------------------ .../translations/zh_CN/power/index.rst | 1 - 3 files changed, 142 deletions(-) delete mode 100644 Documentation/power/swsusp-dmcrypt.rst diff --git a/Documentation/power/index.rst b/Documentation/power/index.rst index a0f5244fb427..9f1758c92e48 100644 --- a/Documentation/power/index.rst +++ b/Documentation/power/index.rst @@ -22,7 +22,6 @@ Power Management suspend-and-cpuhotplug suspend-and-interrupts swsusp-and-swap-files - swsusp-dmcrypt swsusp video tricks diff --git a/Documentation/power/swsusp-dmcrypt.rst b/Documentation/power/swsusp-dmcrypt.rst deleted file mode 100644 index afb29a58fdf8..000000000000 --- a/Documentation/power/swsusp-dmcrypt.rst +++ /dev/null @@ -1,140 +0,0 @@ -======================================= -How to use dm-crypt and swsusp together -======================================= - -Author: Andreas Steinmetz - - - -Some prerequisites: -You know how dm-crypt works. If not, visit the following web page: -http://www.saout.de/misc/dm-crypt/ -You have read Documentation/power/swsusp.rst and understand it. -You did read Documentation/filesystems/ramfs-rootfs-initramfs.rst and know how an initrd works. -You know how to create or how to modify an initrd. - -Now your system is properly set up, your disk is encrypted except for -the swap device(s) and the boot partition which may contain a mini -system for crypto setup and/or rescue purposes. You may even have -an initrd that does your current crypto setup already. - -At this point you want to encrypt your swap, too. Still you want to -be able to suspend using swsusp. This, however, means that you -have to be able to either enter a passphrase or that you read -the key(s) from an external device like a pcmcia flash disk -or an usb stick prior to resume. So you need an initrd, that sets -up dm-crypt and then asks swsusp to resume from the encrypted -swap device. - -The most important thing is that you set up dm-crypt in such -a way that the swap device you suspend to/resume from has -always the same major/minor within the initrd as well as -within your running system. The easiest way to achieve this is -to always set up this swap device first with dmsetup, so that -it will always look like the following:: - - brw------- 1 root root 254, 0 Jul 28 13:37 /dev/mapper/swap0 - -Now set up your kernel to use /dev/mapper/swap0 as the default -resume partition, so your kernel .config contains:: - - CONFIG_PM_STD_PARTITION="/dev/mapper/swap0" - -Prepare your boot loader to use the initrd you will create or -modify. For lilo the simplest setup looks like the following -lines:: - - image=/boot/vmlinuz - initrd=/boot/initrd.gz - label=linux - append="root=/dev/ram0 init=/linuxrc rw" - -Finally you need to create or modify your initrd. Lets assume -you create an initrd that reads the required dm-crypt setup -from a pcmcia flash disk card. The card is formatted with an ext2 -fs which resides on /dev/hde1 when the card is inserted. The -card contains at least the encrypted swap setup in a file -named "swapkey". /etc/fstab of your initrd contains something -like the following:: - - /dev/hda1 /mnt ext3 ro 0 0 - none /proc proc defaults,noatime,nodiratime 0 0 - none /sys sysfs defaults,noatime,nodiratime 0 0 - -/dev/hda1 contains an unencrypted mini system that sets up all -of your crypto devices, again by reading the setup from the -pcmcia flash disk. What follows now is a /linuxrc for your -initrd that allows you to resume from encrypted swap and that -continues boot with your mini system on /dev/hda1 if resume -does not happen:: - - #!/bin/sh - PATH=/sbin:/bin:/usr/sbin:/usr/bin - mount /proc - mount /sys - mapped=0 - noresume=`grep -c noresume /proc/cmdline` - if [ "$*" != "" ] - then - noresume=1 - fi - dmesg -n 1 - /sbin/cardmgr -q - for i in 1 2 3 4 5 6 7 8 9 0 - do - if [ -f /proc/ide/hde/media ] - then - usleep 500000 - mount -t ext2 -o ro /dev/hde1 /mnt - if [ -f /mnt/swapkey ] - then - dmsetup create swap0 /mnt/swapkey > /dev/null 2>&1 && mapped=1 - fi - umount /mnt - break - fi - usleep 500000 - done - killproc /sbin/cardmgr - dmesg -n 6 - if [ $mapped = 1 ] - then - if [ $noresume != 0 ] - then - mkswap /dev/mapper/swap0 > /dev/null 2>&1 - fi - echo 254:0 > /sys/power/resume - dmsetup remove swap0 - fi - umount /sys - mount /mnt - umount /proc - cd /mnt - pivot_root . mnt - mount /proc - umount -l /mnt - umount /proc - exec chroot . /sbin/init $* < dev/console > dev/console 2>&1 - -Please don't mind the weird loop above, busybox's msh doesn't know -the let statement. Now, what is happening in the script? -First we have to decide if we want to try to resume, or not. -We will not resume if booting with "noresume" or any parameters -for init like "single" or "emergency" as boot parameters. - -Then we need to set up dmcrypt with the setup data from the -pcmcia flash disk. If this succeeds we need to reset the swap -device if we don't want to resume. The line "echo 254:0 > /sys/power/resume" -then attempts to resume from the first device mapper device. -Note that it is important to set the device in /sys/power/resume, -regardless if resuming or not, otherwise later suspend will fail. -If resume starts, script execution terminates here. - -Otherwise we just remove the encrypted swap device and leave it to the -mini system on /dev/hda1 to set the whole crypto up (it is up to -you to modify this to your taste). - -What then follows is the well known process to change the root -file system and continue booting from there. I prefer to unmount -the initrd prior to continue booting but it is up to you to modify -this. diff --git a/Documentation/translations/zh_CN/power/index.rst b/Documentation/translations/zh_CN/power/index.rst index bc54983ba515..4ee880e65107 100644 --- a/Documentation/translations/zh_CN/power/index.rst +++ b/Documentation/translations/zh_CN/power/index.rst @@ -32,7 +32,6 @@ TODOList: * suspend-and-cpuhotplug * suspend-and-interrupts * swsusp-and-swap-files - * swsusp-dmcrypt * swsusp * video * tricks -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:00 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:00 +0000 Subject: [PATCH RESEND 21/62] init: remove all mentions of root=/dev/ram* In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-22-safinaskar@gmail.com> Initrd support is removed, so root=/dev/ram* is never correct Signed-off-by: Askar Safin --- Documentation/admin-guide/kernel-parameters.txt | 3 +-- Documentation/arch/m68k/kernel-options.rst | 9 ++------- arch/arm/boot/dts/arm/integratorap.dts | 2 +- arch/arm/boot/dts/arm/integratorcp.dts | 2 +- arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-cmm.dts | 2 +- .../boot/dts/aspeed/aspeed-bmc-facebook-galaxy100.dts | 2 +- .../arm/boot/dts/aspeed/aspeed-bmc-facebook-minipack.dts | 2 +- .../arm/boot/dts/aspeed/aspeed-bmc-facebook-wedge100.dts | 2 +- arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-wedge40.dts | 2 +- arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-yamp.dts | 2 +- .../boot/dts/aspeed/ast2600-facebook-netbmc-common.dtsi | 2 +- arch/arm/boot/dts/hisilicon/hi3620-hi4511.dts | 2 +- .../boot/dts/intel/ixp/intel-ixp42x-welltech-epbx100.dts | 2 +- arch/arm/boot/dts/nspire/nspire-classic.dtsi | 2 +- arch/arm/boot/dts/nspire/nspire-cx.dts | 2 +- arch/arm/boot/dts/samsung/exynos4210-origen.dts | 2 +- arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts | 2 +- arch/arm/boot/dts/samsung/exynos4412-smdk4412.dts | 2 +- arch/arm/boot/dts/samsung/exynos5250-smdk5250.dts | 2 +- arch/arm/boot/dts/st/ste-nomadik-nhk15.dts | 2 +- arch/arm/boot/dts/st/ste-nomadik-s8815.dts | 2 +- arch/arm/boot/dts/st/stm32429i-eval.dts | 2 +- arch/arm/boot/dts/st/stm32746g-eval.dts | 2 +- arch/arm/boot/dts/st/stm32f429-disco.dts | 2 +- arch/arm/boot/dts/st/stm32f469-disco.dts | 2 +- arch/arm/boot/dts/st/stm32f746-disco.dts | 2 +- arch/arm/boot/dts/st/stm32f769-disco.dts | 2 +- arch/arm/boot/dts/st/stm32h743i-disco.dts | 2 +- arch/arm/boot/dts/st/stm32h743i-eval.dts | 2 +- arch/arm/boot/dts/st/stm32h747i-disco.dts | 2 +- arch/arm/boot/dts/st/stm32h750i-art-pi.dts | 2 +- arch/arm/configs/assabet_defconfig | 2 +- arch/arm/configs/at91_dt_defconfig | 2 +- arch/arm/configs/exynos_defconfig | 2 +- arch/arm/configs/lpc32xx_defconfig | 2 +- arch/arm/configs/pxa_defconfig | 2 +- arch/arm/configs/s3c6400_defconfig | 2 +- arch/arm/configs/s5pv210_defconfig | 2 +- arch/arm/configs/sama5_defconfig | 2 +- arch/arm/configs/u8500_defconfig | 2 +- arch/parisc/defpalo.conf | 2 +- arch/s390/boot/ipl_parm.c | 2 +- arch/xtensa/Kconfig | 2 +- arch/xtensa/boot/dts/csp.dts | 2 +- 44 files changed, 45 insertions(+), 51 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index e862a7b1d2ec..a259f2bdba0f 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -6407,8 +6407,7 @@ Usually this is a block device specifier of some kind, see the early_lookup_bdev comment in block/early-lookup.c for details. - Alternatively this can be "ram" for the legacy initial - ramdisk, "nfs" and "cifs" for root on a network file + Alternatively this can be "nfs" and "cifs" for root on a network file system, or "mtd" and "ubi" for mounting from raw flash. rootdelay= [KNL] Delay (in seconds) to pause before attempting to diff --git a/Documentation/arch/m68k/kernel-options.rst b/Documentation/arch/m68k/kernel-options.rst index f6469ebeb2c7..a508ee8efa8b 100644 --- a/Documentation/arch/m68k/kernel-options.rst +++ b/Documentation/arch/m68k/kernel-options.rst @@ -73,7 +73,6 @@ hardcoded name to number mappings. The name must always be a combination of two or three letters, followed by a decimal number. Valid names are:: - /dev/ram: -> 0x0100 (initial ramdisk) /dev/hda: -> 0x0300 (first IDE disk) /dev/hdb: -> 0x0340 (second IDE disk) /dev/sda: -> 0x0800 (first SCSI disk) @@ -86,12 +85,8 @@ Valid names are:: The name must be followed by a decimal number, that stands for the partition number. Internally, the value of the number is just added to the device number mentioned in the table above. The -exceptions are /dev/ram and /dev/fd, where /dev/ram refers to an -initial ramdisk loaded by your bootstrap program (please consult the -instructions for your bootstrap program to find out how to load an -initial ramdisk). As of kernel version 2.0.18 you must specify -/dev/ram as the root device if you want to boot from an initial -ramdisk. For the floppy devices, /dev/fd, the number stands for the +exception is /dev/fd. +For the floppy devices, /dev/fd, the number stands for the floppy drive number (there are no partitions on floppy disks). I.e., /dev/fd0 stands for the first drive, /dev/fd1 for the second, and so on. Since the number is just added, you can also force the disk format diff --git a/arch/arm/boot/dts/arm/integratorap.dts b/arch/arm/boot/dts/arm/integratorap.dts index 9b6a1dbaf265..2e43a8291d40 100644 --- a/arch/arm/boot/dts/arm/integratorap.dts +++ b/arch/arm/boot/dts/arm/integratorap.dts @@ -53,7 +53,7 @@ aliases { }; chosen { - bootargs = "root=/dev/ram0 console=ttyAM0,38400n8 earlyprintk"; + bootargs = "console=ttyAM0,38400n8 earlyprintk"; }; /* 24 MHz chrystal on the Integrator/AP development board */ diff --git a/arch/arm/boot/dts/arm/integratorcp.dts b/arch/arm/boot/dts/arm/integratorcp.dts index 8ad1a8957ace..2ac140741752 100644 --- a/arch/arm/boot/dts/arm/integratorcp.dts +++ b/arch/arm/boot/dts/arm/integratorcp.dts @@ -11,7 +11,7 @@ / { compatible = "arm,integrator-cp"; chosen { - bootargs = "root=/dev/ram0 console=ttyAMA0,38400n8 earlyprintk"; + bootargs = "console=ttyAMA0,38400n8 earlyprintk"; }; cpus { diff --git a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-cmm.dts b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-cmm.dts index 24153868cc00..f4ae167e89f0 100644 --- a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-cmm.dts +++ b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-cmm.dts @@ -280,7 +280,7 @@ aliases { chosen { stdout-path = &uart1; - bootargs = "console=ttyS1,9600n8 root=/dev/ram rw earlycon"; + bootargs = "console=ttyS1,9600n8 rw earlycon"; }; ast-adc-hwmon { diff --git a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-galaxy100.dts b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-galaxy100.dts index 60e875ac2461..d51ee3aaa461 100644 --- a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-galaxy100.dts +++ b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-galaxy100.dts @@ -10,7 +10,7 @@ / { chosen { stdout-path = &uart5; - bootargs = "console=ttyS0,9600n8 root=/dev/ram rw"; + bootargs = "console=ttyS0,9600n8 rw"; }; ast-adc-hwmon { diff --git a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-minipack.dts b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-minipack.dts index aafd1042b6e5..4233d0d857b8 100644 --- a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-minipack.dts +++ b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-minipack.dts @@ -230,7 +230,7 @@ aliases { chosen { stdout-path = &uart1; - bootargs = "debug console=ttyS1,9600n8 root=/dev/ram rw"; + bootargs = "debug console=ttyS1,9600n8 rw"; }; }; diff --git a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-wedge100.dts b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-wedge100.dts index 97cd11c3d9a5..23f9d1c690f8 100644 --- a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-wedge100.dts +++ b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-wedge100.dts @@ -10,7 +10,7 @@ / { chosen { stdout-path = &uart3; - bootargs = "console=ttyS2,9600n8 root=/dev/ram rw"; + bootargs = "console=ttyS2,9600n8 rw"; }; ast-adc-hwmon { diff --git a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-wedge40.dts b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-wedge40.dts index 6624855d8ebd..e9b1b51f9f7a 100644 --- a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-wedge40.dts +++ b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-wedge40.dts @@ -10,7 +10,7 @@ / { chosen { stdout-path = &uart3; - bootargs = "console=ttyS2,9600n8 root=/dev/ram rw"; + bootargs = "console=ttyS2,9600n8 rw"; }; ast-adc-hwmon { diff --git a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-yamp.dts b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-yamp.dts index 98fe0d6c8188..578ca0dc9647 100644 --- a/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-yamp.dts +++ b/arch/arm/boot/dts/aspeed/aspeed-bmc-facebook-yamp.dts @@ -21,7 +21,7 @@ aliases { chosen { stdout-path = &uart5; - bootargs = "console=ttyS0,9600n8 root=/dev/ram rw"; + bootargs = "console=ttyS0,9600n8 rw"; }; }; diff --git a/arch/arm/boot/dts/aspeed/ast2600-facebook-netbmc-common.dtsi b/arch/arm/boot/dts/aspeed/ast2600-facebook-netbmc-common.dtsi index 00e5887c926f..3dbf0cc70f48 100644 --- a/arch/arm/boot/dts/aspeed/ast2600-facebook-netbmc-common.dtsi +++ b/arch/arm/boot/dts/aspeed/ast2600-facebook-netbmc-common.dtsi @@ -12,7 +12,7 @@ aliases { }; chosen { - bootargs = "console=ttyS0,9600n8 root=/dev/ram rw vmalloc=640M"; + bootargs = "console=ttyS0,9600n8 rw vmalloc=640M"; }; memory at 80000000 { diff --git a/arch/arm/boot/dts/hisilicon/hi3620-hi4511.dts b/arch/arm/boot/dts/hisilicon/hi3620-hi4511.dts index f1c816a1d7cf..bbd62c6ad280 100644 --- a/arch/arm/boot/dts/hisilicon/hi3620-hi4511.dts +++ b/arch/arm/boot/dts/hisilicon/hi3620-hi4511.dts @@ -13,7 +13,7 @@ / { compatible = "hisilicon,hi3620-hi4511"; chosen { - bootargs = "root=/dev/ram0"; + bootargs = ""; stdout-path = "serial0:115200n8"; }; diff --git a/arch/arm/boot/dts/intel/ixp/intel-ixp42x-welltech-epbx100.dts b/arch/arm/boot/dts/intel/ixp/intel-ixp42x-welltech-epbx100.dts index c550c421b659..96105137a364 100644 --- a/arch/arm/boot/dts/intel/ixp/intel-ixp42x-welltech-epbx100.dts +++ b/arch/arm/boot/dts/intel/ixp/intel-ixp42x-welltech-epbx100.dts @@ -20,7 +20,7 @@ memory at 0 { }; chosen { - bootargs = "console=ttyS0,115200n8 root=/dev/ram0 initrd=0x00800000,9M"; + bootargs = "console=ttyS0,115200n8 initrd=0x00800000,9M"; stdout-path = "uart0:115200n8"; }; diff --git a/arch/arm/boot/dts/nspire/nspire-classic.dtsi b/arch/arm/boot/dts/nspire/nspire-classic.dtsi index 0ee53d3ecd54..224cf5921e26 100644 --- a/arch/arm/boot/dts/nspire/nspire-classic.dtsi +++ b/arch/arm/boot/dts/nspire/nspire-classic.dtsi @@ -81,6 +81,6 @@ panel_in: endpoint { }; }; chosen { - bootargs = "debug earlyprintk console=tty0 console=ttyS0,115200n8 root=/dev/ram0"; + bootargs = "debug earlyprintk console=tty0 console=ttyS0,115200n8"; }; }; diff --git a/arch/arm/boot/dts/nspire/nspire-cx.dts b/arch/arm/boot/dts/nspire/nspire-cx.dts index debeff0ec010..08155d15cca9 100644 --- a/arch/arm/boot/dts/nspire/nspire-cx.dts +++ b/arch/arm/boot/dts/nspire/nspire-cx.dts @@ -165,6 +165,6 @@ panel_in: endpoint { }; }; chosen { - bootargs = "debug earlyprintk console=tty0 console=ttyAMA0,115200n8 root=/dev/ram0"; + bootargs = "debug earlyprintk console=tty0 console=ttyAMA0,115200n8"; }; }; diff --git a/arch/arm/boot/dts/samsung/exynos4210-origen.dts b/arch/arm/boot/dts/samsung/exynos4210-origen.dts index 4dcf794bd18b..b714073143e7 100644 --- a/arch/arm/boot/dts/samsung/exynos4210-origen.dts +++ b/arch/arm/boot/dts/samsung/exynos4210-origen.dts @@ -36,7 +36,7 @@ aliases { }; chosen { - bootargs = "root=/dev/ram0 rw initrd=0x41000000,8M init=/linuxrc"; + bootargs = "rw initrd=0x41000000,8M init=/linuxrc"; stdout-path = "serial2:115200n8"; }; diff --git a/arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts b/arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts index 4cdeddeff3fc..2a3c2a4c0e90 100644 --- a/arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts +++ b/arch/arm/boot/dts/samsung/exynos4210-smdkv310.dts @@ -30,7 +30,7 @@ aliases { }; chosen { - bootargs = "root=/dev/ram0 rw initrd=0x41000000,8M init=/linuxrc"; + bootargs = "rw initrd=0x41000000,8M init=/linuxrc"; stdout-path = "serial1:115200n8"; }; diff --git a/arch/arm/boot/dts/samsung/exynos4412-smdk4412.dts b/arch/arm/boot/dts/samsung/exynos4412-smdk4412.dts index 4b18cc55d6ca..920af4f91c75 100644 --- a/arch/arm/boot/dts/samsung/exynos4412-smdk4412.dts +++ b/arch/arm/boot/dts/samsung/exynos4412-smdk4412.dts @@ -27,7 +27,7 @@ aliases { }; chosen { - bootargs = "root=/dev/ram0 rw initrd=0x41000000,8M init=/linuxrc"; + bootargs = "rw initrd=0x41000000,8M init=/linuxrc"; stdout-path = "serial1:115200n8"; }; diff --git a/arch/arm/boot/dts/samsung/exynos5250-smdk5250.dts b/arch/arm/boot/dts/samsung/exynos5250-smdk5250.dts index 4164c7c2a3eb..e5cfff1ffad0 100644 --- a/arch/arm/boot/dts/samsung/exynos5250-smdk5250.dts +++ b/arch/arm/boot/dts/samsung/exynos5250-smdk5250.dts @@ -27,7 +27,7 @@ memory at 40000000 { }; chosen { - bootargs = "root=/dev/ram0 rw initrd=0x41000000,8M init=/linuxrc"; + bootargs = "rw initrd=0x41000000,8M init=/linuxrc"; stdout-path = "serial2:115200n8"; }; diff --git a/arch/arm/boot/dts/st/ste-nomadik-nhk15.dts b/arch/arm/boot/dts/st/ste-nomadik-nhk15.dts index cdff33063d6f..8a22425cdb78 100644 --- a/arch/arm/boot/dts/st/ste-nomadik-nhk15.dts +++ b/arch/arm/boot/dts/st/ste-nomadik-nhk15.dts @@ -13,7 +13,7 @@ / { compatible = "st,nomadik-nhk-15"; chosen { - bootargs = "root=/dev/ram0 console=ttyAMA1,115200n8 earlyprintk"; + bootargs = "console=ttyAMA1,115200n8 earlyprintk"; }; aliases { diff --git a/arch/arm/boot/dts/st/ste-nomadik-s8815.dts b/arch/arm/boot/dts/st/ste-nomadik-s8815.dts index c905c2643a12..7f418d8a2370 100644 --- a/arch/arm/boot/dts/st/ste-nomadik-s8815.dts +++ b/arch/arm/boot/dts/st/ste-nomadik-s8815.dts @@ -13,7 +13,7 @@ / { compatible = "calaosystems,usb-s8815"; chosen { - bootargs = "root=/dev/ram0 console=ttyAMA1,115200n8 earlyprintk"; + bootargs = "console=ttyAMA1,115200n8 earlyprintk"; }; aliases { diff --git a/arch/arm/boot/dts/st/stm32429i-eval.dts b/arch/arm/boot/dts/st/stm32429i-eval.dts index afa417b34b25..7e8834af20c6 100644 --- a/arch/arm/boot/dts/st/stm32429i-eval.dts +++ b/arch/arm/boot/dts/st/stm32429i-eval.dts @@ -57,7 +57,7 @@ / { compatible = "st,stm32429i-eval", "st,stm32f429"; chosen { - bootargs = "root=/dev/ram"; + bootargs = ""; stdout-path = "serial0:115200n8"; }; diff --git a/arch/arm/boot/dts/st/stm32746g-eval.dts b/arch/arm/boot/dts/st/stm32746g-eval.dts index e9ac37b6eca0..43a52b26fdaa 100644 --- a/arch/arm/boot/dts/st/stm32746g-eval.dts +++ b/arch/arm/boot/dts/st/stm32746g-eval.dts @@ -51,7 +51,7 @@ / { compatible = "st,stm32746g-eval", "st,stm32f746"; chosen { - bootargs = "root=/dev/ram"; + bootargs = ""; stdout-path = "serial0:115200n8"; }; diff --git a/arch/arm/boot/dts/st/stm32f429-disco.dts b/arch/arm/boot/dts/st/stm32f429-disco.dts index a3cb4aabdd5a..68d822d79988 100644 --- a/arch/arm/boot/dts/st/stm32f429-disco.dts +++ b/arch/arm/boot/dts/st/stm32f429-disco.dts @@ -57,7 +57,7 @@ / { compatible = "st,stm32f429i-disco", "st,stm32f429"; chosen { - bootargs = "root=/dev/ram"; + bootargs = ""; stdout-path = "serial0:115200n8"; }; diff --git a/arch/arm/boot/dts/st/stm32f469-disco.dts b/arch/arm/boot/dts/st/stm32f469-disco.dts index 8a4f8ddd083d..31b4abbc608d 100644 --- a/arch/arm/boot/dts/st/stm32f469-disco.dts +++ b/arch/arm/boot/dts/st/stm32f469-disco.dts @@ -56,7 +56,7 @@ / { compatible = "st,stm32f469i-disco", "st,stm32f469"; chosen { - bootargs = "root=/dev/ram"; + bootargs = ""; stdout-path = "serial0:115200n8"; }; diff --git a/arch/arm/boot/dts/st/stm32f746-disco.dts b/arch/arm/boot/dts/st/stm32f746-disco.dts index b57dbdce2f40..3cb04547228e 100644 --- a/arch/arm/boot/dts/st/stm32f746-disco.dts +++ b/arch/arm/boot/dts/st/stm32f746-disco.dts @@ -52,7 +52,7 @@ / { compatible = "st,stm32f746-disco", "st,stm32f746"; chosen { - bootargs = "root=/dev/ram"; + bootargs = ""; stdout-path = "serial0:115200n8"; }; diff --git a/arch/arm/boot/dts/st/stm32f769-disco.dts b/arch/arm/boot/dts/st/stm32f769-disco.dts index 535cfdc4681c..13f96ee0b3de 100644 --- a/arch/arm/boot/dts/st/stm32f769-disco.dts +++ b/arch/arm/boot/dts/st/stm32f769-disco.dts @@ -51,7 +51,7 @@ / { compatible = "st,stm32f769-disco", "st,stm32f769"; chosen { - bootargs = "root=/dev/ram"; + bootargs = ""; stdout-path = "serial0:115200n8"; }; diff --git a/arch/arm/boot/dts/st/stm32h743i-disco.dts b/arch/arm/boot/dts/st/stm32h743i-disco.dts index 8451a54a9a08..8bdb24fcf0c7 100644 --- a/arch/arm/boot/dts/st/stm32h743i-disco.dts +++ b/arch/arm/boot/dts/st/stm32h743i-disco.dts @@ -49,7 +49,7 @@ / { compatible = "st,stm32h743i-disco", "st,stm32h743"; chosen { - bootargs = "root=/dev/ram"; + bootargs = ""; stdout-path = "serial0:115200n8"; }; diff --git a/arch/arm/boot/dts/st/stm32h743i-eval.dts b/arch/arm/boot/dts/st/stm32h743i-eval.dts index 4b0ced27b80e..c3de36d94acf 100644 --- a/arch/arm/boot/dts/st/stm32h743i-eval.dts +++ b/arch/arm/boot/dts/st/stm32h743i-eval.dts @@ -49,7 +49,7 @@ / { compatible = "st,stm32h743i-eval", "st,stm32h743"; chosen { - bootargs = "root=/dev/ram"; + bootargs = ""; stdout-path = "serial0:115200n8"; }; diff --git a/arch/arm/boot/dts/st/stm32h747i-disco.dts b/arch/arm/boot/dts/st/stm32h747i-disco.dts index 99f0255dae8e..a57341e2d95c 100644 --- a/arch/arm/boot/dts/st/stm32h747i-disco.dts +++ b/arch/arm/boot/dts/st/stm32h747i-disco.dts @@ -14,7 +14,7 @@ / { compatible = "st,stm32h747i-disco", "st,stm32h747"; chosen { - bootargs = "root=/dev/ram"; + bootargs = ""; stdout-path = "serial0:115200n8"; }; diff --git a/arch/arm/boot/dts/st/stm32h750i-art-pi.dts b/arch/arm/boot/dts/st/stm32h750i-art-pi.dts index 56c53e262da7..b4bd8315464c 100644 --- a/arch/arm/boot/dts/st/stm32h750i-art-pi.dts +++ b/arch/arm/boot/dts/st/stm32h750i-art-pi.dts @@ -54,7 +54,7 @@ / { compatible = "st,stm32h750i-art-pi", "st,stm32h750"; chosen { - bootargs = "root=/dev/ram"; + bootargs = ""; stdout-path = "serial0:2000000n8"; }; diff --git a/arch/arm/configs/assabet_defconfig b/arch/arm/configs/assabet_defconfig index 07ab9eaac4af..56fce6c08945 100644 --- a/arch/arm/configs/assabet_defconfig +++ b/arch/arm/configs/assabet_defconfig @@ -5,7 +5,7 @@ CONFIG_ARCH_MULTI_V4=y # CONFIG_ARCH_MULTI_V7 is not set CONFIG_ARCH_SA1100=y CONFIG_SA1100_ASSABET=y -CONFIG_CMDLINE="mem=32M console=ttySA0,38400n8 initrd=0xc0800000,3M root=/dev/ram" +CONFIG_CMDLINE="mem=32M console=ttySA0,38400n8 initrd=0xc0800000,3M" CONFIG_FPE_NWFPE=y CONFIG_PM=y CONFIG_MODULES=y diff --git a/arch/arm/configs/at91_dt_defconfig b/arch/arm/configs/at91_dt_defconfig index ff13e1ecf4bb..b53c7906d317 100644 --- a/arch/arm/configs/at91_dt_defconfig +++ b/arch/arm/configs/at91_dt_defconfig @@ -23,7 +23,7 @@ CONFIG_UACCESS_WITH_MEMCPY=y # CONFIG_ATAGS is not set CONFIG_ARM_APPENDED_DTB=y CONFIG_ARM_ATAG_DTB_COMPAT=y -CONFIG_CMDLINE="console=ttyS0,115200 initrd=0x21100000,25165824 root=/dev/ram0 rw" +CONFIG_CMDLINE="console=ttyS0,115200 initrd=0x21100000,25165824 rw" CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set diff --git a/arch/arm/configs/exynos_defconfig b/arch/arm/configs/exynos_defconfig index 77d3521f55d4..02a903816baa 100644 --- a/arch/arm/configs/exynos_defconfig +++ b/arch/arm/configs/exynos_defconfig @@ -15,7 +15,7 @@ CONFIG_HIGHMEM=y CONFIG_SECCOMP=y CONFIG_ARM_APPENDED_DTB=y CONFIG_ARM_ATAG_DTB_COMPAT=y -CONFIG_CMDLINE="root=/dev/ram0 rw initrd=0x41000000,8M console=ttySAC1,115200 init=/linuxrc mem=256M" +CONFIG_CMDLINE="rw initrd=0x41000000,8M console=ttySAC1,115200 init=/linuxrc mem=256M" CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_STAT=y CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y diff --git a/arch/arm/configs/lpc32xx_defconfig b/arch/arm/configs/lpc32xx_defconfig index 9afccd76446b..a98d1125b9aa 100644 --- a/arch/arm/configs/lpc32xx_defconfig +++ b/arch/arm/configs/lpc32xx_defconfig @@ -13,7 +13,7 @@ CONFIG_ARCH_LPC32XX=y CONFIG_AEABI=y CONFIG_ARM_APPENDED_DTB=y CONFIG_ARM_ATAG_DTB_COMPAT=y -CONFIG_CMDLINE="console=ttyS0,115200n81 root=/dev/ram0" +CONFIG_CMDLINE="console=ttyS0,115200n81" CONFIG_CPU_IDLE=y CONFIG_VFP=y CONFIG_JUMP_LABEL=y diff --git a/arch/arm/configs/pxa_defconfig b/arch/arm/configs/pxa_defconfig index 1a80602c1284..0c4b9389d4d6 100644 --- a/arch/arm/configs/pxa_defconfig +++ b/arch/arm/configs/pxa_defconfig @@ -22,7 +22,7 @@ CONFIG_MACH_AKITA=y CONFIG_MACH_BORZOI=y CONFIG_AEABI=y CONFIG_ARCH_FORCE_MAX_ORDER=8 -CONFIG_CMDLINE="root=/dev/ram0 ro" +CONFIG_CMDLINE="ro" CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_STAT=y CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y diff --git a/arch/arm/configs/s3c6400_defconfig b/arch/arm/configs/s3c6400_defconfig index 23635d5b9322..a5018ce274ec 100644 --- a/arch/arm/configs/s3c6400_defconfig +++ b/arch/arm/configs/s3c6400_defconfig @@ -4,7 +4,7 @@ CONFIG_ARCH_MULTI_V6=y # CONFIG_ARCH_MULTI_V7 is not set CONFIG_ARCH_S3C64XX=y CONFIG_MACH_WLF_CRAGG_6410=y -CONFIG_CMDLINE="console=ttySAC0,115200 root=/dev/ram init=/linuxrc initrd=0x51000000,6M" +CONFIG_CMDLINE="console=ttySAC0,115200 init=/linuxrc initrd=0x51000000,6M" CONFIG_VFP=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y diff --git a/arch/arm/configs/s5pv210_defconfig b/arch/arm/configs/s5pv210_defconfig index 8ec82d9b51e4..485dd5174c62 100644 --- a/arch/arm/configs/s5pv210_defconfig +++ b/arch/arm/configs/s5pv210_defconfig @@ -8,7 +8,7 @@ CONFIG_KALLSYMS_ALL=y CONFIG_ARCH_S5PV210=y CONFIG_VMSPLIT_2G=y CONFIG_ARM_APPENDED_DTB=y -CONFIG_CMDLINE="root=/dev/ram0 rw initrd=0x20800000,8M console=ttySAC1,115200 init=/linuxrc" +CONFIG_CMDLINE="rw initrd=0x20800000,8M console=ttySAC1,115200 init=/linuxrc" CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_STAT=y CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y diff --git a/arch/arm/configs/sama5_defconfig b/arch/arm/configs/sama5_defconfig index 2cad045e1d8d..0463ff84c06c 100644 --- a/arch/arm/configs/sama5_defconfig +++ b/arch/arm/configs/sama5_defconfig @@ -14,7 +14,7 @@ CONFIG_SOC_SAMA5D4=y # CONFIG_ATMEL_CLOCKSOURCE_PIT is not set CONFIG_UACCESS_WITH_MEMCPY=y # CONFIG_ATAGS is not set -CONFIG_CMDLINE="console=ttyS0,115200 initrd=0x21100000,25165824 root=/dev/ram0 rw" +CONFIG_CMDLINE="console=ttyS0,115200 initrd=0x21100000,25165824 rw" CONFIG_VFP=y CONFIG_NEON=y CONFIG_KERNEL_MODE_NEON=y diff --git a/arch/arm/configs/u8500_defconfig b/arch/arm/configs/u8500_defconfig index 0f55815eecb3..510c760b0bc7 100644 --- a/arch/arm/configs/u8500_defconfig +++ b/arch/arm/configs/u8500_defconfig @@ -9,7 +9,7 @@ CONFIG_NR_CPUS=2 CONFIG_HIGHMEM=y CONFIG_ARM_APPENDED_DTB=y CONFIG_ARM_ATAG_DTB_COMPAT=y -CONFIG_CMDLINE="root=/dev/ram0 console=ttyAMA2,115200n8" +CONFIG_CMDLINE="console=ttyAMA2,115200n8" CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_GOV_ONDEMAND=y CONFIG_CPUFREQ_DT=y diff --git a/arch/parisc/defpalo.conf b/arch/parisc/defpalo.conf index 208ff3b41487..86c9a132cb92 100644 --- a/arch/parisc/defpalo.conf +++ b/arch/parisc/defpalo.conf @@ -12,7 +12,7 @@ # If you want a root ramdisk, use the next 2 lines # (Edit the ramdisk image name!!!!) --ramdisk=ram-disk-image-file ---commandline=0/vmlinuz HOME=/ root=/dev/ram initrd=0/ramdisk panic_timeout=60 panic=-1 +--commandline=0/vmlinuz HOME=/ initrd=0/ramdisk panic_timeout=60 panic=-1 # If you want NFS root, use the following command line (Edit the HOSTNAME!!!) #--commandline=0/vmlinuz HOME=/ root=/dev/nfs nfsroot=HOSTNAME ip=bootp diff --git a/arch/s390/boot/ipl_parm.c b/arch/s390/boot/ipl_parm.c index f584d7da29cb..47fc2a7ed551 100644 --- a/arch/s390/boot/ipl_parm.c +++ b/arch/s390/boot/ipl_parm.c @@ -18,7 +18,7 @@ struct parmarea parmarea __section(".parmarea") = { .kernel_version = (unsigned long)kernel_version, .max_command_line_size = COMMAND_LINE_SIZE, - .command_line = "root=/dev/ram0 ro", + .command_line = "ro", }; char __bootdata(early_command_line)[COMMAND_LINE_SIZE]; diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig index f2f9cd9cde50..e8e579160c6b 100644 --- a/arch/xtensa/Kconfig +++ b/arch/xtensa/Kconfig @@ -448,7 +448,7 @@ config CMDLINE_BOOL config CMDLINE string "Initial kernel command string" depends on CMDLINE_BOOL - default "console=ttyS0,38400 root=/dev/ram" + default "console=ttyS0,38400" help On some architectures (EBSA110 and CATS), there is currently no way for the boot loader to pass arguments to the kernel. For these diff --git a/arch/xtensa/boot/dts/csp.dts b/arch/xtensa/boot/dts/csp.dts index 885495460f7e..c7e07dd0d7d0 100644 --- a/arch/xtensa/boot/dts/csp.dts +++ b/arch/xtensa/boot/dts/csp.dts @@ -8,7 +8,7 @@ / { interrupt-parent = <&pic>; chosen { - bootargs = "earlycon=cdns,0xfd000000,115200 console=tty0 console=ttyPS0,115200 root=/dev/ram0 rw earlyprintk xilinx_uartps.rx_trigger_level=32 loglevel=8 nohz=off ignore_loglevel"; + bootargs = "earlycon=cdns,0xfd000000,115200 console=tty0 console=ttyPS0,115200 rw earlyprintk xilinx_uartps.rx_trigger_level=32 loglevel=8 nohz=off ignore_loglevel"; }; memory at 0 { -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:01 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:01 +0000 Subject: [PATCH RESEND 22/62] doc: remove obsolete mentions of pivot_root In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-23-safinaskar@gmail.com> They refer to initrd, which was removed in previous commits Signed-off-by: Askar Safin --- Documentation/admin-guide/device-mapper/dm-init.rst | 4 ++-- Documentation/arch/arm/ixp4xx.rst | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/Documentation/admin-guide/device-mapper/dm-init.rst b/Documentation/admin-guide/device-mapper/dm-init.rst index 981d6a907699..586bb38d716b 100644 --- a/Documentation/admin-guide/device-mapper/dm-init.rst +++ b/Documentation/admin-guide/device-mapper/dm-init.rst @@ -5,8 +5,8 @@ Early creation of mapped devices It is possible to configure a device-mapper device to act as the root device for your system in two ways. -The first is to build an initial ramdisk which boots to a minimal userspace -which configures the device, then pivot_root(8) in to it. +The first is to build initramfs which boots to a minimal userspace +which configures the device, then switches to it. The second is to create one or more device-mappers using the module parameter "dm-mod.create=" through the kernel boot command line argument. diff --git a/Documentation/arch/arm/ixp4xx.rst b/Documentation/arch/arm/ixp4xx.rst index 17aafc610908..ac9cb28776c7 100644 --- a/Documentation/arch/arm/ixp4xx.rst +++ b/Documentation/arch/arm/ixp4xx.rst @@ -137,8 +137,8 @@ Intel IXDPG425 Development Platform added. One issue with this board is that the mini-PCI slots only have the 3.3v line connected, so you can't use a PCI to mini-PCI adapter with an E100 card. So to NFS root you need to use either - the CSR or a WiFi card and a ramdisk that BOOTPs and then does - a pivot_root to NFS. + the CSR or a WiFi card and initramfs that BOOTPs and then switches + to NFS. Motorola PrPMC1100 Processor Mezanine Card http://www.fountainsys.com -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:02 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:02 +0000 Subject: [PATCH RESEND 23/62] init: rename __initramfs_{start,size} to __builtin_initramfs_{start,size} In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-24-safinaskar@gmail.com> Rename __initramfs_start to __builtin_initramfs_start and __initramfs_size to __builtin_initramfs_size . This is more clear Signed-off-by: Askar Safin --- arch/x86/tools/relocs.c | 2 +- drivers/acpi/tables.c | 4 ++-- include/asm-generic/vmlinux.lds.h | 6 +++--- include/linux/initrd.h | 4 ++-- init/initramfs.c | 4 +--- usr/initramfs_data.S | 4 ++-- 6 files changed, 11 insertions(+), 13 deletions(-) diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c index 5778bc498415..4b4e556f1b52 100644 --- a/arch/x86/tools/relocs.c +++ b/arch/x86/tools/relocs.c @@ -87,7 +87,7 @@ static const char * const sym_regex_kernel[S_NSYMTYPES] = { "__(start|stop)_notes|" "__end_rodata|" "__end_rodata_aligned|" - "__initramfs_start|" + "__builtin_initramfs_start|" "(jiffies|jiffies_64)|" #if ELF_BITS == 64 "__end_rodata_hpage_align|" diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c index fa9bb8c8ce95..3160cb7dca00 100644 --- a/drivers/acpi/tables.c +++ b/drivers/acpi/tables.c @@ -429,8 +429,8 @@ void __init acpi_table_upgrade(void) struct cpio_data file; if (IS_ENABLED(CONFIG_ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD)) { - data = __initramfs_start; - size = __initramfs_size; + data = __builtin_initramfs_start; + size = __builtin_initramfs_size; } else { data = (void *)initrd_start; size = initrd_end - initrd_start; diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index ae2d2359b79e..a6bd2ff46f7e 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -46,8 +46,8 @@ * [_sdata, _edata] is the data section * * Some of the included output section have their own set of constants. - * Examples are: [__initramfs_start, __initramfs_end] for initramfs and - * [__nosave_begin, __nosave_end] for the nosave data + * Examples are: [__builtin_initramfs_start, __builtin_initramfs_start + __builtin_initramfs_size] + * for initramfs and [__nosave_begin, __nosave_end] for the nosave data */ #include @@ -969,7 +969,7 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG) #ifdef CONFIG_BLK_DEV_INITRD #define INIT_RAM_FS \ . = ALIGN(4); \ - __initramfs_start = .; \ + __builtin_initramfs_start = .; \ KEEP(*(.init.ramfs)) \ . = ALIGN(8); \ KEEP(*(.init.ramfs.info)) diff --git a/include/linux/initrd.h b/include/linux/initrd.h index cc389ef1a738..e49c7166dbb3 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -21,8 +21,8 @@ static inline void wait_for_initramfs(void) {} extern phys_addr_t phys_initrd_start; extern unsigned long phys_initrd_size; -extern char __initramfs_start[]; -extern unsigned long __initramfs_size; +extern char __builtin_initramfs_start[]; +extern unsigned long __builtin_initramfs_size; void console_on_rootfs(void); diff --git a/init/initramfs.c b/init/initramfs.c index 850cb0de873e..2866d7a0afd7 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -597,8 +597,6 @@ static int __init initramfs_async_setup(char *str) } __setup("initramfs_async=", initramfs_async_setup); -extern char __initramfs_start[]; -extern unsigned long __initramfs_size; #include #include @@ -695,7 +693,7 @@ static inline bool kexec_free_initrd(void) static void __init do_populate_rootfs(void *unused, async_cookie_t cookie) { /* Load the built in initramfs */ - char *err = unpack_to_rootfs(__initramfs_start, __initramfs_size); + char *err = unpack_to_rootfs(__builtin_initramfs_start, __builtin_initramfs_size); if (err) panic_show_mem("%s", err); /* Failed to decompress INTERNAL initramfs */ diff --git a/usr/initramfs_data.S b/usr/initramfs_data.S index cd67edc38797..64ca648a80e2 100644 --- a/usr/initramfs_data.S +++ b/usr/initramfs_data.S @@ -27,8 +27,8 @@ __irf_start: .incbin "usr/initramfs_inc_data" __irf_end: .section .init.ramfs.info,"a" -.globl __initramfs_size -__initramfs_size: +.globl __builtin_initramfs_size +__builtin_initramfs_size: #ifdef CONFIG_64BIT .quad __irf_end - __irf_start #else -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:03 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:03 +0000 Subject: [PATCH RESEND 24/62] init: remove wrong comment In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-25-safinaskar@gmail.com> This comment is wrong. free_initrd_mem may be called with crashk_end and initrd_end as arguments Signed-off-by: Askar Safin --- include/linux/initrd.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/initrd.h b/include/linux/initrd.h index e49c7166dbb3..4080ba82d4c9 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -6,7 +6,6 @@ /* 1 if it is not an error if initrd_start < memory_start */ extern int initrd_below_start_ok; -/* free_initrd_mem always gets called with the next two as arguments.. */ extern unsigned long initrd_start, initrd_end; extern void free_initrd_mem(unsigned long, unsigned long); -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:04 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:04 +0000 Subject: [PATCH RESEND 25/62] init: rename phys_initrd_{start,size} to phys_external_initramfs_{start,size} In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-26-safinaskar@gmail.com> Rename phys_initrd_start to phys_external_initramfs_start and phys_initrd_size to phys_external_initramfs_size. They refer to initramfs, not to initrd Signed-off-by: Askar Safin --- arch/arc/mm/init.c | 8 ++++---- arch/arm/mm/init.c | 8 ++++---- arch/arm64/mm/init.c | 15 ++++++++------- arch/x86/kernel/setup.c | 4 ++-- drivers/firmware/efi/efi.c | 6 +++--- drivers/of/fdt.c | 8 ++++---- include/linux/initrd.h | 4 ++-- init/do_mounts_initrd.c | 8 ++++---- init/initramfs.c | 10 +++++----- 9 files changed, 36 insertions(+), 35 deletions(-) diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c index a73cc94f806e..eb8a616a63c6 100644 --- a/arch/arc/mm/init.c +++ b/arch/arc/mm/init.c @@ -110,10 +110,10 @@ void __init setup_arch_memory(void) __pa(_end) - CONFIG_LINUX_LINK_BASE); #ifdef CONFIG_BLK_DEV_INITRD - if (phys_initrd_size) { - memblock_reserve(phys_initrd_start, phys_initrd_size); - initrd_start = (unsigned long)__va(phys_initrd_start); - initrd_end = initrd_start + phys_initrd_size; + if (phys_external_initramfs_size) { + memblock_reserve(phys_external_initramfs_start, phys_external_initramfs_size); + initrd_start = (unsigned long)__va(phys_external_initramfs_start); + initrd_end = initrd_start + phys_external_initramfs_size; } #endif diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 54bdca025c9f..93f8010b9115 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -55,8 +55,8 @@ static int __init parse_tag_initrd(const struct tag *tag) { pr_warn("ATAG_INITRD is deprecated; " "please update your bootloader.\n"); - phys_initrd_start = __virt_to_phys(tag->u.initrd.start); - phys_initrd_size = tag->u.initrd.size; + phys_external_initramfs_start = __virt_to_phys(tag->u.initrd.start); + phys_external_initramfs_size = tag->u.initrd.size; return 0; } @@ -64,8 +64,8 @@ __tagtable(ATAG_INITRD, parse_tag_initrd); static int __init parse_tag_initrd2(const struct tag *tag) { - phys_initrd_start = tag->u.initrd.start; - phys_initrd_size = tag->u.initrd.size; + phys_external_initramfs_start = tag->u.initrd.start; + phys_external_initramfs_size = tag->u.initrd.size; return 0; } diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index ea84a61ed508..da517edcf824 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -246,14 +246,15 @@ void __init arm64_memblock_init(void) memblock_add(__pa_symbol(_text), (u64)(_end - _text)); } - if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && phys_initrd_size) { + if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && phys_external_initramfs_size) { /* * Add back the memory we just removed if it results in the * initrd to become inaccessible via the linear mapping. * Otherwise, this is a no-op */ - u64 base = phys_initrd_start & PAGE_MASK; - u64 size = PAGE_ALIGN(phys_initrd_start + phys_initrd_size) - base; + u64 base = phys_external_initramfs_start & PAGE_MASK; + u64 size = PAGE_ALIGN(phys_external_initramfs_start + + phys_external_initramfs_size) - base; /* * We can only add back the initrd memory if we don't end up @@ -267,7 +268,7 @@ void __init arm64_memblock_init(void) base + size > memblock_start_of_DRAM() + linear_region_size, "initrd not fully accessible via the linear mapping -- please check your bootloader ...\n")) { - phys_initrd_size = 0; + phys_external_initramfs_size = 0; } else { memblock_add(base, size); memblock_clear_nomap(base, size); @@ -280,10 +281,10 @@ void __init arm64_memblock_init(void) * pagetables with memblock. */ memblock_reserve(__pa_symbol(_stext), _end - _stext); - if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && phys_initrd_size) { + if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && phys_external_initramfs_size) { /* the generic initrd code expects virtual addresses */ - initrd_start = __phys_to_virt(phys_initrd_start); - initrd_end = initrd_start + phys_initrd_size; + initrd_start = __phys_to_virt(phys_external_initramfs_start); + initrd_end = initrd_start + phys_external_initramfs_size; } early_init_fdt_scan_reserved_mem(); diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 797c3c9fc75e..e727c7a7f648 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -297,7 +297,7 @@ static u64 __init get_ramdisk_image(void) ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32; if (ramdisk_image == 0) - ramdisk_image = phys_initrd_start; + ramdisk_image = phys_external_initramfs_start; return ramdisk_image; } @@ -308,7 +308,7 @@ static u64 __init get_ramdisk_size(void) ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32; if (ramdisk_size == 0) - ramdisk_size = phys_initrd_size; + ramdisk_size = phys_external_initramfs_size; return ramdisk_size; } diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index 1ce428e2ac8a..7cab72da2ea9 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -808,13 +808,13 @@ int __init efi_config_parse_tables(const efi_config_table_t *config_tables, } if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && - initrd != EFI_INVALID_TABLE_ADDR && phys_initrd_size == 0) { + initrd != EFI_INVALID_TABLE_ADDR && phys_external_initramfs_size == 0) { struct linux_efi_initrd *tbl; tbl = early_memremap(initrd, sizeof(*tbl)); if (tbl) { - phys_initrd_start = tbl->base; - phys_initrd_size = tbl->size; + phys_external_initramfs_start = tbl->base; + phys_external_initramfs_size = tbl->size; early_memunmap(tbl, sizeof(*tbl)); } } diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index 0edd639898a6..9c4c9be948c5 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -760,8 +760,8 @@ static void __early_init_dt_declare_initrd(unsigned long start, { /* * __va() is not yet available this early on some platforms. In that - * case, the platform uses phys_initrd_start/phys_initrd_size instead - * and does the VA conversion itself. + * case, the platform uses phys_external_initramfs_start/phys_external_initramfs_size + * instead and does the VA conversion itself. */ if (!IS_ENABLED(CONFIG_ARM64) && !(IS_ENABLED(CONFIG_RISCV) && IS_ENABLED(CONFIG_64BIT))) { @@ -799,8 +799,8 @@ static void __init early_init_dt_check_for_initrd(unsigned long node) return; __early_init_dt_declare_initrd(start, end); - phys_initrd_start = start; - phys_initrd_size = end - start; + phys_external_initramfs_start = start; + phys_external_initramfs_size = end - start; pr_debug("initrd_start=0x%llx initrd_end=0x%llx\n", start, end); } diff --git a/include/linux/initrd.h b/include/linux/initrd.h index 4080ba82d4c9..23c08e88234c 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -17,8 +17,8 @@ static inline void __init reserve_initrd_mem(void) {} static inline void wait_for_initramfs(void) {} #endif -extern phys_addr_t phys_initrd_start; -extern unsigned long phys_initrd_size; +extern phys_addr_t phys_external_initramfs_start; +extern unsigned long phys_external_initramfs_size; extern char __builtin_initramfs_start[]; extern unsigned long __builtin_initramfs_size; diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c index d5264e9a52e0..444182a76999 100644 --- a/init/do_mounts_initrd.c +++ b/init/do_mounts_initrd.c @@ -15,8 +15,8 @@ unsigned long initrd_start, initrd_end; int initrd_below_start_ok; -phys_addr_t phys_initrd_start __initdata; -unsigned long phys_initrd_size __initdata; +phys_addr_t phys_external_initramfs_start __initdata; +unsigned long phys_external_initramfs_size __initdata; static int __init early_initrdmem(char *p) { @@ -28,8 +28,8 @@ static int __init early_initrdmem(char *p) if (*endp == ',') { size = memparse(endp + 1, NULL); - phys_initrd_start = start; - phys_initrd_size = size; + phys_external_initramfs_start = start; + phys_external_initramfs_size = size; } return 0; } diff --git a/init/initramfs.c b/init/initramfs.c index 2866d7a0afd7..6abe0a3ca4ce 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -610,7 +610,7 @@ void __init reserve_initrd_mem(void) /* Ignore the virtul address computed during device tree parsing */ initrd_start = initrd_end = 0; - if (!phys_initrd_size) + if (!phys_external_initramfs_size) return; /* * Round the memory region to page boundaries as per free_initrd_mem() @@ -618,8 +618,8 @@ void __init reserve_initrd_mem(void) * are in use, but more importantly, reserves the entire set of pages * as we don't want these pages allocated for other purposes. */ - start = round_down(phys_initrd_start, PAGE_SIZE); - size = phys_initrd_size + (phys_initrd_start - start); + start = round_down(phys_external_initramfs_start, PAGE_SIZE); + size = phys_external_initramfs_size + (phys_external_initramfs_start - start); size = round_up(size, PAGE_SIZE); if (!memblock_is_region_memory(start, size)) { @@ -636,8 +636,8 @@ void __init reserve_initrd_mem(void) memblock_reserve(start, size); /* Now convert initrd to virtual addresses */ - initrd_start = (unsigned long)__va(phys_initrd_start); - initrd_end = initrd_start + phys_initrd_size; + initrd_start = (unsigned long)__va(phys_external_initramfs_start); + initrd_end = initrd_start + phys_external_initramfs_size; initrd_below_start_ok = 1; return; -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:05 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:05 +0000 Subject: [PATCH RESEND 26/62] init: move phys_external_initramfs_{start,size} to init/initramfs.c In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-27-safinaskar@gmail.com> Move definitions of phys_external_initramfs_start and phys_external_initramfs_size to init/initramfs.c Signed-off-by: Askar Safin --- init/do_mounts_initrd.c | 3 --- init/initramfs.c | 3 +++ 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c index 444182a76999..06be76aa602c 100644 --- a/init/do_mounts_initrd.c +++ b/init/do_mounts_initrd.c @@ -15,9 +15,6 @@ unsigned long initrd_start, initrd_end; int initrd_below_start_ok; -phys_addr_t phys_external_initramfs_start __initdata; -unsigned long phys_external_initramfs_size __initdata; - static int __init early_initrdmem(char *p) { phys_addr_t start; diff --git a/init/initramfs.c b/init/initramfs.c index 6abe0a3ca4ce..5242d851e839 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -600,6 +600,9 @@ __setup("initramfs_async=", initramfs_async_setup); #include #include +phys_addr_t phys_external_initramfs_start __initdata; +unsigned long phys_external_initramfs_size __initdata; + static BIN_ATTR(initrd, 0440, sysfs_bin_attr_simple_read, NULL, 0); void __init reserve_initrd_mem(void) -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:06 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:06 +0000 Subject: [PATCH RESEND 27/62] init: alpha: remove "extern unsigned long initrd_start, initrd_end" In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-28-safinaskar@gmail.com> These variables already declared in , which is included Signed-off-by: Askar Safin --- arch/alpha/kernel/core_irongate.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/alpha/kernel/core_irongate.c b/arch/alpha/kernel/core_irongate.c index 05dc4c1b9074..3411564144ae 100644 --- a/arch/alpha/kernel/core_irongate.c +++ b/arch/alpha/kernel/core_irongate.c @@ -225,8 +225,6 @@ albacore_init_arch(void) alpha_mv.min_mem_address = pci_mem; if (memtop > pci_mem) { #ifdef CONFIG_BLK_DEV_INITRD - extern unsigned long initrd_start, initrd_end; - /* Move the initrd out of the way. */ if (initrd_end && __pa(initrd_end) > pci_mem) { unsigned long size; -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:07 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:07 +0000 Subject: [PATCH RESEND 28/62] init: alpha, arc, arm, arm64, csky, m68k, microblaze, mips, nios2, openrisc, parisc, powerpc, s390, sh, sparc, um, x86, xtensa: rename initrd_{start,end} to virt_external_initramfs_{start,end} In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-29-safinaskar@gmail.com> Rename initrd_start to virt_external_initramfs_start and initrd_end to virt_external_initramfs_end. They refer to initramfs, not to initrd Signed-off-by: Askar Safin --- arch/alpha/kernel/core_irongate.c | 6 ++-- arch/alpha/kernel/setup.c | 24 +++++++------- arch/arc/mm/init.c | 4 +-- arch/arm/mm/init.c | 4 +-- arch/arm64/mm/init.c | 4 +-- arch/csky/kernel/setup.c | 16 ++++----- arch/m68k/kernel/setup_mm.c | 6 ++-- arch/m68k/kernel/setup_no.c | 6 ++-- arch/m68k/kernel/uboot.c | 6 ++-- arch/microblaze/mm/init.c | 6 ++-- arch/mips/ath79/prom.c | 8 ++--- arch/mips/kernel/setup.c | 44 ++++++++++++------------- arch/mips/sibyte/common/cfe.c | 22 ++++++------- arch/nios2/kernel/setup.c | 10 +++--- arch/openrisc/kernel/setup.c | 14 ++++---- arch/parisc/kernel/pdt.c | 2 +- arch/parisc/kernel/setup.c | 4 +-- arch/parisc/mm/init.c | 24 +++++++------- arch/powerpc/kernel/prom.c | 14 ++++---- arch/powerpc/kernel/setup-common.c | 14 ++++---- arch/powerpc/platforms/powermac/setup.c | 2 +- arch/s390/kernel/setup.c | 4 +-- arch/sh/kernel/setup.c | 8 ++--- arch/sparc/mm/init_32.c | 18 +++++----- arch/sparc/mm/init_64.c | 14 ++++---- arch/um/kernel/initrd.c | 4 +-- arch/x86/kernel/cpu/microcode/core.c | 8 ++--- arch/x86/kernel/setup.c | 12 +++---- arch/xtensa/kernel/setup.c | 14 ++++---- drivers/acpi/tables.c | 4 +-- drivers/of/fdt.c | 4 +-- include/linux/initrd.h | 4 +-- init/do_mounts_initrd.c | 2 +- init/initramfs.c | 40 +++++++++++----------- init/main.c | 18 +++++----- 35 files changed, 197 insertions(+), 197 deletions(-) diff --git a/arch/alpha/kernel/core_irongate.c b/arch/alpha/kernel/core_irongate.c index 3411564144ae..5519bb8fc6f2 100644 --- a/arch/alpha/kernel/core_irongate.c +++ b/arch/alpha/kernel/core_irongate.c @@ -226,11 +226,11 @@ albacore_init_arch(void) if (memtop > pci_mem) { #ifdef CONFIG_BLK_DEV_INITRD /* Move the initrd out of the way. */ - if (initrd_end && __pa(initrd_end) > pci_mem) { + if (virt_external_initramfs_end && __pa(virt_external_initramfs_end) > pci_mem) { unsigned long size; - size = initrd_end - initrd_start; - memblock_free((void *)initrd_start, PAGE_ALIGN(size)); + size = virt_external_initramfs_end - virt_external_initramfs_start; + memblock_free((void *)virt_external_initramfs_start, PAGE_ALIGN(size)); if (!move_initrd(pci_mem)) printk("irongate_init_arch: initrd too big " "(%ldK)\ndisabling initrd\n", diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c index bebdffafaee8..a344e71b2d2a 100644 --- a/arch/alpha/kernel/setup.c +++ b/arch/alpha/kernel/setup.c @@ -268,15 +268,15 @@ move_initrd(unsigned long mem_limit) void *start; unsigned long size; - size = initrd_end - initrd_start; + size = virt_external_initramfs_end - virt_external_initramfs_start; start = memblock_alloc(PAGE_ALIGN(size), PAGE_SIZE); if (!start || __pa(start) + size > mem_limit) { - initrd_start = initrd_end = 0; + virt_external_initramfs_start = virt_external_initramfs_end = 0; return NULL; } - memmove(start, (void *)initrd_start, size); - initrd_start = (unsigned long)start; - initrd_end = initrd_start + size; + memmove(start, (void *)virt_external_initramfs_start, size); + virt_external_initramfs_start = (unsigned long)start; + virt_external_initramfs_end = virt_external_initramfs_start + size; printk("initrd moved to %p\n", start); return start; } @@ -347,20 +347,20 @@ setup_memory(void *kernel_end) memblock_reserve(KERNEL_START_PHYS, kernel_size); #ifdef CONFIG_BLK_DEV_INITRD - initrd_start = INITRD_START; - if (initrd_start) { - initrd_end = initrd_start+INITRD_SIZE; + virt_external_initramfs_start = INITRD_START; + if (virt_external_initramfs_start) { + virt_external_initramfs_end = virt_external_initramfs_start+INITRD_SIZE; printk("Initial ramdisk at: 0x%p (%lu bytes)\n", - (void *) initrd_start, INITRD_SIZE); + (void *) virt_external_initramfs_start, INITRD_SIZE); - if ((void *)initrd_end > phys_to_virt(PFN_PHYS(max_low_pfn))) { + if ((void *)virt_external_initramfs_end > phys_to_virt(PFN_PHYS(max_low_pfn))) { if (!move_initrd(PFN_PHYS(max_low_pfn))) printk("initrd extends beyond end of memory " "(0x%08lx > 0x%p)\ndisabling initrd\n", - initrd_end, + virt_external_initramfs_end, phys_to_virt(PFN_PHYS(max_low_pfn))); } else { - memblock_reserve(virt_to_phys((void *)initrd_start), + memblock_reserve(virt_to_phys((void *)virt_external_initramfs_start), INITRD_SIZE); } } diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c index eb8a616a63c6..1e098d7fc6af 100644 --- a/arch/arc/mm/init.c +++ b/arch/arc/mm/init.c @@ -112,8 +112,8 @@ void __init setup_arch_memory(void) #ifdef CONFIG_BLK_DEV_INITRD if (phys_external_initramfs_size) { memblock_reserve(phys_external_initramfs_start, phys_external_initramfs_size); - initrd_start = (unsigned long)__va(phys_external_initramfs_start); - initrd_end = initrd_start + phys_external_initramfs_size; + virt_external_initramfs_start = (unsigned long)__va(phys_external_initramfs_start); + virt_external_initramfs_end = virt_external_initramfs_start + phys_external_initramfs_size; } #endif diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 93f8010b9115..4faeec51c522 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -439,9 +439,9 @@ void free_initmem(void) #ifdef CONFIG_BLK_DEV_INITRD void free_initrd_mem(unsigned long start, unsigned long end) { - if (start == initrd_start) + if (start == virt_external_initramfs_start) start = round_down(start, PAGE_SIZE); - if (end == initrd_end) + if (end == virt_external_initramfs_end) end = round_up(end, PAGE_SIZE); poison_init_mem((void *)start, PAGE_ALIGN(end) - start); diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index da517edcf824..3414e48c8c82 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -283,8 +283,8 @@ void __init arm64_memblock_init(void) memblock_reserve(__pa_symbol(_stext), _end - _stext); if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && phys_external_initramfs_size) { /* the generic initrd code expects virtual addresses */ - initrd_start = __phys_to_virt(phys_external_initramfs_start); - initrd_end = initrd_start + phys_external_initramfs_size; + virt_external_initramfs_start = __phys_to_virt(phys_external_initramfs_start); + virt_external_initramfs_end = virt_external_initramfs_start + phys_external_initramfs_size; } early_init_fdt_scan_reserved_mem(); diff --git a/arch/csky/kernel/setup.c b/arch/csky/kernel/setup.c index e0d6ca86ea8c..ce128888462e 100644 --- a/arch/csky/kernel/setup.c +++ b/arch/csky/kernel/setup.c @@ -17,35 +17,35 @@ static void __init setup_initrd(void) { unsigned long size; - if (initrd_start >= initrd_end) { + if (virt_external_initramfs_start >= virt_external_initramfs_end) { pr_err("initrd not found or empty"); goto disable; } - if (__pa(initrd_end) > PFN_PHYS(max_low_pfn)) { + if (__pa(virt_external_initramfs_end) > PFN_PHYS(max_low_pfn)) { pr_err("initrd extends beyond end of memory"); goto disable; } - size = initrd_end - initrd_start; + size = virt_external_initramfs_end - virt_external_initramfs_start; - if (memblock_is_region_reserved(__pa(initrd_start), size)) { + if (memblock_is_region_reserved(__pa(virt_external_initramfs_start), size)) { pr_err("INITRD: 0x%08lx+0x%08lx overlaps in-use memory region", - __pa(initrd_start), size); + __pa(virt_external_initramfs_start), size); goto disable; } - memblock_reserve(__pa(initrd_start), size); + memblock_reserve(__pa(virt_external_initramfs_start), size); pr_info("Initial ramdisk at: 0x%p (%lu bytes)\n", - (void *)(initrd_start), size); + (void *)(virt_external_initramfs_start), size); initrd_below_start_ok = 1; return; disable: - initrd_start = initrd_end = 0; + virt_external_initramfs_start = virt_external_initramfs_end = 0; pr_err(" - disabling initrd\n"); } diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c index c7e8de0d34bb..80f0544c1041 100644 --- a/arch/m68k/kernel/setup_mm.c +++ b/arch/m68k/kernel/setup_mm.c @@ -333,9 +333,9 @@ void __init setup_arch(char **cmdline_p) paging_init(); if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && m68k_ramdisk.size) { - initrd_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr); - initrd_end = initrd_start + m68k_ramdisk.size; - pr_info("initrd: %08lx - %08lx\n", initrd_start, initrd_end); + virt_external_initramfs_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr); + virt_external_initramfs_end = virt_external_initramfs_start + m68k_ramdisk.size; + pr_info("initrd: %08lx - %08lx\n", virt_external_initramfs_start, virt_external_initramfs_end); } #ifdef CONFIG_NATFEAT diff --git a/arch/m68k/kernel/setup_no.c b/arch/m68k/kernel/setup_no.c index f724875b15cc..4d98e0063725 100644 --- a/arch/m68k/kernel/setup_no.c +++ b/arch/m68k/kernel/setup_no.c @@ -155,9 +155,9 @@ void __init setup_arch(char **cmdline_p) max_pfn = max_low_pfn = PFN_DOWN(memory_end); #if defined(CONFIG_UBOOT) && defined(CONFIG_BLK_DEV_INITRD) - if ((initrd_start > 0) && (initrd_start < initrd_end) && - (initrd_end < memory_end)) - memblock_reserve(initrd_start, initrd_end - initrd_start); + if ((virt_external_initramfs_start > 0) && (virt_external_initramfs_start < virt_external_initramfs_end) && + (virt_external_initramfs_end < memory_end)) + memblock_reserve(virt_external_initramfs_start, virt_external_initramfs_end - virt_external_initramfs_start); #endif /* if defined(CONFIG_BLK_DEV_INITRD) */ /* diff --git a/arch/m68k/kernel/uboot.c b/arch/m68k/kernel/uboot.c index d278060a250c..5fc831a0794a 100644 --- a/arch/m68k/kernel/uboot.c +++ b/arch/m68k/kernel/uboot.c @@ -81,9 +81,9 @@ static void __init parse_uboot_commandline(char *commandp, int size) if (uboot_initrd_start && uboot_initrd_end && (uboot_initrd_end > uboot_initrd_start)) { - initrd_start = uboot_initrd_start; - initrd_end = uboot_initrd_end; - pr_info("initrd at 0x%lx:0x%lx\n", initrd_start, initrd_end); + virt_external_initramfs_start = uboot_initrd_start; + virt_external_initramfs_end = uboot_initrd_end; + pr_info("initrd at 0x%lx:0x%lx\n", virt_external_initramfs_start, virt_external_initramfs_end); } #endif /* if defined(CONFIG_BLK_DEV_INITRD) */ } diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c index 31d475cdb1c5..fabeca49c2c6 100644 --- a/arch/microblaze/mm/init.c +++ b/arch/microblaze/mm/init.c @@ -202,10 +202,10 @@ asmlinkage void __init mmu_init(void) #if defined(CONFIG_BLK_DEV_INITRD) /* Remove the init RAM disk from the available memory. */ - if (initrd_start) { + if (virt_external_initramfs_start) { unsigned long size; - size = initrd_end - initrd_start; - memblock_reserve(__virt_to_phys(initrd_start), size); + size = virt_external_initramfs_end - virt_external_initramfs_start; + memblock_reserve(__virt_to_phys(virt_external_initramfs_start), size); } #endif /* CONFIG_BLK_DEV_INITRD */ diff --git a/arch/mips/ath79/prom.c b/arch/mips/ath79/prom.c index cc6dc5600677..506dcada711b 100644 --- a/arch/mips/ath79/prom.c +++ b/arch/mips/ath79/prom.c @@ -25,10 +25,10 @@ void __init prom_init(void) #ifdef CONFIG_BLK_DEV_INITRD /* Read the initrd address from the firmware environment */ - initrd_start = fw_getenvl("initrd_start"); - if (initrd_start) { - initrd_start = KSEG0ADDR(initrd_start); - initrd_end = initrd_start + fw_getenvl("initrd_size"); + virt_external_initramfs_start = fw_getenvl("initrd_start"); + if (virt_external_initramfs_start) { + virt_external_initramfs_start = KSEG0ADDR(virt_external_initramfs_start); + virt_external_initramfs_end = virt_external_initramfs_start + fw_getenvl("initrd_size"); } #endif } diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c index a78e24873231..da11ae875539 100644 --- a/arch/mips/kernel/setup.c +++ b/arch/mips/kernel/setup.c @@ -126,15 +126,15 @@ static int __init rd_start_early(char *p) if (start < XKPHYS) start = (int)start; #endif - initrd_start = start; - initrd_end += start; + virt_external_initramfs_start = start; + virt_external_initramfs_end += start; return 0; } early_param("rd_start", rd_start_early); static int __init rd_size_early(char *p) { - initrd_end += memparse(p, &p); + virt_external_initramfs_end += memparse(p, &p); return 0; } early_param("rd_size", rd_size_early); @@ -146,13 +146,13 @@ static unsigned long __init init_initrd(void) /* * Board specific code or command line parser should have - * already set up initrd_start and initrd_end. In these cases + * already set up virt_external_initramfs_start and virt_external_initramfs_end. In these cases * perform sanity checks and use them if all looks good. */ - if (!initrd_start || initrd_end <= initrd_start) + if (!virt_external_initramfs_start || virt_external_initramfs_end <= virt_external_initramfs_start) goto disable; - if (initrd_start & ~PAGE_MASK) { + if (virt_external_initramfs_start & ~PAGE_MASK) { pr_err("initrd start must be page aligned\n"); goto disable; } @@ -164,19 +164,19 @@ static unsigned long __init init_initrd(void) * 32-bit. We need also to switch from KSEG0 to XKPHYS * addresses now, so the code can now safely use __pa(). */ - end = __pa(initrd_end); - initrd_end = (unsigned long)__va(end); - initrd_start = (unsigned long)__va(__pa(initrd_start)); + end = __pa(virt_external_initramfs_end); + virt_external_initramfs_end = (unsigned long)__va(end); + virt_external_initramfs_start = (unsigned long)__va(__pa(virt_external_initramfs_start)); - if (initrd_start < PAGE_OFFSET) { + if (virt_external_initramfs_start < PAGE_OFFSET) { pr_err("initrd start < PAGE_OFFSET\n"); goto disable; } return PFN_UP(end); disable: - initrd_start = 0; - initrd_end = 0; + virt_external_initramfs_start = 0; + virt_external_initramfs_end = 0; return 0; } @@ -189,21 +189,21 @@ static void __init maybe_bswap_initrd(void) u64 buf; /* Check for CPIO signature */ - if (!memcmp((void *)initrd_start, "070701", 6)) + if (!memcmp((void *)virt_external_initramfs_start, "070701", 6)) return; /* Check for compressed initrd */ - if (decompress_method((unsigned char *)initrd_start, 8, NULL)) + if (decompress_method((unsigned char *)virt_external_initramfs_start, 8, NULL)) return; /* Try again with a byte swapped header */ - buf = swab64p((u64 *)initrd_start); + buf = swab64p((u64 *)virt_external_initramfs_start); if (!memcmp(&buf, "070701", 6) || decompress_method((unsigned char *)(&buf), 8, NULL)) { unsigned long i; pr_info("Byteswapped initrd detected\n"); - for (i = initrd_start; i < ALIGN(initrd_end, 8); i += 8) + for (i = virt_external_initramfs_start; i < ALIGN(virt_external_initramfs_end, 8); i += 8) swab64s((u64 *)i); } #endif @@ -211,29 +211,29 @@ static void __init maybe_bswap_initrd(void) static void __init finalize_initrd(void) { - unsigned long size = initrd_end - initrd_start; + unsigned long size = virt_external_initramfs_end - virt_external_initramfs_start; if (size == 0) { printk(KERN_INFO "Initrd not found or empty"); goto disable; } - if (__pa(initrd_end) > PFN_PHYS(max_low_pfn)) { + if (__pa(virt_external_initramfs_end) > PFN_PHYS(max_low_pfn)) { printk(KERN_ERR "Initrd extends beyond end of memory"); goto disable; } maybe_bswap_initrd(); - memblock_reserve(__pa(initrd_start), size); + memblock_reserve(__pa(virt_external_initramfs_start), size); initrd_below_start_ok = 1; pr_info("Initial ramdisk at: 0x%lx (%lu bytes)\n", - initrd_start, size); + virt_external_initramfs_start, size); return; disable: printk(KERN_CONT " - disabling initrd\n"); - initrd_start = 0; - initrd_end = 0; + virt_external_initramfs_start = 0; + virt_external_initramfs_end = 0; } #else /* !CONFIG_BLK_DEV_INITRD */ diff --git a/arch/mips/sibyte/common/cfe.c b/arch/mips/sibyte/common/cfe.c index 2cb90dbbe843..642b7d615594 100644 --- a/arch/mips/sibyte/common/cfe.c +++ b/arch/mips/sibyte/common/cfe.c @@ -38,7 +38,7 @@ int cfe_cons_handle; #ifdef CONFIG_BLK_DEV_INITRD -extern unsigned long initrd_start, initrd_end; +extern unsigned long virt_external_initramfs_start, virt_external_initramfs_end; #endif static void __noreturn cfe_linux_exit(void *arg) @@ -86,9 +86,9 @@ static __init void prom_meminit(void) unsigned long initrd_pstart; unsigned long initrd_pend; - initrd_pstart = CPHYSADDR(initrd_start); - initrd_pend = CPHYSADDR(initrd_end); - if (initrd_start && + initrd_pstart = CPHYSADDR(virt_external_initramfs_start); + initrd_pend = CPHYSADDR(virt_external_initramfs_end); + if (virt_external_initramfs_start && ((initrd_pstart > MAX_RAM_SIZE) || (initrd_pend > MAX_RAM_SIZE))) { panic("initrd out of addressable memory"); @@ -105,7 +105,7 @@ static __init void prom_meminit(void) * ramdisk */ #ifdef CONFIG_BLK_DEV_INITRD - if (initrd_start) { + if (virt_external_initramfs_start) { if ((initrd_pstart > addr) && (initrd_pstart < (addr + size))) { memblock_add(addr, @@ -139,7 +139,7 @@ static __init void prom_meminit(void) } } #ifdef CONFIG_BLK_DEV_INITRD - if (initrd_start) { + if (virt_external_initramfs_start) { memblock_add(initrd_pstart, initrd_pend - initrd_pstart); memblock_reserve(initrd_pstart, initrd_pend - initrd_pstart); } @@ -183,17 +183,17 @@ static int __init initrd_setup(char *str) goto fail; } *(tmp-1) = '@'; - initrd_start = simple_strtoul(tmp, &endptr, 16); + virt_external_initramfs_start = simple_strtoul(tmp, &endptr, 16); if (*endptr) { goto fail; } - initrd_end = initrd_start + initrd_size; - printk("Found initrd of %lx@%lx\n", initrd_size, initrd_start); + virt_external_initramfs_end = virt_external_initramfs_start + initrd_size; + printk("Found initrd of %lx@%lx\n", initrd_size, virt_external_initramfs_start); return 1; fail: printk("Bad initrd argument. Disabling initrd\n"); - initrd_start = 0; - initrd_end = 0; + virt_external_initramfs_start = 0; + virt_external_initramfs_end = 0; return 1; } diff --git a/arch/nios2/kernel/setup.c b/arch/nios2/kernel/setup.c index 2a40150142c3..3cc44fa4931c 100644 --- a/arch/nios2/kernel/setup.c +++ b/arch/nios2/kernel/setup.c @@ -109,8 +109,8 @@ asmlinkage void __init nios2_boot_init(unsigned r4, unsigned r5, unsigned r6, if (r4 == 0x534f494e) { /* r4 is magic NIOS */ #if defined(CONFIG_BLK_DEV_INITRD) if (r5) { /* initramfs */ - initrd_start = r5; - initrd_end = r6; + virt_external_initramfs_start = r5; + virt_external_initramfs_end = r6; } #endif /* CONFIG_BLK_DEV_INITRD */ dtb_passed = r6; @@ -161,9 +161,9 @@ void __init setup_arch(char **cmdline_p) memblock_reserve(__pa_symbol(_stext), _end - _stext); #ifdef CONFIG_BLK_DEV_INITRD - if (initrd_start) { - memblock_reserve(virt_to_phys((void *)initrd_start), - initrd_end - initrd_start); + if (virt_external_initramfs_start) { + memblock_reserve(virt_to_phys((void *)virt_external_initramfs_start), + virt_external_initramfs_end - virt_external_initramfs_start); } #endif /* CONFIG_BLK_DEV_INITRD */ diff --git a/arch/openrisc/kernel/setup.c b/arch/openrisc/kernel/setup.c index a9fb9cc6779e..f387dc57ec35 100644 --- a/arch/openrisc/kernel/setup.c +++ b/arch/openrisc/kernel/setup.c @@ -77,9 +77,9 @@ static void __init setup_memory(void) #ifdef CONFIG_BLK_DEV_INITRD /* Then reserve the initrd, if any */ - if (initrd_start && (initrd_end > initrd_start)) { - unsigned long aligned_start = ALIGN_DOWN(initrd_start, PAGE_SIZE); - unsigned long aligned_end = ALIGN(initrd_end, PAGE_SIZE); + if (virt_external_initramfs_start && (virt_external_initramfs_end > virt_external_initramfs_start)) { + unsigned long aligned_start = ALIGN_DOWN(virt_external_initramfs_start, PAGE_SIZE); + unsigned long aligned_end = ALIGN(virt_external_initramfs_end, PAGE_SIZE); memblock_reserve(__pa(aligned_start), aligned_end - aligned_start); } @@ -239,13 +239,13 @@ void __init setup_arch(char **cmdline_p) setup_initial_init_mm(_stext, _etext, _edata, _end); #ifdef CONFIG_BLK_DEV_INITRD - if (initrd_start == initrd_end) { + if (virt_external_initramfs_start == virt_external_initramfs_end) { printk(KERN_INFO "Initial ramdisk not found\n"); - initrd_start = 0; - initrd_end = 0; + virt_external_initramfs_start = 0; + virt_external_initramfs_end = 0; } else { printk(KERN_INFO "Initial ramdisk at: 0x%p (%lu bytes)\n", - (void *)(initrd_start), initrd_end - initrd_start); + (void *)(virt_external_initramfs_start), virt_external_initramfs_end - virt_external_initramfs_start); initrd_below_start_ok = 1; } #endif diff --git a/arch/parisc/kernel/pdt.c b/arch/parisc/kernel/pdt.c index b70b67adb855..3715a3b088a7 100644 --- a/arch/parisc/kernel/pdt.c +++ b/arch/parisc/kernel/pdt.c @@ -229,7 +229,7 @@ void __init pdc_pdt_init(void) addr = pdt_entry[i] & PDT_ADDR_PHYS_MASK; if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && - addr >= initrd_start && addr < initrd_end) + addr >= virt_external_initramfs_start && addr < virt_external_initramfs_end) pr_crit("CRITICAL: initrd possibly broken " "due to bad memory!\n"); diff --git a/arch/parisc/kernel/setup.c b/arch/parisc/kernel/setup.c index ace483b6f19a..41f45fa177d0 100644 --- a/arch/parisc/kernel/setup.c +++ b/arch/parisc/kernel/setup.c @@ -71,8 +71,8 @@ static void __init setup_cmdline(char **cmdline_p) #ifdef CONFIG_BLK_DEV_INITRD /* did palo pass us a ramdisk? */ if (boot_args[2] != 0) { - initrd_start = (unsigned long)__va(boot_args[2]); - initrd_end = (unsigned long)__va(boot_args[3]); + virt_external_initramfs_start = (unsigned long)__va(boot_args[2]); + virt_external_initramfs_end = (unsigned long)__va(boot_args[3]); } #endif diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c index 14270715d754..74bfe9797589 100644 --- a/arch/parisc/mm/init.c +++ b/arch/parisc/mm/init.c @@ -298,20 +298,20 @@ static void __init setup_bootmem(void) #endif #ifdef CONFIG_BLK_DEV_INITRD - if (initrd_start) { - printk(KERN_INFO "initrd: %08lx-%08lx\n", initrd_start, initrd_end); - if (__pa(initrd_start) < mem_max) { + if (virt_external_initramfs_start) { + printk(KERN_INFO "initrd: %08lx-%08lx\n", virt_external_initramfs_start, virt_external_initramfs_end); + if (__pa(virt_external_initramfs_start) < mem_max) { unsigned long initrd_reserve; - if (__pa(initrd_end) > mem_max) { - initrd_reserve = mem_max - __pa(initrd_start); + if (__pa(virt_external_initramfs_end) > mem_max) { + initrd_reserve = mem_max - __pa(virt_external_initramfs_start); } else { - initrd_reserve = initrd_end - initrd_start; + initrd_reserve = virt_external_initramfs_end - virt_external_initramfs_start; } initrd_below_start_ok = 1; - printk(KERN_INFO "initrd: reserving %08lx-%08lx (mem_max %08lx)\n", __pa(initrd_start), __pa(initrd_start) + initrd_reserve, mem_max); + printk(KERN_INFO "initrd: reserving %08lx-%08lx (mem_max %08lx)\n", __pa(virt_external_initramfs_start), __pa(virt_external_initramfs_start) + initrd_reserve, mem_max); - memblock_reserve(__pa(initrd_start), initrd_reserve); + memblock_reserve(__pa(virt_external_initramfs_start), initrd_reserve); } } #endif @@ -633,10 +633,10 @@ static void __init pagetable_init(void) } #ifdef CONFIG_BLK_DEV_INITRD - if (initrd_end && initrd_end > mem_limit) { - printk(KERN_INFO "initrd: mapping %08lx-%08lx\n", initrd_start, initrd_end); - map_pages(initrd_start, __pa(initrd_start), - initrd_end - initrd_start, PAGE_KERNEL, 0); + if (virt_external_initramfs_end && virt_external_initramfs_end > mem_limit) { + printk(KERN_INFO "initrd: mapping %08lx-%08lx\n", virt_external_initramfs_start, virt_external_initramfs_end); + map_pages(virt_external_initramfs_start, __pa(virt_external_initramfs_start), + virt_external_initramfs_end - virt_external_initramfs_start, PAGE_KERNEL, 0); } #endif diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index 9ed9dde7d231..b7858b0bd697 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -97,11 +97,11 @@ early_param("mem", early_parse_mem); static inline int overlaps_initrd(unsigned long start, unsigned long size) { #ifdef CONFIG_BLK_DEV_INITRD - if (!initrd_start) + if (!virt_external_initramfs_start) return 0; - return (start + size) > ALIGN_DOWN(initrd_start, PAGE_SIZE) && - start <= ALIGN(initrd_end, PAGE_SIZE); + return (start + size) > ALIGN_DOWN(virt_external_initramfs_start, PAGE_SIZE) && + start <= ALIGN(virt_external_initramfs_end, PAGE_SIZE); #else return 0; #endif @@ -686,10 +686,10 @@ static void __init early_reserve_mem(void) #ifdef CONFIG_BLK_DEV_INITRD /* Then reserve the initrd, if any */ - if (initrd_start && (initrd_end > initrd_start)) { - memblock_reserve(ALIGN_DOWN(__pa(initrd_start), PAGE_SIZE), - ALIGN(initrd_end, PAGE_SIZE) - - ALIGN_DOWN(initrd_start, PAGE_SIZE)); + if (virt_external_initramfs_start && (virt_external_initramfs_end > virt_external_initramfs_start)) { + memblock_reserve(ALIGN_DOWN(__pa(virt_external_initramfs_start), PAGE_SIZE), + ALIGN(virt_external_initramfs_end, PAGE_SIZE) - + ALIGN_DOWN(virt_external_initramfs_start, PAGE_SIZE)); } #endif /* CONFIG_BLK_DEV_INITRD */ diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index 97d330f3b8f1..eff369cba0e5 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -360,17 +360,17 @@ const struct seq_operations cpuinfo_op = { void __init check_for_initrd(void) { #ifdef CONFIG_BLK_DEV_INITRD - DBG(" -> check_for_initrd() initrd_start=0x%lx initrd_end=0x%lx\n", - initrd_start, initrd_end); + DBG(" -> check_for_initrd() virt_external_initramfs_start=0x%lx virt_external_initramfs_end=0x%lx\n", + virt_external_initramfs_start, virt_external_initramfs_end); /* If we were not passed an sensible initramfs, clear initramfs reference. */ - if (!(is_kernel_addr(initrd_start) && is_kernel_addr(initrd_end) && - initrd_end > initrd_start)) - initrd_start = initrd_end = 0; + if (!(is_kernel_addr(virt_external_initramfs_start) && is_kernel_addr(virt_external_initramfs_end) && + virt_external_initramfs_end > virt_external_initramfs_start)) + virt_external_initramfs_start = virt_external_initramfs_end = 0; - if (initrd_start) - pr_info("Found initramfs at 0x%lx:0x%lx\n", initrd_start, initrd_end); + if (virt_external_initramfs_start) + pr_info("Found initramfs at 0x%lx:0x%lx\n", virt_external_initramfs_start, virt_external_initramfs_end); DBG(" <- check_for_initrd()\n"); #endif /* CONFIG_BLK_DEV_INITRD */ diff --git a/arch/powerpc/platforms/powermac/setup.c b/arch/powerpc/platforms/powermac/setup.c index 237d8386a3f4..4c3b9ed5428d 100644 --- a/arch/powerpc/platforms/powermac/setup.c +++ b/arch/powerpc/platforms/powermac/setup.c @@ -296,7 +296,7 @@ static void __init pmac_setup_arch(void) #endif #ifdef CONFIG_PPC32 #ifdef CONFIG_BLK_DEV_INITRD - if (!initrd_start) + if (!virt_external_initramfs_start) #endif ROOT_DEV = DEFAULT_ROOT_DEVICE; #endif diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index a4ce721b7fe8..9bdb6f6b893e 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -672,8 +672,8 @@ static void __init reserve_initrd(void) if (!IS_ENABLED(CONFIG_BLK_DEV_INITRD) || !get_physmem_reserved(RR_INITRD, &addr, &size)) return; - initrd_start = (unsigned long)__va(addr); - initrd_end = initrd_start + size; + virt_external_initramfs_start = (unsigned long)__va(addr); + virt_external_initramfs_end = virt_external_initramfs_start + size; memblock_reserve(addr, size); } diff --git a/arch/sh/kernel/setup.c b/arch/sh/kernel/setup.c index c4312ee13db9..9ce9dc5b9e56 100644 --- a/arch/sh/kernel/setup.c +++ b/arch/sh/kernel/setup.c @@ -153,16 +153,16 @@ void __init check_for_initrd(void) /* * Address sanitization */ - initrd_start = (unsigned long)__va(start); - initrd_end = initrd_start + INITRD_SIZE; + virt_external_initramfs_start = (unsigned long)__va(start); + virt_external_initramfs_end = virt_external_initramfs_start + INITRD_SIZE; - memblock_reserve(__pa(initrd_start), INITRD_SIZE); + memblock_reserve(__pa(virt_external_initramfs_start), INITRD_SIZE); return; disable: pr_info("initrd disabled\n"); - initrd_start = initrd_end = 0; + virt_external_initramfs_start = virt_external_initramfs_end = 0; #endif } diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c index fdc93dd12c3e..7b7722ff5232 100644 --- a/arch/sparc/mm/init_32.c +++ b/arch/sparc/mm/init_32.c @@ -109,20 +109,20 @@ static void __init find_ramdisk(unsigned long end_of_phys_memory) if (sparc_ramdisk_image) { if (sparc_ramdisk_image >= (unsigned long)&_end - 2 * PAGE_SIZE) sparc_ramdisk_image -= KERNBASE; - initrd_start = sparc_ramdisk_image + phys_base; - initrd_end = initrd_start + sparc_ramdisk_size; - if (initrd_end > end_of_phys_memory) { + virt_external_initramfs_start = sparc_ramdisk_image + phys_base; + virt_external_initramfs_end = virt_external_initramfs_start + sparc_ramdisk_size; + if (virt_external_initramfs_end > end_of_phys_memory) { printk(KERN_CRIT "initrd extends beyond end of memory " "(0x%016lx > 0x%016lx)\ndisabling initrd\n", - initrd_end, end_of_phys_memory); - initrd_start = 0; + virt_external_initramfs_end, end_of_phys_memory); + virt_external_initramfs_start = 0; } else { /* Reserve the initrd image area. */ - size = initrd_end - initrd_start; - memblock_reserve(initrd_start, size); + size = virt_external_initramfs_end - virt_external_initramfs_start; + memblock_reserve(virt_external_initramfs_start, size); - initrd_start = (initrd_start - phys_base) + PAGE_OFFSET; - initrd_end = (initrd_end - phys_base) + PAGE_OFFSET; + virt_external_initramfs_start = (virt_external_initramfs_start - phys_base) + PAGE_OFFSET; + virt_external_initramfs_end = (virt_external_initramfs_end - phys_base) + PAGE_OFFSET; } } #endif diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index 7ed58bf3aaca..af249a654e79 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -901,13 +901,13 @@ static void __init find_ramdisk(unsigned long phys_base) numadbg("Found ramdisk at physical address 0x%lx, size %u\n", ramdisk_image, sparc_ramdisk_size); - initrd_start = ramdisk_image; - initrd_end = ramdisk_image + sparc_ramdisk_size; + virt_external_initramfs_start = ramdisk_image; + virt_external_initramfs_end = ramdisk_image + sparc_ramdisk_size; - memblock_reserve(initrd_start, sparc_ramdisk_size); + memblock_reserve(virt_external_initramfs_start, sparc_ramdisk_size); - initrd_start += PAGE_OFFSET; - initrd_end += PAGE_OFFSET; + virt_external_initramfs_start += PAGE_OFFSET; + virt_external_initramfs_end += PAGE_OFFSET; } #endif } @@ -2485,8 +2485,8 @@ int page_in_phys_avail(unsigned long paddr) if (paddr >= kern_base && paddr < (kern_base + kern_size)) return 1; #ifdef CONFIG_BLK_DEV_INITRD - if (paddr >= __pa(initrd_start) && - paddr < __pa(PAGE_ALIGN(initrd_end))) + if (paddr >= __pa(virt_external_initramfs_start) && + paddr < __pa(PAGE_ALIGN(virt_external_initramfs_end))) return 1; #endif diff --git a/arch/um/kernel/initrd.c b/arch/um/kernel/initrd.c index 99dba827461c..e6113192a6b6 100644 --- a/arch/um/kernel/initrd.c +++ b/arch/um/kernel/initrd.c @@ -27,8 +27,8 @@ int __init read_initrd(void) if (!area) return 0; - initrd_start = (unsigned long) area; - initrd_end = initrd_start + size; + virt_external_initramfs_start = (unsigned long) area; + virt_external_initramfs_end = virt_external_initramfs_start + size; return 0; } diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c index b92e09a87c69..b8169f14d175 100644 --- a/arch/x86/kernel/cpu/microcode/core.c +++ b/arch/x86/kernel/cpu/microcode/core.c @@ -213,13 +213,13 @@ struct cpio_data __init find_microcode_in_initrd(const char *path) #endif /* - * Fixup the start address: after reserve_initrd() runs, initrd_start + * Fixup the start address: after reserve_initrd() runs, virt_external_initramfs_start * has the virtual address of the beginning of the initrd. It also - * possibly relocates the ramdisk. In either case, initrd_start contains + * possibly relocates the ramdisk. In either case, virt_external_initramfs_start contains * the updated address so use that instead. */ - if (initrd_start) - start = initrd_start; + if (virt_external_initramfs_start) + start = virt_external_initramfs_start; return find_cpio_data(path, (void *)start, size, NULL); #else /* !CONFIG_BLK_DEV_INITRD */ diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index e727c7a7f648..167b9ef12ebb 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -328,12 +328,12 @@ static void __init relocate_initrd(void) panic("Cannot find place for new RAMDISK of size %lld\n", ramdisk_size); - initrd_start = relocated_ramdisk + PAGE_OFFSET; - initrd_end = initrd_start + ramdisk_size; + virt_external_initramfs_start = relocated_ramdisk + PAGE_OFFSET; + virt_external_initramfs_end = virt_external_initramfs_start + ramdisk_size; printk(KERN_INFO "Allocated new RAMDISK: [mem %#010llx-%#010llx]\n", relocated_ramdisk, relocated_ramdisk + ramdisk_size - 1); - ret = copy_from_early_mem((void *)initrd_start, ramdisk_image, ramdisk_size); + ret = copy_from_early_mem((void *)virt_external_initramfs_start, ramdisk_image, ramdisk_size); if (ret) panic("Copy RAMDISK failed\n"); @@ -368,7 +368,7 @@ static void __init reserve_initrd(void) !ramdisk_image || !ramdisk_size) return; /* No initrd provided by bootloader */ - initrd_start = 0; + virt_external_initramfs_start = 0; printk(KERN_INFO "RAMDISK: [mem %#010llx-%#010llx]\n", ramdisk_image, ramdisk_end - 1); @@ -376,8 +376,8 @@ static void __init reserve_initrd(void) if (pfn_range_is_mapped(PFN_DOWN(ramdisk_image), PFN_DOWN(ramdisk_end))) { /* All are mapped, easy case */ - initrd_start = ramdisk_image + PAGE_OFFSET; - initrd_end = initrd_start + ramdisk_size; + virt_external_initramfs_start = ramdisk_image + PAGE_OFFSET; + virt_external_initramfs_end = virt_external_initramfs_start + ramdisk_size; return; } diff --git a/arch/xtensa/kernel/setup.c b/arch/xtensa/kernel/setup.c index f72e280363be..2e9003be3e8c 100644 --- a/arch/xtensa/kernel/setup.c +++ b/arch/xtensa/kernel/setup.c @@ -49,8 +49,8 @@ #include #ifdef CONFIG_BLK_DEV_INITRD -extern unsigned long initrd_start; -extern unsigned long initrd_end; +extern unsigned long virt_external_initramfs_start; +extern unsigned long virt_external_initramfs_end; extern int initrd_below_start_ok; #endif @@ -106,8 +106,8 @@ static int __init parse_tag_initrd(const bp_tag_t* tag) { struct bp_meminfo *mi = (struct bp_meminfo *)(tag->data); - initrd_start = (unsigned long)__va(mi->start); - initrd_end = (unsigned long)__va(mi->end); + virt_external_initramfs_start = (unsigned long)__va(mi->start); + virt_external_initramfs_end = (unsigned long)__va(mi->end); return 0; } @@ -290,11 +290,11 @@ void __init setup_arch(char **cmdline_p) /* Reserve some memory regions */ #ifdef CONFIG_BLK_DEV_INITRD - if (initrd_start < initrd_end && - !mem_reserve(__pa(initrd_start), __pa(initrd_end))) + if (virt_external_initramfs_start < virt_external_initramfs_end && + !mem_reserve(__pa(virt_external_initramfs_start), __pa(virt_external_initramfs_end))) initrd_below_start_ok = 1; else - initrd_start = 0; + virt_external_initramfs_start = 0; #endif mem_reserve(__pa(_stext), __pa(_end)); diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c index 3160cb7dca00..37ad99c10ac4 100644 --- a/drivers/acpi/tables.c +++ b/drivers/acpi/tables.c @@ -432,8 +432,8 @@ void __init acpi_table_upgrade(void) data = __builtin_initramfs_start; size = __builtin_initramfs_size; } else { - data = (void *)initrd_start; - size = initrd_end - initrd_start; + data = (void *)virt_external_initramfs_start; + size = virt_external_initramfs_end - virt_external_initramfs_start; } if (data == NULL || size == 0) diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index 9c4c9be948c5..baf8347e0314 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -765,8 +765,8 @@ static void __early_init_dt_declare_initrd(unsigned long start, */ if (!IS_ENABLED(CONFIG_ARM64) && !(IS_ENABLED(CONFIG_RISCV) && IS_ENABLED(CONFIG_64BIT))) { - initrd_start = (unsigned long)__va(start); - initrd_end = (unsigned long)__va(end); + virt_external_initramfs_start = (unsigned long)__va(start); + virt_external_initramfs_end = (unsigned long)__va(end); initrd_below_start_ok = 1; } } diff --git a/include/linux/initrd.h b/include/linux/initrd.h index 23c08e88234c..f19efebe8221 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -3,10 +3,10 @@ #ifndef __LINUX_INITRD_H #define __LINUX_INITRD_H -/* 1 if it is not an error if initrd_start < memory_start */ +/* 1 if it is not an error if virt_external_initramfs_start < memory_start */ extern int initrd_below_start_ok; -extern unsigned long initrd_start, initrd_end; +extern unsigned long virt_external_initramfs_start, virt_external_initramfs_end; extern void free_initrd_mem(unsigned long, unsigned long); #ifdef CONFIG_BLK_DEV_INITRD diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c index 06be76aa602c..8bdeb205a0cd 100644 --- a/init/do_mounts_initrd.c +++ b/init/do_mounts_initrd.c @@ -12,7 +12,7 @@ #include "do_mounts.h" -unsigned long initrd_start, initrd_end; +unsigned long virt_external_initramfs_start, virt_external_initramfs_end; int initrd_below_start_ok; static int __init early_initrdmem(char *p) diff --git a/init/initramfs.c b/init/initramfs.c index 5242d851e839..9a221c713c60 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -611,7 +611,7 @@ void __init reserve_initrd_mem(void) unsigned long size; /* Ignore the virtul address computed during device tree parsing */ - initrd_start = initrd_end = 0; + virt_external_initramfs_start = virt_external_initramfs_end = 0; if (!phys_external_initramfs_size) return; @@ -639,15 +639,15 @@ void __init reserve_initrd_mem(void) memblock_reserve(start, size); /* Now convert initrd to virtual addresses */ - initrd_start = (unsigned long)__va(phys_external_initramfs_start); - initrd_end = initrd_start + phys_external_initramfs_size; + virt_external_initramfs_start = (unsigned long)__va(phys_external_initramfs_start); + virt_external_initramfs_end = virt_external_initramfs_start + phys_external_initramfs_size; initrd_below_start_ok = 1; return; disable: pr_cont(" - disabling initrd\n"); - initrd_start = 0; - initrd_end = 0; + virt_external_initramfs_start = 0; + virt_external_initramfs_end = 0; } void __weak __init free_initrd_mem(unsigned long start, unsigned long end) @@ -673,17 +673,17 @@ static bool __init kexec_free_initrd(void) * If the initrd region is overlapped with crashkernel reserved region, * free only memory that is not part of crashkernel region. */ - if (initrd_start >= crashk_end || initrd_end <= crashk_start) + if (virt_external_initramfs_start >= crashk_end || virt_external_initramfs_end <= crashk_start) return false; /* * Initialize initrd memory region since the kexec boot does not do. */ - memset((void *)initrd_start, 0, initrd_end - initrd_start); - if (initrd_start < crashk_start) - free_initrd_mem(initrd_start, crashk_start); - if (initrd_end > crashk_end) - free_initrd_mem(crashk_end, initrd_end); + memset((void *)virt_external_initramfs_start, 0, virt_external_initramfs_end - virt_external_initramfs_start); + if (virt_external_initramfs_start < crashk_start) + free_initrd_mem(virt_external_initramfs_start, crashk_start); + if (virt_external_initramfs_end > crashk_end) + free_initrd_mem(crashk_end, virt_external_initramfs_end); return true; } #else @@ -700,12 +700,12 @@ static void __init do_populate_rootfs(void *unused, async_cookie_t cookie) if (err) panic_show_mem("%s", err); /* Failed to decompress INTERNAL initramfs */ - if (!initrd_start || IS_ENABLED(CONFIG_INITRAMFS_FORCE)) + if (!virt_external_initramfs_start || IS_ENABLED(CONFIG_INITRAMFS_FORCE)) goto done; printk(KERN_INFO "Unpacking initramfs...\n"); - err = unpack_to_rootfs((char *)initrd_start, initrd_end - initrd_start); + err = unpack_to_rootfs((char *)virt_external_initramfs_start, virt_external_initramfs_end - virt_external_initramfs_start); if (err) { printk(KERN_EMERG "Initramfs unpacking failed: %s\n", err); } @@ -717,16 +717,16 @@ static void __init do_populate_rootfs(void *unused, async_cookie_t cookie) * If the initrd region is overlapped with crashkernel reserved region, * free only memory that is not part of crashkernel region. */ - if (!do_retain_initrd && initrd_start && !kexec_free_initrd()) { - free_initrd_mem(initrd_start, initrd_end); - } else if (do_retain_initrd && initrd_start) { - bin_attr_initrd.size = initrd_end - initrd_start; - bin_attr_initrd.private = (void *)initrd_start; + if (!do_retain_initrd && virt_external_initramfs_start && !kexec_free_initrd()) { + free_initrd_mem(virt_external_initramfs_start, virt_external_initramfs_end); + } else if (do_retain_initrd && virt_external_initramfs_start) { + bin_attr_initrd.size = virt_external_initramfs_end - virt_external_initramfs_start; + bin_attr_initrd.private = (void *)virt_external_initramfs_start; if (sysfs_create_bin_file(firmware_kobj, &bin_attr_initrd)) pr_err("Failed to create initrd sysfs file"); } - initrd_start = 0; - initrd_end = 0; + virt_external_initramfs_start = 0; + virt_external_initramfs_end = 0; init_flush_fput(); } diff --git a/init/main.c b/init/main.c index 0ee0ee7b7c2c..5f4d860ab72a 100644 --- a/init/main.c +++ b/init/main.c @@ -271,10 +271,10 @@ static void * __init get_boot_config_from_initrd(size_t *_size) u32 *hdr; int i; - if (!initrd_end) + if (!virt_external_initramfs_end) return NULL; - data = (char *)initrd_end - BOOTCONFIG_MAGIC_LEN; + data = (char *)virt_external_initramfs_end - BOOTCONFIG_MAGIC_LEN; /* * Since Grub may align the size of initrd to 4, we must * check the preceding 3 bytes as well. @@ -292,9 +292,9 @@ static void * __init get_boot_config_from_initrd(size_t *_size) csum = le32_to_cpu(hdr[1]); data = ((void *)hdr) - size; - if ((unsigned long)data < initrd_start) { + if ((unsigned long)data < virt_external_initramfs_start) { pr_err("bootconfig size %d is greater than initrd size %ld\n", - size, initrd_end - initrd_start); + size, virt_external_initramfs_end - virt_external_initramfs_start); return NULL; } @@ -304,7 +304,7 @@ static void * __init get_boot_config_from_initrd(size_t *_size) } /* Remove bootconfig from initramfs/initrd */ - initrd_end = (unsigned long)data; + virt_external_initramfs_end = (unsigned long)data; if (_size) *_size = size; @@ -1047,12 +1047,12 @@ void start_kernel(void) locking_selftest(); #ifdef CONFIG_BLK_DEV_INITRD - if (initrd_start && !initrd_below_start_ok && - page_to_pfn(virt_to_page((void *)initrd_start)) < min_low_pfn) { + if (virt_external_initramfs_start && !initrd_below_start_ok && + page_to_pfn(virt_to_page((void *)virt_external_initramfs_start)) < min_low_pfn) { pr_crit("initrd overwritten (0x%08lx < 0x%08lx) - disabling it.\n", - page_to_pfn(virt_to_page((void *)initrd_start)), + page_to_pfn(virt_to_page((void *)virt_external_initramfs_start)), min_low_pfn); - initrd_start = 0; + virt_external_initramfs_start = 0; } #endif setup_per_cpu_pageset(); -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:08 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:08 +0000 Subject: [PATCH RESEND 29/62] init: move virt_external_initramfs_{start,end} to init/initramfs.c In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-30-safinaskar@gmail.com> Move definitions of virt_external_initramfs_start and virt_external_initramfs_end to init/initramfs.c Signed-off-by: Askar Safin --- init/do_mounts_initrd.c | 1 - init/initramfs.c | 2 ++ 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c index 8bdeb205a0cd..535ce459ab94 100644 --- a/init/do_mounts_initrd.c +++ b/init/do_mounts_initrd.c @@ -12,7 +12,6 @@ #include "do_mounts.h" -unsigned long virt_external_initramfs_start, virt_external_initramfs_end; int initrd_below_start_ok; static int __init early_initrdmem(char *p) diff --git a/init/initramfs.c b/init/initramfs.c index 9a221c713c60..d2301cc6c470 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -600,6 +600,8 @@ __setup("initramfs_async=", initramfs_async_setup); #include #include +unsigned long virt_external_initramfs_start, virt_external_initramfs_end; + phys_addr_t phys_external_initramfs_start __initdata; unsigned long phys_external_initramfs_size __initdata; -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:09 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:09 +0000 Subject: [PATCH RESEND 30/62] doc: remove documentation for block device 4 0 In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-31-safinaskar@gmail.com> It doesn't work. I tested this both in system booted using initramfs and in system booted from real root device directly Signed-off-by: Askar Safin --- Documentation/admin-guide/devices.txt | 6 ------ 1 file changed, 6 deletions(-) diff --git a/Documentation/admin-guide/devices.txt b/Documentation/admin-guide/devices.txt index 27835389ca49..6ce0940233a8 100644 --- a/Documentation/admin-guide/devices.txt +++ b/Documentation/admin-guide/devices.txt @@ -138,12 +138,6 @@ number for BSD PTY devices. As of Linux 2.1.115, this is no longer supported. Use major numbers 2 and 3. - 4 block Aliases for dynamically allocated major devices to be used - when its not possible to create the real device nodes - because the root filesystem is mounted read-only. - - 0 = /dev/root - 5 char Alternate TTY devices 0 = /dev/tty Current TTY device 1 = /dev/console System console -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:10 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:10 +0000 Subject: [PATCH RESEND 31/62] init: rename initrd_below_start_ok to initramfs_below_start_ok In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-32-safinaskar@gmail.com> It refers to initramfs, not to initrd Signed-off-by: Askar Safin --- arch/csky/kernel/setup.c | 2 +- arch/mips/kernel/setup.c | 2 +- arch/openrisc/kernel/setup.c | 2 +- arch/parisc/mm/init.c | 2 +- arch/xtensa/kernel/setup.c | 4 ++-- drivers/of/fdt.c | 2 +- include/linux/initrd.h | 2 +- init/do_mounts_initrd.c | 2 +- init/initramfs.c | 2 +- init/main.c | 2 +- 10 files changed, 11 insertions(+), 11 deletions(-) diff --git a/arch/csky/kernel/setup.c b/arch/csky/kernel/setup.c index ce128888462e..403a977b8c1f 100644 --- a/arch/csky/kernel/setup.c +++ b/arch/csky/kernel/setup.c @@ -40,7 +40,7 @@ static void __init setup_initrd(void) pr_info("Initial ramdisk at: 0x%p (%lu bytes)\n", (void *)(virt_external_initramfs_start), size); - initrd_below_start_ok = 1; + initramfs_below_start_ok = 1; return; diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c index da11ae875539..aed454ebd751 100644 --- a/arch/mips/kernel/setup.c +++ b/arch/mips/kernel/setup.c @@ -225,7 +225,7 @@ static void __init finalize_initrd(void) maybe_bswap_initrd(); memblock_reserve(__pa(virt_external_initramfs_start), size); - initrd_below_start_ok = 1; + initramfs_below_start_ok = 1; pr_info("Initial ramdisk at: 0x%lx (%lu bytes)\n", virt_external_initramfs_start, size); diff --git a/arch/openrisc/kernel/setup.c b/arch/openrisc/kernel/setup.c index f387dc57ec35..337a0381c452 100644 --- a/arch/openrisc/kernel/setup.c +++ b/arch/openrisc/kernel/setup.c @@ -246,7 +246,7 @@ void __init setup_arch(char **cmdline_p) } else { printk(KERN_INFO "Initial ramdisk at: 0x%p (%lu bytes)\n", (void *)(virt_external_initramfs_start), virt_external_initramfs_end - virt_external_initramfs_start); - initrd_below_start_ok = 1; + initramfs_below_start_ok = 1; } #endif diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c index 74bfe9797589..af7a33c8bd31 100644 --- a/arch/parisc/mm/init.c +++ b/arch/parisc/mm/init.c @@ -308,7 +308,7 @@ static void __init setup_bootmem(void) } else { initrd_reserve = virt_external_initramfs_end - virt_external_initramfs_start; } - initrd_below_start_ok = 1; + initramfs_below_start_ok = 1; printk(KERN_INFO "initrd: reserving %08lx-%08lx (mem_max %08lx)\n", __pa(virt_external_initramfs_start), __pa(virt_external_initramfs_start) + initrd_reserve, mem_max); memblock_reserve(__pa(virt_external_initramfs_start), initrd_reserve); diff --git a/arch/xtensa/kernel/setup.c b/arch/xtensa/kernel/setup.c index 2e9003be3e8c..b86367178bce 100644 --- a/arch/xtensa/kernel/setup.c +++ b/arch/xtensa/kernel/setup.c @@ -51,7 +51,7 @@ #ifdef CONFIG_BLK_DEV_INITRD extern unsigned long virt_external_initramfs_start; extern unsigned long virt_external_initramfs_end; -extern int initrd_below_start_ok; +extern int initramfs_below_start_ok; #endif #ifdef CONFIG_USE_OF @@ -292,7 +292,7 @@ void __init setup_arch(char **cmdline_p) #ifdef CONFIG_BLK_DEV_INITRD if (virt_external_initramfs_start < virt_external_initramfs_end && !mem_reserve(__pa(virt_external_initramfs_start), __pa(virt_external_initramfs_end))) - initrd_below_start_ok = 1; + initramfs_below_start_ok = 1; else virt_external_initramfs_start = 0; #endif diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index baf8347e0314..127b37f211cb 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -767,7 +767,7 @@ static void __early_init_dt_declare_initrd(unsigned long start, !(IS_ENABLED(CONFIG_RISCV) && IS_ENABLED(CONFIG_64BIT))) { virt_external_initramfs_start = (unsigned long)__va(start); virt_external_initramfs_end = (unsigned long)__va(end); - initrd_below_start_ok = 1; + initramfs_below_start_ok = 1; } } diff --git a/include/linux/initrd.h b/include/linux/initrd.h index f19efebe8221..364b603215ac 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -4,7 +4,7 @@ #define __LINUX_INITRD_H /* 1 if it is not an error if virt_external_initramfs_start < memory_start */ -extern int initrd_below_start_ok; +extern int initramfs_below_start_ok; extern unsigned long virt_external_initramfs_start, virt_external_initramfs_end; extern void free_initrd_mem(unsigned long, unsigned long); diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c index 535ce459ab94..d8b809ced11b 100644 --- a/init/do_mounts_initrd.c +++ b/init/do_mounts_initrd.c @@ -12,7 +12,7 @@ #include "do_mounts.h" -int initrd_below_start_ok; +int initramfs_below_start_ok; static int __init early_initrdmem(char *p) { diff --git a/init/initramfs.c b/init/initramfs.c index d2301cc6c470..a9c5d211665d 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -643,7 +643,7 @@ void __init reserve_initrd_mem(void) /* Now convert initrd to virtual addresses */ virt_external_initramfs_start = (unsigned long)__va(phys_external_initramfs_start); virt_external_initramfs_end = virt_external_initramfs_start + phys_external_initramfs_size; - initrd_below_start_ok = 1; + initramfs_below_start_ok = 1; return; disable: diff --git a/init/main.c b/init/main.c index 5f4d860ab72a..58a7199c81f7 100644 --- a/init/main.c +++ b/init/main.c @@ -1047,7 +1047,7 @@ void start_kernel(void) locking_selftest(); #ifdef CONFIG_BLK_DEV_INITRD - if (virt_external_initramfs_start && !initrd_below_start_ok && + if (virt_external_initramfs_start && !initramfs_below_start_ok && page_to_pfn(virt_to_page((void *)virt_external_initramfs_start)) < min_low_pfn) { pr_crit("initrd overwritten (0x%08lx < 0x%08lx) - disabling it.\n", page_to_pfn(virt_to_page((void *)virt_external_initramfs_start)), -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:11 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:11 +0000 Subject: [PATCH RESEND 32/62] init: move initramfs_below_start_ok to init/initramfs.c In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-33-safinaskar@gmail.com> This is cleanup after initrd removal Signed-off-by: Askar Safin --- init/do_mounts_initrd.c | 2 -- init/initramfs.c | 1 + 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c index d8b809ced11b..509f912c0fce 100644 --- a/init/do_mounts_initrd.c +++ b/init/do_mounts_initrd.c @@ -12,8 +12,6 @@ #include "do_mounts.h" -int initramfs_below_start_ok; - static int __init early_initrdmem(char *p) { phys_addr_t start; diff --git a/init/initramfs.c b/init/initramfs.c index a9c5d211665d..90096177a867 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -601,6 +601,7 @@ __setup("initramfs_async=", initramfs_async_setup); #include unsigned long virt_external_initramfs_start, virt_external_initramfs_end; +int initramfs_below_start_ok; phys_addr_t phys_external_initramfs_start __initdata; unsigned long phys_external_initramfs_size __initdata; -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:12 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:12 +0000 Subject: [PATCH RESEND 33/62] init: remove init/do_mounts_initrd.c In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-34-safinaskar@gmail.com> This is cleanup after initrd removal Signed-off-by: Askar Safin --- init/Makefile | 1 - init/do_mounts_initrd.c | 36 ------------------------------------ init/initramfs.c | 23 +++++++++++++++++++++++ 3 files changed, 23 insertions(+), 37 deletions(-) delete mode 100644 init/do_mounts_initrd.c diff --git a/init/Makefile b/init/Makefile index b020154b3d2a..09657c0274eb 100644 --- a/init/Makefile +++ b/init/Makefile @@ -17,7 +17,6 @@ obj-$(CONFIG_INITRAMFS_TEST) += initramfs_test.o obj-y += init_task.o mounts-y := do_mounts.o -mounts-$(CONFIG_BLK_DEV_INITRD) += do_mounts_initrd.o # # UTS_VERSION diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c deleted file mode 100644 index 509f912c0fce..000000000000 --- a/init/do_mounts_initrd.c +++ /dev/null @@ -1,36 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "do_mounts.h" - -static int __init early_initrdmem(char *p) -{ - phys_addr_t start; - unsigned long size; - char *endp; - - start = memparse(p, &endp); - if (*endp == ',') { - size = memparse(endp + 1, NULL); - - phys_external_initramfs_start = start; - phys_external_initramfs_size = size; - } - return 0; -} -early_param("initrdmem", early_initrdmem); - -static int __init early_initrd(char *p) -{ - return early_initrdmem(p); -} -early_param("initrd", early_initrd); diff --git a/init/initramfs.c b/init/initramfs.c index 90096177a867..8ed352721a79 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -606,6 +606,29 @@ int initramfs_below_start_ok; phys_addr_t phys_external_initramfs_start __initdata; unsigned long phys_external_initramfs_size __initdata; +static int __init early_initrdmem(char *p) +{ + phys_addr_t start; + unsigned long size; + char *endp; + + start = memparse(p, &endp); + if (*endp == ',') { + size = memparse(endp + 1, NULL); + + phys_external_initramfs_start = start; + phys_external_initramfs_size = size; + } + return 0; +} +early_param("initrdmem", early_initrdmem); + +static int __init early_initrd(char *p) +{ + return early_initrdmem(p); +} +early_param("initrd", early_initrd); + static BIN_ATTR(initrd, 0440, sysfs_bin_attr_simple_read, NULL, 0); void __init reserve_initrd_mem(void) -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:13 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:13 +0000 Subject: [PATCH RESEND 34/62] init: inline create_dev into the only caller In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-35-safinaskar@gmail.com> This is cleanup after initrd removal Signed-off-by: Askar Safin --- init/do_mounts.c | 5 ++++- init/do_mounts.h | 6 ------ 2 files changed, 4 insertions(+), 7 deletions(-) diff --git a/init/do_mounts.c b/init/do_mounts.c index 5c407ca54063..60ba8a633d32 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -366,7 +366,10 @@ static int __init mount_nodev_root(char *root_device_name) #ifdef CONFIG_BLOCK static void __init mount_block_root(char *root_device_name) { - int err = create_dev("/dev/root", ROOT_DEV); + int err; + + init_unlink("/dev/root"); + err = init_mknod("/dev/root", S_IFBLK | 0600, new_encode_dev(ROOT_DEV)); if (err < 0) pr_emerg("Failed to create /dev/root: %d\n", err); diff --git a/init/do_mounts.h b/init/do_mounts.h index 6c7a535e71ce..f3df9d697304 100644 --- a/init/do_mounts.h +++ b/init/do_mounts.h @@ -16,12 +16,6 @@ void mount_root_generic(char *name, char *pretty_name, int flags); void mount_root(char *root_device_name); extern int root_mountflags; -static inline __init int create_dev(char *name, dev_t dev) -{ - init_unlink(name); - return init_mknod(name, S_IFBLK | 0600, new_encode_dev(dev)); -} - /* Ensure that async file closing finished to prevent spurious errors. */ static inline void init_flush_fput(void) { -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:14 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:14 +0000 Subject: [PATCH RESEND 35/62] init: make mount_root_generic static In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-36-safinaskar@gmail.com> This is cleanup after initrd removal Signed-off-by: Askar Safin --- init/do_mounts.c | 2 +- init/do_mounts.h | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/init/do_mounts.c b/init/do_mounts.c index 60ba8a633d32..c722351c991f 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -174,7 +174,7 @@ static int __init do_mount_root(const char *name, const char *fs, return ret; } -void __init mount_root_generic(char *name, char *pretty_name, int flags) +static void __init mount_root_generic(char *name, char *pretty_name, int flags) { struct page *page = alloc_page(GFP_KERNEL); char *fs_names = page_address(page); diff --git a/init/do_mounts.h b/init/do_mounts.h index f3df9d697304..f291c30f7407 100644 --- a/init/do_mounts.h +++ b/init/do_mounts.h @@ -12,7 +12,6 @@ #include #include -void mount_root_generic(char *name, char *pretty_name, int flags); void mount_root(char *root_device_name); extern int root_mountflags; -- 2.47.2 From safinaskar at gmail.com Fri Sep 12 17:38:15 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sat, 13 Sep 2025 00:38:15 +0000 Subject: [PATCH RESEND 36/62] init: make mount_root static In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250913003842.41944-37-safinaskar@gmail.com> This is cleanup after initrd removal Signed-off-by: Askar Safin --- init/do_mounts.c | 2 +- init/do_mounts.h | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/init/do_mounts.c b/init/do_mounts.c index c722351c991f..7ec5ee5a5c19 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -381,7 +381,7 @@ static inline void mount_block_root(char *root_device_name) } #endif /* CONFIG_BLOCK */ -void __init mount_root(char *root_device_name) +static void __init mount_root(char *root_device_name) { switch (ROOT_DEV) { case Root_NFS: diff --git a/init/do_mounts.h b/init/do_mounts.h index f291c30f7407..90422fb07c02 100644 --- a/init/do_mounts.h +++ b/init/do_mounts.h @@ -12,7 +12,6 @@ #include #include -void mount_root(char *root_device_name); extern int root_mountflags; /* Ensure that async file closing finished to prevent spurious errors. */ -- 2.47.2 From fangyu.yu at linux.alibaba.com Fri Sep 12 18:24:51 2025 From: fangyu.yu at linux.alibaba.com (fangyu.yu at linux.alibaba.com) Date: Sat, 13 Sep 2025 09:24:51 +0800 Subject: [PATCH] RISC-V: KVM: Fix guest page fault within HLV* instructions In-Reply-To: <20250912140142.25147-1-fangyu.yu@linux.alibaba.com> References: <20250912140142.25147-1-fangyu.yu@linux.alibaba.com> Message-ID: <20250913012451.33829-1-fangyu.yu@linux.alibaba.com> >>From: Fangyu Yu >> >>When executing HLV* instructions at the HS mode, a guest page fault >>may occur when a g-stage page table migration between triggering the >>virtual instruction exception and executing the HLV* instruction. >> >>This may be a corner case, and one simpler way to handle this is to >>re-execute the instruction where the virtual instruction exception >>occurred, and the guest page fault will be automatically handled. >> >>Fixes: 9f7013265112 ("RISC-V: KVM: Handle MMIO exits for VCPU") >>Signed-off-by: Fangyu Yu >>--- >> arch/riscv/kvm/vcpu_insn.c | 21 ++++++++++++++++++--- >> 1 file changed, 18 insertions(+), 3 deletions(-) >> >>diff --git a/arch/riscv/kvm/vcpu_insn.c b/arch/riscv/kvm/vcpu_insn.c >>index 97dec18e6989..a8b93aa4d8ec 100644 >>--- a/arch/riscv/kvm/vcpu_insn.c >>+++ b/arch/riscv/kvm/vcpu_insn.c >>@@ -448,7 +448,12 @@ int kvm_riscv_vcpu_virtual_insn(struct kvm_vcpu *vcpu, struct kvm_run *run, >> insn = kvm_riscv_vcpu_unpriv_read(vcpu, true, >> ct->sepc, >> &utrap); >>- if (utrap.scause) { >>+ switch (utrap.scause) { >>+ case 0: >>+ break; >>+ case EXC_LOAD_GUEST_PAGE_FAULT: >>+ return KVM_INSN_CONTINUE_SAME_SEPC; >>+ default: >> utrap.sepc = ct->sepc; >> kvm_riscv_vcpu_trap_redirect(vcpu, &utrap); >> return 1; >>@@ -503,7 +508,12 @@ int kvm_riscv_vcpu_mmio_load(struct kvm_vcpu *vcpu, struct kvm_run *run, >> */ >> insn = kvm_riscv_vcpu_unpriv_read(vcpu, true, ct->sepc, >> &utrap); >>- if (utrap.scause) { >>+ switch (utrap.scause) { >>+ case 0: >>+ break; >>+ case EXC_LOAD_GUEST_PAGE_FAULT: >>+ return KVM_INSN_CONTINUE_SAME_SEPC; >>+ default: >> /* Redirect trap if we failed to read instruction */ >> utrap.sepc = ct->sepc; >> kvm_riscv_vcpu_trap_redirect(vcpu, &utrap); >>@@ -629,7 +639,12 @@ int kvm_riscv_vcpu_mmio_store(struct kvm_vcpu *vcpu, struct kvm_run *run, >> */ >> insn = kvm_riscv_vcpu_unpriv_read(vcpu, true, ct->sepc, >> &utrap); >>- if (utrap.scause) { >>+ switch (utrap.scause) { >>+ case 0: >>+ break; >>+ case EXC_LOAD_GUEST_PAGE_FAULT: > >Here should be EXC_STORE_GUEST_PAGE_FAULT, I will fix it next version. Please ignore this comment, EXC_LOAD_GUEST_PAGE_FAULT is correct. > >>+ return KVM_INSN_CONTINUE_SAME_SEPC; >>+ default: >> /* Redirect trap if we failed to read instruction */ >> utrap.sepc = ct->sepc; >> kvm_riscv_vcpu_trap_redirect(vcpu, &utrap); >>-- >>2.49.0 > From pjw at kernel.org Fri Sep 12 18:30:29 2025 From: pjw at kernel.org (Paul Walmsley) Date: Fri, 12 Sep 2025 19:30:29 -0600 (MDT) Subject: [PATCH v5 2/3] riscv: Strengthen duplicate and inconsistent definition of RV_X() In-Reply-To: <20250620-dev-alex-insn_duplicate_v5_manual-v5-2-d865dc9ad180@rivosinc.com> References: <20250620-dev-alex-insn_duplicate_v5_manual-v5-0-d865dc9ad180@rivosinc.com> <20250620-dev-alex-insn_duplicate_v5_manual-v5-2-d865dc9ad180@rivosinc.com> Message-ID: On Fri, 20 Jun 2025, Alexandre Ghiti wrote: > RV_X() macro is defined in two different ways which is error prone. > > So harmonize its first definition and add another macro RV_X_mask() for > the second one. > > Reviewed-by: Andrew Jones > Signed-off-by: Alexandre Ghiti Thanks. I updated this one to uppercase the name of the RV_X_MASK macro. That way it matches the naming of the rest of the macros in the file. Queued for v6.18. - Paul From pjw at kernel.org Fri Sep 12 18:33:00 2025 From: pjw at kernel.org (Paul Walmsley) Date: Fri, 12 Sep 2025 19:33:00 -0600 (MDT) Subject: [PATCH v2 0/2] riscv: Replace __ASSEMBLY__ with __ASSEMBLER__ in header files In-Reply-To: References: <20250606070952.498274-1-thuth@redhat.com> <175450055499.2863135.2738368758577957268.git-patchwork-notify@kernel.org> Message-ID: Hi Thomas, On Mon, 18 Aug 2025, Thomas Huth wrote: > On 06/08/2025 19.15, patchwork-bot+linux-riscv at kernel.org wrote: > > Hello: > > > > This series was applied to riscv/linux.git (for-next) > > by Alexandre Ghiti : > > Hi Alexandre, > > I can't see the patches in the for-next branch ... have they been dropped > again? Was there an issue with the patches? No issues with your patches; we just had some trouble getting the arch/riscv PR merged during the last merge window. I've queued both of your patches for v6.18. They should show up in for-next in a few days. Thanks, - Paul From guoren at kernel.org Fri Sep 12 22:04:17 2025 From: guoren at kernel.org (Guo Ren) Date: Sat, 13 Sep 2025 13:04:17 +0800 Subject: [PATCH 3/3] riscv: dts: thead: add zfh for th1520 In-Reply-To: <20250911184528.1512543-4-rabenda.cn@gmail.com> References: <20250911184528.1512543-1-rabenda.cn@gmail.com> <20250911184528.1512543-4-rabenda.cn@gmail.com> Message-ID: On Fri, Sep 12, 2025 at 2:46?AM Han Gao wrote: > > th1520 support Zfh ISA extension [1]. > > Link: https://occ-oss-prod.oss-cn-hangzhou.aliyuncs.com/resource//1737721869472/%E7%8E%84%E9%93%81C910%E4%B8%8EC920R1S6%E7%94%A8%E6%88%B7%E6%89%8B%E5%86%8C%28xrvm%29_20250124.pdf [1] Agree with Conor's advice. Linus just had some comment about the Link tag usage: https://www.phoronix.com/news/Linus-Torvalds-No-Link-Tags We should be careful :-P > > Signed-off-by: Han Gao > Signed-off-by: Han Gao > --- > arch/riscv/boot/dts/thead/th1520.dtsi | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/riscv/boot/dts/thead/th1520.dtsi b/arch/riscv/boot/dts/thead/th1520.dtsi > index 7f07688aa964..2075bb969c2f 100644 > --- a/arch/riscv/boot/dts/thead/th1520.dtsi > +++ b/arch/riscv/boot/dts/thead/th1520.dtsi > @@ -26,7 +26,7 @@ c910_0: cpu at 0 { > riscv,isa-base = "rv64i"; > riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > "ziccrse", "zicntr", "zicsr", > - "zifencei", "zihpm", > + "zifencei", "zihpm", "zfh", > "xtheadvector"; > thead,vlenb = <16>; > reg = <0>; > @@ -53,7 +53,7 @@ c910_1: cpu at 1 { > riscv,isa-base = "rv64i"; > riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > "ziccrse", "zicntr", "zicsr", > - "zifencei", "zihpm", > + "zifencei", "zihpm", "zfh", > "xtheadvector"; > thead,vlenb = <16>; > reg = <1>; > @@ -80,7 +80,7 @@ c910_2: cpu at 2 { > riscv,isa-base = "rv64i"; > riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > "ziccrse", "zicntr", "zicsr", > - "zifencei", "zihpm", > + "zifencei", "zihpm", "zfh", > "xtheadvector"; > thead,vlenb = <16>; > reg = <2>; > @@ -107,7 +107,7 @@ c910_3: cpu at 3 { > riscv,isa-base = "rv64i"; > riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > "ziccrse", "zicntr", "zicsr", > - "zifencei", "zihpm", > + "zifencei", "zihpm", "zfh", > "xtheadvector"; > thead,vlenb = <16>; > reg = <3>; > -- > 2.47.3 > -- Best Regards Guo Ren From bp at alien8.de Fri Sep 12 22:48:37 2025 From: bp at alien8.de (Borislav Petkov) Date: Sat, 13 Sep 2025 07:48:37 +0200 Subject: [PATCH RESEND 28/62] init: alpha, arc, arm, arm64, csky, m68k, microblaze, mips, nios2, openrisc, parisc, powerpc, s390, sh, sparc, um, x86, xtensa: rename initrd_{start,end} to virt_external_initramfs_{start,end} In-Reply-To: <20250913003842.41944-29-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> <20250913003842.41944-29-safinaskar@gmail.com> Message-ID: <20250913054837.GAaMUFtd4YlaPqL2Ov@fat_crate.local> On Sat, Sep 13, 2025 at 12:38:07AM +0000, Askar Safin wrote: > Rename initrd_start to virt_external_initramfs_start and > initrd_end to virt_external_initramfs_end. "virt" as in "virtualization"? That's not confusing at all... :-\ And "external" means what? > They refer to initramfs, not to initrd Why not simply initramfs_{start,end} if they belong to it? -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette From bp at alien8.de Fri Sep 12 22:59:45 2025 From: bp at alien8.de (Borislav Petkov) Date: Sat, 13 Sep 2025 07:59:45 +0200 Subject: [PATCH RESEND 28/62] init: alpha, arc, arm, arm64, csky, m68k, microblaze, mips, nios2, openrisc, parisc, powerpc, s390, sh, sparc, um, x86, xtensa: rename initrd_{start,end} to virt_external_initramfs_{start,end} In-Reply-To: <20250913054837.GAaMUFtd4YlaPqL2Ov@fat_crate.local> References: <20250913003842.41944-1-safinaskar@gmail.com> <20250913003842.41944-29-safinaskar@gmail.com> <20250913054837.GAaMUFtd4YlaPqL2Ov@fat_crate.local> Message-ID: <20250913055851.GBaMUIGyF8VhpUsOZg@fat_crate.local> On Sat, Sep 13, 2025 at 07:48:37AM +0200, Borislav Petkov wrote: > On Sat, Sep 13, 2025 at 12:38:07AM +0000, Askar Safin wrote: > > Rename initrd_start to virt_external_initramfs_start and > > initrd_end to virt_external_initramfs_end. > > "virt" as in "virtualization"? Ooh, now I see it - you have virtual and physical initramfs address things. We usually call those "va" and "pa". So initramfs_{va,pa}_{start,end} perhaps... -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette From joro at 8bytes.org Fri Sep 12 23:09:19 2025 From: joro at 8bytes.org (=?utf-8?B?SsO2cmcgUsO2ZGVs?=) Date: Sat, 13 Sep 2025 08:09:19 +0200 Subject: [PATCH v1] riscv: iommu: Fix irq failure due to idx mismatch in icvec In-Reply-To: <20250910095430.93868-1-guoyaxing@bosc.ac.cn> References: <20250910095430.93868-1-guoyaxing@bosc.ac.cn> Message-ID: On Wed, Sep 10, 2025 at 05:54:30PM +0800, Yaxing Guo wrote: > In icvec, the idx of civ, fiv, pmiv and piv are 0, 1, 2, 3 > (According to spec 5.27). And usually, the interrupt-names > property in dts riscv-iommu node also follows this (In qemu > virt machine follows this) which will cause hardware irq > number errors (Especially when using qemu virt machine to > start Linux). This sounds like the patch needs a Fixes tag. From julian.stecklina at cyberus-technology.de Sat Sep 13 01:58:34 2025 From: julian.stecklina at cyberus-technology.de (Julian Stecklina) Date: Sat, 13 Sep 2025 08:58:34 +0000 Subject: [PATCH RESEND 00/62] initrd: remove classic initrd support In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <1f9aee6090716db537e9911685904786b030111f.camel@cyberus-technology.de> On Sat, 2025-09-13 at 00:37 +0000, Askar Safin wrote: > Intro > ==== > This patchset removes classic initrd (initial RAM disk) support, > which was deprecated in 2020. > Initramfs still stays, and RAM disk itself (brd) still stays, too. > init/do_mounts* and init/*initramfs* are listed in VFS entry in > MAINTAINERS, so I think this patchset should go through VFS tree. > This patchset touchs every subdirectory in arch/, so I tested it > on 8 (!!!) archs in Qemu (see details below). > Warning: this patchset renames CONFIG_BLK_DEV_INITRD (!!!) to CONFIG_INITRAMFS > and CONFIG_RD_* to CONFIG_INITRAMFS_DECOMPRESS_* (for example, > CONFIG_RD_GZIP to CONFIG_INITRAMFS_DECOMPRESS_GZIP). > If you still use initrd, see below for workaround. > As the person who kicked this off by trying to get erofs support for initrds: You have all my support for nuking so much legacy code! I'm all for removing historical baggage even if it comes with slight inconveniences for fringe usecase users (me!). If this series goes through, I'll drink a beer to you! Acked-by: Julian Stecklina > > Also I renamed CONFIG_BLK_DEV_INITRD (which became total misnomer) > to CONFIG_INITRAMFS. And CONFIG_RD_* to CONFIG_INITRAMFS_DECOMPRESS_*. > This will break all configs out there (update your configs!). This is beautiful. The original names were pretty misleading! > Workaround > ==== > If "retain_initrd" is passed to kernel, then initramfs/initrd, > passed by bootloader, is retained and becomes available after boot > as read-only magic file /sys/firmware/initrd [3]. This is pretty neat, because now you can use _all filesystems_ as initrds. :-D This solves my original problem, albeit with a tiny shim initramfs. Nice! Julian From wens at csie.org Sat Sep 13 02:02:44 2025 From: wens at csie.org (Chen-Yu Tsai) Date: Sat, 13 Sep 2025 17:02:44 +0800 Subject: [PATCH v8 3/5] ARM: dts: sunxi: add support for NetCube Systems Nagami SoM In-Reply-To: <20250831162536.2380589-4-lukas.schmid@netcube.li> References: <20250831162536.2380589-1-lukas.schmid@netcube.li> <20250831162536.2380589-4-lukas.schmid@netcube.li> Message-ID: Hi, On Mon, Sep 1, 2025 at 12:26?AM Lukas Schmid wrote: > > NetCube Systems Nagami SoM is a module based around the Allwinner T113s > SoC. It includes the following features and interfaces: > > - 128MB DDR3 included in SoC > - 10/100 Mbps Ethernet using LAN8720A phy > - One USB-OTG interface > - One USB-Host interface > - One I2S interface with in and output support > - Two CAN interfaces > - ESP32 over SDIO > - One SPI interface > - I2C EEPROM for MAC address > - One QWIIC I2C Interface with dedicated interrupt pin shared with EEPROM > - One external I2C interface > - SD interface for external SD-Card > > Signed-off-by: Lukas Schmid > --- > .../allwinner/sun8i-t113s-netcube-nagami.dtsi | 250 ++++++++++++++++++ > 1 file changed, 250 insertions(+) > create mode 100644 arch/arm/boot/dts/allwinner/sun8i-t113s-netcube-nagami.dtsi > > diff --git a/arch/arm/boot/dts/allwinner/sun8i-t113s-netcube-nagami.dtsi b/arch/arm/boot/dts/allwinner/sun8i-t113s-netcube-nagami.dtsi > new file mode 100644 > index 0000000000000..4d3627f6d58d2 > --- /dev/null > +++ b/arch/arm/boot/dts/allwinner/sun8i-t113s-netcube-nagami.dtsi > @@ -0,0 +1,250 @@ > +// SPDX-License-Identifier: (GPL-2.0+ OR MIT) > +/* > + * Copyright (C) 2025 Lukas Schmid > + */ > + > +/dts-v1/; > +#include "sun8i-t113s.dtsi" > + > +#include > +#include > + > +/ { > + model = "NetCube Systems Nagami SoM"; > + compatible = "netcube,nagami", "allwinner,sun8i-t113s"; > + > + aliases { > + serial1 = &uart1; // ESP32 Bootloader UART > + serial3 = &uart3; // Console UART on Card Edge > + ethernet0 = &emac; > + }; > + > + chosen { > + stdout-path = "serial3:115200n8"; > + }; > + > + /* module wide 3.3V supply directly from the card edge */ > + reg_vcc3v3: regulator-3v3 { > + compatible = "regulator-fixed"; > + regulator-name = "vcc-3v3"; > + regulator-min-microvolt = <3300000>; > + regulator-max-microvolt = <3300000>; > + regulator-always-on; > + }; > + > + /* SY8008 DC/DC regulator on the board, also supplying VDD-SYS */ > + reg_vcc_core: regulator-core { > + compatible = "regulator-fixed"; > + regulator-name = "vcc-core"; > + regulator-min-microvolt = <880000>; > + regulator-max-microvolt = <880000>; > + vin-supply = <®_vcc3v3>; > + }; > + > + /* USB0 MUX to switch connect to Card-Edge only after BootROM */ > + usb0_sec_mux: mux-controller{ > + compatible = "gpio-mux"; > + #mux-control-cells = <0>; > + mux-gpios = <&pio 3 9 GPIO_ACTIVE_HIGH>; /* PD9 */ > + idle-state = <1>; /* USB connected to Card-Edge by default */ > + }; > + > + /* Reset of ESP32 */ > + wifi_pwrseq: wifi-pwrseq { > + compatible = "mmc-pwrseq-simple"; > + reset-gpios = <&pio 6 9 GPIO_ACTIVE_LOW>; /* PG9 */ > + post-power-on-delay-ms = <1500>; > + power-off-delay-us = <200>; > + }; > +}; > + > +&cpu0 { > + cpu-supply = <®_vcc_core>; > +}; > + > +&cpu1 { > + cpu-supply = <®_vcc_core>; > +}; > + > +&dcxo { > + clock-frequency = <24000000>; > +}; > + > +&emac { > + nvmem-cells = <ð0_macaddress>; > + nvmem-cell-names = "mac-address"; > + phy-handle = <&lan8720a>; > + phy-mode = "rmii"; > + pinctrl-0 = <&rmii_pe_pins>; > + pinctrl-names = "default"; > + status = "okay"; > +}; > + > +/* Default I2C Interface on Card-Edge */ > +&i2c2 { > + pinctrl-0 = <&i2c2_pins>; > + pinctrl-names = "default"; > + status = "disabled"; > +}; > + > +/* Exposed as the QWIIC connector and used by the internal EEPROM */ > +&i2c3 { > + pinctrl-0 = <&i2c3_pins>; > + pinctrl-names = "default"; > + status = "okay"; > + > + eeprom0: eeprom at 50 { > + compatible = "atmel,24c02"; /* actually it's a 24AA02E48 */ > + reg = <0x50>; > + pagesize = <16>; > + read-only; > + vcc-supply = <®_vcc3v3>; > + > + #address-cells = <1>; > + #size-cells = <1>; > + > + eth0_macaddress: macaddress at fa { > + reg = <0xfa 0x06>; > + }; > + }; > +}; > + > +/* Default I2S Interface on Card-Edge */ > +&i2s1 { > + pinctrl-0 = <&i2s1_pins>, <&i2s1_din_pins>, <&i2s1_dout_pins>; > + pinctrl-names = "default"; > + status = "disabled"; > +}; > + > +/* Phy is on SoM. MDI signals pre-magentics are on the card edge */ ^ pre-magnetics? Will fix up when applying if nothing else in the series is wrong. ChenYu > +&mdio { > + lan8720a: ethernet-phy at 0 { > + compatible = "ethernet-phy-ieee802.3-c22"; > + reg = <0>; > + }; > +}; > + > +/* Default SD Interface on Card-Edge */ > +&mmc0 { > + pinctrl-0 = <&mmc0_pins>; > + pinctrl-names = "default"; > + status = "disabled"; > +}; > + > +/* Connected to the on-board ESP32 */ > +&mmc1 { > + pinctrl-0 = <&mmc1_pins>; > + pinctrl-names = "default"; > + vmmc-supply = <®_vcc3v3>; > + bus-width = <4>; > + non-removable; > + mmc-pwrseq = <&wifi_pwrseq>; > + status = "okay"; > +}; > + > +/* Connected to the on-board eMMC */ > +&mmc2 { > + pinctrl-0 = <&mmc2_pins>; > + pinctrl-names = "default"; > + vmmc-supply = <®_vcc3v3>; > + vqmmc-supply = <®_vcc3v3>; > + bus-width = <4>; > + non-removable; > + status = "okay"; > +}; > + > +&pio { > + vcc-pb-supply = <®_vcc3v3>; > + vcc-pc-supply = <®_vcc3v3>; > + vcc-pd-supply = <®_vcc3v3>; > + vcc-pe-supply = <®_vcc3v3>; > + vcc-pf-supply = <®_vcc3v3>; > + vcc-pg-supply = <®_vcc3v3>; > + > + gpio-line-names = "", "", "", "", // PA > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "CAN0_TX", "CAN0_RX", // PB > + "CAN1_TX", "CAN1_RX", "UART3_TX", "UART3_RX", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "eMMC_CLK", "eMMC_CMD", // PC > + "eMMC_D2", "eMMC_D1", "eMMC_D0", "eMMC_D3", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", // PD > + "", "", "", "", > + "", "USB_SEC_EN", "SPI1_CS", "SPI1_CLK", > + "SPI1_MOSI", "SPI1_MISO", "SPI1_HOLD", "SPI1_WP", > + "PD16", "", "", "", > + "I2C2_SCL", "I2C2_SDA", "PD22", "", > + "", "", "", "", > + "", "", "", "", > + "ETH_CRSDV", "ETH_RXD0", "ETH_RXD1", "ETH_TXCK", // PE > + "ETH_TXD0", "ETH_TXD1", "ETH_TXEN", "", > + "ETH_MDC", "ETH_MDIO", "QWIIC_nINT", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "SD_D1", "SD_D0", "SD_CLK", "SD_CLK", // PF > + "SD_D3", "SD_D2", "PF6", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "ESP_CLK", "ESP_CMD", "ESP_D0", "ESP_D1", // PG > + "ESP_D2", "ESP_D3", "UART1_TXD", "UART1_RXD", > + "ESP_nBOOT", "ESP_nRST", "I2C3_SCL", "I2C3_SDA", > + "I2S1_WS", "I2S1_CLK", "I2S1_DIN0", "I2S1_DOUT0", > + "", "", "", "", > + "", "", "", "", > + "", "", "", "", > + "", "", "", ""; > +}; > + > +/* Remove the unused CK pin from the pinctl as it is unconnected */ > +&rmii_pe_pins { > + pins = "PE0", "PE1", "PE2", "PE3", "PE4", > + "PE5", "PE6", "PE8", "PE9"; > +}; > + > +/* Default SPI Interface on Card-Edge */ > +&spi1 { > + #address-cells = <1>; > + #size-cells = <0>; > + pinctrl-0 = <&spi1_pins>, <&spi1_hold_pin>, <&spi1_wp_pin>; > + pinctrl-names = "default"; > + cs-gpios = <0>; > + status = "disabled"; > +}; > + > +/* Connected to the Bootloader/Console of the ESP32 */ > +&uart1 { > + pinctrl-0 = <&uart1_pg6_pins>; > + pinctrl-names = "default"; > + status = "okay"; > +}; > + > +/* Console/Debug UART on Card-Edge */ > +&uart3 { > + pinctrl-0 = <&uart3_pb_pins>; > + pinctrl-names = "default"; > + status = "okay"; > +}; > -- > 2.39.5 > > From wens at csie.org Sat Sep 13 02:09:05 2025 From: wens at csie.org (Chen-Yu Tsai) Date: Sat, 13 Sep 2025 17:09:05 +0800 Subject: [PATCH v8 2/5] riscv: dts: allwinner: d1s-t113: Add pinctrl's required by NetCube Systems Nagami SoM In-Reply-To: <20250831162536.2380589-3-lukas.schmid@netcube.li> References: <20250831162536.2380589-1-lukas.schmid@netcube.li> <20250831162536.2380589-3-lukas.schmid@netcube.li> Message-ID: On Mon, Sep 1, 2025 at 12:26?AM Lukas Schmid wrote: > > Added the following pinctrl's used by the NetCube Systems Nagami SoM > * i2c2_pins > * i2c3_pins > * i2s1_pins, i2s1_din_pins, i2s1_dout_pins > * spi1_pins, spi1_hold_pin, spi1_wp_pin > > Signed-off-by: Lukas Schmid > --- > .../boot/dts/allwinner/sunxi-d1s-t113.dtsi | 48 +++++++++++++++++++ > 1 file changed, 48 insertions(+) > > diff --git a/arch/riscv/boot/dts/allwinner/sunxi-d1s-t113.dtsi b/arch/riscv/boot/dts/allwinner/sunxi-d1s-t113.dtsi > index e4175adb028da..c00996d6275c5 100644 > --- a/arch/riscv/boot/dts/allwinner/sunxi-d1s-t113.dtsi > +++ b/arch/riscv/boot/dts/allwinner/sunxi-d1s-t113.dtsi > @@ -78,6 +78,36 @@ dsi_4lane_pins: dsi-4lane-pins { > function = "dsi"; > }; > > + /omit-if-no-ref/ > + i2c2_pins: i2c2-pins { > + pins = "PD20", "PD21"; > + function = "i2c2"; > + }; > + > + /omit-if-no-ref/ > + i2c3_pins: i2c3-pins { > + pins = "PG10", "PG11"; > + function = "i2c3"; > + }; Because i2c2 and i2c3 have multiple options, they should be named appropriately, like i2c2-pd-pins and i2c3-pg-pins > + > + /omit-if-no-ref/ > + i2s1_pins: i2s1-pins { > + pins = "PG12", "PG13"; > + function = "i2s1"; > + }; > + > + /omit-if-no-ref/ > + i2s1_din_pins: i2s1-din-pins { > + pins = "PG14"; > + function = "i2s1_din"; > + }; > + > + /omit-if-no-ref/ > + i2s1_dout_pins: i2s1-dout-pins { > + pins = "PG15"; > + function = "i2s1_dout"; > + }; Should be *din0* and *dout0*, since you have din1 and dout1 on the same pins but swapped around. ChenYu > + > /omit-if-no-ref/ > lcd_rgb666_pins: lcd-rgb666-pins { > pins = "PD0", "PD1", "PD2", "PD3", "PD4", "PD5", > @@ -126,6 +156,24 @@ spi0_pins: spi0-pins { > function = "spi0"; > }; > > + /omit-if-no-ref/ > + spi1_pins: spi1-pins { > + pins = "PD10", "PD11", "PD12", "PD13"; > + function = "spi1"; > + }; > + > + /omit-if-no-ref/ > + spi1_hold_pin: spi1-hold-pin { > + pins = "PD14"; > + function = "spi1"; > + }; > + > + /omit-if-no-ref/ > + spi1_wp_pin: spi1-wp-pin { > + pins = "PD15"; > + function = "spi1"; > + }; > + > /omit-if-no-ref/ > uart1_pg6_pins: uart1-pg6-pins { > pins = "PG6", "PG7"; > -- > 2.39.5 > > From rabenda.cn at gmail.com Sat Sep 13 02:29:06 2025 From: rabenda.cn at gmail.com (Han Gao) Date: Sat, 13 Sep 2025 17:29:06 +0800 Subject: [PATCH 2/3] riscv: dts: thead: add ziccrse for th1520 In-Reply-To: <20250912-gander-fox-d20c2e431816@spud> References: <20250911184528.1512543-1-rabenda.cn@gmail.com> <20250911184528.1512543-3-rabenda.cn@gmail.com> <20250912-gander-fox-d20c2e431816@spud> Message-ID: On Sat, Sep 13, 2025 at 1:57?AM Conor Dooley wrote: > > On Fri, Sep 12, 2025 at 02:45:27AM +0800, Han Gao wrote: > > th1520 support Ziccrse ISA extension [1]. > > > > Link: https://lore.kernel.org/all/20241103145153.105097-12-alexghiti at rivosinc.com/ [1] > > I don't see what this link has to do with th1520 supporting the > extension. The kernel supporting it has nothing to do with whether it > should be in the dts or not. A useful link would substantiate your > claim. Existing rv64 hardware conforms to the rva20 profile. Ziccrse is an additional extension required by the rva20 profile, so th1520 has this extension. Link: https://github.com/riscv/riscv-profiles/blob/main/src/profiles.adoc#511-rva20u64-mandatory-base [1] > > > Signed-off-by: Han Gao > > Signed-off-by: Han Gao > > You only need to sign this off once. > > Cheers, > Conor. > > > --- > > arch/riscv/boot/dts/thead/th1520.dtsi | 24 ++++++++++++++++-------- > > 1 file changed, 16 insertions(+), 8 deletions(-) > > > > diff --git a/arch/riscv/boot/dts/thead/th1520.dtsi b/arch/riscv/boot/dts/thead/th1520.dtsi > > index 59d1927764a6..7f07688aa964 100644 > > --- a/arch/riscv/boot/dts/thead/th1520.dtsi > > +++ b/arch/riscv/boot/dts/thead/th1520.dtsi > > @@ -24,8 +24,10 @@ c910_0: cpu at 0 { > > device_type = "cpu"; > > riscv,isa = "rv64imafdc"; > > riscv,isa-base = "rv64i"; > > - riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", > > - "zifencei", "zihpm", "xtheadvector"; > > + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > > + "ziccrse", "zicntr", "zicsr", > > + "zifencei", "zihpm", > > + "xtheadvector"; > > thead,vlenb = <16>; > > reg = <0>; > > i-cache-block-size = <64>; > > @@ -49,8 +51,10 @@ c910_1: cpu at 1 { > > device_type = "cpu"; > > riscv,isa = "rv64imafdc"; > > riscv,isa-base = "rv64i"; > > - riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", > > - "zifencei", "zihpm", "xtheadvector"; > > + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > > + "ziccrse", "zicntr", "zicsr", > > + "zifencei", "zihpm", > > + "xtheadvector"; > > thead,vlenb = <16>; > > reg = <1>; > > i-cache-block-size = <64>; > > @@ -74,8 +78,10 @@ c910_2: cpu at 2 { > > device_type = "cpu"; > > riscv,isa = "rv64imafdc"; > > riscv,isa-base = "rv64i"; > > - riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", > > - "zifencei", "zihpm", "xtheadvector"; > > + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > > + "ziccrse", "zicntr", "zicsr", > > + "zifencei", "zihpm", > > + "xtheadvector"; > > thead,vlenb = <16>; > > reg = <2>; > > i-cache-block-size = <64>; > > @@ -99,8 +105,10 @@ c910_3: cpu at 3 { > > device_type = "cpu"; > > riscv,isa = "rv64imafdc"; > > riscv,isa-base = "rv64i"; > > - riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "zicntr", "zicsr", > > - "zifencei", "zihpm", "xtheadvector"; > > + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", > > + "ziccrse", "zicntr", "zicsr", > > + "zifencei", "zihpm", > > + "xtheadvector"; > > thead,vlenb = <16>; > > reg = <3>; > > i-cache-block-size = <64>; > > -- > > 2.47.3 > > From wens at kernel.org Sat Sep 13 02:30:16 2025 From: wens at kernel.org (Chen-Yu Tsai) Date: Sat, 13 Sep 2025 17:30:16 +0800 Subject: [PATCH v8 0/5] Add support for NetCube Systems Nagami SoM and its carrier boards In-Reply-To: <20250831162536.2380589-1-lukas.schmid@netcube.li> References: <20250831162536.2380589-1-lukas.schmid@netcube.li> Message-ID: <175775572870.3891284.1456456718289976149.b4-ty@csie.org> From: Chen-Yu Tsai On Sun, 31 Aug 2025 18:25:29 +0200, Lukas Schmid wrote: > This series adds support for the NetCube Systems Nagami SoM and its > associated carrier boards, the Nagami Basic Carrier and the Nagami Keypad > Carrier. > > Changes in v8: > - Use a gpio-mux instead of the gpio-hog for the USB0_SEC_EN signal > - Fix the dt-schema issues > > [...] Applied to sunxi/dt-for-6.18 in local tree, thanks! [1/5] dt-bindings: arm: sunxi: Add NetCube Systems Nagami SoM and carrier board bindings commit: db5796c5c5c6db72339e818b54e6a2e043f7032c [2/5] riscv: dts: allwinner: d1s-t113: Add pinctrl's required by NetCube Systems Nagami SoM commit: cbce6d5326b116f55dc29f7fc0a7d56a9a03d9e5 [3/5] ARM: dts: sunxi: add support for NetCube Systems Nagami SoM commit: cba2febbd6465aabdff157fb95b1c07d090af1f0 [4/5] ARM: dts: sunxi: add support for NetCube Systems Nagami Basic Carrier commit: e36d4d54eefb60144666b27754007e1c0dd0a581 [5/5] ARM: dts: sunxi: add support for NetCube Systems Nagami Keypad Carrier commit: caffed0800ef4dd29cc29ee17a89d015e867e03a Note that there were some cases in the device tree files where lines were indented more than necessary, like for gpio-line-names and the board level fallback compatible string. Wrapped lines for lists of items should align with the first item on the first line. Best regards, -- Chen-Yu Tsai From fustini at kernel.org Sat Sep 13 14:30:59 2025 From: fustini at kernel.org (Drew Fustini) Date: Sat, 13 Sep 2025 14:30:59 -0700 Subject: [PATCH 0/7] RISC-V: Add support for Tenstorrent Blackhole SoC Message-ID: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> Enable support for the Tenstorrent Blackhole A0 SoC in the Blackhole P100 and P150 PCIe cards [1]. The Blackhole SoC contains four RISC-V CPU tiles consisting of 4x SiFive X280 cores. Each tile is capable of running an instance of Linux. There is a public Linux-on-Blackhole project [2] that enables users to boot Linux on Blackhole PCIe cards. A boot script on the PCIe host loads the kernel image and the rootfs into DDR memory and then takes the X280 cores out of reset. All the low-level SoC initialization is handled by firmware [3] running on a separate management core in the Blackhole SoC. Linux on the X280 cores does not need to deal with any clocks, reset, etc. The management core firmware also controls the PCIe EP functionality. The tt-kmd Linux kernel driver [4] on the PCIe host allows the host to interact with the DDR memory on the Blackhole PCIe card along with other tiles in the SoC accessible from the NoC [5]. There is a virtual UART implemented in OpenSBI [6] that allows a console program on the PCIe host to communicate through shared memory with Linux running on the Blackhole. This does require CONFIG_HVC_RISCV_SBI which is currently hidden behind CONFIG_NONPORTABLE. I would like Blackhole to work with defconfig, so I'm looking into possible ways of solving the issue that caused HVC SBI to be guarded by NONPORTABLE [7]. The public Linux-on-Blackhole project does also make use of virtio to provide networking and storage. However, this relies on changes in our downstream kernel branch [8], so I've removed those dt nodes from this upstream dts series. We hope to eventually leverage the virtio-msg spec to upstream the virtio functionality, too. I have also dropped the bootargs from this series. Instead, I will add the ability to fixup the dtb to the boot script on the host [9]. It does need 'console=hvc0' to ensure the full boot output appears in the console program on the host. I also dropped the pmem node from this series as I don't see any upstream users of pmem. I have been using pmem for the rootfs, so I'll update the boot script to add the pmem node and amend 'root=/dev/pmem0' in bootargs. TL;DR: The goal for upstreaming this rather minimal device tree in this series is to make it possible to boot mainline kernel builds. I attended the recent KernelCI workshop, and there are not currently many RISC-V boards doing boot tests. I think the Blackhole cards could help improve the situation once Blackhole is able to boot important trees like mainline and next. The HVC SBI console is sufficient for boot testing. [1] https://tenstorrent.com/hardware/blackhole [2] https://github.com/tenstorrent/tt-bh-linux [3] https://github.com/tenstorrent/tt-zephyr-platforms [4] https://github.com/tenstorrent/tt-kmd [5] https://github.com/tenstorrent/tt-isa-documentation/blob/main/BlackholeA0/ [6] https://github.com/tenstorrent/opensbi/ [7] https://lore.kernel.org/all/20240214153429.16484-2-palmer at rivosinc.com/ [8] https://github.com/tenstorrent/linux/ [9] https://github.com/tenstorrent/tt-bh-linux/blob/dfustini/kernelci/boot.py Signed-off-by: Drew Fustini --- Drew Fustini (7): dt-bindings: vendor-prefixes: Add Tenstorrent AI ULC dt-bindings: riscv: Add Tenstorrent Blackhole compatible dt-bindings: riscv: cpus: Add SiFive X280 compatible dt-bindings: timers: Add Tenstorrent Blackhole compatible dt-bindings: interrupt-controller: Add Tenstorrent Blackhole compatible riscv: dts: Add Tenstorrent Blackhole A0 SoC PCIe cards riscv: Kconfig.socs: Add ARCH_TENSTORRENT for Tenstorrent SoCs .../interrupt-controller/sifive,plic-1.0.0.yaml | 1 + Documentation/devicetree/bindings/riscv/cpus.yaml | 1 + .../devicetree/bindings/riscv/tenstorrent.yaml | 28 ++++++ .../devicetree/bindings/timer/sifive,clint.yaml | 1 + .../devicetree/bindings/vendor-prefixes.yaml | 2 + MAINTAINERS | 9 ++ arch/riscv/Kconfig.socs | 8 ++ arch/riscv/boot/dts/Makefile | 1 + arch/riscv/boot/dts/tenstorrent/Makefile | 2 + .../boot/dts/tenstorrent/blackhole-a0-card.dts | 14 +++ arch/riscv/boot/dts/tenstorrent/blackhole-a0.dtsi | 112 +++++++++++++++++++++ 11 files changed, 179 insertions(+) --- base-commit: 76eeb9b8de9880ca38696b2fb56ac45ac0a25c6c change-id: 20250912-tt-bh-dts-d7bcc507a556 Best regards, -- Drew Fustini From fustini at kernel.org Sat Sep 13 14:31:00 2025 From: fustini at kernel.org (Drew Fustini) Date: Sat, 13 Sep 2025 14:31:00 -0700 Subject: [PATCH 1/7] dt-bindings: vendor-prefixes: Add Tenstorrent AI ULC In-Reply-To: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> References: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> Message-ID: <20250913-tt-bh-dts-v1-1-ddb0d6860fe5@tenstorrent.com> From: Drew Fustini Document vendor prefix for Tenstorrent in DT bindings. Signed-off-by: Drew Fustini --- Documentation/devicetree/bindings/vendor-prefixes.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/Documentation/devicetree/bindings/vendor-prefixes.yaml b/Documentation/devicetree/bindings/vendor-prefixes.yaml index 9ec8947dfcad2fa53b2dca2ca06a63710771a600..8bbc0ebdfb9eb5864f2797251a8d144e2eea9a92 100644 --- a/Documentation/devicetree/bindings/vendor-prefixes.yaml +++ b/Documentation/devicetree/bindings/vendor-prefixes.yaml @@ -1547,6 +1547,8 @@ patternProperties: description: Teltonika Networks "^tempo,.*": description: Tempo Semiconductor + "^tenstorrent,.*": + description: Tenstorrent AI ULC "^terasic,.*": description: Terasic Inc. "^tesla,.*": -- 2.34.1 From fustini at kernel.org Sat Sep 13 14:31:01 2025 From: fustini at kernel.org (Drew Fustini) Date: Sat, 13 Sep 2025 14:31:01 -0700 Subject: [PATCH 2/7] dt-bindings: riscv: Add Tenstorrent Blackhole compatible In-Reply-To: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> References: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> Message-ID: <20250913-tt-bh-dts-v1-2-ddb0d6860fe5@tenstorrent.com> From: Drew Fustini Add compatibles for the Tenstorrent Blackhole A0 SoC PCIe card. Signed-off-by: Drew Fustini --- .../devicetree/bindings/riscv/tenstorrent.yaml | 28 ++++++++++++++++++++++ MAINTAINERS | 8 +++++++ 2 files changed, 36 insertions(+) diff --git a/Documentation/devicetree/bindings/riscv/tenstorrent.yaml b/Documentation/devicetree/bindings/riscv/tenstorrent.yaml new file mode 100644 index 0000000000000000000000000000000000000000..877da1b0214f6730713369f82a1fdcc44c4ea562 --- /dev/null +++ b/Documentation/devicetree/bindings/riscv/tenstorrent.yaml @@ -0,0 +1,28 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/riscv/tenstorrent.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Tenstorrent SoC-based boards + +maintainers: + - Drew Fustini + - Joel Stanley + +description: + Tenstorrent SoC-based boards + +properties: + $nodename: + const: '/' + compatible: + oneOf: + - description: Tenstorrent Blackhole A0 PCIe card + items: + - const: tenstorrent,blackhole-a0-card + - const: tenstorrent,blackhole-a0 + +additionalProperties: true + +... diff --git a/MAINTAINERS b/MAINTAINERS index cd7ff55b5d321752ac44c91d2d7e74de28e08960..f2cb2aae8d66d21bf5c13b16b3b1d8fdc98b9462 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -21741,6 +21741,14 @@ F: arch/riscv/boot/dts/spacemit/ N: spacemit K: spacemit +RISC-V TENSTORRENT SoC SUPPORT +M: Drew Fustini +M: Joel Stanley +L: linux-riscv at lists.infradead.org +S: Maintained +T: git https://github.com/tenstorrent/linux.git +F: Documentation/devicetree/bindings/riscv/tenstorrent.yaml + RISC-V THEAD SoC SUPPORT M: Drew Fustini M: Guo Ren -- 2.34.1 From fustini at kernel.org Sat Sep 13 14:31:02 2025 From: fustini at kernel.org (Drew Fustini) Date: Sat, 13 Sep 2025 14:31:02 -0700 Subject: [PATCH 3/7] dt-bindings: riscv: cpus: Add SiFive X280 compatible In-Reply-To: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> References: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> Message-ID: <20250913-tt-bh-dts-v1-3-ddb0d6860fe5@tenstorrent.com> From: Drew Fustini Document compatible for the SiFive X280 RISC-V core. Signed-off-by: Drew Fustini --- Documentation/devicetree/bindings/riscv/cpus.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml index 1a0cf0702a45d2df38c48f50d66b3d2ac3715da5..bbc3886282dc5e8c53e54c0acd91608b443f590f 100644 --- a/Documentation/devicetree/bindings/riscv/cpus.yaml +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml @@ -69,6 +69,7 @@ properties: - enum: - sifive,e51 - sifive,u54-mc + - sifive,x280 - const: sifive,rocket0 - const: riscv - const: riscv # Simulator only -- 2.34.1 From fustini at kernel.org Sat Sep 13 14:31:03 2025 From: fustini at kernel.org (Drew Fustini) Date: Sat, 13 Sep 2025 14:31:03 -0700 Subject: [PATCH 4/7] dt-bindings: timers: Add Tenstorrent Blackhole compatible In-Reply-To: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> References: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> Message-ID: <20250913-tt-bh-dts-v1-4-ddb0d6860fe5@tenstorrent.com> From: Drew Fustini Document clint compatible for the Tenstorrent Blackhole A0 SoC. Signed-off-by: Drew Fustini --- Documentation/devicetree/bindings/timer/sifive,clint.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/timer/sifive,clint.yaml b/Documentation/devicetree/bindings/timer/sifive,clint.yaml index d85a1a088b35dabc0aa202475b926302705c4cf1..198146c59de0c95a2ffa052c8d4d7aa3f91f8e92 100644 --- a/Documentation/devicetree/bindings/timer/sifive,clint.yaml +++ b/Documentation/devicetree/bindings/timer/sifive,clint.yaml @@ -36,6 +36,7 @@ properties: - starfive,jh7100-clint # StarFive JH7100 - starfive,jh7110-clint # StarFive JH7110 - starfive,jh8100-clint # StarFive JH8100 + - tenstorrent,blackhole-a0-clint # Tenstorrent Blackhole - const: sifive,clint0 # SiFive CLINT v0 IP block - items: - {} -- 2.34.1 From fustini at kernel.org Sat Sep 13 14:31:06 2025 From: fustini at kernel.org (Drew Fustini) Date: Sat, 13 Sep 2025 14:31:06 -0700 Subject: [PATCH 7/7] riscv: Kconfig.socs: Add ARCH_TENSTORRENT for Tenstorrent SoCs In-Reply-To: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> References: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> Message-ID: <20250913-tt-bh-dts-v1-7-ddb0d6860fe5@tenstorrent.com> From: Drew Fustini Add Kconfig option ARCH_TENSTORRENT to enable support for SoCs like the Blackhole A0. Signed-off-by: Drew Fustini --- arch/riscv/Kconfig.socs | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/riscv/Kconfig.socs b/arch/riscv/Kconfig.socs index 61ceae0aa27a6fa3a91da6a46becfd96da99fd09..ff733a998612d429e7b1e00276eb86290d8331a3 100644 --- a/arch/riscv/Kconfig.socs +++ b/arch/riscv/Kconfig.socs @@ -57,6 +57,14 @@ config ARCH_SUNXI This enables support for Allwinner sun20i platform hardware, including boards based on the D1 and D1s SoCs. +config ARCH_TENSTORRENT + bool "Tenstorrent SoCs" + help + This enables support for Tenstorrent SoC platforms. + Current support is for Blackhole P100 and P150 PCIe cards. + The Blackhole A0 SoC contains four RISC-V CPU tiles each + consisting of 4x SiFive X280 cores. + config ARCH_THEAD bool "T-HEAD RISC-V SoCs" depends on MMU && !XIP_KERNEL -- 2.34.1 From fustini at kernel.org Sat Sep 13 14:31:04 2025 From: fustini at kernel.org (Drew Fustini) Date: Sat, 13 Sep 2025 14:31:04 -0700 Subject: [PATCH 5/7] dt-bindings: interrupt-controller: Add Tenstorrent Blackhole compatible In-Reply-To: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> References: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> Message-ID: <20250913-tt-bh-dts-v1-5-ddb0d6860fe5@tenstorrent.com> From: Drew Fustini Document compatible for the PLIC in the Tenstorrent Blackhole A0 SoC. Signed-off-by: Drew Fustini --- .../devicetree/bindings/interrupt-controller/sifive,plic-1.0.0.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/interrupt-controller/sifive,plic-1.0.0.yaml b/Documentation/devicetree/bindings/interrupt-controller/sifive,plic-1.0.0.yaml index 5b827bc243011cda1fd45d739d34eca95c6e1ee2..c960a9ec17e9fceb0b754c21162e8730b12120fb 100644 --- a/Documentation/devicetree/bindings/interrupt-controller/sifive,plic-1.0.0.yaml +++ b/Documentation/devicetree/bindings/interrupt-controller/sifive,plic-1.0.0.yaml @@ -63,6 +63,7 @@ properties: - spacemit,k1-plic - starfive,jh7100-plic - starfive,jh7110-plic + - tenstorrent,blackhole-a0-plic - const: sifive,plic-1.0.0 - items: - enum: -- 2.34.1 From fustini at kernel.org Sat Sep 13 14:31:05 2025 From: fustini at kernel.org (Drew Fustini) Date: Sat, 13 Sep 2025 14:31:05 -0700 Subject: [PATCH 6/7] riscv: dts: Add Tenstorrent Blackhole A0 SoC PCIe cards In-Reply-To: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> References: <20250913-tt-bh-dts-v1-0-ddb0d6860fe5@tenstorrent.com> Message-ID: <20250913-tt-bh-dts-v1-6-ddb0d6860fe5@tenstorrent.com> From: Drew Fustini Add device tree source describing the Tenstorrent Blackhole A0 SoC and the Blackhole P100 and P150 PCIe cards. There are no differences between the P100 and P150 cards from the perspective of an OS kernel like Linux running on the X280 cores. Link: https://github.com/tenstorrent/tt-isa-documentation/blob/main/BlackholeA0/ Signed-off-by: Drew Fustini --- MAINTAINERS | 1 + arch/riscv/boot/dts/Makefile | 1 + arch/riscv/boot/dts/tenstorrent/Makefile | 2 + .../boot/dts/tenstorrent/blackhole-a0-card.dts | 14 +++ arch/riscv/boot/dts/tenstorrent/blackhole-a0.dtsi | 112 +++++++++++++++++++++ 5 files changed, 130 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index f2cb2aae8d66d21bf5c13b16b3b1d8fdc98b9462..20605d7530a6d19e928709647ea91a9cf7913ee7 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -21748,6 +21748,7 @@ L: linux-riscv at lists.infradead.org S: Maintained T: git https://github.com/tenstorrent/linux.git F: Documentation/devicetree/bindings/riscv/tenstorrent.yaml +F: arch/riscv/boot/dts/tenstorrent/ RISC-V THEAD SoC SUPPORT M: Drew Fustini diff --git a/arch/riscv/boot/dts/Makefile b/arch/riscv/boot/dts/Makefile index 3b99e91efa25be2d6ca5bc173342c24a72f87187..0624199867065dbb5eb62d660f950b4aa3a7abd7 100644 --- a/arch/riscv/boot/dts/Makefile +++ b/arch/riscv/boot/dts/Makefile @@ -8,4 +8,5 @@ subdir-y += sifive subdir-y += sophgo subdir-y += spacemit subdir-y += starfive +subdir-y += tenstorrent subdir-y += thead diff --git a/arch/riscv/boot/dts/tenstorrent/Makefile b/arch/riscv/boot/dts/tenstorrent/Makefile new file mode 100644 index 0000000000000000000000000000000000000000..009510bea6c8e558bda70850a7f8490b23bffdea --- /dev/null +++ b/arch/riscv/boot/dts/tenstorrent/Makefile @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0 +dtb-$(CONFIG_ARCH_TENSTORRENT) += blackhole-a0-card.dtb diff --git a/arch/riscv/boot/dts/tenstorrent/blackhole-a0-card.dts b/arch/riscv/boot/dts/tenstorrent/blackhole-a0-card.dts new file mode 100644 index 0000000000000000000000000000000000000000..b2b08023643a2cebd4f924579024290bb355c9b3 --- /dev/null +++ b/arch/riscv/boot/dts/tenstorrent/blackhole-a0-card.dts @@ -0,0 +1,14 @@ +// SPDX-License-Identifier: (GPL-2.0 OR MIT) +/dts-v1/; + +#include "blackhole-a0.dtsi" + +/ { + model = "Tenstorrent Blackhole A0 SoC PCIe card"; + compatible = "tenstorrent,blackhole-a0-card", "tenstorrent,blackhole-a0"; + + memory at 0 { + device_type = "memory"; + reg = <0x4000 0x30000000 0x1 0x00000000>; + }; +}; diff --git a/arch/riscv/boot/dts/tenstorrent/blackhole-a0.dtsi b/arch/riscv/boot/dts/tenstorrent/blackhole-a0.dtsi new file mode 100644 index 0000000000000000000000000000000000000000..517b6442ff0fe61659069e29318ad3f01bc504e2 --- /dev/null +++ b/arch/riscv/boot/dts/tenstorrent/blackhole-a0.dtsi @@ -0,0 +1,112 @@ +// SPDX-License-Identifier: (GPL-2.0 OR MIT) +// Copyright 2025 Tenstorrent AI ULC +/dts-v1/; + +/ { + compatible = "tenstorrent,blackhole-a0"; + #address-cells = <2>; + #size-cells = <2>; + + cpus { + #address-cells = <0x1>; + #size-cells = <0x0>; + timebase-frequency = <50000000>; + + cpu at 0 { + compatible = "sifive,x280", "sifive,rocket0", "riscv"; + device_type = "cpu"; + reg = <0>; + mmu-type = "riscv,sv57"; + riscv,isa = "rv64imafdcv_zicsr_zifencei_zfh_zba_zbb_sscofpmf"; + riscv,isa-base = "rv64i"; + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "v", "zicsr", + "zifencei", "zfh", "zba", "zbb", "sscofpmf"; + riscv,cboz-block-size = <0x40>; + cpu0_intc: interrupt-controller { + compatible = "riscv,cpu-intc"; + #interrupt-cells = <1>; + interrupt-controller; + }; + }; + + cpu at 1 { + compatible = "sifive,x280", "sifive,rocket0", "riscv"; + device_type = "cpu"; + reg = <1>; + mmu-type = "riscv,sv57"; + riscv,isa = "rv64imafdcv_zicsr_zifencei_zfh_zba_zbb_sscofpmf"; + riscv,isa-base = "rv64i"; + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "v", "zicsr", + "zifencei", "zfh", "zba", "zbb", "sscofpmf"; + riscv,cboz-block-size = <0x40>; + cpu1_intc: interrupt-controller { + compatible = "riscv,cpu-intc"; + #interrupt-cells = <1>; + interrupt-controller; + }; + }; + + cpu at 2 { + compatible = "sifive,x280", "sifive,rocket0", "riscv"; + device_type = "cpu"; + reg = <2>; + mmu-type = "riscv,sv57"; + riscv,isa = "rv64imafdcv_zicsr_zifencei_zfh_zba_zbb_sscofpmf"; + riscv,isa-base = "rv64i"; + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "v", "zicsr", + "zifencei", "zfh", "zba", "zbb", "sscofpmf"; + riscv,cboz-block-size = <0x40>; + cpu2_intc: interrupt-controller { + compatible = "riscv,cpu-intc"; + #interrupt-cells = <1>; + interrupt-controller; + }; + }; + + cpu at 3 { + compatible = "sifive,x280", "sifive,rocket0", "riscv"; + device_type = "cpu"; + reg = <3>; + mmu-type = "riscv,sv57"; + riscv,isa-base = "rv64i"; + riscv,isa = "rv64imafdcv_zicsr_zifencei_zfh_zba_zbb_sscofpmf"; + riscv,isa-extensions = "i", "m", "a", "f", "d", "c", "v", "zicsr", + "zifencei", "zfh", "zba", "zbb", "sscofpmf"; + riscv,cboz-block-size = <0x40>; + cpu3_intc: interrupt-controller { + compatible = "riscv,cpu-intc"; + #interrupt-cells = <1>; + interrupt-controller; + }; + }; + }; + + soc { + #address-cells = <2>; + #size-cells = <2>; + compatible = "simple-bus"; + ranges; + + clint0: timer at 2000000 { + compatible = "tenstorrent,blackhole-a0-clint", "sifive,clint0"; + reg = <0x0 0x2000000 0x0 0x10000>; + interrupts-extended = <&cpu0_intc 0x3>, <&cpu0_intc 0x7>, + <&cpu1_intc 0x3>, <&cpu1_intc 0x7>, + <&cpu2_intc 0x3>, <&cpu2_intc 0x7>, + <&cpu3_intc 0x3>, <&cpu3_intc 0x7>; + }; + + plic0: interrupt-controller at c000000 { + compatible = "tenstorrent,blackhole-a0-plic", "sifive,plic-1.0.0"; + reg = <0x0 0x0c000000 0x0 0x04000000>; + interrupts-extended = <&cpu0_intc 11>, <&cpu0_intc 9>, + <&cpu1_intc 11>, <&cpu1_intc 9>, + <&cpu2_intc 11>, <&cpu2_intc 9>, + <&cpu3_intc 11>, <&cpu3_intc 9>; + interrupt-controller; + #interrupt-cells = <1>; + #address-cells = <0>; + riscv,ndev = <128>; + }; + }; +}; -- 2.34.1 From sboyd at kernel.org Sat Sep 13 15:00:21 2025 From: sboyd at kernel.org (Stephen Boyd) Date: Sat, 13 Sep 2025 15:00:21 -0700 Subject: [GIT PULL] clk: thead: Updates for v6.18 In-Reply-To: References: Message-ID: <175780082175.4354.18337386109597093831@lazor> Quoting Drew Fustini (2025-09-06 12:12:55) > The following changes since commit 8f5ae30d69d7543eee0d70083daf4de8fe15d585: > > Linux 6.17-rc1 (2025-08-10 19:41:16 +0300) > > are available in the Git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/fustini/linux.git tags/thead-clk-for-v6.18 > > for you to fetch changes up to c567bc5fc68c4388c00e11fc65fd14fe86b52070: > > clk: thead: th1520-ap: set all AXI clocks to CLK_IS_CRITICAL (2025-08-18 14:58:23 -0700) > > ---------------------------------------------------------------- Thanks. Pulled into to clk-next From sboyd at kernel.org Sat Sep 13 15:05:56 2025 From: sboyd at kernel.org (Stephen Boyd) Date: Sat, 13 Sep 2025 15:05:56 -0700 Subject: [GIT PULL] clk: spacemit: Updates for v6.18 In-Reply-To: <20250909171321-GYC7803064@gentoo.org> References: <20250909171321-GYC7803064@gentoo.org> Message-ID: <175780115692.4354.14857999848718953911@lazor> Quoting Yixun Lan (2025-09-09 02:15:03) > Hi Stephen, > > Please pull SpacemiT's clock changes for v6.18 > > Yixun Lan > > The following changes since commit 8f5ae30d69d7543eee0d70083daf4de8fe15d585: > > Linux 6.17-rc1 (2025-08-10 19:41:16 +0300) > > are available in the Git repository at: > > https://github.com/spacemit-com/linux tags/spacemit-clk-for-6.18-1 > > for you to fetch changes up to d02c71cba7bba453d233a49497412ddbf2d44871: > > clk: spacemit: ccu_pll: convert from round_rate() to determine_rate() (2025-08-26 06:07:45 +0800) > > ---------------------------------------------------------------- Thanks. Pulled into to clk-next From safinaskar at gmail.com Sat Sep 13 20:43:35 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sun, 14 Sep 2025 06:43:35 +0300 Subject: [PATCH RESEND 37/62] init: remove root_mountflags from init/do_mounts.h In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250914034335.3506706-1-safinaskar@gmail.com> It is already declared in include/linux/kernel.h Signed-off-by: Askar Safin --- init/do_mounts.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/init/do_mounts.h b/init/do_mounts.h index 90422fb07c02..e225d594dd06 100644 --- a/init/do_mounts.h +++ b/init/do_mounts.h @@ -12,8 +12,6 @@ #include #include -extern int root_mountflags; - /* Ensure that async file closing finished to prevent spurious errors. */ static inline void init_flush_fput(void) { -- 2.47.2 From safinaskar at gmail.com Sat Sep 13 20:50:27 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sun, 14 Sep 2025 06:50:27 +0300 Subject: [PATCH RESEND 38/62] init: remove most headers from init/do_mounts.h In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250914035027.3609569-1-safinaskar@gmail.com> This is cleanup after initrd removal Signed-off-by: Askar Safin --- init/do_mounts.c | 2 ++ init/do_mounts.h | 10 ---------- 2 files changed, 2 insertions(+), 10 deletions(-) diff --git a/init/do_mounts.c b/init/do_mounts.c index 7ec5ee5a5c19..5b55d0035e03 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -5,12 +5,14 @@ #include #include #include +#include #include #include #include #include #include #include +#include #include #include #include diff --git a/init/do_mounts.h b/init/do_mounts.h index e225d594dd06..53e60add795a 100644 --- a/init/do_mounts.h +++ b/init/do_mounts.h @@ -1,14 +1,4 @@ /* SPDX-License-Identifier: GPL-2.0 */ -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include #include #include -- 2.47.2 From safinaskar at gmail.com Sat Sep 13 20:51:03 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sun, 14 Sep 2025 06:51:03 +0300 Subject: [PATCH RESEND 39/62] init: make console_on_rootfs static In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250914035103.3619203-1-safinaskar@gmail.com> This is cleanup after initrd removal Signed-off-by: Askar Safin --- include/linux/initrd.h | 2 -- init/main.c | 2 +- 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/include/linux/initrd.h b/include/linux/initrd.h index 364b603215ac..55239701c4e0 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -23,6 +23,4 @@ extern unsigned long phys_external_initramfs_size; extern char __builtin_initramfs_start[]; extern unsigned long __builtin_initramfs_size; -void console_on_rootfs(void); - #endif /* __LINUX_INITRD_H */ diff --git a/init/main.c b/init/main.c index 58a7199c81f7..f119460bf8e1 100644 --- a/init/main.c +++ b/init/main.c @@ -1533,7 +1533,7 @@ static int __ref kernel_init(void *unused) } /* Open /dev/console, for stdin/stdout/stderr, this should never fail */ -void __init console_on_rootfs(void) +static void __init console_on_rootfs(void) { struct file *file = filp_open("/dev/console", O_RDWR, 0); -- 2.47.2 From safinaskar at gmail.com Sat Sep 13 20:51:38 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sun, 14 Sep 2025 06:51:38 +0300 Subject: [PATCH RESEND 40/62] init: rename free_initrd_mem to free_initramfs_mem In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250914035138.3631173-1-safinaskar@gmail.com> This is cleanup after initrd removal Signed-off-by: Askar Safin --- arch/arm/mm/init.c | 2 +- arch/x86/mm/init.c | 2 +- include/linux/initrd.h | 2 +- init/initramfs.c | 10 +++++----- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 4faeec51c522..290e9f9874c9 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -437,7 +437,7 @@ void free_initmem(void) } #ifdef CONFIG_BLK_DEV_INITRD -void free_initrd_mem(unsigned long start, unsigned long end) +void free_initramfs_mem(unsigned long start, unsigned long end) { if (start == virt_external_initramfs_start) start = round_down(start, PAGE_SIZE); diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index bb57e93b4caf..c7ca996fb430 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -981,7 +981,7 @@ void __ref free_initmem(void) } #ifdef CONFIG_BLK_DEV_INITRD -void __init free_initrd_mem(unsigned long start, unsigned long end) +void __init free_initramfs_mem(unsigned long start, unsigned long end) { /* * end could be not aligned, and We can not align that, diff --git a/include/linux/initrd.h b/include/linux/initrd.h index 55239701c4e0..b2a0128c3438 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -7,7 +7,7 @@ extern int initramfs_below_start_ok; extern unsigned long virt_external_initramfs_start, virt_external_initramfs_end; -extern void free_initrd_mem(unsigned long, unsigned long); +extern void free_initramfs_mem(unsigned long, unsigned long); #ifdef CONFIG_BLK_DEV_INITRD extern void __init reserve_initrd_mem(void); diff --git a/init/initramfs.c b/init/initramfs.c index 8ed352721a79..7a050e54ff1a 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -642,7 +642,7 @@ void __init reserve_initrd_mem(void) if (!phys_external_initramfs_size) return; /* - * Round the memory region to page boundaries as per free_initrd_mem() + * Round the memory region to page boundaries as per free_initramfs_mem() * This allows us to detect whether the pages overlapping the initrd * are in use, but more importantly, reserves the entire set of pages * as we don't want these pages allocated for other purposes. @@ -676,7 +676,7 @@ void __init reserve_initrd_mem(void) virt_external_initramfs_end = 0; } -void __weak __init free_initrd_mem(unsigned long start, unsigned long end) +void __weak __init free_initramfs_mem(unsigned long start, unsigned long end) { #ifdef CONFIG_ARCH_KEEP_MEMBLOCK unsigned long aligned_start = ALIGN_DOWN(start, PAGE_SIZE); @@ -707,9 +707,9 @@ static bool __init kexec_free_initrd(void) */ memset((void *)virt_external_initramfs_start, 0, virt_external_initramfs_end - virt_external_initramfs_start); if (virt_external_initramfs_start < crashk_start) - free_initrd_mem(virt_external_initramfs_start, crashk_start); + free_initramfs_mem(virt_external_initramfs_start, crashk_start); if (virt_external_initramfs_end > crashk_end) - free_initrd_mem(crashk_end, virt_external_initramfs_end); + free_initramfs_mem(crashk_end, virt_external_initramfs_end); return true; } #else @@ -744,7 +744,7 @@ static void __init do_populate_rootfs(void *unused, async_cookie_t cookie) * free only memory that is not part of crashkernel region. */ if (!do_retain_initrd && virt_external_initramfs_start && !kexec_free_initrd()) { - free_initrd_mem(virt_external_initramfs_start, virt_external_initramfs_end); + free_initramfs_mem(virt_external_initramfs_start, virt_external_initramfs_end); } else if (do_retain_initrd && virt_external_initramfs_start) { bin_attr_initrd.size = virt_external_initramfs_end - virt_external_initramfs_start; bin_attr_initrd.private = (void *)virt_external_initramfs_start; -- 2.47.2 From safinaskar at gmail.com Sat Sep 13 20:52:15 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sun, 14 Sep 2025 06:52:15 +0300 Subject: [PATCH RESEND 41/62] init: rename reserve_initrd_mem to reserve_initramfs_mem In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250914035215.3641628-1-safinaskar@gmail.com> This is cleanup after initrd removal Signed-off-by: Askar Safin --- arch/arm/mm/init.c | 2 +- arch/loongarch/kernel/setup.c | 2 +- arch/riscv/mm/init.c | 2 +- include/linux/initrd.h | 4 ++-- init/initramfs.c | 2 +- 5 files changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 290e9f9874c9..a564cbc36d18 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -186,7 +186,7 @@ void __init arm_memblock_init(const struct machine_desc *mdesc) /* Register the kernel text, kernel data and initrd with memblock. */ memblock_reserve(__pa(KERNEL_START), KERNEL_END - KERNEL_START); - reserve_initrd_mem(); + reserve_initramfs_mem(); arm_mm_memblock_reserve(); diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c index 075b79b2c1d3..226262f35dc1 100644 --- a/arch/loongarch/kernel/setup.c +++ b/arch/loongarch/kernel/setup.c @@ -602,7 +602,7 @@ void __init setup_arch(char **cmdline_p) pagetable_init(); bootcmdline_init(cmdline_p); parse_early_param(); - reserve_initrd_mem(); + reserve_initramfs_mem(); platform_init(); arch_mem_init(cmdline_p); diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 15683ae13fa5..b1c4876dadae 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -295,7 +295,7 @@ static void __init setup_bootmem(void) dma32_phys_limit = min(4UL * SZ_1G, (unsigned long)PFN_PHYS(max_low_pfn)); - reserve_initrd_mem(); + reserve_initramfs_mem(); /* * No allocation should be done before reserving the memory as defined diff --git a/include/linux/initrd.h b/include/linux/initrd.h index b2a0128c3438..51c473b6a973 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -10,10 +10,10 @@ extern unsigned long virt_external_initramfs_start, virt_external_initramfs_end; extern void free_initramfs_mem(unsigned long, unsigned long); #ifdef CONFIG_BLK_DEV_INITRD -extern void __init reserve_initrd_mem(void); +extern void __init reserve_initramfs_mem(void); extern void wait_for_initramfs(void); #else -static inline void __init reserve_initrd_mem(void) {} +static inline void __init reserve_initramfs_mem(void) {} static inline void wait_for_initramfs(void) {} #endif diff --git a/init/initramfs.c b/init/initramfs.c index 7a050e54ff1a..a6c11260e62b 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -631,7 +631,7 @@ early_param("initrd", early_initrd); static BIN_ATTR(initrd, 0440, sysfs_bin_attr_simple_read, NULL, 0); -void __init reserve_initrd_mem(void) +void __init reserve_initramfs_mem(void) { phys_addr_t start; unsigned long size; -- 2.47.2 From safinaskar at gmail.com Sat Sep 13 20:52:50 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sun, 14 Sep 2025 06:52:50 +0300 Subject: [PATCH RESEND 42/62] init: rename to In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250914035250.3651258-1-safinaskar@gmail.com> This is cleanup after initrd removal Signed-off-by: Askar Safin --- arch/alpha/kernel/core_irongate.c | 2 +- arch/alpha/kernel/setup.c | 2 +- arch/arc/mm/init.c | 2 +- arch/arm/kernel/atags_parse.c | 2 +- arch/arm/kernel/setup.c | 2 +- arch/arm/mm/init.c | 2 +- arch/arm64/kernel/setup.c | 2 +- arch/arm64/mm/init.c | 2 +- arch/csky/kernel/setup.c | 2 +- arch/csky/mm/init.c | 2 +- arch/loongarch/kernel/mem.c | 2 +- arch/loongarch/kernel/setup.c | 2 +- arch/m68k/kernel/setup_mm.c | 2 +- arch/m68k/kernel/setup_no.c | 2 +- arch/m68k/kernel/uboot.c | 2 +- arch/microblaze/kernel/cpu/mb.c | 2 +- arch/microblaze/kernel/setup.c | 2 +- arch/microblaze/mm/init.c | 2 +- arch/mips/ath79/prom.c | 2 +- arch/mips/kernel/setup.c | 2 +- arch/mips/mm/init.c | 2 +- arch/mips/sibyte/swarm/setup.c | 2 +- arch/nios2/kernel/setup.c | 2 +- arch/openrisc/kernel/setup.c | 2 +- arch/parisc/kernel/pdt.c | 2 +- arch/parisc/kernel/setup.c | 2 +- arch/parisc/mm/init.c | 2 +- arch/powerpc/kernel/prom.c | 2 +- arch/powerpc/kernel/prom_init.c | 2 +- arch/powerpc/kernel/setup-common.c | 2 +- arch/powerpc/kernel/setup_32.c | 2 +- arch/powerpc/kernel/setup_64.c | 2 +- arch/powerpc/mm/init_32.c | 2 +- arch/powerpc/platforms/52xx/lite5200.c | 2 +- arch/powerpc/platforms/83xx/km83xx.c | 2 +- arch/powerpc/platforms/85xx/mpc85xx_mds.c | 2 +- arch/powerpc/platforms/chrp/setup.c | 2 +- arch/powerpc/platforms/embedded6xx/linkstation.c | 2 +- arch/powerpc/platforms/embedded6xx/storcenter.c | 2 +- arch/powerpc/platforms/powermac/setup.c | 2 +- arch/riscv/mm/init.c | 2 +- arch/s390/kernel/setup.c | 2 +- arch/s390/mm/init.c | 2 +- arch/sh/kernel/setup.c | 2 +- arch/sparc/kernel/setup_32.c | 2 +- arch/sparc/kernel/setup_64.c | 2 +- arch/sparc/mm/init_32.c | 2 +- arch/sparc/mm/init_64.c | 2 +- arch/um/kernel/initrd.c | 2 +- arch/x86/kernel/cpu/microcode/amd.c | 2 +- arch/x86/kernel/cpu/microcode/intel.c | 2 +- arch/x86/kernel/cpu/microcode/internal.h | 2 +- arch/x86/kernel/devicetree.c | 2 +- arch/x86/kernel/setup.c | 2 +- arch/x86/mm/init.c | 2 +- arch/x86/mm/init_32.c | 2 +- arch/x86/mm/init_64.c | 2 +- drivers/acpi/tables.c | 2 +- drivers/base/firmware_loader/main.c | 2 +- drivers/block/brd.c | 2 +- drivers/firmware/efi/efi.c | 2 +- drivers/of/fdt.c | 2 +- include/linux/{initrd.h => initramfs.h} | 6 +++--- init/do_mounts.c | 2 +- init/initramfs.c | 2 +- init/main.c | 2 +- kernel/sysctl.c | 2 +- kernel/umh.c | 2 +- 68 files changed, 70 insertions(+), 70 deletions(-) rename include/linux/{initrd.h => initramfs.h} (89%) diff --git a/arch/alpha/kernel/core_irongate.c b/arch/alpha/kernel/core_irongate.c index 5519bb8fc6f2..83b799848b39 100644 --- a/arch/alpha/kernel/core_irongate.c +++ b/arch/alpha/kernel/core_irongate.c @@ -19,7 +19,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c index a344e71b2d2a..809651206781 100644 --- a/arch/alpha/kernel/setup.c +++ b/arch/alpha/kernel/setup.c @@ -34,7 +34,7 @@ #include #include #include -#include +#include #include #include #ifdef CONFIG_MAGIC_SYSRQ diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c index 1e098d7fc6af..00aaf1ed389f 100644 --- a/arch/arc/mm/init.c +++ b/arch/arc/mm/init.c @@ -7,7 +7,7 @@ #include #include #ifdef CONFIG_BLK_DEV_INITRD -#include +#include #endif #include #include diff --git a/arch/arm/kernel/atags_parse.c b/arch/arm/kernel/atags_parse.c index 615d9e83c9b5..2b49e0ddfa42 100644 --- a/arch/arm/kernel/atags_parse.c +++ b/arch/arm/kernel/atags_parse.c @@ -15,7 +15,7 @@ */ #include -#include +#include #include #include #include diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 0bfd66c7ada0..876039b24290 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index a564cbc36d18..ae5921db626e 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -13,7 +13,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 77c7926a4df6..bddbb473ad88 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -12,7 +12,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 3414e48c8c82..e50533faaece 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -14,7 +14,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/csky/kernel/setup.c b/arch/csky/kernel/setup.c index 403a977b8c1f..9feca38d4c47 100644 --- a/arch/csky/kernel/setup.c +++ b/arch/csky/kernel/setup.c @@ -3,7 +3,7 @@ #include #include -#include +#include #include #include #include diff --git a/arch/csky/mm/init.c b/arch/csky/mm/init.c index 573da66b2543..f2d1004fc6ae 100644 --- a/arch/csky/mm/init.c +++ b/arch/csky/mm/init.c @@ -19,7 +19,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c index aed901c57fb4..5ec4d18c9000 100644 --- a/arch/loongarch/kernel/mem.c +++ b/arch/loongarch/kernel/mem.c @@ -3,7 +3,7 @@ * Copyright (C) 2020-2022 Loongson Technology Corporation Limited */ #include -#include +#include #include #include diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c index 226262f35dc1..5d0124cbe94b 100644 --- a/arch/loongarch/kernel/setup.c +++ b/arch/loongarch/kernel/setup.c @@ -17,7 +17,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c index 80f0544c1041..b9c9b2e3a150 100644 --- a/arch/m68k/kernel/setup_mm.c +++ b/arch/m68k/kernel/setup_mm.c @@ -25,7 +25,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/m68k/kernel/setup_no.c b/arch/m68k/kernel/setup_no.c index 4d98e0063725..6d3d5a299383 100644 --- a/arch/m68k/kernel/setup_no.c +++ b/arch/m68k/kernel/setup_no.c @@ -29,7 +29,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/m68k/kernel/uboot.c b/arch/m68k/kernel/uboot.c index 5fc831a0794a..416e3f8f879d 100644 --- a/arch/m68k/kernel/uboot.c +++ b/arch/m68k/kernel/uboot.c @@ -18,7 +18,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/microblaze/kernel/cpu/mb.c b/arch/microblaze/kernel/cpu/mb.c index 37cb2898216b..a5d2c564d4e5 100644 --- a/arch/microblaze/kernel/cpu/mb.c +++ b/arch/microblaze/kernel/cpu/mb.c @@ -13,7 +13,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/microblaze/kernel/setup.c b/arch/microblaze/kernel/setup.c index f417333eccae..7f537307b71c 100644 --- a/arch/microblaze/kernel/setup.c +++ b/arch/microblaze/kernel/setup.c @@ -14,7 +14,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c index fabeca49c2c6..f54d71160712 100644 --- a/arch/microblaze/mm/init.c +++ b/arch/microblaze/mm/init.c @@ -12,7 +12,7 @@ #include #include #include /* mem_init */ -#include +#include #include #include #include diff --git a/arch/mips/ath79/prom.c b/arch/mips/ath79/prom.c index 506dcada711b..fcb45fe198a0 100644 --- a/arch/mips/ath79/prom.c +++ b/arch/mips/ath79/prom.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c index aed454ebd751..47dc7eb99ef7 100644 --- a/arch/mips/kernel/setup.c +++ b/arch/mips/kernel/setup.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c index a673d3d68254..5b109c737547 100644 --- a/arch/mips/mm/init.c +++ b/arch/mips/mm/init.c @@ -30,7 +30,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/mips/sibyte/swarm/setup.c b/arch/mips/sibyte/swarm/setup.c index 38c90b5e8754..ff8b2d8ad7ab 100644 --- a/arch/mips/sibyte/swarm/setup.c +++ b/arch/mips/sibyte/swarm/setup.c @@ -15,7 +15,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/nios2/kernel/setup.c b/arch/nios2/kernel/setup.c index 3cc44fa4931c..d3d60c42df46 100644 --- a/arch/nios2/kernel/setup.c +++ b/arch/nios2/kernel/setup.c @@ -17,7 +17,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/openrisc/kernel/setup.c b/arch/openrisc/kernel/setup.c index 337a0381c452..27ae87c09b0e 100644 --- a/arch/openrisc/kernel/setup.c +++ b/arch/openrisc/kernel/setup.c @@ -29,7 +29,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/parisc/kernel/pdt.c b/arch/parisc/kernel/pdt.c index 3715a3b088a7..49982a48c92c 100644 --- a/arch/parisc/kernel/pdt.c +++ b/arch/parisc/kernel/pdt.c @@ -17,7 +17,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/parisc/kernel/setup.c b/arch/parisc/kernel/setup.c index 41f45fa177d0..1e403c26070d 100644 --- a/arch/parisc/kernel/setup.c +++ b/arch/parisc/kernel/setup.c @@ -13,7 +13,7 @@ */ #include -#include +#include #include #include #include diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c index af7a33c8bd31..5843f4a46e93 100644 --- a/arch/parisc/mm/init.c +++ b/arch/parisc/mm/init.c @@ -18,7 +18,7 @@ #include #include #include -#include +#include #include #include #include /* for node_online_map */ diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index b7858b0bd697..a2a1896f9e46 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -19,7 +19,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index 827c958677f8..a0ac845eb504 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -24,7 +24,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index eff369cba0e5..53a416bc41ce 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c index 5a1bf501fbe1..21d21b8291ef 100644 --- a/arch/powerpc/kernel/setup_32.c +++ b/arch/powerpc/kernel/setup_32.c @@ -10,7 +10,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index 8fd7cbf3bd04..66c2d563c094 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -13,7 +13,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c index 4e71dfe7d026..f434e6dc1921 100644 --- a/arch/powerpc/mm/init_32.c +++ b/arch/powerpc/mm/init_32.c @@ -22,7 +22,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/powerpc/platforms/52xx/lite5200.c b/arch/powerpc/platforms/52xx/lite5200.c index 0a161d82a3a8..e4222658ec2d 100644 --- a/arch/powerpc/platforms/52xx/lite5200.c +++ b/arch/powerpc/platforms/52xx/lite5200.c @@ -17,7 +17,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/powerpc/platforms/83xx/km83xx.c b/arch/powerpc/platforms/83xx/km83xx.c index 2b5d187d9b62..b0426b35f9ed 100644 --- a/arch/powerpc/platforms/83xx/km83xx.c +++ b/arch/powerpc/platforms/83xx/km83xx.c @@ -19,7 +19,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c b/arch/powerpc/platforms/85xx/mpc85xx_mds.c index c19490cf6376..6b6c11931c1e 100644 --- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c +++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c @@ -24,7 +24,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/powerpc/platforms/chrp/setup.c b/arch/powerpc/platforms/chrp/setup.c index c1bfa4c3444c..00a6663a0a88 100644 --- a/arch/powerpc/platforms/chrp/setup.c +++ b/arch/powerpc/platforms/chrp/setup.c @@ -30,7 +30,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/powerpc/platforms/embedded6xx/linkstation.c b/arch/powerpc/platforms/embedded6xx/linkstation.c index 4012f206ec63..8e41d0fb0892 100644 --- a/arch/powerpc/platforms/embedded6xx/linkstation.c +++ b/arch/powerpc/platforms/embedded6xx/linkstation.c @@ -11,7 +11,7 @@ */ #include -#include +#include #include #include diff --git a/arch/powerpc/platforms/embedded6xx/storcenter.c b/arch/powerpc/platforms/embedded6xx/storcenter.c index e49880e8dab8..df458828eb22 100644 --- a/arch/powerpc/platforms/embedded6xx/storcenter.c +++ b/arch/powerpc/platforms/embedded6xx/storcenter.c @@ -13,7 +13,7 @@ #include #include -#include +#include #include #include diff --git a/arch/powerpc/platforms/powermac/setup.c b/arch/powerpc/platforms/powermac/setup.c index 4c3b9ed5428d..ab0860868025 100644 --- a/arch/powerpc/platforms/powermac/setup.c +++ b/arch/powerpc/platforms/powermac/setup.c @@ -32,7 +32,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index b1c4876dadae..479a0861a93e 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -9,7 +9,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index 9bdb6f6b893e..7ce009c2599d 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -33,7 +33,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index e4953453d254..e6556f9f2be3 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -27,7 +27,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/sh/kernel/setup.c b/arch/sh/kernel/setup.c index 9ce9dc5b9e56..814866e35120 100644 --- a/arch/sh/kernel/setup.c +++ b/arch/sh/kernel/setup.c @@ -9,7 +9,7 @@ */ #include #include -#include +#include #include #include #include diff --git a/arch/sparc/kernel/setup_32.c b/arch/sparc/kernel/setup_32.c index fb46fb3acf54..b3778d78bb78 100644 --- a/arch/sparc/kernel/setup_32.c +++ b/arch/sparc/kernel/setup_32.c @@ -14,7 +14,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/sparc/kernel/setup_64.c b/arch/sparc/kernel/setup_64.c index 79b56613c6d8..02b16827b664 100644 --- a/arch/sparc/kernel/setup_64.c +++ b/arch/sparc/kernel/setup_64.c @@ -28,7 +28,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c index 7b7722ff5232..f04dd1d6f382 100644 --- a/arch/sparc/mm/init_32.c +++ b/arch/sparc/mm/init_32.c @@ -19,7 +19,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index af249a654e79..b0fa82676e6f 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -14,7 +14,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/um/kernel/initrd.c b/arch/um/kernel/initrd.c index e6113192a6b6..99edfbd78c00 100644 --- a/arch/um/kernel/initrd.c +++ b/arch/um/kernel/initrd.c @@ -5,7 +5,7 @@ #include #include -#include +#include #include #include #include diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c index 514f63340880..0086e285d60c 100644 --- a/arch/x86/kernel/cpu/microcode/amd.c +++ b/arch/x86/kernel/cpu/microcode/amd.c @@ -26,7 +26,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c index 371ca6eac00e..4bebf8b77542 100644 --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -14,7 +14,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/x86/kernel/cpu/microcode/internal.h b/arch/x86/kernel/cpu/microcode/internal.h index 50a9702ae4e2..b4aec58af7e3 100644 --- a/arch/x86/kernel/cpu/microcode/internal.h +++ b/arch/x86/kernel/cpu/microcode/internal.h @@ -3,7 +3,7 @@ #define _X86_MICROCODE_INTERNAL_H #include -#include +#include #include #include diff --git a/arch/x86/kernel/devicetree.c b/arch/x86/kernel/devicetree.c index dd8748c45529..3eb6dad99288 100644 --- a/arch/x86/kernel/devicetree.c +++ b/arch/x86/kernel/devicetree.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 167b9ef12ebb..3b88d156ed39 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -14,7 +14,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index c7ca996fb430..b7c45004f999 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -1,5 +1,5 @@ #include -#include +#include #include #include #include diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 8a34fff6ab2b..d075d4178d36 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -27,7 +27,7 @@ #include #include #include -#include +#include #include #include diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index b9426fce5f3e..34fcb5b8f386 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -19,7 +19,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c index 37ad99c10ac4..4ecb6bf897fd 100644 --- a/drivers/acpi/tables.c +++ b/drivers/acpi/tables.c @@ -19,7 +19,7 @@ #include #include #include -#include +#include #include #include #include "internal.h" diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c index 6942c62fa59d..f32de7459e76 100644 --- a/drivers/base/firmware_loader/main.c +++ b/drivers/base/firmware_loader/main.c @@ -15,7 +15,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/drivers/block/brd.c b/drivers/block/brd.c index 05c4325904d2..a15b699d3a09 100644 --- a/drivers/block/brd.c +++ b/drivers/block/brd.c @@ -10,7 +10,7 @@ */ #include -#include +#include #include #include #include diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index 7cab72da2ea9..1dcaaea1dcfb 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -21,7 +21,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index 127b37f211cb..2e73de8a1bbe 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/include/linux/initrd.h b/include/linux/initramfs.h similarity index 89% rename from include/linux/initrd.h rename to include/linux/initramfs.h index 51c473b6a973..e9f523917a02 100644 --- a/include/linux/initrd.h +++ b/include/linux/initramfs.h @@ -1,7 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 */ -#ifndef __LINUX_INITRD_H -#define __LINUX_INITRD_H +#ifndef __LINUX_INITRAMFS_H +#define __LINUX_INITRAMFS_H /* 1 if it is not an error if virt_external_initramfs_start < memory_start */ extern int initramfs_below_start_ok; @@ -23,4 +23,4 @@ extern unsigned long phys_external_initramfs_size; extern char __builtin_initramfs_start[]; extern unsigned long __builtin_initramfs_size; -#endif /* __LINUX_INITRD_H */ +#endif /* __LINUX_INITRAMFS_H */ diff --git a/init/do_mounts.c b/init/do_mounts.c index 5b55d0035e03..2df33c573d9c 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -14,7 +14,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/init/initramfs.c b/init/initramfs.c index a6c11260e62b..8b648b09247a 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -597,7 +597,7 @@ static int __init initramfs_async_setup(char *str) } __setup("initramfs_async=", initramfs_async_setup); -#include +#include #include unsigned long virt_external_initramfs_start, virt_external_initramfs_end; diff --git a/init/main.c b/init/main.c index f119460bf8e1..5186233c64fd 100644 --- a/init/main.c +++ b/init/main.c @@ -26,7 +26,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/kernel/sysctl.c b/kernel/sysctl.c index cb6196e3fa99..3bf92703332b 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -12,7 +12,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/kernel/umh.c b/kernel/umh.c index b4da45a3a7cf..c58b3e8e9256 100644 --- a/kernel/umh.c +++ b/kernel/umh.c @@ -26,7 +26,7 @@ #include #include #include -#include +#include #include #include -- 2.47.2 From safinaskar at gmail.com Sat Sep 13 20:53:26 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sun, 14 Sep 2025 06:53:26 +0300 Subject: [PATCH RESEND 43/62] setsid: inline ksys_setsid into the only caller In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250914035326.3661003-1-safinaskar@gmail.com> This is cleanup after initrd removal Signed-off-by: Askar Safin --- include/linux/syscalls.h | 1 - kernel/sys.c | 7 +------ 2 files changed, 1 insertion(+), 7 deletions(-) diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 77f45e5d4413..75e9ee03d19b 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -1231,7 +1231,6 @@ int ksys_fchown(unsigned int fd, uid_t user, gid_t group); ssize_t ksys_read(unsigned int fd, char __user *buf, size_t count); void ksys_sync(void); int ksys_unshare(unsigned long unshare_flags); -int ksys_setsid(void); int ksys_sync_file_range(int fd, loff_t offset, loff_t nbytes, unsigned int flags); ssize_t ksys_pread64(unsigned int fd, char __user *buf, size_t count, diff --git a/kernel/sys.c b/kernel/sys.c index 1e28b40053ce..66e1e2dfd585 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -1265,7 +1265,7 @@ static void set_special_pids(struct pid **pids, struct pid *pid) change_pid(pids, curr, PIDTYPE_PGID, pid); } -int ksys_setsid(void) +SYSCALL_DEFINE0(setsid) { struct task_struct *group_leader = current->group_leader; struct pid *sid = task_pid(group_leader); @@ -1300,11 +1300,6 @@ int ksys_setsid(void) return err; } -SYSCALL_DEFINE0(setsid) -{ - return ksys_setsid(); -} - DECLARE_RWSEM(uts_sem); #ifdef COMPAT_UTS_MACHINE -- 2.47.2 From safinaskar at gmail.com Sat Sep 13 20:54:02 2025 From: safinaskar at gmail.com (Askar Safin) Date: Sun, 14 Sep 2025 06:54:02 +0300 Subject: [PATCH RESEND 44/62] doc: kernel-parameters: remove [RAM] from reserve_mem= In-Reply-To: <20250913003842.41944-1-safinaskar@gmail.com> References: <20250913003842.41944-1-safinaskar@gmail.com> Message-ID: <20250914035402.3670906-1-safinaskar@gmail.com> This parameter has nothing to do with ramdisk Signed-off-by: Askar Safin --- Documentation/admin-guide/kernel-parameters.txt | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index a259f2bdba0f..0805d3ebc75a 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -6277,8 +6277,7 @@ them. If is less than 0x10000, the region is assumed to be I/O ports; otherwise it is memory. - reserve_mem= [RAM] - Format: nn[KMG]::