[PATCH RT v2 0/3] riscv: add PREEMPT_RT support
Conor Dooley
conor.dooley at microchip.com
Thu Nov 2 05:31:15 PDT 2023
On Wed, Nov 01, 2023 at 07:41:59PM +0800, Jisheng Zhang wrote:
> On Tue, Oct 31, 2023 at 05:44:11PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2023-10-31 23:49:29 [+0800], Jisheng Zhang wrote:
> > > Yes there's no third patch. I didn't use the correct number in patch0's
> > > subject.
> >
> > So it looks fine. The warning was CPU-hotplug related
> > https://lore.kernel.org/all/0abd0acf-70a1-d546-a517-19efe60042d1@microchip.com/
> >
> > and it looks to be gone as of commit
> > 5944ce092b97c ("arch_topology: Build cacheinfo from primary CPU")
> >
> > so that good. Any double checking is welcome of course ;)
> > JUMP_LABELs don't use stop_cpu. Check.
> > The timer is PERCPU. Check.
> > Can't find perf events. But the commit for threaded interrupts claims to
> > have them per-CPU.
> > Has HAVE_POSIX_CPU_TIMERS_TASK_WORK with generic kvm. Check.
> >
> > die() and die_lock. It looks like die_lock is acquired when the system
> > is done and requires medical assistance. This would qualify it for a
> > raw_spinlock_t. Also, should any of the bad things happen in a section
> > with disabled preemption or interrupts then a spinlock_t can not be
> > acquired. Unless die() is always invoked in a preemptible context…
> >
> > The other things are covered by the generic code. I think I didn't miss
> > anything…
> > I going to have new release by the end of the week at the latest with
> > this bits. Please look after the die_lock.
>
> Thank you so much, I will check.
>
> Hi @Conor,
>
> If you help to try this series, can you please apply Evan' misaligned
> access probe probe patch? Refer to [1] for details. NOTE: this is not
> related with RT patches because I can reproduce the bug with v6.6.
Hmm, so I gave it a go with and without Evan's patch, but I am seeing
various issues with locking. For example, here's one with Evan's patch:
Starting kernel ...
[ 0.000000] Linux version 6.6.0-rc6-rt10-00003-gd649dd498753 (conor at wendy) (ClangBuiltLinux clang version 16.0.2 (/
home/conor/stuff/dev/llvm/clang 18ddebe1a1a9bde349441631365f0472e9693520), ClangBuiltLinux LLD 16.0.2) #1 SMP PREEMPT_
RT @666
[ 0.000000] Machine model: Microchip PolarFire-SoC Icicle Kit
[ 0.000000] SBI specification v1.0 detected
[ 0.000000] SBI implementation ID=0x8 Version=0x10002
[ 0.000000] SBI TIME extension detected
[ 0.000000] SBI IPI extension detected
[ 0.000000] SBI RFENCE extension detected
[ 0.000000] SBI SRST extension detected
[ 0.000000] earlycon: ns16550a0 at MMIO32 0x0000000020100000 (options '115200n8')
[ 0.000000] printk: legacy bootconsole [ns16550a0] enabled
[ 0.000000] printk: debug: skip boot console de-registration.
[ 0.000000] efi: UEFI not found.
[ 0.000000] OF: reserved mem: 0x00000000bfc00000..0x00000000bfffffff (4096 KiB) nomap non-reusable region at BFC00000
[ 0.000000] Zone ranges:
[ 0.000000] DMA32 [mem 0x0000000080000000-0x00000000ffffffff]
[ 0.000000] Normal [mem 0x0000000100000000-0x000000107fffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000080000000-0x00000000bfbfffff]
[ 0.000000] node 0: [mem 0x00000000bfc00000-0x00000000bfffffff]
[ 0.000000] node 0: [mem 0x0000001040000000-0x000000107fffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x000000107fffffff]
[ 0.000000] SBI HSM extension detected
[ 0.000000] CPU with hartid=0 is not available
[ 0.000000] Falling back to deprecated "riscv,isa"
[ 0.000000] riscv: base ISA extensions acdfim
[ 0.000000] riscv: ELF capabilities acdfim
[ 0.000000] percpu: Embedded 31 pages/cpu s88096 r8192 d30688 u126976
[ 0.000000] Kernel command line: earlycon keep_bootcon riscv_isa_fallback
[ 0.000000] Unknown kernel command line parameters "riscv_isa_fallback", will be passed to user space.
[ 0.000000] Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes, linear)
[ 0.000000] Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 517120
[ 0.000000] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
[ 0.000000] software IO TLB: area num 4.
[ 0.000000] software IO TLB: mapped [mem 0x00000000bbc00000-0x00000000bfc00000] (64MB)
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] fixmap : 0xffffffc6fea00000 - 0xffffffc6ff000000 (6144 kB)
[ 0.000000] pci io : 0xffffffc6ff000000 - 0xffffffc700000000 ( 16 MB)
[ 0.000000] vmemmap : 0xffffffc700000000 - 0xffffffc800000000 (4096 MB)
[ 0.000000] vmalloc : 0xffffffc800000000 - 0xffffffd800000000 ( 64 GB)
[ 0.000000] modules : 0xffffffff02a26000 - 0xffffffff80000000 (2005 MB)
[ 0.000000] lowmem : 0xffffffd800000000 - 0xffffffe800000000 ( 64 GB)
[ 0.000000] kernel : 0xffffffff80000000 - 0xffffffffffffffff (2047 MB)
[ 0.000000] Memory: 1914732K/2097152K available (12175K kernel code, 7264K rwdata, 6144K rodata, 6211K init, 11249K bss, 182420K reserved, 0K cma-reserved)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[ 0.000000] trace event string verifier disabled
[ 0.000000] Running RCU self tests
[ 0.000000] Running RCU synchronous self tests
[ 0.000000] rcu: Preemptible hierarchical RCU implementation.
[ 0.000000] rcu: RCU lockdep checking is enabled.
[ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=4.
[ 0.000000] rcu: RCU priority boosting: priority 1 delay 500 ms.
[ 0.000000] rcu: RCU_SOFTIRQ processing moved to rcuc kthreads.
[ 0.000000] rcu: RCU debug extended QS entry/exit.
[ 0.000000] No expedited grace period (rcu_normal_after_boot).
[ 0.000000] Trampoline variant of Tasks RCU enabled.
[ 0.000000] Tracing variant of Tasks RCU enabled.
[ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[ 0.000000] Running RCU synchronous self tests
[ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[ 0.000000] riscv-intc: unable to find hart id for /cpus/cpu at 0/interrupt-controller
[ 0.000000] riscv-intc: 64 local interrupts mapped
[ 0.000000] plic: interrupt-controller at c000000: mapped 186 interrupts with 4 handlers for 9 contexts.
[ 0.000000] riscv: providing IPIs using SBI IPI extension
[ 0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[ 0.000000] clocksource: riscv_clocksource: mask: 0xffffffffffffffff max_cycles: 0x1d854df40, max_idle_ns: 3526361616960 ns
[ 0.000003] sched_clock: 64 bits at 1000kHz, resolution 1000ns, wraps every 2199023255500ns
[ 0.003776] Console: colour dummy device 80x25
[ 0.003879] printk: legacy console [tty0] enabled
[ 0.004090] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[ 0.004107] ... MAX_LOCKDEP_SUBCLASSES: 8
[ 0.004121] ... MAX_LOCK_DEPTH: 48
[ 0.004135] ... MAX_LOCKDEP_KEYS: 8192
[ 0.004148] ... CLASSHASH_SIZE: 4096
[ 0.004161] ... MAX_LOCKDEP_ENTRIES: 32768
[ 0.004175] ... MAX_LOCKDEP_CHAINS: 65536
[ 0.004188] ... CHAINHASH_SIZE: 32768
[ 0.004200] memory used by lock dependency info: 6493 kB
[ 0.004215] memory used for stack traces: 4224 kB
[ 0.004228] per task-struct memory footprint: 1920 bytes
[ 0.005187] Calibrating delay loop (skipped), value calculated using timer frequency.. 2.00 BogoMIPS (lpj=4000)
[ 0.005236] pid_max: default: 32768 minimum: 301
[ 0.009319] Mount-cache hash table entries: 4096 (order: 3, 32768 bytes, linear)
[ 0.009520] Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes, linear)
[ 0.030982] Running RCU synchronous self tests
[ 0.031032] Running RCU synchronous self tests
[ 0.036925] CPU node for /cpus/cpu at 0 exist but the possible cpu range is :0-3
[ 0.055142] RCU Tasks: Setting shift to 2 and lim to 1 rcu_task_cb_adjust=1.
[ 0.056482] RCU Tasks Trace: Setting shift to 2 and lim to 1 rcu_task_cb_adjust=1.
[ 0.058104] Running RCU-tasks wait API self tests
[ 0.064370] riscv: ELF compat mode unsupported
[ 0.064417] ASID allocator disabled (0 bits)
[ 0.071600] Callback from call_rcu_tasks_trace() invoked.
[ 0.072487] rcu: Hierarchical SRCU implementation.
[ 0.072504] rcu: Max phase no-delay instances is 1000.
[ 0.753395] EFI services will not be available.
[ 0.765374] smp: Bringing up secondary CPUs ...
[ 0.826425] smp: Brought up 1 node, 4 CPUs
[ 0.859509] devtmpfs: initialized
[ 1.033554] Running RCU synchronous self tests
[ 1.033791] Running RCU synchronous self tests
[ 1.044502] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 1.044965] futex hash table entries: 1024 (order: 5, 196608 bytes, linear)
[ 1.049859] pinctrl core: initialized pinctrl subsystem
[ 1.076426] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[ 1.093390] Callback from call_rcu_tasks() invoked.
[ 1.099141] DMA: preallocated 256 KiB GFP_KERNEL pool for atomic allocations
[ 1.099644] DMA: preallocated 256 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
[ 1.123261] cpuidle: using governor menu
[ 1.130384] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[ 1.130418]
[ 1.130414] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
[ 1.130431] ================================
[ 1.130440] WARNING: inconsistent lock state
[ 1.130439] preempt_count: 10001, expected: 0
[ 1.130454] RCU nest depth: 1, expected: 1
[ 1.130452] 6.6.0-rc6-rt10-00003-gd649dd498753 #1 Not tainted
[ 1.130468] INFO: lockdep is turned off.
[ 1.130470] --------------------------------
[ 1.130477] irq event stamp: 1026
[ 1.130479] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[ 1.130496] swapper/3/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[ 1.130487] hardirqs last enabled at (1025): [<ffffffff80bd5308>] do_irq+0x7e/0xa6
[ 1.130525] ffffffff81c42d38 (
[ 1.130533] hardirqs last disabled at (1026): [<ffffffff80bd52a0>] do_irq+0x16/0xa6
[ 1.130550] &zone->lock){?.+.}-{2:2}
[ 1.130562] softirqs last enabled at (0): [<ffffffff8001543c>] copy_process+0x548/0xdec
[ 1.130577] , at: __rmqueue_pcplist+0x12a/0xbc0
[ 1.130611] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 1.130626] {HARDIRQ-ON-W} state was registered at:
[ 1.130638] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.6.0-rc6-rt10-00003-gd649dd498753 #1
[ 1.130639] __lock_acquire+0x80a/0xd52
[ 1.130668] Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
[ 1.130684] Call Trace:
[ 1.130681] lock_acquire+0x116/0x2b0
[ 1.130697] [<ffffffff80006fca>] show_stack+0x2c/0x3c
[ 1.130709] rt_spin_lock+0x26/0xdc
[ 1.130747] __rmqueue_pcplist+0x12a/0xbc0
[ 1.130740] [<ffffffff80bd4c38>] dump_stack_lvl+0x5e/0x84
[ 1.130777] get_page_from_freelist+0x37a/0x143e
[ 1.130787] [<ffffffff80bd4c72>] dump_stack+0x14/0x1c
[ 1.130806] __alloc_pages+0xac/0x1be
[ 1.130822] [<ffffffff8005ba84>] __might_resched+0x1b4/0x1c2
[ 1.130835] alloc_slab_page+0x1c/0xc0
[ 1.130863] new_slab+0x94/0x336
[ 1.130861] [<ffffffff80bdfdd2>] rt_spin_lock+0x42/0xdc
[ 1.130886] ___slab_alloc+0x830/0xcda
[ 1.130896] [<ffffffff80267236>] __rmqueue_pcplist+0x12a/0xbc0
[ 1.130909] __kmem_cache_alloc_node+0xc8/0x1c4
[ 1.130934] [<ffffffff80268204>] get_page_from_freelist+0x37a/0x143e
[ 1.130934] kmalloc_trace+0x22/0x48
[ 1.130971] [<ffffffff80267d78>] __alloc_pages+0xac/0x1be
[ 1.130978] alloc_workqueue+0x96/0x6de
[ 1.131005] [<ffffffff8000412a>] check_unaligned_access+0x34/0x336
[ 1.131010] kmem_cache_init_late+0x1c/0x36
[ 1.131038] [<ffffffff80004660>] check_unaligned_access_nonboot_cpu+0x12/0x1a
[ 1.131046] start_kernel+0x204/0x7e6
[ 1.131082] irq event stamp: 822
[ 1.131071] [<ffffffff800fcba0>] __flush_smp_call_function_queue+0x1de/0x790
[ 1.131093] hardirqs last enabled at (821): [<ffffffff80bd75b0>] default_idle_call+0xfa/0x152
[ 1.131111] [<ffffffff800fd3a0>] generic_smp_call_function_single_interrupt+0xe/0x1a
[ 1.131124] hardirqs last disabled at (822): [<ffffffff80bd52a0>] do_irq+0x16/0xa6
[ 1.131146] [<ffffffff80009edc>] handle_IPI+0x54/0x13c
[ 1.131153] softirqs last enabled at (0): [<ffffffff8001543c>] copy_process+0x548/0xdec
[ 1.131190] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 1.131184] [<ffffffff800a65e6>] handle_percpu_devid_irq+0xc8/0x1c0
[ 1.131211]
[ 1.131211] other info that might help us debug this:
[ 1.131220] Possible unsafe locking scenario:
[ 1.131220]
[ 1.131228] CPU0
[ 1.131235] ----
[ 1.131229] [<ffffffff8009fd96>] generic_handle_domain_irq+0x2c/0x40
[ 1.131241] lock(&zone->lock);
[ 1.131266] <Interrupt>
[ 1.131272] lock(
[ 1.131261] [<ffffffff800ae39c>] ipi_mux_process+0xb0/0x102
[ 1.131279] &zone->lock);
[ 1.131294]
[ 1.131294] *** DEADLOCK ***
[ 1.131294]
[ 1.131302] 2 locks held by swapper/3/0:
[ 1.131298] [<ffffffff8000c0ec>] sbi_ipi_handle+0x50/0x7c
[ 1.131320] #0: ffffffe7fefcf958
[ 1.131328] [<ffffffff8009fd96>] generic_handle_domain_irq+0x2c/0x40
[ 1.131341] (&pcp->lock){+.+.}-{2:2}
[ 1.131358] [<ffffffff805ce5dc>] riscv_intc_irq+0x28/0x44
[ 1.131370] , at: get_page_from_freelist+0x338/0x143e
[ 1.131402] #1:
[ 1.131394] [<ffffffff80bd5370>] handle_riscv_irq+0x40/0x64
[ 1.131411] ffffffff81aa6960 (rcu_read_lock
[ 1.131423] [<ffffffff80bd52ea>] do_irq+0x60/0xa6
[ 1.131437] ){....}-{1:2}, at: rcu_lock_acquire+0x0/0x32
[ 1.131478]
[ 1.131478] stack backtrace:
[ 1.131488] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G W 6.6.0-rc6-rt10-00003-gd649dd498753 #1
[ 1.131518] Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
[ 1.131530] Call Trace:
[ 1.131542] [<ffffffff80006fca>] show_stack+0x2c/0x3c
[ 1.131576] [<ffffffff80bd4c38>] dump_stack_lvl+0x5e/0x84
[ 1.131614] [<ffffffff80bd4c72>] dump_stack+0x14/0x1c
[ 1.131648] [<ffffffff8008c66a>] print_usage_bug+0x430/0x4d2
[ 1.131686] [<ffffffff8008bf6e>] mark_lock_irq+0x504/0x61c
[ 1.131720] [<ffffffff8008b64e>] mark_lock+0x12e/0x194
[ 1.131753] [<ffffffff80088d3a>] __lock_acquire+0x5d2/0xd52
[ 1.131786] [<ffffffff800884fa>] lock_acquire+0x116/0x2b0
[ 1.131819] [<ffffffff80bdfdb6>] rt_spin_lock+0x26/0xdc
[ 1.131852] [<ffffffff80267236>] __rmqueue_pcplist+0x12a/0xbc0
[ 1.131887] [<ffffffff80268204>] get_page_from_freelist+0x37a/0x143e
[ 1.131922] [<ffffffff80267d78>] __alloc_pages+0xac/0x1be
[ 1.131955] [<ffffffff8000412a>] check_unaligned_access+0x34/0x336
[ 1.131986] [<ffffffff80004660>] check_unaligned_access_nonboot_cpu+0x12/0x1a
[ 1.132018] [<ffffffff800fcba0>] __flush_smp_call_function_queue+0x1de/0x790
[ 1.132052] [<ffffffff800fd3a0>] generic_smp_call_function_single_interrupt+0xe/0x1a
[ 1.132085] [<ffffffff80009edc>] handle_IPI+0x54/0x13c
[ 1.132119] [<ffffffff800a65e6>] handle_percpu_devid_irq+0xc8/0x1c0
[ 1.132154] [<ffffffff8009fd96>] generic_handle_domain_irq+0x2c/0x40
[ 1.132184] [<ffffffff800ae39c>] ipi_mux_process+0xb0/0x102
[ 1.132216] [<ffffffff8000c0ec>] sbi_ipi_handle+0x50/0x7c
[ 1.132244] [<ffffffff8009fd96>] generic_handle_domain_irq+0x2c/0x40
[ 1.132273] [<ffffffff805ce5dc>] riscv_intc_irq+0x28/0x44
[ 1.132303] [<ffffffff80bd5370>] handle_riscv_irq+0x40/0x64
[ 1.132331] [<ffffffff80bd52ea>] do_irq+0x60/0xa6
[ 1.156457] cpu1: Ratio of byte access time to unaligned word access is 0.01, unaligned accesses are slow
[ 1.156464] cpu3: Ratio of byte access time to unaligned word access is 0.01, unaligned accesses are slow
[ 1.156465] cpu2: Ratio of byte access time to unaligned word access is 0.01, unaligned accesses are slow
[ 1.184777] cpu0: Ratio of byte access time to unaligned word access is 0.01, unaligned accesses are slow
The exact tree/.config are here:
https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git/log/?h=linux-6.6.y-rt
Cheers,
Conor.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-riscv/attachments/20231102/1b6594bf/attachment.sig>
More information about the linux-riscv
mailing list