[syzbot] BUG: scheduling while atomic in simple_recursive_removal

Bruno Goncalves bgoncalv at redhat.com
Tue Apr 5 05:07:43 PDT 2022


On Wed, Mar 30, 2022 at 7:39 PM syzbot
<syzbot+2778a29e60b4982065a0 at syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:    d888c83fcec7 fs: fix fd table size alignment properly
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=103e7b53700000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=cadd7063134e07bc
> dashboard link: https://syzkaller.appspot.com/bug?extid=2778a29e60b4982065a0
> compiler:       aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> userspace arch: arm64
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+2778a29e60b4982065a0 at syzkaller.appspotmail.com
>
> BUG: scheduling while atomic: syz-executor.0/2197/0x00000101
> Modules linked in:
> CPU: 0 PID: 2197 Comm: syz-executor.0 Not tainted 5.17.0-syzkaller-13034-gd888c83fcec7 #0
> Hardware name: linux,dummy-virt (DT)
> Call trace:
>  dump_backtrace.part.0+0xcc/0xe0 arch/arm64/kernel/stacktrace.c:184
>  dump_backtrace arch/arm64/kernel/stacktrace.c:190 [inline]
>  show_stack+0x18/0x6c arch/arm64/kernel/stacktrace.c:191
>  __dump_stack lib/dump_stack.c:88 [inline]
>  dump_stack_lvl+0x68/0x84 lib/dump_stack.c:106
>  dump_stack+0x18/0x34 lib/dump_stack.c:113
>  __schedule_bug+0x60/0x80 kernel/sched/core.c:5617
>  schedule_debug kernel/sched/core.c:5644 [inline]
>  __schedule+0x74c/0x7f0 kernel/sched/core.c:6273
>  schedule+0x54/0xd0 kernel/sched/core.c:6454
>  rwsem_down_write_slowpath+0x29c/0x5a0 kernel/locking/rwsem.c:1142
>  __down_write_common kernel/locking/rwsem.c:1259 [inline]
>  __down_write_common kernel/locking/rwsem.c:1256 [inline]
>  __down_write kernel/locking/rwsem.c:1268 [inline]
>  down_write+0x58/0x64 kernel/locking/rwsem.c:1515
>  inode_lock include/linux/fs.h:777 [inline]
>  simple_recursive_removal+0x124/0x270 fs/libfs.c:288
>  debugfs_remove fs/debugfs/inode.c:742 [inline]
>  debugfs_remove+0x5c/0x80 fs/debugfs/inode.c:736
>  blk_release_queue+0x7c/0xf0 block/blk-sysfs.c:784
>  kobject_cleanup lib/kobject.c:705 [inline]
>  kobject_release lib/kobject.c:736 [inline]
>  kref_put include/linux/kref.h:65 [inline]
>  kobject_put+0x98/0x114 lib/kobject.c:753
>  blk_put_queue+0x14/0x20 block/blk-core.c:270
>  blkg_free.part.0+0x54/0x80 block/blk-cgroup.c:86
>  blkg_free block/blk-cgroup.c:78 [inline]
>  __blkg_release+0x44/0x70 block/blk-cgroup.c:102
>  rcu_do_batch kernel/rcu/tree.c:2535 [inline]
>  rcu_core+0x324/0x590 kernel/rcu/tree.c:2786
>  rcu_core_si+0x10/0x20 kernel/rcu/tree.c:2803
>  _stext+0x124/0x2a0
>  do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline]
>  invoke_softirq kernel/softirq.c:439 [inline]
>  __irq_exit_rcu+0xe4/0x100 kernel/softirq.c:637
>  irq_exit_rcu+0x10/0x1c kernel/softirq.c:649
>  __el1_irq arch/arm64/kernel/entry-common.c:459 [inline]
>  el1_interrupt+0x38/0x64 arch/arm64/kernel/entry-common.c:473
>  el1h_64_irq_handler+0x18/0x24 arch/arm64/kernel/entry-common.c:478
>  el1h_64_irq+0x64/0x68 arch/arm64/kernel/entry.S:577
>  walk_stackframe arch/arm64/kernel/stacktrace.c:153 [inline]
>  arch_stack_walk+0x68/0x280 arch/arm64/kernel/stacktrace.c:211
>  stack_trace_save+0x50/0x80 kernel/stacktrace.c:122
>  kasan_save_stack+0x2c/0x5c mm/kasan/common.c:38
>  kasan_set_track mm/kasan/common.c:45 [inline]
>  set_alloc_info mm/kasan/common.c:436 [inline]
>  ____kasan_kmalloc mm/kasan/common.c:515 [inline]
>  ____kasan_kmalloc mm/kasan/common.c:474 [inline]
>  __kasan_kmalloc+0xb0/0xd0 mm/kasan/common.c:524
>  kasan_kmalloc include/linux/kasan.h:234 [inline]
>  kmem_cache_alloc_node_trace include/linux/slab.h:488 [inline]
>  kmalloc_node include/linux/slab.h:599 [inline]
>  kzalloc_node include/linux/slab.h:725 [inline]
>  __get_vm_area_node.constprop.0+0xf8/0x224 mm/vmalloc.c:2459
>  __vmalloc_node_range+0xa4/0x5c0 mm/vmalloc.c:3132
>  __vmalloc_node mm/vmalloc.c:3237 [inline]
>  vzalloc+0x74/0x90 mm/vmalloc.c:3307
>  do_ip6t_get_ctl+0x290/0x4d0 net/ipv6/netfilter/ip6_tables.c:817
>  nf_getsockopt+0x60/0x8c net/netfilter/nf_sockopt.c:116
>  ipv6_getsockopt+0x140/0x1c4 net/ipv6/ipv6_sockglue.c:1504
>  tcp_getsockopt+0x20/0x50 net/ipv4/tcp.c:4295
>  sock_common_getsockopt+0x1c/0x30 net/core/sock.c:3478
>  __sys_getsockopt+0xa4/0x214 net/socket.c:2224
>  __do_sys_getsockopt net/socket.c:2239 [inline]
>  __se_sys_getsockopt net/socket.c:2236 [inline]
>  __arm64_sys_getsockopt+0x28/0x3c net/socket.c:2236
>  __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
>  invoke_syscall+0x48/0x114 arch/arm64/kernel/syscall.c:52
>  el0_svc_common.constprop.0+0x44/0xec arch/arm64/kernel/syscall.c:142
>  do_el0_svc+0x6c/0x84 arch/arm64/kernel/syscall.c:181
>  el0_svc+0x44/0xb0 arch/arm64/kernel/entry-common.c:616
>  el0t_64_sync_handler+0x1a4/0x1b0 arch/arm64/kernel/entry-common.c:634
>  el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:581
>

We also seem to hit the same issue. We've hit this during stress-ng
with os stressor:

[ 8221.964778] loop27622: detected capacity change from 0 to 4096
[ 8221.974106] loop66401: detected capacity change from 4096 to 8192
[ 8221.979364] BUG: scheduling while atomic: swapper/18/0/0x00000100
[ 8222.006062] restraintd[2388]: [ 8223.957654] Modules linked in:
binfmt_misc dm_cache_smq dm_cache raid1 dm_raid raid456
async_raid6_recov async_memcpy async_pq async_xor async_tx
dm_thin_pool dm_persistent_data dm_bio_prison nf_conntrack_netlink
xt_addrtype xt_nat xt_mark xt_comment nft_chain_nat xt_MASQUERADE
nf_nat veth bridge stp llc wp512 streebog_generic rmd160
nhpoly1305_neon nhpoly1305 libpoly1305 michael_mic md4 crc32_generic
twofish_generic twofish_common serpent_generic fcrypt des_generic
libdes cast6_generic cast5_generic cast_common camellia_generic
blowfish_generic blowfish_common aegis128 vsock_loopback
vmw_vsock_virtio_transport_common vsock loop tun af_key crypto_user
scsi_transport_iscsi xt_multiport ip_gre ip_tunnel gre overlay
xt_CONNSECMARK xt_SECMARK xt_conntrack nft_compat ah6 ah4 nft_objref
nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink
ip_tables jfs sctp ip6_udp_tunnel udp_tunnel ipmi_watchdog
ipmi_poweroff mlx5_ib ib_uverbs ib_core rfkill sunrpc i2c_smbus joydev
mlx5_core
*** Current Time: Fri Apr 01 19:[ 8223.957775]  psample mlxfw tls
acpi_ipmi ipmi_ssif ipmi_devintf ipmi_msghandler thunderx2_pmu vfat
fat fuse zram xfs nouveau drm_dp_helper ast cec drm_vram_helper
crct10dif_ce drm_kms_helper ghash_ce uas sysimgblt syscopyarea
sysfillrect fb_sys_fops drm_ttm_helper ttm drm usb_storage
i2c_algo_bit gpio_xlp i2c_xlp9xx [last unloaded: dm_persistent_data]
38:37 2022  Localwatchdog at: Fr[ 8224.081739] CPU: 18 PID: 0 Comm:
swapper/18 Tainted: G           OE     5.17.0 #1
i Apr 01 20:28:36 2022
[ 8224.091972] Hardware name: HPE Apollo 70             /C01_APACHE_MB
        , BIOS L50_5.13_1.11 06/18/2019
[ 8224.103779] Call trace:
[ 8224.106213]  dump_backtrace+0xfc/0x14c
[ 8224.109956]  show_stack+0x24/0x68
[ 8224.113259]  dump_stack_lvl+0x64/0x7c
[ 8224.116911]  dump_stack+0x18/0x38
[ 8224.120213]  __schedule_bug+0x68/0x84
[ 8224.123867]  schedule_debug+0x9c/0x134
[ 8224.127606]  __schedule+0x58/0x66c
[ 8224.130998]  schedule+0x70/0xa8
[ 8224.134128]  rwsem_down_write_slowpath+0x2ac/0x744
[ 8224.138908]  down_write+0x50/0x80
[ 8224.142212]  simple_recursive_removal+0xe4/0x1cc
[ 8224.146821]  debugfs_remove+0x64/0x84
[ 8224.150473]  blk_release_queue+0x90/0xf0
[ 8224.154388]  kobject_cleanup+0x74/0x1cc
[ 8224.158215]  kobject_put+0x5c/0x98
[ 8224.161604]  blk_put_queue+0x20/0x2c
[ 8224.165169]  blkg_free+0xc0/0xe8
[ 8224.168386]  __blkg_release+0x98/0x140
[ 8224.172123]  rcu_do_batch+0x1d4/0x4a4
[ 8224.175776]  rcu_core+0x100/0x2fc
[ 8224.179080]  rcu_core_si+0x1c/0x28
[ 8224.182470]  __do_softirq+0xe8/0x344
[ 8224.186034]  irq_exit_rcu+0xa4/0x144
[ 8224.189600]  el1_interrupt+0x94/0xb0
[ 8224.193163]  el1h_64_irq_handler+0x18/0x24
[ 8224.197247]  el1h_64_irq+0x68/0x6c
[ 8224.200637]  arch_local_irq_enable+0xc/0x18
[ 8224.204809]  default_idle_call+0x44/0x100
[ 8224.208807]  do_idle+0xfc/0x278
[ 8224.211938]  cpu_startup_entry+0x30/0x34
[ 8224.215849]  secondary_start_kernel+0x19c/0x1d0
[ 8224.220369]  __secondary_switched+0xa0/0xa4
[ 8224.224586] softirq: huh, entered softirq 9 RCU 00000000a642db6f
with preempt_count 00000100, exited with 00000000?

test logs: https://datawarehouse.cki-project.org/kcidb/tests/3036710
CKI issue tracker: https://datawarehouse.cki-project.org/issue/1096

Thanks,
Bruno Goncalves

>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller at googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>




More information about the linux-arm-kernel mailing list