syzkaller on risc-v

Colin Ian King colin.king at canonical.com
Mon Jul 6 06:12:05 EDT 2020


FYI, increasing the THREAD_SIZE_ORDER to 2 fixes the gcov stack crashes
I'm seeing on a 5.4 kernel.

On 30/06/2020 14:57, David Abdurachmanov wrote:
> On Tue, Jun 30, 2020 at 4:38 PM Colin Ian King <colin.king at canonical.com> wrote:
>>
>> I believe I'm also seeing some potential stack smashing issues in the
>> lua engine in ZFS on risc-v. It is taking a while for me to debug, but I
>> don't see the failure on other arches.  Is there a way to bump the stack
>> size up temporarily to test with larger stacks on risc-v?
> 
> Dmitry wrote on the original email that the follow solves issues with
> KCOV enabled:
> 
> --- a/arch/riscv/include/asm/thread_info.h
> +++ b/arch/riscv/include/asm/thread_info.h
> -#define THREAD_SIZE_ORDER      (1)
> +#define THREAD_SIZE_ORDER      (2)
> 
> I see MIPS have:
> 
> [..]
>  80 /* thread information allocation */
>  81 #if defined(CONFIG_PAGE_SIZE_4KB) && defined(CONFIG_32BIT)
>  82 #define THREAD_SIZE_ORDER (1)
>  83 #endif
>  84 #if defined(CONFIG_PAGE_SIZE_4KB) && defined(CONFIG_64BIT)
>  85 #define THREAD_SIZE_ORDER (2)
> [..]
> 
> david
> 
>>
>> Colin
>>
>> On 30/06/2020 14:26, David Abdurachmanov wrote:
>>> On Tue, Jun 30, 2020 at 4:04 PM Andreas Schwab <schwab at suse.de> wrote:
>>>>
>>>> On Jun 30 2020, Dmitry Vyukov wrote:
>>>>
>>>>> I would assume some stack overflows can happen without KCOV as well.
>>>>
>>>> Yes, I see stack overflows quite a lot, like this:
>>>>
>>>> [62192.908680] Kernel panic - not syncing: corrupted stack end detected inside scheduler
>>>> [62192.915752] CPU: 0 PID: 12347 Comm: ld Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
>>>> [62192.925204] Call Trace:
>>>> [62192.927646] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
>>>> [62192.933030] [<ffffffe000202b76>] show_stack+0x2a/0x34
>>>> [62192.938066] [<ffffffe000557d44>] dump_stack+0x6e/0x88
>>>> [62192.943098] [<ffffffe00020c2d2>] panic+0xe8/0x26a
>>>> [62192.947785] [<ffffffe00085ab9c>] schedule+0x0/0xb2
>>>> [62192.952561] [<ffffffe00085af36>] _cond_resched+0x32/0x44
>>>> [62192.957859] [<ffffffe0002f18ea>] invalidate_mapping_pages+0xe0/0x1ce
>>>> [62192.964193] [<ffffffe000370aa4>] inode_lru_isolate+0x238/0x298
>>>> [62192.970012] [<ffffffe000308098>] __list_lru_walk_one+0x5e/0xf6
>>>> [62192.975826] [<ffffffe000308516>] list_lru_walk_one+0x42/0x98
>>>> [62192.981470] [<ffffffe0003717e8>] prune_icache_sb+0x32/0x72
>>>> [62192.986941] [<ffffffe000358366>] super_cache_scan+0xe4/0x13e
>>>> [62192.992586] [<ffffffe0002f1fac>] do_shrink_slab+0x10e/0x17e
>>>> [62192.998142] [<ffffffe0002f2126>] shrink_slab_memcg+0x10a/0x1de
>>>> [62193.003957] [<ffffffe0002f5314>] shrink_node_memcgs+0x12e/0x1a4
>>>> [62193.009861] [<ffffffe0002f5484>] shrink_node+0xfa/0x43c
>>>> [62193.015067] [<ffffffe0002f583e>] shrink_zones+0x78/0x18c
>>>> [62193.020365] [<ffffffe0002f59f0>] do_try_to_free_pages+0x9e/0x23e
>>>> [62193.026352] [<ffffffe0002f65ac>] try_to_free_pages+0xb2/0xf4
>>>> [62193.031991] [<ffffffe000322952>] __alloc_pages_slowpath.constprop.0+0x2d0/0x6c2
>>>> [62193.039284] [<ffffffe000322e9a>] __alloc_pages_nodemask+0x156/0x1b2
>>>> [62193.045535] [<ffffffe00030c730>] do_anonymous_page+0x58/0x41c
>>>> [62193.051266] [<ffffffe00030f50e>] handle_pte_fault+0x12e/0x156
>>>> [62193.056994] [<ffffffe000310444>] __handle_mm_fault+0xca/0x118
>>>> [62193.062725] [<ffffffe000310532>] handle_mm_fault+0xa0/0x152
>>>> [62193.068278] [<ffffffe0002055ba>] do_page_fault+0xd6/0x370
>>>> [62193.073666] [<ffffffe00020140a>] ret_from_exception+0x0/0xc
>>>> [62193.079222] [<ffffffe0004fc16a>] copy_page_to_iter_iovec+0x4c/0x154
>>>
>>> There was a report from Canonical that enabling gcov causes similar issues.
>>>
>>> linux: riscv: corrupted stack detected inside scheduler
>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1877954
>>>
>>> Adding Colin to CC. So far we couldn't reproduce this locally, I
>>> guess, because we don't have the right config.
>>>
>>> david
>>>
>>>
>>>>
>>>> or this:
>>>>
>>>> [200460.114397] Kernel panic - not syncing: corrupted stack end detected inside scheduler
>>>> [200460.121553] CPU: 0 PID: 32619 Comm: sh Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
>>>> [200460.131090] Call Trace:
>>>> [200460.133623] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
>>>> [200460.139091] [<ffffffe000202b76>] show_stack+0x2a/0x34
>>>> [200460.144212] [<ffffffe000557d44>] dump_stack+0x6e/0x88
>>>> [200460.149335] [<ffffffe00020c2d2>] panic+0xe8/0x26a
>>>> [200460.154109] [<ffffffe00085ab9c>] schedule+0x0/0xb2
>>>> [200460.158969] [<ffffffe00085af36>] _cond_resched+0x32/0x44
>>>> [200460.164348] [<ffffffe000498572>] aa_sk_perm+0x38/0x138
>>>> [200460.169559] [<ffffffe00048d4b4>] apparmor_socket_sendmsg+0x18/0x20
>>>> [200460.175817] [<ffffffe0004508e0>] security_socket_sendmsg+0x2a/0x42
>>>> [200460.182061] [<ffffffe0006f4c0a>] sock_sendmsg+0x1a/0x40
>>>> [200460.195979] [<ffffffdf817210cc>] xprt_sock_sendmsg+0xb2/0x2b6 [sunrpc]
>>>> [200460.210450] [<ffffffdf81723bde>] xs_tcp_send_request+0xc6/0x206 [sunrpc]
>>>> [200460.224930] [<ffffffdf8171f538>] xprt_request_transmit.constprop.0+0x88/0x218 [sunrpc]
>>>> [200460.240731] [<ffffffdf81720610>] xprt_transmit+0x9a/0x182 [sunrpc]
>>>> [200460.254858] [<ffffffdf8171a584>] call_transmit+0x68/0xb8 [sunrpc]
>>>> [200460.268817] [<ffffffdf81726660>] __rpc_execute+0x84/0x222 [sunrpc]
>>>> [200460.282787] [<ffffffdf81726cea>] rpc_execute+0xac/0xb8 [sunrpc]
>>>> [200460.296493] [<ffffffdf8171c5ca>] rpc_run_task+0x122/0x178 [sunrpc]
>>>> [200460.314422] [<ffffffdf82e1533a>] nfs4_do_call_sync+0x64/0x84 [nfsv4]
>>>> [200460.332514] [<ffffffdf82e1541c>] _nfs4_proc_getattr+0xc2/0xd4 [nfsv4]
>>>> [200460.350813] [<ffffffdf82e1cafc>] nfs4_proc_getattr+0x48/0x72 [nfsv4]
>>>> [200460.363307] [<ffffffdf8292c1f6>] __nfs_revalidate_inode+0x104/0x2c8 [nfs]
>>>> [200460.376204] [<ffffffdf82926d18>] nfs_access_get_cached+0x104/0x212 [nfs]
>>>> [200460.389112] [<ffffffdf82926f20>] nfs_do_access+0xfa/0x178 [nfs]
>>>> [200460.401176] [<ffffffdf82927070>] nfs_permission+0x8e/0x184 [nfs]
>>>> [200460.406497] [<ffffffe000361936>] inode_permission.part.0+0x78/0x118
>>>> [200460.412838] [<ffffffe0003638ea>] link_path_walk.part.0+0x1bc/0x212
>>>> [200460.419086] [<ffffffe000363c7e>] path_lookupat+0x34/0x172
>>>> [200460.424559] [<ffffffe0003653de>] filename_lookup+0x5c/0xf4
>>>> [200460.430114] [<ffffffe00036551e>] user_path_at_empty+0x3a/0x5e
>>>> [200460.435931] [<ffffffe00035b838>] vfs_statx+0x62/0xbc
>>>> [200460.440966] [<ffffffe00035b92a>] __do_sys_newfstatat+0x24/0x3a
>>>> [200460.446870] [<ffffffe00035bafa>] sys_newfstatat+0x10/0x18
>>>> [200460.452339] [<ffffffe0002013fc>] ret_from_syscall+0x0/0x2
>>>>
>>>> Andreas.
>>>>
>>>> --
>>>> Andreas Schwab, SUSE Labs, schwab at suse.de
>>>> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
>>>> "And now for something completely different."
>>>>
>>>> _______________________________________________
>>>> linux-riscv mailing list
>>>> linux-riscv at lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/linux-riscv
>>




More information about the linux-riscv mailing list