[PATCH v5] um: switch to regset API and depend on XSTATE

Brian Norris briannorris at chromium.org
Fri Dec 13 12:00:30 PST 2024


Hi Benjamin,

On Wed, Oct 23, 2024 at 11:41:20AM +0200, Benjamin Berg wrote:
> From: Benjamin Berg <benjamin.berg at intel.com>
> 
> The PTRACE_GETREGSET API has now existed since Linux 2.6.33. The XSAVE
> CPU feature should also be sufficiently common to be able to rely on it.
> 
> With this, define our internal FP state to be the hosts XSAVE data. Add
> discovery for the hosts XSAVE size and place the FP registers at the end
> of task_struct so that we can adjust the size at runtime.
> 
> Next we can implement the regset API on top and update the signal
> handling as well as ptrace APIs to use them. Also switch coredump
> creation to use the regset API and finally set HAVE_ARCH_TRACEHOOK.
> 
> This considerably improves the signal frames. Previously they might not
> have contained all the registers (i386) and also did not have the
> sizes and magic values set to the correct values to permit userspace to
> decode the frame.
> 
> As a side effect, this will permit UML to run on hosts with newer CPU
> extensions (such as AMX) that need even more register state.
> 
> Signed-off-by: Benjamin Berg <benjamin.berg at intel.com>

This patch seems to trip up KASAN. Or at least, KUnit tests fail when I
enable CONFIG_KASAN, and 'git bisect' points me here:

$ git bisect run ./tools/testing/kunit/kunit.py run stackinit.test_user --kconfig_add CONFIG_KASAN=y
[...]
3f17fed2149192c7d3b76a45a6a87b4ff22cd586 is the first bad commit
commit 3f17fed2149192c7d3b76a45a6a87b4ff22cd586
Author: Benjamin Berg <benjamin.berg at intel.com>
Date:   Wed Oct 23 11:41:20 2024 +0200

    um: switch to regset API and depend on XSTATE
[...]

If I run at Linus's latest:

  243f750a2df0 Merge tag 'gpio-fixes-for-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

I get a KASAN warning and panic [1]. I tried this fix for fun, but it
doesn't help:
Subject: [PATCH] um: add back support for FXSAVE registers
https://lore.kernel.org/linux-um/20241204074827.1582917-1-benjamin@sipsolutions.net/

I'm not very familiar with this area, but let me know if there's more I
can help with on tracking the issue down. Hopefully, it's as easy as
running these same commands for you to reproduce.

Brian

[1]
$ ./tools/testing/kunit/kunit.py run stackinit.test_user --kconfig_add CONFIG_KASAN=y --raw_output=all
[...]
<3>==================================================================
<3>BUG: KASAN: global-out-of-bounds in arch_dup_task_struct+0x4b/0x70
<3>Read of size 4616 at addr 0000000060b1aec0 by task swapper/0
<3>
<3>CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.13.0-rc2-00194-g6787126c27ef #61
<3>Stack:
<4> 00000000 00000000 ffffff00 60acc428
<4> 60ad2ffc 9f225db0 00000001 6008b7fb
<4> 60b17aa0 6003fbf5 60b1aec0 6004c654
<3>Call Trace:
<3> [<60038c0e>] ? show_stack.cold+0x64/0xf3
<3> [<6008b7fb>] ? dump_stack_lvl+0x8b/0xa7
<3> [<6003fbf5>] ? _printk+0x0/0x103
<3> [<6004c654>] ? print_report+0x145/0x519
<3> [<60090f2b>] ? arch_dup_task_struct+0x4b/0x70
<3> [<6031f854>] ? kasan_report+0x114/0x160
<3> [<60090f2b>] ? arch_dup_task_struct+0x4b/0x70
<3> [<60320830>] ? kasan_check_range+0x0/0x1e0
<3> [<603209a0>] ? kasan_check_range+0x170/0x1e0
<3> [<6032135d>] ? __asan_memcpy+0x2d/0x80
<3> [<60090f2b>] ? arch_dup_task_struct+0x4b/0x70
<3> [<600b9381>] ? copy_process+0x3e1/0x7390
<3> [<600af1a0>] ? block_signals+0x0/0x20
<3> [<603bb46e>] ? vfs_kern_mount.part.0+0x6e/0x140
<3> [<601b48d6>] ? stack_trace_save+0x86/0xa0
<3> [<6063ef2c>] ? stack_depot_save_flags+0x2c/0xa80
<3> [<601b4850>] ? stack_trace_save+0x0/0xa0
<3> [<6031e919>] ? kasan_save_stack+0x49/0x60
<3> [<603bb46e>] ? vfs_kern_mount.part.0+0x6e/0x140
<3> [<6031e919>] ? kasan_save_stack+0x49/0x60
<3> [<600b8fa0>] ? copy_process+0x0/0x7390
<3> [<600c04b3>] ? kernel_clone+0xd3/0x8c0
<3> [<600c03e0>] ? kernel_clone+0x0/0x8c0
<3> [<60038743>] ? arch_irqs_disabled_flags+0x0/0x9
<3> [<60038700>] ? arch_local_save_flags+0x0/0x43
<3> [<600c107d>] ? user_mode_thread+0x9d/0xc0
<3> [<600c0fe0>] ? user_mode_thread+0x0/0xc0
<3> [<60926934>] ? kernel_init+0x0/0x18c
<3> [<6003875e>] ? arch_local_irq_disable+0x0/0xc
<3> [<60038743>] ? arch_irqs_disabled_flags+0x0/0x9
<3> [<60038700>] ? arch_local_save_flags+0x0/0x43
<3> [<603bb69d>] ? kern_mount+0x3d/0xb0
<3> [<6003875e>] ? arch_local_irq_disable+0x0/0xc
<3> [<60926831>] ? rest_init+0x2d/0x130
<3> [<6003875e>] ? arch_local_irq_disable+0x0/0xc
<3> [<60038743>] ? arch_irqs_disabled_flags+0x0/0x9
<3> [<60038700>] ? arch_local_save_flags+0x0/0x43
<3> [<60002679>] ? do_one_initcall+0x0/0x450
<3> [<60005c97>] ? start_kernel_proc+0x0/0x1d
<3> [<60005cb0>] ? start_kernel_proc+0x19/0x1d
<3> [<600904fa>] ? new_thread_handler+0xca/0x130
<3>
<3>The buggy address belongs to the variable:
<3> 0x60b1aec0
<3>
<3>The buggy address belongs to the physical page:
<4>page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xb1a
<4>flags: 0x2000(reserved|zone=0)
<4>raw: 0000000000002000 000000009f225db8 000000009f225db8 0000000000000000
<4>raw: 0000000000000000 0000000000000000 00000001ffffffff
<4>page dumped because: kasan: bad access detected
<3>
<3>Memory state around the buggy address:
<3> 0000000060b1b600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
<3> 0000000060b1b680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
<3>>0000000060b1b700: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 00 00 00 00
<3>                                           ^
<3> 0000000060b1b780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
<3> 0000000060b1b800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
<3>==================================================================
<4>Disabling lock debugging due to kernel taint
<4>
<6>Pid: 0, comm: swapper Tainted: G    B              6.13.0-rc2-00194-g6787126c27ef
<6>RIP: 0033:copy_namespaces+0x104/0x2b0
<6>RSP: 0000000060b17b70  EFLAGS: 00010246
<6>RAX: 0000000000000001 RBX: 00000000610a8000 RCX: 0000000060133d7f
<6>RDX: 0000000000000001 RSI: 0000000000000004 RDI: 0000000000000000
<6>RBP: 0000000000000000 R08: 0000000000000001 R09: 0000100000000000
<6>R10: 0000000000000003 R11: ffffffffffffffff R12: 0000000000800300
<6>R13: 000000006102a000 R14: 00000000610a84d8 R15: 0000000060b31ba0
<0>Kernel panic - not syncing: Segfault with no mm
<4>CPU: 0 UID: 0 PID: 0 Comm: swapper Tainted: G    B              6.13.0-rc2-00194-g6787126c27ef #61
<4>Tainted: [B]=BAD_PAGE
<4>Stack:
<4> 00000000 60321286 61070380 0c162f92
<4> 00000000 60b1aec0 61070110 610a8000
<4> 610a8498 600bae85 61001400 60b17ed0
<4>Call Trace:
<4> [<60321286>] ? __asan_memset+0x26/0x50
<4> [<600bae85>] ? copy_process+0x1ee5/0x7390
<4> [<600af1a0>] ? block_signals+0x0/0x20
<4> [<6063ef2c>] ? stack_depot_save_flags+0x2c/0xa80
<4> [<601b4850>] ? stack_trace_save+0x0/0xa0
<4> [<6031e919>] ? kasan_save_stack+0x49/0x60
<4> [<603bb46e>] ? vfs_kern_mount.part.0+0x6e/0x140
<4> [<6031e919>] ? kasan_save_stack+0x49/0x60
<4> [<600b8fa0>] ? copy_process+0x0/0x7390
<4> [<600c04b3>] ? kernel_clone+0xd3/0x8c0
<4> [<600c03e0>] ? kernel_clone+0x0/0x8c0
<4> [<60038743>] ? arch_irqs_disabled_flags+0x0/0x9
<4> [<60038700>] ? arch_local_save_flags+0x0/0x43
<4> [<600c107d>] ? user_mode_thread+0x9d/0xc0
<4> [<600c0fe0>] ? user_mode_thread+0x0/0xc0
<4> [<60926934>] ? kernel_init+0x0/0x18c
<4> [<6003875e>] ? arch_local_irq_disable+0x0/0xc
<4> [<60038743>] ? arch_irqs_disabled_flags+0x0/0x9
<4> [<60038700>] ? arch_local_save_flags+0x0/0x43
<4> [<603bb69d>] ? kern_mount+0x3d/0xb0
<4> [<6003875e>] ? arch_local_irq_disable+0x0/0xc
<4> [<60926831>] ? rest_init+0x2d/0x130
<4> [<6003875e>] ? arch_local_irq_disable+0x0/0xc
<4> [<60038743>] ? arch_irqs_disabled_flags+0x0/0x9
<4> [<60038700>] ? arch_local_save_flags+0x0/0x43
<4> [<60002679>] ? do_one_initcall+0x0/0x450
<4> [<60005c97>] ? start_kernel_proc+0x0/0x1d
<4> [<60005cb0>] ? start_kernel_proc+0x19/0x1d
<4> [<600904fa>] ? new_thread_handler+0xca/0x130
[11:56:56] Elapsed time: 6.794s total, 0.001s configuring, 5.513s building, 1.280s running




More information about the linux-um mailing list