kernel tried to execute NX-protected page - exploit attempt? (uid: 0) and kernel BUG observed on 4.13.0-rc7

Yi Zhang yizhan at redhat.com
Wed Aug 30 21:00:59 PDT 2017


Hi

I observed one kernel BUG on 4.13.0-rc7, here is the environment/steps/console log.
With below steps I reproduced one time, will try more to find one stable reproducer, let me know if you need more info, thanks.

Environment:
Link layer is mlx5_roce
Connected by switch
Firmware version:
[   13.447246] mlx5_core 0000:04:00.0: firmware version: 12.18.1000
[   14.347008] mlx5_core 0000:04:00.1: firmware version: 12.18.1000
[   15.080944] mlx5_core 0000:05:00.0: firmware version: 14.18.1000
[   15.924917] mlx5_core 0000:05:00.1: firmware version: 14.18.1000

Two servers both installed below Mellanox cards:
04:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]
04:00.1 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]
05:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
05:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]

Steps I used:
1. Setup NVMeoF ROCE RDMA at target side
2. connect the target at client side
3. execute below steps at client side:
#!/bin/bash
fio -filename=/dev/nvme0n1 -iodepth=1 -thread -rw=randwrite -ioengine=psync -bssplit=5k/10:9k/10:13k/10:17k/10:21k/10:25k/10:29k/10:33k/10:37k/10:41k/10 -bs_unaligned -runtime=1200 -size=-group_reporting -name=mytest -numjobs=60 &>/dev/null &
num=0
while [ $num -lt 100 ]
do
        echo "-------------------------------$num"
	echo 1 >/sys/block/nvme0n1/device/reset_controller || exit 1
	((num++))
done 

Console log:
Client:
[   67.144951] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 172.31.40.92:4420
[   67.398611] nvme nvme0: creating 40 I/O queues.
[   68.560894] nvme nvme0: new ctrl: NQN "testnqn", addr 172.31.40.92:4420
[   80.130132] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib1.8005: link becomes ready
[   80.148982] IPv6: ADDRCONF(NETDEV_UP): mlx5_ib1.8005: link is not ready
[   80.158793] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib1.8005: link becomes ready
[   80.167100] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib1: link becomes ready
[   80.219463] IPv6: ADDRCONF(NETDEV_UP): mlx5_ib1.8005: link is not ready
[   80.227743] IPv6: ADDRCONF(NETDEV_UP): mlx5_ib1.8003: link is not ready
[   80.236516] IPv6: ADDRCONF(NETDEV_UP): mlx5_ib1.8007: link is not ready
[   80.243940] IPv6: ADDRCONF(NETDEV_UP): mlx5_ib1: link is not ready
[   80.252185] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib1.8003: link becomes ready
[   80.268645] IPv6: ADDRCONF(NETDEV_UP): mlx5_ib1: link is not ready
[   80.277100] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib1: link becomes ready
[   80.293184] IPv6: ADDRCONF(NETDEV_UP): mlx5_ib1.8003: link is not ready
[   80.302954] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib1.8003: link becomes ready
[   80.337096] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib1.8007: link becomes ready
[   80.354517] IPv6: ADDRCONF(NETDEV_UP): mlx5_ib1.8007: link is not ready
[   80.364150] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib1.8007: link becomes ready
[   80.427098] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib1.8005: link becomes ready
rdma-virt-03 login: 

Kernel 4.13.0-rc7 on an x86_64

rdma-virt-03 login: [  134.626661] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
[  134.635041] BUG: unable to handle kernel paging request at ffff88207d5cb5b8
[  134.642830] IP: 0xffff88207d5cb5b8
[  134.646633] PGD 207fd64067 
[  134.646633] P4D 207fd64067 
[  134.649755] PUD 10fcd9c063 
[  134.652878] PMD 800000207d4001e3 
[  134.656000] 
[  134.661370] Oops: 0011 [#1] SMP
[  134.664882] Modules linked in: nvme_rdma nvme_fabrics nvme_core sch_mqprio ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge 8021q garp mrp stp llc rpcrdr
[  134.744721]  syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlx5_core drm tg3 mlxfw ahci devlink libahci ptp crc32c_intel libata i2c_core pps_core dm_mirror dm_region_hash dd
[  134.763502] CPU: 25 PID: 2213 Comm: kworker/25:1H Not tainted 4.13.0-rc7 #8
[  134.771291] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.6.2 01/08/2016
[  134.779663] Workqueue: kblockd blk_mq_timeout_work
[  134.785012] task: ffff88207cf0c5c0 task.stack: ffffc90009460000
[  134.791634] RIP: 0010:0xffff88207d5cb5b8
[  134.796022] RSP: 0018:ffffc90009463cb0 EFLAGS: 00010202
[  134.802570] RAX: ffff88207d5cb400 RBX: ffff880f365da440 RCX: ffff88207af00000
[  134.811219] RDX: ffffc90009463cb8 RSI: ffffc90009463cc0 RDI: ffff88207d5cc400
[  134.819863] RBP: ffffc90009463d10 R08: 0000000000000008 R09: 0000000000000000
[  134.828509] R10: 00000000000002ef R11: 00000000000002ee R12: ffff88103eadc000
[  134.837140] R13: ffff88100a920000 R14: ffff88202c8a4000 R15: ffff880f3b295700
[  134.845754] FS:  0000000000000000(0000) GS:ffff88207af00000(0000) knlGS:0000000000000000
[  134.855451] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  134.862513] CR2: ffff88207d5cb5b8 CR3: 0000002039238000 CR4: 00000000001406e0
[  134.871147] Call Trace:
[  134.874547]  ? nvme_rdma_unmap_data+0x126/0x1c0 [nvme_rdma]
[  134.881427]  nvme_rdma_complete_rq+0x1c/0x30 [nvme_rdma]
[  134.888011]  __blk_mq_complete_request+0x90/0x140
[  134.893931]  blk_mq_rq_timed_out+0x66/0x70
[  134.899178]  blk_mq_check_expired+0x37/0x60
[  134.904528]  bt_iter+0x48/0x50
[  134.908652]  blk_mq_queue_tag_busy_iter+0xdd/0x1f0
[  134.914678]  ? blk_mq_rq_timed_out+0x70/0x70
[  134.920128]  ? blk_mq_rq_timed_out+0x70/0x70
[  134.925557]  blk_mq_timeout_work+0x88/0x180
[  134.930889]  process_one_work+0x149/0x360
[  134.936042]  worker_thread+0x4d/0x3c0
[  134.940791]  kthread+0x109/0x140
[  134.945051]  ? rescuer_thread+0x380/0x380
[  134.950189]  ? kthread_park+0x60/0x60
[  134.954954]  ret_from_fork+0x25/0x30
[  134.959605] Code: 88 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 a8 b5 5c 7d 20 88 ff ff a8 b5 5c 7d 20 88 ff ff <b8> b5 5c 7d 20 88 ff ff b8  
[  134.982033] RIP: 0xffff88207d5cb5b8 RSP: ffffc90009463cb0
[  134.988749] CR2: ffff88207d5cb5b8
[  134.993152] ---[ end trace 399dfc3e7e0f9bee ]---
[  135.002359] Kernel panic - not syncing: Fatal exception
[  135.008918] Kernel Offset: disabled
[  135.016612] ---[ end Kernel panic - not syncing: Fatal exception
[  135.024025] sched: Unexpected reschedule of offline CPU#0!
[  135.030834] ------------[ cut here ]------------
[  135.036668] WARNING: CPU: 25 PID: 2213 at arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x3c/0x40
[  135.047921] Modules linked in: nvme_rdma nvme_fabrics nvme_core sch_mqprio ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge 8021q garp mrp stp llc rpcrdr
[  135.132328]  syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlx5_core drm tg3 mlxfw ahci devlink libahci ptp crc32c_intel libata i2c_core pps_core dm_mirror dm_region_hash dd
[  135.152485] CPU: 25 PID: 2213 Comm: kworker/25:1H Tainted: G      D         4.13.0-rc7 #8
[  135.162329] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.6.2 01/08/2016
[  135.171409] Workqueue: kblockd blk_mq_timeout_work
[  135.177481] task: ffff88207cf0c5c0 task.stack: ffffc90009460000
[  135.184816] RIP: 0010:native_smp_send_reschedule+0x3c/0x40
[  135.191676] RSP: 0018:ffff88207af03e50 EFLAGS: 00010046
[  135.198242] RAX: 000000000000002e RBX: 0000000000000000 RCX: 0000000000000000
[  135.206958] RDX: 0000000000000000 RSI: ffff88207af0e038 RDI: ffff88207af0e038
[  135.215649] RBP: ffff88207af03e50 R08: 0000000000000000 R09: 00000000000006dd
[  135.224340] R10: 00000000000003ff R11: 0000000000000001 R12: 0000000000000019
[  135.233020] R13: 00000000ffffbfd6 R14: ffff88207cf0c5c0 R15: ffff88207af14368
[  135.241705] FS:  0000000000000000(0000) GS:ffff88207af00000(0000) knlGS:0000000000000000
[  135.251469] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  135.258601] CR2: ffff88207d5cb5b8 CR3: 0000002039238000 CR4: 00000000001406e0
[  135.267303] Call Trace:
[  135.270752]  <IRQ>
[  135.273721]  trigger_load_balance+0x10e/0x1f0
[  135.279307]  scheduler_tick+0xab/0xe0
[  135.284118]  ? tick_sched_do_timer+0x70/0x70
[  135.289614]  update_process_times+0x47/0x60
[  135.295018]  tick_sched_handle+0x2d/0x60
[  135.300127]  tick_sched_timer+0x39/0x70
[  135.305135]  __hrtimer_run_queues+0xe5/0x230
[  135.310631]  hrtimer_interrupt+0xa8/0x1a0
[  135.315836]  local_apic_timer_interrupt+0x35/0x60
[  135.321827]  smp_apic_timer_interrupt+0x38/0x50
[  135.327651]  apic_timer_interrupt+0x93/0xa0
[  135.333059] RIP: 0010:panic+0x1fd/0x245
[  135.338056] RSP: 0018:ffffc90009463a00 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
[  135.347225] RAX: 0000000000000034 RBX: 0000000000000000 RCX: 0000000000000006
[  135.355903] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff88207af0e030
[  135.364593] RBP: ffffc90009463a70 R08: 0000000000000000 R09: 00000000000006dc
[  135.373276] R10: 00000000000003ff R11: 0000000000000001 R12: ffffffff81a2e220
[  135.381956] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000046
[  135.390633]  </IRQ>
[  135.393673]  oops_end+0xb8/0xd0
[  135.397873]  no_context+0x19e/0x3f0
[  135.402487]  __bad_area_nosemaphore+0xee/0x1d0
[  135.408137]  bad_area_nosemaphore+0x14/0x20
[  135.413480]  __do_page_fault+0x89/0x4a0
[  135.418407]  do_page_fault+0x30/0x80
[  135.423021]  page_fault+0x28/0x30
[  135.427322] RIP: 0010:0xffff88207d5cb5b8
[  135.432300] RSP: 0018:ffffc90009463cb0 EFLAGS: 00010202
[  135.438707] RAX: ffff88207d5cb400 RBX: ffff880f365da440 RCX: ffff88207af00000
[  135.447229] RDX: ffffc90009463cb8 RSI: ffffc90009463cc0 RDI: ffff88207d5cc400
[  135.455731] RBP: ffffc90009463d10 R08: 0000000000000008 R09: 0000000000000000
[  135.464220] R10: 00000000000002ef R11: 00000000000002ee R12: ffff88103eadc000
[  135.472703] R13: ffff88100a920000 R14: ffff88202c8a4000 R15: ffff880f3b295700
[  135.481185]  ? nvme_rdma_unmap_data+0x126/0x1c0 [nvme_rdma]
[  135.487911]  ? nvme_rdma_complete_rq+0x1c/0x30 [nvme_rdma]
[  135.494535]  ? __blk_mq_complete_request+0x90/0x140
[  135.500481]  ? blk_mq_rq_timed_out+0x66/0x70
[  135.505754]  ? blk_mq_check_expired+0x37/0x60
[  135.511109]  ? bt_iter+0x48/0x50
[  135.515206]  ? blk_mq_queue_tag_busy_iter+0xdd/0x1f0
[  135.521233]  ? blk_mq_rq_timed_out+0x70/0x70
[  135.526489]  ? blk_mq_rq_timed_out+0x70/0x70
[  135.531724]  ? blk_mq_timeout_work+0x88/0x180
[  135.537081]  ? process_one_work+0x149/0x360
[  135.542199]  ? worker_thread+0x4d/0x3c0
[  135.546925]  ? kthread+0x109/0x140
[  135.551163]  ? rescuer_thread+0x380/0x380
[  135.556101]  ? kthread_park+0x60/0x60
[  135.560629]  ? ret_from_fork+0x25/0x30
[  135.565252] Code: dc 00 0f 92 c0 84 c0 74 14 48 8b 05 3f 43 aa 00 be fd 00 00 00 ff 90 a0 00 00 00 5d c3 89 fe 48 c7 c7 e0 50 a3 81 e8 c7 f1 0b 00 <0f> ff 5d c3 0f 1f 44 00 00  
[  135.587290] ---[ end trace 399dfc3e7e0f9bef ]---


Target:
[   96.887568] null: module loaded
[   97.063749] nvmet: adding nsid 1 to subsystem testnqn
[   97.070033] nvmet_rdma: enabling port 2 (172.31.40.92:4420)
[  100.990739] nvmet: creating controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:NVMf:uuid:00000000-0000-0000-0000-000000000000.
[  101.135413] nvmet_rdma: freeing queue 0
[  101.248275] nvmet: creating controller 1 for subsystem testnqn for NQN nqn.2014-08.org.nvmexpress:NVMf:uuid:00000000-0000-0000-0000-000000000000.
[  102.216999] nvmet: adding queue 1 to ctrl 1.
[  102.221957] nvmet: adding queue 2 to ctrl 1.
[  102.226938] nvmet: adding queue 3 to ctrl 1.
[  102.231925] nvmet: adding queue 4 to ctrl 1.
[  102.236914] nvmet: adding queue 5 to ctrl 1.
[  102.241905] nvmet: adding queue 6 to ctrl 1.
[  102.246852] nvmet: adding queue 7 to ctrl 1.
[  102.251837] nvmet: adding queue 8 to ctrl 1.
[  102.256821] nvmet: adding queue 9 to ctrl 1.
[  102.261798] nvmet: adding queue 10 to ctrl 1.
[  102.266848] nvmet: adding queue 11 to ctrl 1.
[  102.271922] nvmet: adding queue 12 to ctrl 1.
[  102.277009] nvmet: adding queue 13 to ctrl 1.
[  102.282097] nvmet: adding queue 14 to ctrl 1.
[  102.287143] nvmet: adding queue 15 to ctrl 1.
[  102.292225] nvmet: adding queue 16 to ctrl 1.
[  102.297267] nvmet: adding queue 17 to ctrl 1.
[  102.302302] nvmet: adding queue 18 to ctrl 1.
[  102.307337] nvmet: adding queue 19 to ctrl 1.
[  102.312863] nvmet: adding queue 20 to ctrl 1.
[  102.318307] nvmet: adding queue 21 to ctrl 1.
[  102.323746] nvmet: adding queue 22 to ctrl 1.
[  102.329182] nvmet: adding queue 23 to ctrl 1.
[  102.334580] nvmet: adding queue 24 to ctrl 1.
[  102.339968] nvmet: adding queue 25 to ctrl 1.
[  102.345352] nvmet: adding queue 26 to ctrl 1.
[  102.350704] nvmet: adding queue 27 to ctrl 1.
[  102.356085] nvmet: adding queue 28 to ctrl 1.
[  102.361476] nvmet: adding queue 29 to ctrl 1.
[  102.366825] nvmet: adding queue 30 to ctrl 1.
[  102.372163] nvmet: adding queue 31 to ctrl 1.
[  102.377507] nvmet: adding queue 32 to ctrl 1.
[  102.382848] nvmet: adding queue 33 to ctrl 1.
[  102.388188] nvmet: adding queue 34 to ctrl 1.
[  102.393530] nvmet: adding queue 35 to ctrl 1.
[  102.398843] nvmet: adding queue 36 to ctrl 1.
[  102.404181] nvmet: adding queue 37 to ctrl 1.
[  102.409527] nvmet: adding queue 38 to ctrl 1.
[  102.414824] nvmet: adding queue 39 to ctrl 1.
[  102.420114] nvmet: adding queue 40 to ctrl 1.
[  107.163731] nvmet_rdma: freeing queue 1
[  107.168970] nvmet_rdma: freeing queue 2
[  107.174192] nvmet_rdma: freeing queue 3
[  107.179808] nvmet_rdma: freeing queue 4
[  107.185071] nvmet_rdma: freeing queue 5
[  107.190711] nvmet_rdma: freeing queue 6
[  107.196290] nvmet_rdma: freeing queue 7
[  107.201982] nvmet_rdma: freeing queue 8
[  107.208189] nvmet_rdma: freeing queue 9
[  107.214422] nvmet_rdma: freeing queue 10
[  107.220631] nvmet_rdma: freeing queue 11
[  107.226614] nvmet_rdma: freeing queue 12
[  107.232042] nvmet_rdma: freeing queue 13
[  107.238307] nvmet_rdma: freeing queue 14
[  107.245026] nvmet_rdma: freeing queue 15
[  107.251648] nvmet_rdma: freeing queue 16
[  107.257900] nvmet_rdma: freeing queue 17
[  107.264060] nvmet_rdma: freeing queue 18
[  107.270341] nvmet_rdma: freeing queue 19
[  107.276352] nvmet_rdma: freeing queue 20
[  107.282254] nvmet_rdma: freeing queue 21
[  107.288368] nvmet_rdma: freeing queue 22
[  107.293646] nvmet_rdma: freeing queue 23
[  107.299908] nvmet_rdma: freeing queue 24
[  107.322134] nvmet_rdma: freeing queue 25
[  107.328177] nvmet_rdma: freeing queue 26
[  107.334114] nvmet_rdma: freeing queue 27
[  107.340417] nvmet_rdma: freeing queue 28
[  107.346548] nvmet_rdma: freeing queue 29
[  107.352201] nvmet_rdma: freeing queue 30
[  107.358351] nvmet_rdma: freeing queue 31
[  107.365128] nvmet_rdma: freeing queue 32
[  107.371169] nvmet_rdma: freeing queue 33
[  107.377300] nvmet_rdma: freeing queue 34
[  107.383723] nvmet_rdma: freeing queue 35
[  107.390642] nvmet_rdma: freeing queue 36
[  107.397143] nvmet_rdma: freeing queue 37
[  107.402630] nvmet_rdma: freeing queue 38
[  107.409141] nvmet_rdma: freeing queue 39
[  107.414772] nvmet_rdma: freeing queue 40
[  107.441390] nvmet: got io cmd 6 while CC.EN == 0 on qid = 0
[  107.449412] nvmet_rdma: freeing queue 0



Best Regards,
  Yi Zhang





More information about the Linux-nvme mailing list