[PATCHv5 1/2] block: accumulate memory segment gaps per bio

Matthew Wilcox willy at infradead.org
Mon Nov 10 20:26:21 PST 2025


On Tue, Oct 14, 2025 at 08:04:55AM -0700, Keith Busch wrote:
> The blk-mq dma iterator has an optimization for requests that align to
> the device's iommu merge boundary. This boundary may be larger than the
> device's virtual boundary, but the code had been depending on that queue
> limit to know ahead of time if the request is guaranteed to align to
> that optimization.
> 
> Rather than rely on that queue limit, which many devices may not report,
> save the lowest set bit of any boundary gap between each segment in the
> bio while checking the segments. The request stores the value for
> merging and quickly checking per io if the request can use iova
> optimizations.

Hi Keith,

I just hit this bug:

generic/455       run fstests generic/455 at 2025-11-11 04:11:25
XFS (vdb): Mounting V5 Filesystem 54edd3b5-5306-493b-9ecd-f06cd9a8d669
XFS (vdb): Ending clean mount
XFS (dm-4): Mounting V5 Filesystem 3eb16918-3537-4d69-999a-ba226510f6c2
XFS (dm-4): Ending clean mount
XFS (dm-4): Unmounting Filesystem 3eb16918-3537-4d69-999a-ba226510f6c2
BUG: kernel NULL pointer dereference, address: 0000000000000008
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 1614197 Comm: kworker/u64:2 Not tainted 6.18.0-rc4-next-20251
110-ktest-00017-g2307dc640a8d #131 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Workqueue: dm-thin do_worker
RIP: 0010:bio_get_last_bvec+0x20/0xe0
Code: 90 90 90 90 90 90 90 90 90 90 55 49 89 f2 48 89 f9 48 89 e5 53 8b 77 2c 8b 47 30 44 8b 4f 28 49 89 f0 49 c1 e0 04 4c 03 47 50 <41> 8b 78 08 41 8b 58 0c 4d 8b 18 29 c7 44 39 cf 4d 89 1a 41 0f 47
RSP: 0018:ffff88814d1ef8d8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8881044b5500 RCX: ffff88811fd23d78
RDX: ffff88811fd235f8 RSI: 0000000000000000 RDI: ffff88811fd23d78
RBP: ffff88814d1ef8e0 R08: 0000000000000000 R09: 0000000000020000
R10: ffff88814d1ef8f0 R11: 0000000000000200 R12: 0000000000000000
R13: ffff88811fd235f8 R14: 0000000000000000 R15: ffff8881044b5500
FS:  0000000000000000(0000) GS:ffff8881f6ac9000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000000263a000 CR4: 0000000000750eb0
PKRU: 55555554
Call Trace:
 <TASK>
 bio_seg_gap+0x4c/0x150
 bio_attempt_front_merge+0x19a/0x3a0
 blk_attempt_bio_merge.part.0+0xb4/0x110
 blk_attempt_plug_merge+0xd6/0xe0
 blk_mq_submit_bio+0x76c/0x9f0
 ? lock_release+0xbb/0x260
 __submit_bio+0xa5/0x380
 submit_bio_noacct_nocheck+0x126/0x380
 ? submit_bio_noacct_nocheck+0x126/0x380
 submit_bio_noacct+0x17f/0x3c0
 ? __cond_resched+0x1e/0x60
 submit_bio+0xd6/0x100
 end_discard+0x3a/0x90
 process_prepared_discard_passdown_pt1+0xff/0x180
 process_discard_cell_passdown+0x19e/0x2a0
 process_discard_bio+0x105/0x1a0
 do_worker+0x824/0xa40
 ? process_one_work+0x1ad/0x530
 process_one_work+0x1ed/0x530
 ? move_linked_works+0x77/0xb0
 worker_thread+0x1cf/0x3d0
 ? __pfx_worker_thread+0x10/0x10
 kthread+0x100/0x220
 ? _raw_spin_unlock_irq+0x2b/0x40
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x249/0x280
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1a/0x30
 </TASK>
Modules linked in: [last unloaded: crc_t10dif]
CR2: 0000000000000008
---[ end trace 0000000000000000 ]---
RIP: 0010:bio_get_last_bvec+0x20/0xe0
Code: 90 90 90 90 90 90 90 90 90 90 55 49 89 f2 48 89 f9 48 89 e5 53 8b 77 2c 8b 47 30 44 8b 4f 28 49 89 f0 49 c1 e0 04 4c 03 47 50 <41> 8b 78 08 41 8b 58 0c 4d 8b 18 29 c7 44 39 cf 4d 89 1a 41 0f 47
RSP: 0018:ffff88814d1ef8d8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8881044b5500 RCX: ffff88811fd23d78
RDX: ffff88811fd235f8 RSI: 0000000000000000 RDI: ffff88811fd23d78
RBP: ffff88814d1ef8e0 R08: 0000000000000000 R09: 0000000000020000
R10: ffff88814d1ef8f0 R11: 0000000000000200 R12: 0000000000000000
R13: ffff88811fd235f8 R14: 0000000000000000 R15: ffff8881044b5500
FS:  0000000000000000(0000) GS:ffff8881f6ac9000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000000263a000 CR4: 0000000000750eb0
PKRU: 55555554
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled
---[ end Kernel panic - not syncing: Fatal exception ]---

I'm not saying it's definitely your patch; after all, there's 17 of
my slab patches on top of next-20251110, but when I looked on lore for
'bio_get_last_bvec' this was the only patch since 2021 that mentioned it,
so I thought I'd drop you a note in case you see the bug immediately.
I'm heading to bed, and will be out tomorrow, so my opportunities to be
helpful will be limited.



More information about the Linux-nvme mailing list