[PATCH 0/1] nvmet: add basic in-memory backend support
Chaitanya Kulkarni
chaitanyak at nvidia.com
Tue Nov 4 16:09:39 PST 2025
>>
>> I/O processing uses scatter-gather iteration similar to existing
>> backends,
>> with per-page operations that handle alignment and boundary cases.
>>
> Question remains: why?
> We already have at least two other memory-backed devices (brd, null_blk)
> which should do the job just nicely.
> Why do we need to add another one?
>
> Cheers,
>
> Hannes
Thanks for looking into this.
brd and null_blk require going through block layer
(bio/request allocation) which adds unnecessary overhead for each I/O and
fundamentally forces block layer infra to access pages. This block layer
infra is already processed I/Os on the host side that already provides
block device infra and block device view with nvmet-mem-backend.
1. Fundamentally wrong abstraction when accessing the memory (pages) :-
* NVMe Host -> Fabric -> nvmet -> block layer -> brd/null_blk -> memory
^
Unnecessary infrastructure to access memory.
* NVMe Host -> Fabric -> nvmet -> memory
may I please know what is the advantage this intermediary level where
accessing the pages has no conceptual obligation to go through block layer
infra, that is forced by null_blk and brd unlike nvmet-mem-backend ?
2. Unnecessary Processing overhead:
- bio_alloc() + mempool_alloc()
- bio_add_page() for each page
- submit_bio() -> submit_bio_noacct()
- blk_mq_submit_bio()
- Request allocation and Request queue processing
- Tag allocation
- Plug/unplug handling
- Queue lock contention
- I/O scheduler invocation (noop processing)
- bio_for_each_segment() iteration
- bio_endio() + mempool_free()
# Code path: ~25-30 function calls
Even for brd although it doesn't require request allocation you still need
to
- allocate the bio
- convert the nvmet->req-> sg_list pages to bio bvec
- again convert it back to brd xarray from bio-bvec
- vice versa for every single I/O.
may I please know what is the advantage of this unnecessary conversion when
accessing pages can be done efficiently with nvmet-mem-backend ?
nvmet-mem direct path:
- struct sg_mapping_iter : ~100 bytes (stack, reused - zero heap)
- sg_miter_start() on nvmet request sgl
- kmap_local_page() -> memcpy() -> kunmap_local()
# Code path: ~8-10 function calls
3. Block layer usage model
The block layer provides critical functionality for _real block device_:
- I/O scheduling (deadline, mq-deadline, etc.)
- Request merging
- Partitioning
- Device queue limits
- Plug/unplug batching
- Multi-queue infrastructure
- Device abstraction
For memory-backed nvmet namespaces, _none_ of these apply:
- No scheduling needed - memory has no seek time to optimize
- No merging needed - no device to batch requests to
- No partitions - nvmet exposes namespaces, not block devices
- No queue limits - memory has no device-specific constraints
- No batching needed - completion is synchronous
- No multi-queue needed - no hardware queues to distribute to
- No abstraction needed - we ARE the storage back-end
and so does list of the calls mentioned in #2 needs to go away.
Every bio/request allocation is pure waste for this use case.
-ck
More information about the Linux-nvme
mailing list