[PATCH 0/1] nvmet: add basic in-memory backend support

Tue Nov 4 16:09:39 PST 2025

>>
>> I/O processing uses scatter-gather iteration similar to existing 
>> backends,
>> with per-page operations that handle alignment and boundary cases.
>>
> Question remains: why?
> We already have at least two other memory-backed devices (brd, null_blk)
> which should do the job just nicely.
> Why do we need to add another one?
>
> Cheers,
>
> Hannes

Thanks for looking into this.

brd and null_blk require going through block layer
(bio/request allocation) which adds unnecessary overhead for each I/O and
fundamentally forces block layer infra to access pages. This block layer
infra is already processed I/Os on the host side that already provides
block device infra and block device view with nvmet-mem-backend.

1. Fundamentally wrong abstraction when accessing the memory (pages) :-
* NVMe Host -> Fabric -> nvmet -> block layer -> brd/null_blk -> memory
                                    ^
                 Unnecessary infrastructure to access memory.
* NVMe Host -> Fabric -> nvmet -> memory

may I please know what is the advantage this intermediary level where
accessing the pages has no conceptual obligation to go through block layer
infra, that is forced by null_blk and brd unlike nvmet-mem-backend ?

2. Unnecessary Processing overhead:

- bio_alloc() + mempool_alloc()
- bio_add_page() for each page
- submit_bio() -> submit_bio_noacct()
- blk_mq_submit_bio()
- Request allocation and Request queue processing
- Tag allocation
- Plug/unplug handling
- Queue lock contention
- I/O scheduler invocation (noop processing)
- bio_for_each_segment() iteration
- bio_endio() + mempool_free()

# Code path: ~25-30 function calls

Even for brd although it doesn't require request allocation you still need
to
- allocate the bio
- convert the nvmet->req-> sg_list pages to bio bvec
- again convert it back to brd xarray from bio-bvec
- vice versa for every single I/O.

may I please know what is the advantage of this unnecessary conversion when
accessing pages can be done efficiently with nvmet-mem-backend ?

nvmet-mem direct path:
   - struct sg_mapping_iter : ~100 bytes (stack, reused - zero heap)
   - sg_miter_start() on nvmet request sgl
   - kmap_local_page() -> memcpy() -> kunmap_local()

# Code path: ~8-10 function calls

3. Block layer usage model

The block layer provides critical functionality for _real block device_:
- I/O scheduling (deadline, mq-deadline, etc.)
- Request merging
- Partitioning
- Device queue limits
- Plug/unplug batching
- Multi-queue infrastructure
- Device abstraction

For memory-backed nvmet namespaces, _none_ of these apply:
- No scheduling needed - memory has no seek time to optimize
- No merging needed - no device to batch requests to
- No partitions - nvmet exposes namespaces, not block devices
- No queue limits - memory has no device-specific constraints
- No batching needed - completion is synchronous
- No multi-queue needed - no hardware queues to distribute to
- No abstraction needed - we ARE the storage back-end

and so does list of the calls mentioned in #2 needs to go away.
Every bio/request allocation is pure waste for this use case.

-ck