[PATCH 1/5] block: Move integrity kobject to struct gendisk
Mike Snitzer
snitzer at redhat.com
Wed Sep 16 10:26:01 PDT 2015
On Thu, Aug 20 2015 at 4:41pm -0400,
Martin K. Petersen <martin.petersen at oracle.com> wrote:
> The integrity kobject purely exists to support the integrity
> subdirectory in sysfs and doesn't really have anything to do with the
> blk_integrity data structure. Move the kobject to struct gendisk where
> it belongs.
>
> Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
> Reported-by: Christoph Hellwig <hch at lst.de>
> Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
[I understand both Martin and Keith are on vacation but I'm putting the
time to this now with the understanding that a proper exchange may be
delayed.]
Thinking about this series further: I don't really agree with it.
Maybe in the end the benefit of embedding in gendisk outweighs the
complexity of dynamic allocation... but I need to see it for myself.
Christoph, if you _know_ this to be the right way forward I can accept
it but please elaborate further.
What we currently have (before this patchset) has the very real benefit
of not wasting any memory if the block device doesn't have integrity
(DIF/DIX) support. Especially given how rare bip supporting devices
still are. Is there reason to expect they won't be so rare in the
future? Every NVMe and nvdimm device will have integrity support?
This patch moves thes 64 byte 'struct kobject' out into 'struct
gendisk'. I'm not seeing why (on x86_64 anyway) we'd what to always
allocate an extra 76 bytes (blk_integrity + kobject) for _every_ gendisk
regardless of whether a bip is needed..
76 bytes may not sound like a lot but in the context of DM-mpath that
adds up when you have systems with 1000s of devices with multiple paths
and then a DM mpath device (with its own gendisk) ontop -- 1000 devices
with 4 paths can waste 5000 * 76bytes = ~372K (even 372K isn't much...)
If DM-core could support existing dynamic bip (albeit with more involved
code born out of DM-specific quirks of device stacking and inactive and
active tables) then I have to believe NVMe can too.
As such I'm now going to shift my focus to revisiting what is so hard
for NVMe that it cannot cope with the existing dynamic allocation of
struct blk_integrity.
[If that turns out to be a waste of time (and NVMe/nvdimm/whatever would
be needlessly inconvenienced) then I'll switch back to fixing DM to cope
with this change. But to be clear the current proposed DM changes
_cannot_ go upstream.]
More information about the Linux-nvme
mailing list