extend bi_size to unsigned long ?

Coly Li colyli at suse.de
Tue Dec 13 06:53:55 PST 2016


Hi linux-block and linux-nvme lists,

Recently when I work on md-raid0 DISCARD optimization, I found the
maximum DISCARD bio length that raid0_make_request() receives is 8388608
sectors. I find it is because of the limitation of bi_size, which is
unsigned int and 32 bits length.

A 32 bits bi_size means a DISCARD bio can only cover UINT_MAX>>9
sectors, see commit a22c4d7e3440 ("block: re-add discard_granularity and
alignment checks"). To format a xfs volume on 4x4TB NVMe SSDs, the
original DISCARD bio has to be split for 4x1024 times. If bi_size is a
64 bits unsigned long, in ideal condition the original DISCARD bio can
only be split for 4 times, that is one split bio for each device.

Now days it won't be a big issue since block layer may merge the split
bios (or may not if its block-mq and NVMe). When the underlying device
becomes larger and larger, maybe a 32 bits bi_size will hurt DISCARD
performance.

I know this is not simple, it changes a very important KABI. But this is
really an interesting question to ask: do we have any idea to extend
bi_size from unsigned int to unsigned long ?

Thanks in advance.

Coly Li



More information about the Linux-nvme mailing list