[PATCH 5/5] nvme: enable logical block size > PAGE_SIZE

Hannes Reinecke hare at suse.de
Mon May 13 06:43:38 PDT 2024


On 5/12/24 04:43, Luis Chamberlain wrote:
> On Sat, May 11, 2024 at 04:51:32PM -0700, Luis Chamberlain wrote:
>> On Sun, May 12, 2024 at 12:30:40AM +0100, Matthew Wilcox wrote:
>>> On Sat, May 11, 2024 at 04:09:48PM -0700, Luis Chamberlain wrote:
>>>> We can't just do this, we need to consider the actual nvme cap (test it,
>>>> and if it crashes and below what the page cache supports, then we have
>>>> to go below) and so to make the enablment easier. So we could just move
>>>> this to helper [0]. Then when the bdev cache patch goes through the
>>>> check for CONFIG_BUFFER_HEAD can be removed, if this goes first.
>>>>
>>>> We crash if we go above 1 MiB today, we should be able to go up to 2
>>>> MiB but that requires some review to see what stupid thing is getting
>>>> in the way.
>>>>
>>>> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/commit/?h=20240408-lbs-scsi-kludge&id=1f7f4dce548cc11872e977939a872b107c68ad53
>>>
>>> This is overengineered garbage.  What's the crash?
>>
>> I had only tested it with iomap, I had not tested it with buffer-heads,
>> and so it would require re-testing. It's Saturday 5pm, I should be doing
>> something else other than being on my computer.
> 
> It crashes because we forgot two things in this series below, so the
> change below its us to enable to *at least* boot up to 64k LBA format on
> NVMe.
> 
> One can reproduce this with kdevops with:
> 
> make defconfig-lbs-xfs-bdev-nvme
> make bringup
> make linux
> 
> I've added another defconfig which bumps the LBA format up to 512 KiB to
> see if bootup blows up, that has another defconfig:
> 
> make lbs-xfs-bdev-large-nvme
> make bringup
> make linux
> 
> That at least booted. Note that the above defconfigs use this thread's
> message ID, so it applies this series on top of the min order branch.
> The patch below is just needed.
> 
> I'll try next going above 512 KiB.
>   
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 4f73d23c2c46..fa88e300a946 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -2360,8 +2360,6 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block)
>   	if (IS_ENABLED(CONFIG_FS_VERITY) && IS_VERITY(inode))
>   		limit = inode->i_sb->s_maxbytes;
>   
> -	VM_BUG_ON_FOLIO(folio_test_large(folio), folio);
> -
>   	head = folio_create_buffers(folio, inode, 0);
>   	blocksize = head->b_size;
>   
> diff --git a/fs/mpage.c b/fs/mpage.c
> index e3732686e65f..e124c924b2e7 100644
> --- a/fs/mpage.c
> +++ b/fs/mpage.c
> @@ -178,7 +178,6 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
>   	gfp_t gfp = mapping_gfp_constraint(folio->mapping, GFP_KERNEL);
>   
>   	/* MAX_BUF_PER_PAGE, for example */
> -	VM_BUG_ON_FOLIO(folio_test_large(folio), folio);
>   
>   	if (args->is_readahead) {
>   		opf |= REQ_RAHEAD;
> 
Thanks. Will be including that in the next round.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare at suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich




More information about the Linux-nvme mailing list