[RFC][PATCH] netfs, afs, ceph: Use folios

Matthew Wilcox willy at infradead.org
Wed Aug 11 06:54:49 PDT 2021


On Wed, Aug 11, 2021 at 02:07:51PM +0100, David Howells wrote:
> Convert the netfs helper library and the afs filesystem to use folios.
> 
> NOTE: This patch will also need to alter the ceph filesystem, but as that's
> not been done that yet, ceph will fail to build.
> 
> The patch makes two alterations to the mm headers:
> 
>  (1) Fix a bug in readahead_folio() where a NULL return from
>      __readahead_folio() will cause folio_put() to oops.

I'll fold that in.

>  (2) Add folio_change_private() to change the private data on the folio
>      without adjusting the page refcount or changing the flag.  This
>      assumes folio_attach_private() was already called.

Makes sense.

>  (*) Should I be using page_mapping() or page_file_mapping()?

Depends if you can have a swapfile on your filesystem.  I'd like to
get rid of this and only use the directIO path for swap, but that's a
far-distant project.

>  (*) Can page_endio() be split into two separate functions, one for read
>      and one for write?  If seems a waste of time to conditionally switch
>      between two different branches.

So you'd like a folio_end_write() and folio_end_read()?

>  (*) Is there a better way to implement afs_kill_pages() and
>      afs_redirty_pages()?  I was previously using find_get_pages_contig()
>      into a pagevec, but that doesn't look like it'll work with folios, so
>      I'm now calling filemap_get_folio() a lot more - not that it matters
>      so much, as these are failure paths.

I always disliked the _contig variants.  Block filesystems tend to
follow the pattern

	for-each-page-in-range
		if page-is-contig-with-prev
			append-to-bio
		else
			start-new-bio

while network filesystems tend to use the pattern

	for-range
		get-a-batch-of-contig-pages
			submit-an-io-using-these-pages

it'd be nice to follow the same pattern for both.  Would reduce the
amount of duplicated infrastructure.

>      Also, should these be moved into generic code?

I'd have to figure out what they do to answer this question.

>  (*) Can ->page_mkwrite() see which subpage of a folio got hit?

It already does -- you're passed a page, not a folio.  Are you trying
to optimise by only marking part of a folio as dirty?  If so, that's a
bad idea because we're going to want to, eg, map 64KB chunks of a folio
with a single TLB entry on ARM, so you'll only get one notification for
that page.

>  (*) __filemap_get_folio() should be used instead of
>      grab_cache_page_write_begin()?  What should be done if xa_is_value()
>      returns true on the value returned by that?

If you don't pass FGP_ENTRY, it won't return you an xa_is_value() ...




More information about the linux-afs mailing list