[PATCH v6 01/27] mm: Introduce struct folio

Matthew Wilcox willy at infradead.org
Tue Apr 6 13:48:07 BST 2021


On Tue, Apr 06, 2021 at 03:29:18PM +0300, Kirill A. Shutemov wrote:
> On Wed, Mar 31, 2021 at 07:47:02PM +0100, Matthew Wilcox (Oracle) wrote:
> > +/**
> > + * folio_next - Move to the next physical folio.
> > + * @folio: The folio we're currently operating on.
> > + *
> > + * If you have physically contiguous memory which may span more than
> > + * one folio (eg a &struct bio_vec), use this function to move from one
> > + * folio to the next.  Do not use it if the memory is only virtually
> > + * contiguous as the folios are almost certainly not adjacent to each
> > + * other.  This is the folio equivalent to writing ``page++``.
> > + *
> > + * Context: We assume that the folios are refcounted and/or locked at a
> > + * higher level and do not adjust the reference counts.
> > + * Return: The next struct folio.
> > + */
> > +static inline struct folio *folio_next(struct folio *folio)
> > +{
> > +#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
> > +	return (struct folio *)nth_page(&folio->page, folio_nr_pages(folio));
> > +#else
> > +	return folio + folio_nr_pages(folio);
> > +#endif
> 
> Do we really need the #if here?
> 
> >From quick look at nth_page() and memory_model.h, compiler should be able
> to simplify calculation for FLATMEM or SPARSEMEM_VMEMMAP to what you do in
> the #else. No?

No.

0000000000001180 <a>:
struct page *a(struct page *p, unsigned long n)
{
    1180:       e8 00 00 00 00          callq  1185 <a+0x5>
                        1181: R_X86_64_PLT32    __fentry__-0x4
    1185:       55                      push   %rbp
        return nth_page(p, n);
    1186:       48 2b 3d 00 00 00 00    sub    0x0(%rip),%rdi
                        1189: R_X86_64_PC32     vmemmap_base-0x4
    118d:       48 c1 ff 06             sar    $0x6,%rdi
    1191:       48 8d 04 37             lea    (%rdi,%rsi,1),%rax
    1195:       48 89 e5                mov    %rsp,%rbp
        return nth_page(p, n);
    1198:       48 c1 e0 06             shl    $0x6,%rax
    119c:       48 03 05 00 00 00 00    add    0x0(%rip),%rax
                        119f: R_X86_64_PC32     vmemmap_base-0x4
    11a3:       5d                      pop    %rbp
    11a4:       c3                      retq   

vs

00000000000011b0 <b>:

struct page *b(struct page *p, unsigned long n)
{
    11b0:       e8 00 00 00 00          callq  11b5 <b+0x5>
                        11b1: R_X86_64_PLT32    __fentry__-0x4
    11b5:       55                      push   %rbp
        return p + n;
    11b6:       48 c1 e6 06             shl    $0x6,%rsi
    11ba:       48 8d 04 37             lea    (%rdi,%rsi,1),%rax
    11be:       48 89 e5                mov    %rsp,%rbp
    11c1:       5d                      pop    %rbp
    11c2:       c3                      retq   

Now, maybe we should put this optimisation into the definition of nth_page?

> > +struct folio {
> > +	/* private: don't document the anon union */
> > +	union {
> > +		struct {
> > +	/* public: */
> > +			unsigned long flags;
> > +			struct list_head lru;
> > +			struct address_space *mapping;
> > +			pgoff_t index;
> > +			unsigned long private;
> > +			atomic_t _mapcount;
> > +			atomic_t _refcount;
> > +#ifdef CONFIG_MEMCG
> > +			unsigned long memcg_data;
> > +#endif
> 
> As Christoph, I'm not a fan of this :/

What would you prefer?



More information about the linux-afs mailing list