[PATCH v2 1/4] net: introduce helper sendpages_ok()

Ofir Gal ofir.gal at volumez.com
Mon Jun 3 05:35:34 PDT 2024



On 31/05/2024 11:51, Sagi Grimberg wrote:
>
>
> On 30/05/2024 17:24, Ofir Gal wrote:
>> Network drivers are using sendpage_ok() to check the first page of an
>> iterator in order to disable MSG_SPLICE_PAGES. The iterator can
>> represent list of contiguous pages.
>>
>> When MSG_SPLICE_PAGES is enabled skb_splice_from_iter() is being used,
>> it requires all pages in the iterator to be sendable. Therefore it needs
>> to check that each page is sendable.
>>
>> The patch introduces a helper sendpages_ok(), it returns true if all the
>> contiguous pages are sendable.
>>
>> Drivers who want to send contiguous pages with MSG_SPLICE_PAGES may use
>> this helper to check whether the page list is OK. If the helper does not
>> return true, the driver should remove MSG_SPLICE_PAGES flag.
>>
>> Signed-off-by: Ofir Gal <ofir.gal at volumez.com>
>> ---
>>   include/linux/net.h | 20 ++++++++++++++++++++
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/include/linux/net.h b/include/linux/net.h
>> index 688320b79fcc..b33bdc3e2031 100644
>> --- a/include/linux/net.h
>> +++ b/include/linux/net.h
>> @@ -322,6 +322,26 @@ static inline bool sendpage_ok(struct page *page)
>>       return !PageSlab(page) && page_count(page) >= 1;
>>   }
>>   +/*
>> + * Check sendpage_ok on contiguous pages.
>> + */
>> +static inline bool sendpages_ok(struct page *page, size_t len, size_t offset)
>> +{
>> +    unsigned int pagecount;
>> +    size_t page_offset;
>> +    int k;
>> +
>> +    page = page + offset / PAGE_SIZE;
>> +    page_offset = offset % PAGE_SIZE;
>
> lets not modify the input page variable.
>
> p = page + offset >> PAGE_SHIFT;
> poffset = offset & PAGE_MASK;
Ok, will be applied in the next patch set.

>> +    pagecount = DIV_ROUND_UP(len + page_offset, PAGE_SIZE);
>> +
>> +    for (k = 0; k < pagecount; k++)
>> +        if (!sendpage_ok(page + k))
>> +            return false;
>
> perhaps instead of doing a costly DIV_ROUND_UP for every network send we can do:
>
>         count = 0;
>         while (count < len) {
>                 if (!sendpage_ok(p))
>                         return false;
>                 page++;
>                 count += PAGE_SIZE;
>         }
>
> And we can lose page_offset.
>
> It can be done in a number of ways, but we should be able to do it
> without the DIV_ROUND_UP...
Ok, will be applied in the next patch set.

> I still don't understand how a page in the middle of a contiguous range ends
> up coming from the slab while others don't.
I haven't investigate the origin of the IO
yet. I suspect the first 2 pages are the superblocks of the raid
(mdp_superblock_1 and bitmap_super_s) and the rest of the IO is the bitmap.

> Ofir, can you please check which condition in sendpage_ok actually fails?
It failed because the page has slab, page count is 1. Sorry for not
clarifying this.

"skbuff: !sendpage_ok - page: 0x54f9f140 (pfn: 120757). is_slab: 1, page_count: 1"
                                                                 ^
The print I used:
pr_info(
    "!sendpage_ok - page: 0x%p (pfn: %lx). is_slab: %u, page_count: %u\n",
    (void *)page,
    page_to_pfn(page),
    page_address(page),
    !!PageSlab(page),
    page_count(page)
);





More information about the Linux-nvme mailing list