[PATCH v2 1/4] net: introduce helper sendpages_ok()
Ofir Gal
ofir.gal at volumez.com
Mon Jun 3 05:35:34 PDT 2024
On 31/05/2024 11:51, Sagi Grimberg wrote:
>
>
> On 30/05/2024 17:24, Ofir Gal wrote:
>> Network drivers are using sendpage_ok() to check the first page of an
>> iterator in order to disable MSG_SPLICE_PAGES. The iterator can
>> represent list of contiguous pages.
>>
>> When MSG_SPLICE_PAGES is enabled skb_splice_from_iter() is being used,
>> it requires all pages in the iterator to be sendable. Therefore it needs
>> to check that each page is sendable.
>>
>> The patch introduces a helper sendpages_ok(), it returns true if all the
>> contiguous pages are sendable.
>>
>> Drivers who want to send contiguous pages with MSG_SPLICE_PAGES may use
>> this helper to check whether the page list is OK. If the helper does not
>> return true, the driver should remove MSG_SPLICE_PAGES flag.
>>
>> Signed-off-by: Ofir Gal <ofir.gal at volumez.com>
>> ---
>> include/linux/net.h | 20 ++++++++++++++++++++
>> 1 file changed, 20 insertions(+)
>>
>> diff --git a/include/linux/net.h b/include/linux/net.h
>> index 688320b79fcc..b33bdc3e2031 100644
>> --- a/include/linux/net.h
>> +++ b/include/linux/net.h
>> @@ -322,6 +322,26 @@ static inline bool sendpage_ok(struct page *page)
>> return !PageSlab(page) && page_count(page) >= 1;
>> }
>> +/*
>> + * Check sendpage_ok on contiguous pages.
>> + */
>> +static inline bool sendpages_ok(struct page *page, size_t len, size_t offset)
>> +{
>> + unsigned int pagecount;
>> + size_t page_offset;
>> + int k;
>> +
>> + page = page + offset / PAGE_SIZE;
>> + page_offset = offset % PAGE_SIZE;
>
> lets not modify the input page variable.
>
> p = page + offset >> PAGE_SHIFT;
> poffset = offset & PAGE_MASK;
Ok, will be applied in the next patch set.
>> + pagecount = DIV_ROUND_UP(len + page_offset, PAGE_SIZE);
>> +
>> + for (k = 0; k < pagecount; k++)
>> + if (!sendpage_ok(page + k))
>> + return false;
>
> perhaps instead of doing a costly DIV_ROUND_UP for every network send we can do:
>
> count = 0;
> while (count < len) {
> if (!sendpage_ok(p))
> return false;
> page++;
> count += PAGE_SIZE;
> }
>
> And we can lose page_offset.
>
> It can be done in a number of ways, but we should be able to do it
> without the DIV_ROUND_UP...
Ok, will be applied in the next patch set.
> I still don't understand how a page in the middle of a contiguous range ends
> up coming from the slab while others don't.
I haven't investigate the origin of the IO
yet. I suspect the first 2 pages are the superblocks of the raid
(mdp_superblock_1 and bitmap_super_s) and the rest of the IO is the bitmap.
> Ofir, can you please check which condition in sendpage_ok actually fails?
It failed because the page has slab, page count is 1. Sorry for not
clarifying this.
"skbuff: !sendpage_ok - page: 0x54f9f140 (pfn: 120757). is_slab: 1, page_count: 1"
^
The print I used:
pr_info(
"!sendpage_ok - page: 0x%p (pfn: %lx). is_slab: %u, page_count: %u\n",
(void *)page,
page_to_pfn(page),
page_address(page),
!!PageSlab(page),
page_count(page)
);
More information about the Linux-nvme
mailing list