On 31/05/2024 11:51, Sagi Grimberg wrote: > > > On 30/05/2024 17:24, Ofir Gal wrote: >> Network drivers are using sendpage_ok() to check the first page of an >> iterator in order to disable MSG_SPLICE_PAGES. The iterator can >> represent list of contiguous pages. >> >> When MSG_SPLICE_PAGES is enabled skb_splice_from_iter() is being used, >> it requires all pages in the iterator to be sendable. Therefore it needs >> to check that each page is sendable. >> >> The patch introduces a helper sendpages_ok(), it returns true if all the >> contiguous pages are sendable. >> >> Drivers who want to send contiguous pages with MSG_SPLICE_PAGES may use >> this helper to check whether the page list is OK. If the helper does not >> return true, the driver should remove MSG_SPLICE_PAGES flag. >> >> Signed-off-by: Ofir Gal <ofir.gal@xxxxxxxxxxx> >> --- >> include/linux/net.h | 20 ++++++++++++++++++++ >> 1 file changed, 20 insertions(+) >> >> diff --git a/include/linux/net.h b/include/linux/net.h >> index 688320b79fcc..b33bdc3e2031 100644 >> --- a/include/linux/net.h >> +++ b/include/linux/net.h >> @@ -322,6 +322,26 @@ static inline bool sendpage_ok(struct page *page) >> return !PageSlab(page) && page_count(page) >= 1; >> } >> +/* >> + * Check sendpage_ok on contiguous pages. >> + */ >> +static inline bool sendpages_ok(struct page *page, size_t len, size_t offset) >> +{ >> + unsigned int pagecount; >> + size_t page_offset; >> + int k; >> + >> + page = page + offset / PAGE_SIZE; >> + page_offset = offset % PAGE_SIZE; > > lets not modify the input page variable. > > p = page + offset >> PAGE_SHIFT; > poffset = offset & PAGE_MASK; Ok, will be applied in the next patch set. >> + pagecount = DIV_ROUND_UP(len + page_offset, PAGE_SIZE); >> + >> + for (k = 0; k < pagecount; k++) >> + if (!sendpage_ok(page + k)) >> + return false; > > perhaps instead of doing a costly DIV_ROUND_UP for every network send we can do: > > count = 0; > while (count < len) { > if (!sendpage_ok(p)) > return false; > page++; > count += PAGE_SIZE; > } > > And we can lose page_offset. > > It can be done in a number of ways, but we should be able to do it > without the DIV_ROUND_UP... Ok, will be applied in the next patch set. > I still don't understand how a page in the middle of a contiguous range ends > up coming from the slab while others don't. I haven't investigate the origin of the IO yet. I suspect the first 2 pages are the superblocks of the raid (mdp_superblock_1 and bitmap_super_s) and the rest of the IO is the bitmap. > Ofir, can you please check which condition in sendpage_ok actually fails? It failed because the page has slab, page count is 1. Sorry for not clarifying this. "skbuff: !sendpage_ok - page: 0x54f9f140 (pfn: 120757). is_slab: 1, page_count: 1" ^ The print I used: pr_info( "!sendpage_ok - page: 0x%p (pfn: %lx). is_slab: %u, page_count: %u\n", (void *)page, page_to_pfn(page), page_address(page), !!PageSlab(page), page_count(page) );