On Wed, Nov 15, 2023 at 1:21 AM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote: > > On 2023/11/14 21:16, Jason Gunthorpe wrote: > > On Tue, Nov 14, 2023 at 04:21:26AM -0800, Mina Almasry wrote: > > > >> Actually because you put the 'strtuct page for devmem' in > >> skb->bv_frag, the net stack will grab the 'struct page' for devmem > >> using skb_frag_page() then call things like page_address(), kmap, > >> get_page, put_page, etc, etc, etc. > > > > Yikes, please no. If net has its own struct page look alike it has to > > stay entirely inside net. A non-mm owned struct page should not be > > passed into mm calls. It is just way too hacky to be seriously > > considered :( > > Yes, that is something this patchset is trying to do, defining its own > struct page look alike for page pool to support devmem. > > struct page for devmem will not be called into the mm subsystem, so most > of the mm calls is avoided by calling into the devmem memory provider' > ops instead of calling mm calls. > > As far as I see for now, only page_ref_count(), page_is_pfmemalloc() and > PageTail() is called for devmem page, which should be easy to ensure that > those call for devmem page is consistent with the struct page owned by mm. I'm not sure this is true. These 3 calls are just the calls you're aware of. In your proposal you're casting mirror pages into page* and releasing them into the net stack. You need to scrub the entire net stack for mm calls, i.e. all driver code and all skb_frag_page() call sites. Of the top of my head, the driver is probably calling page_address() and illegal_highdma() is calling PageHighMem(). TCP zerocopy receive is calling vm_insert_pages(). > I am not sure if we can use some kind of compile/runtime checking to ensure > those kinds of consistency? > > > > >>> I would expect net stack, page pool, driver still see the 'struct page', > >>> only memory provider see the specific struct for itself, for the above, > >>> devmem memory provider sees the 'struct page_pool_iov'. > >>> > >>> The reason I still expect driver to see the 'struct page' is that driver > >>> will still need to support normal memory besides devmem. > > > > I wouldn't say this approach is unreasonable, but it does have to be > > done carefully to isolate the mm. Keeping the struct page in the API > > is going to make this very hard. > > I would expect that most of the isolation is done in page pool, as far as > I can see: > > 1. For control part: the driver may need to tell the page pool which memory > provider it want to use. Or the administrator specifies > which memory provider to use by some netlink-based cmd. > > 2. For data part: I am thinking that driver should only call page_pool_alloc(), > page_pool_free() and page_pool_get_dma_addr related function. > > Of course the driver may need to be aware of that if it can call kmap() or > page_address() on the page returned from page_pool_alloc(), and maybe tell > net stack that those pages is not kmap()'able and page_address()'able. > > > > > Jason > > . > > -- Thanks, Mina