On 5/23/19 3:37 PM, Ira Weiny wrote: [...] > I've dug in further and I see now that release_pages() implements (almost the > same thing, see below) as put_page(). > > However, I think we need to be careful here because put_page_testzero() calls > > page_ref_dec_and_test(page); > > ... and after your changes it will need to call ... > > page_ref_sub_return(page, GUP_PIN_COUNTING_BIAS); > > ... on a GUP page: > > So how do you propose calling release_pages() from within put_user_pages()? Or > were you thinking this would be temporary? I was thinking of it as a temporary measure, only up until, but not including the point where put_user_pages() becomes active. That is, the point when put_user_pages starts decrementing GUP_PIN_COUNTING_BIAS, instead of just forwarding to put_page(). (For other readers, that's this patch: "mm/gup: debug tracking of get_user_pages() references" ...in https://github.com/johnhubbard/linux/tree/gup_dma_core ) > > That said, there are 2 differences I see between release_pages() and put_page() > > 1) release_pages() will only work for a MEMORY_DEVICE_PUBLIC page and not all > devmem pages... > I think this is a bug, patch to follow shortly. > > 2) release_pages() calls __ClearPageActive() while put_page() does not > > I have no idea if the second difference is a bug or not. But it smells of > one... > > It would be nice to know if the open coding of put_page is really a performance > benefit or not. It seems like an attempt to optimize the taking of the page > data lock. > > Does anyone have any information about the performance advantage here? > > Given the changes above it seems like it would be a benefit to merge the 2 call > paths more closely to make sure we do the right thing. > Yes, it does. Maybe best to not do the temporary measure, then, while this stuff gets improved. I'll look at your other patch... thanks, -- John Hubbard NVIDIA