On 4/19/23 11:23?AM, Lorenzo Stoakes wrote: > On Wed, Apr 19, 2023 at 10:59:27AM -0600, Jens Axboe wrote: >> On 4/19/23 10:35?AM, Jens Axboe wrote: >>> On 4/18/23 9:49?AM, Lorenzo Stoakes wrote: >>>> We are shortly to remove pin_user_pages(), and instead perform the required >>>> VMA checks ourselves. In most cases there will be a single VMA so this >>>> should caues no undue impact on an already slow path. >>>> >>>> Doing this eliminates the one instance of vmas being used by >>>> pin_user_pages(). >>> >>> First up, please don't just send single patches from a series. It's >>> really annoying when you are trying to get the full picture. Just CC the >>> whole series, so reviews don't have to look it up separately. >>> >>> So when you're doing a respin for what I'll mention below and the issue >>> that David found, please don't just show us patch 4+5 of the series. >> >> I'll reply here too rather than keep some of this conversaion >> out-of-band. >> >> I don't necessarily think that making io buffer registration dumber and >> less efficient by needing a separate vma lookup after the fact is a huge >> deal, as I would imagine most workloads register buffers at setup time >> and then don't change them. But if people do switch sets at runtime, >> it's not necessarily a slow path. That said, I suspect the other bits >> that we do in here, like the GUP, is going to dominate the overhead >> anyway. > > Thanks, and indeed I expect the GUP will dominate. Unless you have a lot of vmas... Point is, it's _probably_ not a problem, but it might and it's making things worse for no real gain as far as I can tell outside of some notion of "cleaning up the code". >> My main question is, why don't we just have a __pin_user_pages or >> something helper that still takes the vmas argument, and drop it from >> pin_user_pages() only? That'd still allow the cleanup of the other users >> that don't care about the vma at all, while retaining the bundled >> functionality for the case/cases that do? That would avoid needing >> explicit vma iteration in io_uring. >> > > The desire here is to completely eliminate vmas as an externally available > parameter from GUP. While we do have a newly introduced helper that returns > a VMA, doing the lookup manually for all other vma cases (which look up a > single page and vma), that is more so a helper that sits outside of GUP. > > Having a separate function that still bundled the vmas would essentially > undermine the purpose of the series altogether which is not just to clean > up some NULL's but rather to eliminate vmas as part of the GUP interface > altogether. > > The reason for this is that by doing so we simplify the GUP interface, > eliminate a whole class of possible future bugs with people holding onto > pointers to vmas which may dangle and lead the way to future changes in GUP > which might be more impactful, such as trying to find means to use the fast > paths in more areas with an eye to gradual eradication of the use of > mmap_lock. > > While we return VMAs, none of this is possible and it also makes the > interface more confusing - without vmas GUP takes flags which define its > behaviour and in most cases returns page objects. The odd rules about what > can and cannot return vmas under what circumstances are not helpful for new > users. > > Another point here is that Jason suggested adding a new > FOLL_ALLOW_BROKEN_FILE_MAPPINGS flag which would, by default, not be > set. This could assert that only shmem/hugetlb file mappings are permitted > which would eliminate the need for you to perform this check at all. > > This leads into the larger point that GUP-writing file mappings is > fundamentally broken due to e.g. GUP not honouring write notify so this > check should at least in theory not be necessary. > > So it may be the case that should such a flag be added this code will > simply be deleted at a future point :) Why don't we do that first then? There's nothing more permanent than a temporary workaround/fix. Once it's in there, motivation to get rid of it for most people is zero because they just never see it. Seems like that'd be a much saner approach rather than the other way around, and make this patchset simpler/cleaner too as it'd only be removing code in all of the callers. -- Jens Axboe