On Wed, Mar 24, 2021 at 02:35:38PM +0100, Thomas Hellström (Intel) wrote: > > In an ideal world the creation/destruction of page table levels would > > by dynamic at this point, like THP. > > Hmm, but I'm not sure what problem we're trying to solve by changing the > interface in this way? We are trying to make a sensible driver API to deal with huge pages. > Currently if the core vm requests a huge pud, we give it one, and if we > can't or don't want to (because of dirty-tracking, for example, which is > always done on 4K page-level) we just return VM_FAULT_FALLBACK, and the > fault is retried at a lower level. Well, my thought would be to move the pte related stuff into vmf_insert_range instead of recursing back via VM_FAULT_FALLBACK. I don't know if the locking works out, but it feels cleaner that the driver tells the vmf how big a page it can stuff in, not the vm telling the driver to stuff in a certain size page which it might not want to do. Some devices want to work on a in-between page size like 64k so they can't form 2M pages but they can stuff 64k of 4K pages in a batch on every fault. That idea doesn't fit naturally if the VM is driving the size. Jason