On 6/18/24 5:05 PM, Elliot Berman wrote:
In arm64 pKVM and QuIC's Gunyah protected VM model, we want to support grabbing shmem user pages instead of using KVM's guestmemfd. These hypervisors provide a different isolation model than the CoCo implementations from x86. KVM's guest_memfd is focused on providing memory that is more isolated than AVF requires. Some specific examples include ability to pre-load data onto guest-private pages, dynamically sharing/isolating guest pages without copy, and (future) migrating guest-private pages. In sum of those differences after a discussion in [1] and at PUCK, we want to try to stick with existing shmem and extend GUP to support the isolation needs for arm64 pKVM and Gunyah. To that end, we introduce the concept of "exclusive GUP pinning", which enforces that only one pin of any kind is allowed when using the FOLL_EXCLUSIVE flag is set. This behavior doesn't affect FOLL_GET or any other folio refcount operations that don't go through the FOLL_PIN path. [1]: https://lore.kernel.org/all/20240319143119.GA2736@willie-the-truck/
Hi! Looking through this, I feel that some intangible threshold of "this is too much overloading of page->_refcount" has been crossed. This is a very specific feature, and it is using approximately one more bit than is really actually "available"... If we need a bit in struct page/folio, is this really the only way? Willy is working towards getting us an entirely separate folio->pincount, I suppose that might take too long? Or not? This feels like force-fitting a very specific feature (KVM/CoCo handling of shmem pages) into a more general mechanism that is running low on bits (gup/pup). Maybe a good topic for LPC! thanks, -- John Hubbard NVIDIA
Tree with patches at: https://git.codelinaro.org/clo/linux-kernel/gunyah-linux/-/tree/sent/exclusive-gup-v1 anup@xxxxxxxxxxxxxx, paul.walmsley@xxxxxxxxxx, palmer@xxxxxxxxxxx, aou@xxxxxxxxxxxxxxxxx, seanjc@xxxxxxxxxx, viro@xxxxxxxxxxxxxxxxxx, brauner@xxxxxxxxxx, willy@xxxxxxxxxxxxx, akpm@xxxxxxxxxxxxxxxxxxxx, xiaoyao.li@xxxxxxxxx, yilun.xu@xxxxxxxxx, chao.p.peng@xxxxxxxxxxxxxxx, jarkko@xxxxxxxxxx, amoorthy@xxxxxxxxxx, dmatlack@xxxxxxxxxx, yu.c.zhang@xxxxxxxxxxxxxxx, isaku.yamahata@xxxxxxxxx, mic@xxxxxxxxxxx, vbabka@xxxxxxx, vannapurve@xxxxxxxxxx, ackerleytng@xxxxxxxxxx, mail@xxxxxxxxxxxxxxxxxxxxx, david@xxxxxxxxxx, michael.roth@xxxxxxx, wei.w.wang@xxxxxxxxx, liam.merwick@xxxxxxxxxx, isaku.yamahata@xxxxxxxxx, kirill.shutemov@xxxxxxxxxxxxxxx, suzuki.poulose@xxxxxxx, steven.price@xxxxxxx, quic_eberman@xxxxxxxxxxx, quic_mnalajal@xxxxxxxxxxx, quic_tsoni@xxxxxxxxxxx, quic_svaddagi@xxxxxxxxxxx, quic_cvanscha@xxxxxxxxxxx, quic_pderrin@xxxxxxxxxxx, quic_pheragu@xxxxxxxxxxx, catalin.marinas@xxxxxxx, james.morse@xxxxxxx, yuzenghui@xxxxxxxxxx, oliver.upton@xxxxxxxxx, maz@xxxxxxxxxx, will@xxxxxxxxxx, qperret@xxxxxxxxxx, keirf@xxxxxxxxxx, tabba@xxxxxxxxxx Signed-off-by: Elliot Berman <quic_eberman@xxxxxxxxxxx> --- Elliot Berman (2): mm/gup-test: Verify exclusive pinned mm/gup_test: Verify GUP grabs same pages twice Fuad Tabba (3): mm/gup: Move GUP_PIN_COUNTING_BIAS to page_ref.h mm/gup: Add an option for obtaining an exclusive pin mm/gup: Add support for re-pinning a normal pinned page as exclusive include/linux/mm.h | 57 ++++---- include/linux/mm_types.h | 2 + include/linux/page_ref.h | 74 ++++++++++ mm/Kconfig | 5 + mm/gup.c | 265 ++++++++++++++++++++++++++++++---- mm/gup_test.c | 108 ++++++++++++++ mm/gup_test.h | 1 + tools/testing/selftests/mm/gup_test.c | 5 +- 8 files changed, 457 insertions(+), 60 deletions(-) --- base-commit: 6ba59ff4227927d3a8530fc2973b80e94b54d58f change-id: 20240509-exclusive-gup-66259138bbff Best regards,