Re: [PATCH v1 0/5] s390: page_mapcount(), page_has_private() and PG_arch_1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On Thu, Apr 04, 2024 at 06:36:37PM +0200, David Hildenbrand wrote:
> On my journey to remove page_mapcount(), I got hooked up on other folio
> cleanups that Willy most certainly will enjoy.
> This series removes the s390x usage of:
> * page_mapcount() [patches WIP]
> * page_has_private() [have patches to remove it]
> ... and makes PG_arch_1 only be set on folio->flags (i.e., never on tail
> pages of large folios).
> Further, one "easy" fix upfront.

Looks like you didn't see:

> ... unfortunately there is one other issue I spotted that I am not
> tackling in this series, because I am not 100% sure what we want to
> do: the usage of page_ref_freeze()/folio_ref_freeze() in
> make_folio_secure() is unsafe. :(
> In make_folio_secure(), we're holding the folio lock, the mmap lock and
> the PT lock. So we are protected against concurrent fork(), zap, GUP,
> swapin, migration ... The page_ref_freeze()/ folio_ref_freeze() should
> also block concurrent GUP-fast very reliably.
> But if the folio is mapped into multiple page tables, we could see
> concurrent zapping of the folio, a pagecache folios could get mapped/
> accessed concurrent, we could see fork() sharing the page in another
> process, GUP ... trying to adjust the folio refcount while we froze it.
> Very bad.

Hmmm.  Why is that not then a problem for, eg, splitting or migrating?
Is it because they unmap first and then try to freeze?

> For anonymous folios, it would likely be sufficient to check that
> folio_mapcount() == 1. For pagecache folios, that's insufficient, likely
> we would have to lock the pagecache. To handle folios mapped into
> multiple page tables, we would have to do what
> split_huge_page_to_list_to_order() does (temporary migration entries).
> So it's a bit more involved, and I'll have to leave that to s390x folks to
> figure out. There are othe reasonable cleanups I think, but I'll have to
> focus on other stuff.
> Compile tested, but not runtime tested, I'll appreiate some testing help
> from people with UV access and experience.
> Cc: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
> Cc: Heiko Carstens <hca@xxxxxxxxxxxxx>
> Cc: Vasily Gorbik <gor@xxxxxxxxxxxxx>
> Cc: Alexander Gordeev <agordeev@xxxxxxxxxxxxx>
> Cc: Christian Borntraeger <borntraeger@xxxxxxxxxxxxx>
> Cc: Sven Schnelle <svens@xxxxxxxxxxxxx>
> Cc: Janosch Frank <frankja@xxxxxxxxxxxxx>
> Cc: Claudio Imbrenda <imbrenda@xxxxxxxxxxxxx>
> Cc: Gerald Schaefer <gerald.schaefer@xxxxxxxxxxxxx>
> Cc: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
> Cc: Thomas Huth <thuth@xxxxxxxxxx>
> David Hildenbrand (5):
>   s390/uv: don't call wait_on_page_writeback() without a reference
>   s390/uv: convert gmap_make_secure() to work on folios
>   s390/uv: convert PG_arch_1 users to only work on small folios
>   s390/uv: update PG_arch_1 comment
>   s390/hugetlb: convert PG_arch_1 code to work on folio->flags
>  arch/s390/include/asm/page.h |   2 +
>  arch/s390/kernel/uv.c        | 112 ++++++++++++++++++++++-------------
>  arch/s390/mm/gmap.c          |   4 +-
>  arch/s390/mm/hugetlbpage.c   |   8 +--
>  4 files changed, 79 insertions(+), 47 deletions(-)
> -- 
> 2.44.0

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux