>> + for (i = 0; i < nr; i++, page++) { >> + if (anon) { >> + /* >> + * If this page may have been pinned by the >> + * parent process, copy the page immediately for >> + * the child so that we'll always guarantee the >> + * pinned page won't be randomly replaced in the >> + * future. >> + */ >> + if (unlikely(page_try_dup_anon_rmap( >> + page, false, src_vma))) { >> + if (i != 0) >> + break; >> + /* Page may be pinned, we have to copy. */ >> + return copy_present_page( >> + dst_vma, src_vma, dst_pte, >> + src_pte, addr, rss, prealloc, >> + page); >> + } >> + rss[MM_ANONPAGES]++; >> + VM_BUG_ON(PageAnonExclusive(page)); >> + } else { >> + page_dup_file_rmap(page, false); >> + rss[mm_counter_file(page)]++; >> + } >> } >> - rss[MM_ANONPAGES]++; >> - } else if (page) { >> - folio_get(folio); >> - page_dup_file_rmap(page, false); >> - rss[mm_counter_file(page)]++; >> + >> + nr = i; >> + folio_ref_add(folio, nr); > > You're changing the order of mapcount vs. refcount increment. Don't. > Make sure your refcount >= mapcount. > > You can do that easily by doing the folio_ref_add(folio, nr) first and > then decrementing in case of error accordingly. Errors due to pinned > pages are the corner case. > > I'll note that it will make a lot of sense to have batch variants of > page_try_dup_anon_rmap() and page_dup_file_rmap(). > i still don't understand why it is not a entire map+1, but an increment in each basepage. as long as it is a CONTPTE large folio, there is no much difference with PMD-mapped large folio. it has all the chance to be DoubleMap and need split. When A and B share a CONTPTE large folio, we do madvise(DONTNEED) or any similar things on a part of the large folio in process A, this large folio will have partially mapped subpage in A (all CONTPE bits in all subpages need to be removed though we only unmap a part of the large folioas HW requires consistent CONTPTEs); and it has entire map in process B(all PTEs are still CONPTES in process B). isn't it more sensible for this large folios to have entire_map = 0(for process B), and subpages which are still mapped in process A has map_count =0? (start from -1). > Especially, the batch variant of page_try_dup_anon_rmap() would only > check once if the folio maybe pinned, and in that case, you can simply > drop all references again. So you either have all or no ptes to process, > which makes that code easier. > > But that can be added on top, and I'll happily do that. > > -- > Cheers, > > David / dhildenb Thanks Barry