On Tue, May 07, 2024 at 08:15:08PM +0530, Kundan Kumar wrote: > Add a bigger size from folio to bio and skip processing for pages. > > Fetch the offset of page within a folio. Depending on the size of folio > and folio_offset, fetch a larger length. This length may consist of > multiple contiguous pages if folio is multiorder. The problem is that it may not. Here's the scenario: int fd, fd2; fd = open(src, O_RDONLY); char *addr = mmap(NULL, 1024 * 1024, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_HUGETLB, fd, 0); int i, j; for (i = 0; i < 1024 * 1024; i++) j |= addr[i]; addr[30000] = 17; fd2 = open(dest, O_RDWR | O_DIRECT); write(fd2, &addr[16384], 32768); Assuming that the source file supports being cached in large folios, the page array we get from GUP might contain: f0p4 f0p5 f0p6 f1p0 f0p8 f0p9 ... because we allocated 'f1' when we did COW due to the store to addr[30000]. We can certainly reduce the cost of merge if we know two pages are part of the same folio, but we still need to check that we actually got pages which are part of the same folio.