On Tue, Apr 09, 2019 at 11:04:18AM +0800, Huang Shijie wrote: > On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote: > > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote: > > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote: > > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote: > > > > > The root cause is that sg_alloc_table_from_pages() requires the > > > > > page order to keep the same as it used in the user space, but > > > > > get_user_pages_fast() will mess it up. > > > > > > > > I don't understand how get_user_pages_fast() can return the pages in a > > > > different order in the array from the order they appear in userspace. > > > > Can you explain? > > > Please see the code in gup.c: > > > > > > int get_user_pages_fast(unsigned long start, int nr_pages, > > > unsigned int gup_flags, struct page **pages) > > > { > > > ....... > > > if (gup_fast_permitted(start, nr_pages)) { > > > local_irq_disable(); > > > gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time. > > > > Right ... but if it's not filled entirely, it will be filled part-way, > > and then we stop. > > > > > local_irq_enable(); > > > ret = nr; > > > } > > > ....... > > > if (nr < nr_pages) { > > > /* Try to get the remaining pages with get_user_pages */ > > > start += nr << PAGE_SHIFT; > > > pages += nr; // The @pages is moved forward. > > > > Yes, to the point where gup_pgd_range() stopped. > > > > > if (gup_flags & FOLL_LONGTERM) { > > > down_read(¤t->mm->mmap_sem); > > > ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time > > > > Right. > > > > > /* > > > * retain FAULT_FOLL_ALLOW_RETRY optimization if > > > * possible > > > */ > > > ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time. > > > pages, gup_flags); > > > > Yes. But they'll be in the same order. > > > > > BTW, I do not know why we mess up the page order. It maybe used in some special case. > > > > I'm not discounting the possibility that you've found a bug. > > But documenting that a bug exists is not the solution; the solution is > > fixing the bug. > I do not think it is a bug :) > > If we use the get_user_pages_unlocked(), DMA is okay, such as: > .... > get_user_pages_unlocked() > sg_alloc_table_from_pages() > ..... > > I think the comment is not accurate enough. So just add more comments, and tell the driver > users how to use the GUPs. gup_fast() and gup_unlocked() should return the pages in the same order. If they do not, then it is a bug.