On 2019/5/7 下午11:47, Christoph Hellwig wrote: > On Mon, May 06, 2019 at 10:23:29PM -0400, Jason Wang wrote: >> Note: there're archs (few non popular ones) that don't implement >> futex helper, we can't log dirty pages. We can fix them on top or >> simply disable LOG_ALL features of vhost. > > That means vhost now has to depend on HAVE_FUTEX_CMPXCHG to make > sure we have a working implementation. I found HAVE_FUTEX_CMPXCHG is not a must for arch that has the implementation and futex does some kind of runtime detection like: static void __init futex_detect_cmpxchg(void) { #ifndef CONFIG_HAVE_FUTEX_CMPXCHG u32 curval; /* * This will fail and we want it. Some arch implementations do * runtime detection of the futex_atomic_cmpxchg_inatomic() * functionality. We want to know that before we call in any * of the complex code paths. Also we want to prevent * registration of robust lists in that case. NULL is * guaranteed to fault and we get -EFAULT on functional * implementation, the non-functional ones will return * -ENOSYS. */ if (cmpxchg_futex_value_locked(&curval, NULL, 0, 0) == -EFAULT) futex_cmpxchg_enabled = 1; #endif } > > >> #include <linux/sched/signal.h> >> #include <linux/interval_tree_generic.h> >> #include <linux/nospec.h> >> +#include <asm/futex.h> > > Also please include the futex maintainers to make sure they are fine > with this first usage of <asm/futex.h> outside of kernel/futex.c. > Thanks for ccing them. Will do for next version. If we decide to go this way, we probably need to move it to uaccess for a more generic helper. > >> +static int set_bit_to_user(int nr, u32 __user *addr) >> { >> unsigned long log = (unsigned long)addr; >> struct page *page; >> + u32 old_log; >> int r; >> >> r = get_user_pages_fast(log, 1, 1, &page); >> if (r < 0) >> return r; >> BUG_ON(r != 1); >> + >> + r = futex_atomic_cmpxchg_inatomic(&old_log, addr, 0, 0); >> + if (r < 0) >> + return r; >> + >> + old_log |= 1 << nr; >> + r = put_user(old_log, addr); >> + if (r < 0) >> + return r; > > And this just looks odd to me. Why do we need the futex call to > replace a 0 value with 0? Why does it still duplicate the > put_user? This doesn't look like actually working code to me. Yes, this is a bug. Should be something like: static int set_bit_to_user(int nr, u32 __user *addr) { unsigned long log = (unsigned long)addr; struct page *page; u32 old_log, new_log, l; int r; r = get_user_pages_fast(log, 1, 1, &page); if (r < 0) return r; BUG_ON(r != 1); do { r = get_user(old_log, addr); if (r < 0) return r; new_log = old_log | (1 << nr); r = futex_atomic_cmpxchg_inatomic(&l, addr, old_log, new_log); if (r < 0) return r; } while(l != new_log); set_page_dirty_lock(page); put_page(page); return 0; } > > Also don't we need a pagefault_disable() around > futex_atomic_cmpxchg_inatomic? Since we don't want to deal with pagefault, so the page has been pinned before futex_atomic_cmpxchg_inatomic(). Thanks