On Wed, Oct 20, 2021 at 12:44 PM Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > > However, with MTE doing both get_user() every 16 bytes and > gup can get pretty expensive. So I really think that anything that is performance-critical had better only do the "fault_in_write()" code path in the cold error path where you took a page fault. IOW, the loop structure should be repeat: take_locks(); pagefault_disable(); ret = do_things_with_locks(); pagefault_enable(); release_locks(); // Failure path? if (unlikely(ret == -EFAULT)) { if (fault_in_writable(..)) goto repeat; return -EFAULT; } rather than have it be some kind of "first do fault_in_writable(), then do the work" model. So I wouldn't worry too much about the performance concerns. It simply shouldn't be a common or hot path. And yes, I've seen code that does that "fault_in_xyz()" before the critical operation that cannot take page faults, and does it unconditionally. But then it isn't the "fault_in_xyz()" that should be blamed if it is slow, but the caller that does things the wrong way around. Linus