On Wed, Mar 9, 2022 at 1:22 AM Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Tue, Mar 8, 2022 at 3:25 PM Andreas Gruenbacher <agruenba@xxxxxxxxxx> wrote: > > > > Seems to be working on s390x for this test case at least; the kind of > > trace I'm getting is: > > Good. > > > This shows bursts of successful fault-ins in gfs2_file_read_iter > > (read_fault). The pauses in between might be caused by the storage; > > I'm not sure. > > Don't know about the pauses, but the burst size might be impacted by that > > + Â Â Â const size_t max_size = 4 * PAGE_SIZE; > > thing that limits how many calls to fixup_user_fault() we do per > fault_in_safe_writeable(). > > So it might be worth checking if that value seems to make any difference. > > > I'd still let the caller of fault_in_safe_writeable() decide how much > > memory to fault in; the tight cap in fault_in_safe_writeable() doesn't > > seem useful. > > Well, there are real latency concerns there - fixup_user_fault() is > not necessarily all that low-cost. I just don't know if making the GUP based approach work instead of switching to fixup_user_fault(), or introducing something else, would make enough of a performance difference to be worth it. > And it's actually going to be worse when we have the sub-page coloring > issues happening, and will need to probe at a 128-byte granularity > (not on s390, but on arm64). > > At that point, we almost certainly will need to have a "probe user > space non-destructibly for writability" instruction (possibly > extending on our current arch_futex_atomic_op_inuser() > infrastructure). Let me add Catalin Marinas to the CC.