On Tue, Aug 31, 2021 at 03:28:57PM +0000, Al Viro wrote: > On Tue, Aug 31, 2021 at 02:54:50PM +0100, Catalin Marinas wrote: > > An arm64-specific workaround would be for pagefault_disable() to disable > > tag checking. It's a pretty big hammer, weakening the out of bounds > > access detection of MTE. My preference would be a fix in the btrfs code. > > > > A btrfs option would be for copy_to_sk() to return an indication of > > where the fault occurred and get fault_in_pages_writeable() to check > > that location, even if the copying would restart from an earlier offset > > (this requires open-coding copy_to_user_nofault()). An attempt below, > > untested and does not cover read_extent_buffer_to_user_nofault(): > > Umm... There's another copy_to_user_nofault() call in the same function > (same story, AFAICS). I cleaned up this patch [1] but I realised it still doesn't solve it. The arm64 __copy_to_user_inatomic(), while ensuring progress if called in a loop, it does not guarantee precise copy to the fault position. The copy_to_sk(), after returning an error, starts again from the previous sizeof(sh) boundary rather than from where the __copy_to_user_inatomic() stopped. So it can get stuck attempting to copy the same search header. An ugly fix is to fall back to byte by byte copying so that we can attempt the actual fault address in fault_in_pages_writeable(). If the sh being recreated in copy_to_sk() is the same on the retried iteration, we could use an *sk_offset that is not a multiple of sizeof(sh) in order to have progress. But it's not clear to me whether the data being copied can change once btrfs_release_path() is called. [1] https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=devel/btrfs-fix -- Catalin