On Wed, 2022-06-15 at 19:58 +0300, Kirill A. Shutemov wrote: > On Mon, Jun 13, 2022 at 05:36:43PM +0000, Edgecombe, Rick P wrote: > > On Fri, 2022-06-10 at 17:35 +0300, Kirill A. Shutemov wrote: > > > +#ifdef CONFIG_X86_64 > > > +/* > > > + * Mask out tag bits from the address. > > > + * > > > + * Magic with the 'sign' allows to untag userspace pointer > > > without > > > any branches > > > + * while leaving kernel addresses intact. > > > > Trying to understand the magic part here. I guess how it works is, > > when > > the high bit is set, it does the opposite of untagging the > > addresses by > > setting the tag bits instead of clearing them. So: > > - For proper canonical kernel addresses (with U57) it leaves > > them > > intact since the tag bits were already set. > > - For non-canonical kernel-half addresses, it fixes them up. > > (0xeffffff000000840->0xfffffff000000840) > > - For U48 and 5 level paging, it corrupts some normal kernel > > addresses. (0xff90ffffffffffff->0xffffffffffffffff) > > > > I just ported this to userspace and threw some addresses at it to > > see > > what happened, so hopefully I got that right. > > Ouch. Thanks for noticing this. I should have catched this myself. > Yes, > this implementation is broken for LAM_U48 on 5-level machine. > > What about this: > > #define untagged_addr(mm, > addr) ({ \ > u64 __addr = (__force > u64)(addr); \ > s64 sign = (s64)__addr >> > 63; \ > __addr &= (mm)->context.untag_mask | > sign; \ > (__force > __typeof__(addr))__addr; \ > }) > > It makes mask effectively. all-ones for supervisor addresses. And it > is > less magic to my eyes. Yea, it seems to leave kernel half addresses alone now, including leaving non-canonical addresses as non-canonical and 5 level addresses. With the new bit math: Reviewed-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx> > > The generated code also look sane to me: > > 11d0: 48 89 f8 mov %rdi,%rax > 11d3: 48 c1 f8 3f sar $0x3f,%rax > 11d7: 48 0b 05 52 2e 00 00 or > 0x2e52(%rip),%rax # 4030 <untag_mask> > 11de: 48 21 f8 and %rdi,%rax > > Any comments? > > > Is this special kernel address handling only needed because > > copy_to_kernel_nofault(), etc call the user helpers? > > I did not have any particular use-case in mind. But just if some > kernel > address gets there and bits get cleared we will have very hard to > debug > bug. I just was thinking if we could rearrange the code to avoid untagging kernel addresses, we could skip this, or even VM_WARN_ON() if we see one. Seems ok either way.