On 25/02/2019 16:57, Catalin Marinas wrote: > On Tue, Feb 19, 2019 at 06:38:31PM +0000, Szabolcs Nagy wrote: >> i think these rules work for the cases i care about, a more >> tricky question is when/how to check for the new syscall abi >> and when/how the TCR_EL1.TBI0 setting may be turned off. > > I don't think turning TBI0 off is critical (it's handy for PAC with > 52-bit VA but then it's short-lived if you want more security features > like MTE). yes, i made a mistake assuming TBI0 off is required for (or at least compatible with) MTE. if TBI0 needs to be on for MTE then some of my analysis is wrong, and i expect TBI0 to be on in the foreseeable future. >> consider the following cases (tb == top byte): >> >> binary 1: user tb = any, syscall tb = 0 >> tbi is on, "legacy binary" >> >> binary 2: user tb = any, syscall tb = any >> tbi is on, "new binary using tb" >> for backward compat it needs to check for new syscall abi. >> >> binary 3: user tb = 0, syscall tb = 0 >> tbi can be off, "new binary", >> binary is marked to indicate unused tb, >> kernel may turn tbi off: additional pac bits. >> >> binary 4: user tb = mte, syscall tb = mte >> like binary 3, but with mte, "new binary using mte" so this should be "like binary 2, but with mte". >> does it have to check for new syscall abi? >> or MTE HWCAP would imply it? >> (is it possible to use mte without new syscall abi?) > > I think MTE HWCAP should imply it. > >> in userspace we want most binaries to be like binary 3 and 4 >> eventually, i.e. marked as not-relying-on-tbi, if a dso is >> loaded that is unmarked (legacy or new tb user), then either >> the load fails (e.g. if mte is already used? or can we turn >> mte off at runtime?) or tbi has to be enabled (prctl? does >> this work with pac? or multi-threads?). > > We could enable it via prctl. That's the plan for MTE as well (in > addition maybe to some ELF flag). > >> as for checking the new syscall abi: i don't see much semantic >> difference between AT_HWCAP and AT_FLAGS (either way, the user >> has to check a feature flag before using the feature of the >> underlying system and it does not matter much if it's a syscall >> abi feature or cpu feature), but i don't see anything wrong >> with AT_FLAGS if the kernel prefers that. > > The AT_FLAGS is aimed at capturing binary 2 case above, i.e. the > relaxation of the syscall ABI to accept tb = any. The MTE support will > have its own AT_HWCAP, likely in addition to AT_FLAGS. Arguably, > AT_FLAGS is either redundant here if MTE implies it (and no harm in > keeping it around) or the meaning is different: a tb != 0 may be checked > by the kernel against the allocation tag (i.e. get_user() could fail, > the tag is not entirely ignored). > >> the discussion here was mostly about binary 2, > > That's because passing tb != 0 into the syscall ABI is the main blocker > here that needs clearing out before merging the MTE support. There is, > of course, a variation of binary 1 for MTE: > > binary 5: user tb = mte, syscall tb = 0 > > but this requires a lot of C lib changes to support properly. yes, i don't think we want to do that. but it's ok to have both syscall tbi AT_FLAGS and MTE HWCAP. >> but for >> me the open question is if we can make binary 3/4 work. >> (which requires some elf binary marking, that is recognised >> by the kernel and dynamic loader, and efficient handling of >> the TBI0 bit, ..if it's not possible, then i don't see how >> mte will be deployed). > > If we ignore binary 3, we can keep TBI0 = 1 permanently, whether we have > MTE or not. > >> and i guess on the kernel side the open question is if the >> rules 1/2/3/4 can be made to work in corner cases e.g. when >> pointers embedded into structs are passed down in ioctl. > > We've been trying to track these down since last summer and we came to > the conclusion that it should be (mostly) fine for the non-weird memory > described above. i think an interesting case is when userspace passes a pointer to the kernel and later gets it back, which is why i proposed rule 4 (kernel has to keep the tag then). but i wonder what's the right thing to do for sp (user can malloc thread/sigalt/makecontext stack which will be mte tagged in practice with mte on) does tagged sp work? should userspace untag the stack memory before setting it up as a stack? (but then user pointers to that allocation may get broken..)