On Tue, 2022-11-15 at 14:03 +0100, Peter Zijlstra wrote: > On Tue, Nov 15, 2022 at 01:26:23PM +0100, Peter Zijlstra wrote: > > On Fri, Nov 04, 2022 at 03:35:51PM -0700, Rick Edgecombe wrote: > > > From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> > > > > > > Add three new arch_prctl() handles: > > > > > > - ARCH_CET_ENABLE/DISABLE enables or disables the specified > > > feature. Returns 0 on success or an error. > > > > > > - ARCH_CET_LOCK prevents future disabling or enabling of the > > > specified feature. Returns 0 on success or an error > > > > > > The features are handled per-thread and inherited over > > > fork(2)/clone(2), > > > but reset on exec(). > > > > > > This is preparation patch. It does not implement any features. > > > > Urgh... so much for sharing with other architectures I suppose :/ > > > > The ARM64 BTI thing is very similar to IBT (except I think their > > approach to the legacy bitmap is much saner). > > > > Given that IBT isn't supported and needs the whole legacy bitmap > > mess, > > do we really want to call this CET ? Why not just make a Shadow > > Stack > > API and tackle IBT independently. > > On that; ARM64 exposes PROT_BTI (to be used by mprotect()) and have > an > ELF_ARM64_BTI note for the loader to bootstrap things. > > We could co-opt that same interface and instead of flipping actual > PTE > bits, have this thing manage the legacy bitmap -- basically have the > legacy bitmap function as an external PTE bit array (in inverse). > > Basically, have every page mapped PROT_EXEC set the bit in the legacy > bitmap while every page mapped PROT_EXEC|PROT_BTI will have the > legacy > bitmap bit to 0. > > And as long as there is a single 0 in the bitmap, the feature is > enabled. > > (obviously we can delay allocating the bitmap until the first > PROT_EXEC > mapping that lacks PROT_BTI) This is an interesting idea. I'll have to think a little more on it. One non-impossible issue would be setting IBT in the MSR late. Each thread would have to be interrupted and have it set, while no new threads are created. Maybe this is easy and I just don't know how to do it. The other thing is there would be overhead compared to an IBT implementation with a separate interface from BTI. Would have to look at the tradeoffs.