On Wed, Feb 09, 2022 at 06:37:53PM -0800, Andy Lutomirski wrote: > On 2/8/22 18:18, Edgecombe, Rick P wrote: > > On Tue, 2022-02-08 at 20:02 +0300, Cyrill Gorcunov wrote: > > > On Tue, Feb 08, 2022 at 08:21:20AM -0800, Andy Lutomirski wrote: > > > > > > But such a knob will immediately reduce the security value of > > > > > > the entire > > > > > > thing, and I don't have good ideas how to deal with it :( > > > > > > > > > > Probably a kind of latch in the task_struct which would trigger > > > > > off once > > > > > returt to a different address happened, thus we would be able to > > > > > jump inside > > > > > paratite code. Of course such trigger should be available under > > > > > proper > > > > > capability only. > > > > > > > > I'm not fully in touch with how parasite, etc works. Are we > > > > talking about save or restore? > > > > > > We use parasite code in question during checkpoint phase as far as I > > > remember. > > > push addr/lret trick is used to run "injected" code (code injection > > > itself is > > > done via ptrace) in compat mode at least. Dima, Andrei, I didn't look > > > into this code > > > for years already, do we still need to support compat mode at all? > > > > > > > If it's restore, what exactly does CRIU need to do? Is it just > > > > that CRIU needs to return > > > > out from its resume code into the to-be-resumed program without > > > > tripping CET? Would it > > > > be acceptable for CRIU to require that at least one shstk slot be > > > > free at save time? > > > > Or do we need a mechanism to atomically switch to a completely full > > > > shadow stack at resume? > > > > > > > > Off the top of my head, a sigreturn (or sigreturn-like mechanism) > > > > that is intended for > > > > use for altshadowstack could safely verify a token on the > > > > altshdowstack, possibly > > > > compare to something in ucontext (or not -- this isn't clearly > > > > necessary) and switch > > > > back to the previous stack. CRIU could use that too. Obviously > > > > CRIU will need a way > > > > to populate the relevant stacks, but WRUSS can be used for that, > > > > and I think this > > > > is a fundamental requirement for CRIU -- CRIU restore absolutely > > > > needs a way to write > > > > the saved shadow stack data into the shadow stack. > > > > Still wrapping my head around the CRIU save and restore steps, but > > another general approach might be to give ptrace the ability to > > temporarily pause/resume/set CET enablement and SSP for a stopped > > thread. Then injected code doesn't need to jump through any hoops or > > possibly run into road blocks. I'm not sure how much this opens things > > up if the thread has to be stopped... > > Hmm, that's maybe not insane. > > An alternative would be to add a bona fide ptrace call-a-function mechanism. > I can think of two potentially usable variants: > > 1. Straight call. PTRACE_CALL_FUNCTION(addr) just emulates CALL addr, > shadow stack push and all. > > 2. Signal-style. PTRACE_CALL_FUNCTION_SIGFRAME injects an actual signal > frame just like a real signal is being delivered with the specified handler. > There could be a variant to opt-in to also using a specified altstack and > altshadowstack. I think this would be ideal. In CRIU, the parasite code is executed in the "daemon" mode and returns back via sigreturn. Right now, CRIU needs to generate a signal frame. If I understand your idea right, the signal frame will be generated by the kernel. > > 2 would be more expensive but would avoid the need for much in the way of > asm magic. The injected code could be plain C (or Rust or Zig or whatever). > > All of this only really handles save, not restore. I don't understand > restore enough to fully understand the issue. In a few words, it works like this: CRIU restores all required resources and prepares a signal frame with a target process state, then it switches to a small PIE blob, where it restores vma-s and calls rt_sigreturn. > > --Andy