On Wed, Dec 11, 2024 at 2:47 PM Jeff Xu <jeffxu@xxxxxxxxxxxx> wrote: > > Hi Andrei > > Thanks for your email. > I was hoping to get some feedback from CRIU devs, and happy to see you > reaching out.. > ... > I have been thinking of other alternatives, but those would require > more understanding on CRIU use cases. > One of my questions is: Would CRIU target an individual process? or > entire systems? It targets individual processes that have been forked from the main CRIU process. > > If it is an individual process, we could use prctl to opt-in/opt-out > certain processes. There could be two alternatives. > 1> Opt-in solution: process must set prctl.seal_criu_mapping, this > needs to be set before execve() because sealing is applied at execve() > call. > 2> opt-out solution: The system will by default seal all of the system > mappings, but individual processes can opt-out by setting > prctl.not_seal_criu_mappings. This also needs to be set before > execve() call. I like the idea and I think the opt-out solution should work for CRIU. CRIU will be able to call this prctl and re-execute itself. Let me give you a bit of context on how CRIU works. When CRIU restores processes, it recreates a process tree by forking itself. Afterwards, it restores all mappings in each process but doesn't put them to proper addresses. After that, each process unmaps CRIU mappings from its address space and remaps its restored mappings to the proper addresses. So CRIU should be able to move system mappings and seal them if they have been sealed before dump. BTW, It isn't just about CRIU. gVisor and maybe some other sandbox solutions will be affected by this change too. gVisor uses stub-processes to represent guest address spaces. In a stub process, it unmaps all system mappings. > > For both cases, we will want to identify what type of mapping CRIU > cares about, i.e. maybe CRIU doesn't care about uprobe and vsyscall ? > and only care about vdso/vvar/sigpage ? As for now, it handles only vdso/vvar/sigpage mappings. It doesn't care about vsyscall because it is always mapped to the fixed address. gVisor should be able to unmap all system mappings from a process address space. Thanks, Andrei