Dave Hansen <dave@xxxxxxxxxxxxxxxxxx> writes: > I hate the syscall. It's a very un-Linux-y way of doing things. There, > I said it. Here's an alternative. It still uses the syscall to > initiate things, but it uses debugfs to transport the data instead. > This is just a concept demonstration. It doesn't actually work, and I > wouldn't be using debugfs in practice. A syscall is a very linux-y way to do it. If you called it a core dump instead of a checkpoint you have exactly the same set of issues. Why we are doing vfs_write instead of file->f_op->write I don't understand. > System calls in Linux are fast. Doing lots of them is not a problem. > If it becomes one, we can always export a condensed version of this > format next to the expanded one, kinda like ftrace does. Atomicity with > this approach is also not a problem. The system call in this approach > doesn't return until the checkpoint is completely written out. Extra copies for something (memory) you want to transfer quickly and efficiently is a problem. Reading the memory of another process is a problem, to the point that the /proc/<pid>/mem interface has been removed from the kernel. > This lets userspace pick and choose what parts of the checkpoint it > cares about. It enables us to do all the I/O from userspace: no > in-kernel sys_read/write(). I think this interface is much more > flexible than a plain syscall. Then get with Roland McGraff and build the next generation user space debugging interface. > Want to do a fast checkpoint? Fine, copy all data, use a lot of memory, > store it in-kernel. Dump that out when the filesystem is accessed. > Destroy it when userspace asks. > So, why not? Besides the part of creating a bunch of questionable interfaces that we need to support forever. Ultimately the question is how do you do checkpoint restore and I just don't see that happening with a filesystem interface. Way way way too many dangerous syscalls that are only needed for one thing. Checkpoint/Restore are an atomic operation, and filesystems suck and building high level atomic primitives. Eric _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers