On Thu, Dec 06, 2018 at 11:24:28AM -0600, Eric W. Biederman wrote: > Daniel Colascione <dancol@xxxxxxxxxx> writes: > > > On Thu, Dec 6, 2018 at 7:02 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > >> > >> Christian Brauner <christian@xxxxxxxxxx> writes: > >> > >> > The kill() syscall operates on process identifiers (pid). After a process > >> > has exited its pid can be reused by another process. If a caller sends a > >> > signal to a reused pid it will end up signaling the wrong process. This > >> > issue has often surfaced and there has been a push [1] to address this > >> > problem. > >> > > >> > This patch uses file descriptors (fd) from proc/<pid> as stable handles on > >> > struct pid. Even if a pid is recycled the handle will not change. The fd > >> > can be used to send signals to the process it refers to. > >> > Thus, the new syscall taskfd_send_signal() is introduced to solve this > >> > problem. Instead of pids it operates on process fds (taskfd). > >> > >> I am not yet thrilled with the taskfd naming. > > > > Both the old and new names were fine. Do you want to suggest a name at > > this point? You can't just say "I don't like this. Guess again" > > forever. > > Both names suck, as neither name actually describes what the function is > designed to do. > > Most issues happen at the interface between abstractions. A name that > confuses your users will just make that confusion more likely. So it is > important that we do the very best with the name that we can do. > > We are already having questions about what happens when you perform the > non-sense operation of sending a signal to a zombie. It comes up > because there are races when a process may die and you are not expecting > it. That is an issue with the existing signal sending API, that has > caused confusion. That isn't half as confusing as the naming issue. > > A task in linux is a single thread. A process is all of the threads. > If we are going to support both cases it doesn't make sense to hard code > a single case in the name. > > I would be inclined to simplify things and call the syscall something > like "fdkill(int fd, struct siginfo *info, int flags)". Or perhaps No, definitely nothing with "kill" will be used because that's absolutely not expressing what this syscall is doing. > just "fd_send_signal(int fd, struct siginfo *info, int flags)". > > Then we are not overspecifying what the system call does in the name. I feel changing the name around by a single persons preferences is not really a nice thing to do community-wise. So I'd like to hear other people chime in first before I make that change. > Plus it makes it clear that the fd specifies where the signal goes. > Something I see that by your reply below that you were confused about. > > >> Is there any plan to support sesssions and process groups? > > > > Such a thing could be added with flags in the future. Why complicate > > this patch? > > Actually that isn't the way this is designed. You would have to have > another kind of file descriptor. I am asking because it goes to the > question of naming and what we are trying to do here. > > We don't need to implement that but we have already looked into this > kind of extensibility. If we want the extensibility we should make > room for it, or just close the door. Having the door half open and a > confusing interface is a problem for users. > > >> I am concerned about using kill_pid_info. It does this: > >> > >> > >> rcu_read_lock(); > >> p = pid_task(pid, PIDTYPE_PID); > >> if (p) > >> error = group_send_sig_info(sig, info, p, PIDTYPE_TGID); > >> rcu_read_unlock(); > >> > >> That pid_task(PIDTYPE_PID) is fine for existing callers that need bug > >> compatibility. For new interfaces I would strongly prefer pid_task(PIDTYPE_TGID). > > > > What is the bug that PIDTYPE_PID preserves? > > I am not 100% certain all of the bits for this to matter have been > merged yet but we are close enough that it would not be hard to make it > matter. > > There are two strange behaviours of ordinary kill on the linux kernel > that I am aware of. > > 1) kill(thread_id,...) where the thread_id is not the id of the first > thread and the thread_id thus the pid of the process sends the signal > to the entire process. Something that arguably should not happen. > > 2) kill(pid,...) where the original thread has exited performs the > permission checks against the dead signal group leader. Which means > that the permission checks for sending a signal are very likely wrong > for a multi-threaded processes that calls a function like setuid. > > To fix the second case we are going to have to perform the permission > checks on a non-zombie thread. That is something that is straight > forward to make work with PIDTYPE_TGID. It is not so easy to make work > with PIDTYPE_PID. > > I looked and it doesn't look like I have merged the logic of having > PIDTYPE_TGID point to a living thread when the signal group leader > exits and becomes a zombie. It isn't hard but it does require some very > delicate surgery on the code, so that we don't break some of the > historic confusion of threads and process groups. Then this seems irrelevant to the current patch. It seems we can simply switch to PIDTYPE_PGID once your new logic lands but not right now.