On Tue, Oct 17, 2017 at 3:35 PM, prakash sangappa <prakash.sangappa@xxxxxxxxxx> wrote: > > On 10/17/2017 3:02 PM, Andy Lutomirski wrote: >> >> On Tue, Oct 17, 2017 at 8:38 AM, Prakash Sangappa >> <prakash.sangappa@xxxxxxxxxx> wrote: >>> >>> >>> On 10/16/17 5:52 PM, Andy Lutomirski wrote: >>>> >>>> On Mon, Oct 16, 2017 at 3:54 PM, prakash.sangappa >>>> <prakash.sangappa@xxxxxxxxxx> wrote: >>>>> >>>>> >>>>> On 10/16/2017 03:07 PM, Nagarathnam Muthusamy wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 10/16/2017 02:36 PM, Andrew Morton wrote: >>>>>>> >>>>>>> On Sat, 14 Oct 2017 11:17:47 +0300 Konstantin Khlebnikov >>>>>>> <khlebnikov@xxxxxxxxxxxxxx> wrote: >>>>>>> >>>>>>>>>>> pid_t translate_pid(pid_t pid, int source, int target); >>>>>>>>>>> >>>>>>>>>>> This syscall converts pid from source pid-ns into pid in target >>>>>>>>>>> pid-ns. >>>>>>>>>>> If pid is unreachable from target pid-ns it returns zero. >>>>>>>>>>> >>>>>>>>>>> Pid-namespaces are referred file descriptors opened to proc files >>>>>>>>>>> /proc/[pid]/ns/pid or /proc/[pid]/ns/pid_for_children. Negative >>>>>>>>>>> argument >>>>>>>>>>> refers to current pid namespace, same as file /proc/self/ns/pid. >>>>>>>>>>> >>>>>>>>>>> Kernel expose virtual pids in /proc/[pid]/status:NSpid, but >>>>>>>>>>> backward >>>>>>>>>>> translation requires scanning all tasks. Also pids could be >>>>>>>>>>> translated >>>>>>>>>>> by sending them through unix socket between namespaces, this >>>>>>>>>>> method >>>>>>>>>>> is >>>>>>>>>>> slow and insecure because other side is exposed inside pid >>>>>>>>>>> namespace. >>>>>>>> >>>>>>>> Andrew asked why we might need this. >>>>>>>> >>>>>>>> Such conversion is required for interaction between processes across >>>>>>>> pid-namespaces. >>>>>>>> For example to identify process in container by pid file looking >>>>>>>> from >>>>>>>> outside. >>>>>>>> >>>>>>>> Two years ago I've solved this in project of mine with monstrous >>>>>>>> code >>>>>>>> which >>>>>>>> forks couple times just to convert pid, lucky for me performance >>>>>>>> wasn't >>>>>>>> important. >>>>>>> >>>>>>> That's a single user who needed this a single time, and found a >>>>>>> userspace-based solution anyway. This is not exactly compelling! >>>>>>> >>>>>>> Is there a stronger case to be made? How does this change benefit >>>>>>> our >>>>>>> users? Sell it to us! >>>>>> >>>>>> Oracle database is planning to use pid namespace for sandboxing >>>>>> database >>>>>> instances and they need an API similar to translate_pid to effectively >>>>>> translate process IDs from other pid namespaces. Prakash (cced in >>>>>> mail) >>>>>> can >>>>>> provide more details on this usecase. >>>>> >>>>> >>>>> As Nagarathnam indicated, Oracle Database will be using pid namespaces >>>>> and >>>>> needs a direct method of converting pids of processes in the pid >>>>> namespace >>>>> hierarchy. In this use case multiple >>>>> nested PID namespaces will be used. The currently available mechanism >>>>> are >>>>> not very efficient for this use case. For ex. as Konstantin described, >>>>> using >>>>> /proc/<pid>/status would require the application to scan all the pid's >>>>> status files to determine the pid of given process in a child >>>>> namespace. >>>>> >>>>> Use of SCM_CREDENTIALS's socket message is another way, which would >>>>> require >>>>> every process starting inside a pid namespace to send this message and >>>>> the >>>>> receiving process in the target namespace would have to save the >>>>> converted >>>>> pid and reference it. This mechanism becomes cumbersome especially if >>>>> the >>>>> application has to deal with multiple nested pid namespaces. Also, the >>>>> Database needs to be able to convert a thread's global pid(gettid()). >>>>> Passing the thread's pid(gettid()) in SCM_CREDENTIALS message requires >>>>> CAP_SYS_ADMIN, which is an issue. >>>>> >>>>> So having a direct method, like the API that Konstantin is proposing, >>>>> will >>>>> work best for the Database >>>>> since pid of a process in any of the nested pid namespaces can be >>>>> converted >>>>> as and when required. I think with the proposed API, the application >>>>> should >>>>> be able to convert pid of a process or tid(gettid()) of a thread as >>>>> well. >>>>> >>>> Can you explain what Oracle's database is planning to do with this >>>> information? >>> >>> >>> Database uses the PID to programmatically find out if the process/thread >>> is >>> alive(kill 0) also send signals to the processes requesting it to dump >>> status/debug information and kill the processes in case of a shutdown >>> abort >>> of the instance. >> >> What I'm wondering is: how does the caller of kill() end up >> controlling a task whose pid it doesn't know in its own namespace? > > > I was generally describing how DB would use the PID of process. The above > description > was in the case when no namespaces are used. > > With use of namespaces, the DB would convert the PID of processes inside > its children namespaces to PID in its namespace and use that pid to issue > kill(). Seems vaguely sensible. If I were designing this type of system, I'd have a manager process in each namespace running as PID 1, though -- PID 1 is special and needs to understand what's going on anyway. Then PID 1 would do the kill() calls and wouldn't need translate_pid(). > > -Prakash. > >> >>> -Prakash. >>> >>> > -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html