Re: [PATCH 00/11] Introduce kernel_clone(), kill _do_fork()
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
- To: Matthew Wilcox <willy@xxxxxxxxxxxxx>
- Subject: Re: [PATCH 00/11] Introduce kernel_clone(), kill _do_fork()
- From: ebiederm@xxxxxxxxxxxx (Eric W. Biederman)
- Date: Wed, 19 Aug 2020 08:32:59 -0500
- Cc: Christian Brauner <christian.brauner@xxxxxxxxxx>, peterz@xxxxxxxxxxxxx, Christoph Hewllig <hch@xxxxxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>, linux-arch@xxxxxxxxxxxxxxx, Jonathan Corbet <corbet@xxxxxxx>, Yoshinori Sato <ysato@xxxxxxxxxxxxxxxxxxxx>, Tony Luck <tony.luck@xxxxxxxxx>, Fenghua Yu <fenghua.yu@xxxxxxxxx>, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>, Ley Foon Tan <ley.foon.tan@xxxxxxxxx>, "David S. Miller" <davem@xxxxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Borislav Petkov <bp@xxxxxxxxx>, x86@xxxxxxxxxx, Arnd Bergmann <arnd@xxxxxxxx>, Steven Rostedt <rostedt@xxxxxxxxxxx>, Stafford Horne <shorne@xxxxxxxxx>, Kars de Jong <jongk@xxxxxxxxxxxxxx>, Kees Cook <keescook@xxxxxxxxxxxx>, Greentime Hu <green.hu@xxxxxxxxx>, Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx>, Alexandre Chartre <alexandre.chartre@xxxxxxxxxx>, Masami Hiramatsu <mhiramat@xxxxxxxxxx>, Tom Zanussi <zanussi@xxxxxxxxxx>, Xiao Yang <yangx.jy@xxxxxxxxxxxxxx>, linux-doc@xxxxxxxxxxxxxxx, uclinux-h8-devel@xxxxxxxxxxxxxxxxxxxx, linux-ia64@xxxxxxxxxxxxxxx, linux-m68k@xxxxxxxxxxxxxxx, sparclinux@xxxxxxxxxxxxxxx, kgdb-bugreport@xxxxxxxxxxxxxxxxxxxxx, linux-kselftest@xxxxxxxxxxxxxxx
- In-reply-to: <20200819111851.GY17456@casper.infradead.org> (Matthew Wilcox's message of "Wed, 19 Aug 2020 12:18:51 +0100")
- References: <20200818173411.404104-1-christian.brauner@ubuntu.com> <20200818174447.GV17456@casper.infradead.org> <20200819074340.GW2674@hirez.programming.kicks-ass.net> <20200819084556.im5zfpm2iquzvzws@wittgenstein> <20200819111851.GY17456@casper.infradead.org>
- User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)
Matthew Wilcox <willy@xxxxxxxxxxxxx> writes:
> On Wed, Aug 19, 2020 at 10:45:56AM +0200, Christian Brauner wrote:
>> On Wed, Aug 19, 2020 at 09:43:40AM +0200, peterz@xxxxxxxxxxxxx wrote:
>> > On Tue, Aug 18, 2020 at 06:44:47PM +0100, Matthew Wilcox wrote:
>> > > On Tue, Aug 18, 2020 at 07:34:00PM +0200, Christian Brauner wrote:
>> > > > The only remaining function callable outside of kernel/fork.c is
>> > > > _do_fork(). It doesn't really follow the naming of kernel-internal
>> > > > syscall helpers as Christoph righly pointed out. Switch all callers and
>> > > > references to kernel_clone() and remove _do_fork() once and for all.
>> > >
>> > > My only concern is around return type. long, int, pid_t ... can we
>> > > choose one and stick to it? pid_t is probably the right return type
>> > > within the kernel, despite the return type of clone3(). It'll save us
>> > > some work if we ever go through the hassle of growing pid_t beyond 31-bit.
>> >
>> > We have at least the futex ABI restricting PID space to 30 bits.
>>
>> Ok, looking into kernel/futex.c I see
>>
>> pid_t pid = uval & FUTEX_TID_MASK;
>>
>> which is probably what this referes to and /proc/sys/kernel/threads-max
>> is restricted to FUTEX_TID_MASK.
>>
>> Afaict, that doesn't block switching kernel_clone() to return pid_t. It
>> can't create anything > FUTEX_TID_MASK anyway without yelling EAGAIN at
>> userspace. But it means that _if_ we were to change the size of pid_t
>> we'd likely need a new futex API.
>
> Yes, there would be a lot of work to do to increase the size of pid_t.
> I'd just like to not do anything to make that harder _now_. Stick to
> using pid_t within the kernel.
Just so people are aware. If you look in include/linux/threads.h you
can see that the maximum value of PID_MAX_LIMIT limits pids to 22 bits.
Further the design decisions of pids keeps us densly using pids. So I
expect it will be a while before we even come close to using 30 bits of
pid space.
At the same time I do agree that it makes sense to use a consistent type
in the kernel to make it easier to read and update the code.
Eric
[Index of Archives]
[Linux Kernel]
[Sparc Linux]
[DCCP]
[Linux ARM]
[Yosemite News]
[Linux SCSI]
[Linux x86_64]
[Linux for Ham Radio]