Re: What is the practical significance of fork

Greg Freemyer <greg.freemyer@xxxxxxxxx> · Wed, 15 Dec 2010 14:10:22 -0500

On Wed, Dec 15, 2010 at 1:34 PM, Mulyadi Santosa
<mulyadi.santosa@xxxxxxxxx> wrote:
> Hi :)
>
> On Thu, Dec 16, 2010 at 01:01, Chaitannya Mahatme <chaitannya@xxxxxxxxx> wrote:
>> I tried finding answer to this question in many books but never quite got a
>> satisfactory answer to this question.
>
> Satisfaction is hard to reach sometimes, you know :)
>
>> A fork process would replicate it parent, my question is
>>
>> Why is fork necessary to create a process.
>
> it's just a name actually.... what really matters is the procedure
> which are done inside fork() (as system call).
>
>>Why replicate a existing process
>> before creating a new process.
>
> Replicate? You mean Copy on Write(COW)?
>
> OK, I think what you refer here is actually COW. Essentially, when new
> process is created (forked), mainly new task struct, new signal
> handler tables, vma tables are created. But specificly for the VMAs
> (Virtual Memory area), they are simply pointing to its parent.
>
> only when a write process is targetting that COW-ed area, page fault
> is invoked and new page...one by one..as needed..are allocated
>
>> The exec algorithm is executed after fork which overrides what fork has
>> done, then why do fork. Why can't we directly do exec.
>
> I am not sure that I understand the above question correctly, you mean
> "why fork then exec if we could simply do exec?"
>
> First of all, I forgot few details about fork etc...but let's assume
> there is indeed exec after fork. What you need to know here is that
> fork() is a way to prepare complete process structure along with the
> process address space.
>
> While exec spesifically just deal with how the file (quite likely ELF
> binary) should be loaded and executed. So, in short, fork() build
> foundation, exec() put the bricks and mortars.
>
> After all, again, fork() just COW-ed its parent VMAs...thus when exec
> goes, there is no significant penalty.... no page is allocated for VMA
> after all during fork()
>
> --
> regards,

My first thought is to agree with Mulyadi, fork() is extremely
lightweight.  ie. fast / low-overhead.

So what exactly is the functionality of fork() you considered wasted
by user code that does?

if (fork())
   exec(...);

Please be precise.

=== Okay, I see an issue.  Maybe someone (Mulyadi?) else can help both
of us.  I'll rephrase your question as I see it.

15 years ago when Linux was young, executables would typically only
have a hundreds of ram pages assigned, so fork() and the associated
MMU setup was relatively fast.

With large modern apps, they can have huge MMU overhead for a single app.

Imagine a function with 20GB of ram (ie. millions of ram pages) allocated.

Now when it invokes fork(), is it resource intensive?  Assuming yes,
if it is just going to exec() some small function, then that effort
was mostly wasted.

Mulyadi, do you know how the MMU is reconfigured during a fork of a
large app like that?   Does the MMU have to be told about each of the
millions of pages?

If the MMU is not smart enough to handle that efficiently, maybe it is
time for a new fork_and_exec() system call that doesn't have to setup
millions of pages of COW MMU setups?

Greg

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ