Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is available

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 17, 2010 at 22:28, Johannes Sixt <j6t@xxxxxxxx> wrote:
> On Mittwoch, 10. März 2010, Junio C Hamano wrote:
>> Johannes Sixt <j6t@xxxxxxxx> writes:
>> > On Samstag, 6. März 2010, Shawn O. Pearce wrote:
>> >> I'm in favor of that.  If we have threaded delta search enabled,
>> >> we probably can also run these async procedures in a POSIX thread
>> >> rather than forking off a child.
>> >
>> > OK. The patch could look like this.
>>
>> Will queue in 'pu', but as Shawn said, we should probably give another
>> closer look at the callees that are started with this interface before
>> moving forward.
>
> So here is my analysis.
>
> There are currently these users of the async infrastructure:
>
> builtin-fetch-pack.c
> builtin-receive-pack.c
> builtin-send-pack.c
> convert.c
> remote-curl.c
> upload_pack.c
>
> The list below shows all functions that read or write potentially global state
> that are called from the async procedure (except for upload_pack.c, see
> below). I ignored all functions that do not depend on global state, like
> strcmp, memcpy, sprintf, strerror, etc.
>

> ----------
> convert.c:filter_buffer()
>  pipe
>  close
>  getenv
>  fork
>  read
>  write
>  fprintf
>  waitpid
>
> if GIT_TRACE is set:
>  xrealloc
>  die
>  free
>
> This one is less trivial. It calls start_command+finish_command from the async
> procedure. The parent calls
>  read()
>  xrealloc() (via strbuf_read())
>  error()
> (and nothing else) between start_async() and finish_async().
>
> As long as GIT_TRACE is not set, we are safe, because the async procedure
> carefully releases all global resources that it allocated, and the parent is
> looking in the other direction while it does os. Even if GIT_TRACE is set,
> xrealloc() in the parent and the async procedure cannot be called
> simultanously because the procedure calls it only before it begins writing
> output, and the parent only after it received this output.

Maybe I'm missing something but, isn't it possible that xrealloc is
called simultaneously from the two threads if GIT_TRACE is set?

Immediately after start_async the parent calls strbuf_read. We then
get the call chain
strbuf_read -> strbuf_grow -> ALLOG_GROW -> xrealloc, so xrealloc is
called before we read any data in the parent.

In the child we have start_command -> trace_argv_printf -> strbuf_grow -> ...

That xmalloc and xrealloc aren't thread-safe feels a bit fragile.
Maybe we should try to fix that.

> ----------
> upload_pack:create_pack_file(): This user is different because it calls into
> the revision walker in the async procedure, which definitely affects global
> state. Therefore, here is the list of functions called by the parent until it
> exits:
>  pipe
>  close
>  getenv
>  fork
>  read
>  write
>  fprintf
>  waitpid
>  alarm
>  die
>  die_errno
>  pthread_join / waitpid

sha1_to_hex is also called by the parent and the current
implementation of that function is not thread-safe. sha1_to_hex is
also called by some paths in the revision machinery, but I don't know
if it will ever be called in this particular case.

Maybe it would be a good idea to create wrappers for getenv, setenv,
unsetenv, and putenv to make them thread-safe as well. Then won't have
to worry about them in the future.

- Fredrik
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]