[PATCH 0/4] teach vcs-svn/line_buffer to handle multiple input files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi David et al,

This collection of patches comes from the svndiff0 series[1].  They
are not urgent --- the motivation is for svn-fe to be able to keep
separate track of input from stdin (svnrdump) and the report-fd (blobs
from fast-import) for the coming Text-delta support --- but ideally I
would like to see them applied at the start of the next merge window,
since they change API that other patches use.  See [*] below for the
open question.

The idea: instead of keeping the input file handle and input buffer as
global variables, pack them in a struct and let the calling program
keep track of them.

Patch 1 makes a previously global buffer local to the two functions
that use it.  Performance impact should be negligibile.  Ideally the
buffer would not be needed at all --- there is enough buffering at
lower layers already --- but stdio does not provide the calls that
would be needed to eliminate it (in particular a wrapper for
sendfile(2)).

Patch 2 replaces a use of the obj_pool library with a strbuf.  The
main immediate effect is to improve error handling behavior (more
importantly, this is needed for patches 3 and 4 since obj_pool is
defined to be global).

Patches 3 and 4 are the main patches, collecting input state in a
struct and moving resposibility for that struct to the calling
program, respectively.

The patches have already received a lot of testing.

[*]
I am not sure whether this is the right approach for reading from the
report-fd.  To avoid deadlock, we cannot issue a blocking read(2)
after the trailing newline has been read from an expected line or the
nth byte has been read in fixed-length input.  This would rule out
fread/fgets if implemented as follows with too large an internal
buffer:

 1. fill internal buffer completely (or stop when an error or
    end of file is encountered)
 2. fill caller's buffer from internal buffer

glibc fread/fgets do not work that way (and in fact will never
deadlock for us).  What about other platforms?  The standards (C,
POSIX) do not make it obvious.

 - maybe setvbuf(f, NULL, _IOLBF, 0) would make fgets safe to use
 - maybe setvbuf(f, NULL, _IONBF, 0) would make fread safe to use.
   On the other hand, it is not clear to me what unbuffered input
   means in this context.  That flag does not do anything meaningful
   on glibc, for example, for input streams.

If all else fails, setting the O_NONBLOCK flag with fcntl (this
could presumably be implemented as SetNamedPipeHandleState(...,
PIPE_NOWAIT) on Windows) would avoid trouble.  After any failing
operation we would have to check that the error is EAGAIN and
clear the error indicator.  Simple.  But it would be even better
to learn that that is not needed.

So far I have been playing it safe with the read(fd, buf, 1) trick
but that does not have great performance, as David noticed[2].

Thoughts?

Jonathan Nieder (4):
  vcs-svn: Eliminate global byte_buffer[] array
  vcs-svn: Replace buffer_read_string memory pool with a strbuf
  vcs-svn: Collect line_buffer data in a struct
  vcs-svn: Teach line_buffer to handle multiple input files

 test-line-buffer.c      |   17 ++++++-----
 vcs-svn/fast_export.c   |    6 ++--
 vcs-svn/fast_export.h   |    5 +++-
 vcs-svn/line_buffer.c   |   66 ++++++++++++++++++++--------------------------
 vcs-svn/line_buffer.h   |   25 +++++++++++++-----
 vcs-svn/line_buffer.txt |    5 ++-
 vcs-svn/svndump.c       |   29 +++++++++++---------
 7 files changed, 82 insertions(+), 71 deletions(-)

[1] http://thread.gmane.org/gmane.comp.version-control.git/158731
[2] http://colabti.org/irclogger/irclogger_log/git-devel?date=2010-12-18
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]