Hi David et al, This collection of patches comes from the svndiff0 series[1]. They are not urgent --- the motivation is for svn-fe to be able to keep separate track of input from stdin (svnrdump) and the report-fd (blobs from fast-import) for the coming Text-delta support --- but ideally I would like to see them applied at the start of the next merge window, since they change API that other patches use. See [*] below for the open question. The idea: instead of keeping the input file handle and input buffer as global variables, pack them in a struct and let the calling program keep track of them. Patch 1 makes a previously global buffer local to the two functions that use it. Performance impact should be negligibile. Ideally the buffer would not be needed at all --- there is enough buffering at lower layers already --- but stdio does not provide the calls that would be needed to eliminate it (in particular a wrapper for sendfile(2)). Patch 2 replaces a use of the obj_pool library with a strbuf. The main immediate effect is to improve error handling behavior (more importantly, this is needed for patches 3 and 4 since obj_pool is defined to be global). Patches 3 and 4 are the main patches, collecting input state in a struct and moving resposibility for that struct to the calling program, respectively. The patches have already received a lot of testing. [*] I am not sure whether this is the right approach for reading from the report-fd. To avoid deadlock, we cannot issue a blocking read(2) after the trailing newline has been read from an expected line or the nth byte has been read in fixed-length input. This would rule out fread/fgets if implemented as follows with too large an internal buffer: 1. fill internal buffer completely (or stop when an error or end of file is encountered) 2. fill caller's buffer from internal buffer glibc fread/fgets do not work that way (and in fact will never deadlock for us). What about other platforms? The standards (C, POSIX) do not make it obvious. - maybe setvbuf(f, NULL, _IOLBF, 0) would make fgets safe to use - maybe setvbuf(f, NULL, _IONBF, 0) would make fread safe to use. On the other hand, it is not clear to me what unbuffered input means in this context. That flag does not do anything meaningful on glibc, for example, for input streams. If all else fails, setting the O_NONBLOCK flag with fcntl (this could presumably be implemented as SetNamedPipeHandleState(..., PIPE_NOWAIT) on Windows) would avoid trouble. After any failing operation we would have to check that the error is EAGAIN and clear the error indicator. Simple. But it would be even better to learn that that is not needed. So far I have been playing it safe with the read(fd, buf, 1) trick but that does not have great performance, as David noticed[2]. Thoughts? Jonathan Nieder (4): vcs-svn: Eliminate global byte_buffer[] array vcs-svn: Replace buffer_read_string memory pool with a strbuf vcs-svn: Collect line_buffer data in a struct vcs-svn: Teach line_buffer to handle multiple input files test-line-buffer.c | 17 ++++++----- vcs-svn/fast_export.c | 6 ++-- vcs-svn/fast_export.h | 5 +++- vcs-svn/line_buffer.c | 66 ++++++++++++++++++++-------------------------- vcs-svn/line_buffer.h | 25 +++++++++++++----- vcs-svn/line_buffer.txt | 5 ++- vcs-svn/svndump.c | 29 +++++++++++--------- 7 files changed, 82 insertions(+), 71 deletions(-) [1] http://thread.gmane.org/gmane.comp.version-control.git/158731 [2] http://colabti.org/irclogger/irclogger_log/git-devel?date=2010-12-18 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html