I really need a sanity checking on this one. I think I got the botched pipeline fixed with the patch I am replying to, but I do not understand the waitpid() business. Care to enlighten me? -- >8 -- Documentation/technical/send-pack-pipeline.txt | 112 ++++++++++++++++++++++++ 1 files changed, 112 insertions(+), 0 deletions(-) create mode 100644 Documentation/technical/send-pack-pipeline.txt diff --git a/Documentation/technical/send-pack-pipeline.txt b/Documentation/technical/send-pack-pipeline.txt new file mode 100644 index 0000000..bd32aff --- /dev/null +++ b/Documentation/technical/send-pack-pipeline.txt @@ -0,0 +1,112 @@ +git-send-pack +============= + +Overall operation +----------------- + +. Connects to the remote side and invokes git-receive-pack. + +. Learns what refs the remote has and what commit they point at. + Matches them to the refspecs we are pushing. + +. Checks if there are non-fast-forwards. Unlike fetch-pack, + the repository send-pack runs in is supposed to be a superset + of the recipient in fast-forward cases, so there is no need + for want/have exchanges, and fast-forward check can be done + locally. Tell the result to the other end. + +. Calls pack_objects() which generates a packfile and sends it + over to the other end. + +. If the remote side is new enough (v1.1.0 or later), wait for + the unpack and hook status from the other end. + +. Exit with appropriate error codes. + + +Pack_objects pipeline +--------------------- + +This function gets one file descriptor (`out`) which is either a +socket (over the network) or a pipe (local). What's written to +this fd goes to git-receive-pack to be unpacked. + + send-pack ---> fd ---> receive-pack + +It somehow forks once, but does not wait for it. I am not sure +why. + +The forked child calls rev_list_generate() with that file +descriptor (while the parent closes `out` -- the child will be +the one that writes the packfile to the other end). + + send-pack + | + rev-list-generate ---> fd ---> receive-pack + + +Then rev-list-generate forks after creates a pipe; the child +will become a pipeline "rev-list --stdin | pack-objects", which +is the rev_list() function, while the parent feeds that pipeline +the list of refs. + + send-pack + | + rev-list-generate ---> fd ---> receive-pack + | ^ (pipe) + v | + rev-list + +The child process, before calling rev-list, rearranges the file +descriptors: + +. what it reads from rev-list-generate via pipe becomes the + stdin; this is to feed the upstream of the pipeline which will + be git-rev-list process. + +. what it writes to its stdout goes to the fd connected to + receive-pack. + +On the other hand, the parent process, before starting to feed +the child pipeline, closes the reading side of the pipe and fd +to receive-pack. + + send-pack + | + rev-list-generate + | + v [0] + rev-list [1] ---> receive-pack + +The parent then writes to the pipe and later closes it. There +is a commented out waitpid to wait for the rev-list side before +it exits, I again do not understand why. + +The rev-list function further sets up a pipe and forks to run +git-rev-list piped to git-pack-objects. The child side, before +exec'ing git-pack-objects, rearranges the file descriptors: + +. what it reads from the pipe becomes the stdin; this gets the + list of objects from the git-rev-list process. + +. its stdout is already connected to receive-pack, so what it + generates goes there. + +The parent process arranges its file descriptors before exec'ing +git-rev-list: + +. its stdout is sent to the pipe to feed git-pack-objects. + +. its stdin is already connected to rev-list-generate and will + read the set of refs from it. + + + send-pack + | + rev-list-generate + | + v [0] + git-rev-list [1] ---> [0] git-pack-objects [1] ---> receive-pack + + + - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html