Mike Hommey <mh@xxxxxxxxxxxx> writes: > On Tue, Apr 01, 2014 at 09:15:12AM -0400, Jeff King wrote: >> > It seems to me fast-import keeps a kind of human readable format for its >> > protocol, i wonder if xdelta format would fit the bill. That being said, >> > I also wonder if i shouldn't just try to write a pack on my own... >> >> The fast-import commands are human readable, but the blob contents are >> included inline. I don't see how sending a binary delta is any worse >> than sending a literal binary blob over the stream. > > OTOH, the xdelta format is not exactly straightforward to produce, with > the variable length encoding of integers. Not exactly hard, but when > everything else in fast-import is straightforward, one has to wonder. Unless you already have your change in the xdelta on hand, or the format your foreign change is in gives sufficient information to produce a corresponding xdelta without looking at the content that your foreign change applies to, it is silly to try to convert your foreign change into xdelta and feed it to fast-import. What constitutes "sufficient" information? The xdelta format is a series of instructions that lets you: - copy N bytes from offset in the source material to the destination; or - copy these N literal bytes to the destination. to an existing piece of content, identified by the object name of the "source material", to produce a result of "applying delta". As an example, think about the case where you have *,v files used by RCS (and CVS). The "foreign changes" given to you by that format is a series of instructions that roughly corresponds to an "ed" script. Insert these lines at the line number L, delete N lines from line number K, etc. In order to convert such a change into xdelta, you would need to know what these line numbers correspond to byte offset in the original file. You also may want to know what the Git object name for the original is, although in the fast-import stream you might be able to get away by using the object mark facility. Assuming that you do have and are willing to read the original file, you have three possible (and one impractical) approaches: - Apply the foreign changes to the original file yourself (as that is the foreign system you are interested in, you know how to do that much better than Git does), and produce xdelta between the original and the result using only the original and the result. - Apply the foreign changes to the original file yourself, and feed the resulting content to fast-import in full, letting fast-import convert into the format Git understands. - Interpret the foreign changes, using the original file as a reference, to convert it into xdelta. - Teach fast-import how to interpret various formats that are used to express foreign changes, and feed that. In the first approach, this "given the original and the result, produce xdelta between them" can be reused by other people's system. You may be able to borrow diff-delta.c from us under our licensing terms. The second is the most straightforward; eventual deltification will happen when the resulting repository is repacked and uses the same code from diff-delta.c. The third would be "*,v expresses the source location and length in terms of lines, so look at the original to convert these into byte offset and byte length xdelta wants", which I would think is silly. And the last one is a maintenance nightmare I do not think we would want to touch with a ten-foot pole. In short, the most practical solution would be to reconstitute a full object and feed that to fast-import, unless you already have xdelta or you can turn your foreign change into xdelta without ever looking at the original. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html