Re: storing pre-computed fine-grained diffs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Shawn,

On Thu, Mar 8, 2012 at 10:41 AM, ecloud <shawn.t.rutledge@xxxxxxxxx> wrote:
> The thing about git, as well as all version control systems I have known so
> far which store diffs, is that computing the diff means post-analyzing a
> saved file.  That is, you use any editor you like, and after making a whole
> batch of changes you manually commit to the repository, and the diff
> algorithm figures out what you changed.  Some information is already lost
> about what order you made the changes and what the logical chunks actually
> were.  But what if there was an editor that could save each individual
> change as a separate version?  You put the cursor at one point in the file,
> and type some text; then you click elsewhere, and the editor does a "git
> commit this-file" automatically.

I'm going to skip discussing whether this approach is desirable.
All further comment will assume that it is worthwhile.

> Then you select some other text and delete
> it, and it does a commit again.  It would be nice in that case to avoid
> doing the diff at all, because the editor already knows exactly what the
> change was.

This is just a performance consideration. Two relevant facts are that
the delta computation in git is at its fastest when the difference is
small and that SHA1 computation imposes a per-change cost proportional
to the length of the blob.

> Would it be possible to store these fine-grained changes
> directly in a packfile, efficiently?  Or would it require a different
> storage format?  I know the diff algorithm used is already much smarter than
> a line-by-line diff, but is the storage format capable of representing
> changes over ranges of characters without "extra context" like the
> line-by-line diffs usually have?

The current pack format for git has quite an efficient delta
representation. It uses a byte-wise binary diff, so there is no
context in the physical representation. It is possible to build a pack
as described using the fast-import interface, and it would be
straightforward to write an editor backend that persisted via
fast-import. (Assuming that you are hacking on an editor.)

--
David Barr
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]