Hacking git for managing machine readable "source" files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi git hackers,

I have been scratching my head since quite a few weeks to see if and how I could hack git to manage non-software-source-code files. Theses files might be text-based (XML, JSON, custom format, ...) but are not intended for humans, thus diffing and merging them using standard git features doesn't really make sense (and so the whole "pack" stuff seems useless as well). These files represent a non-software project developed using a graphical SW application. I'm talking here about designing and simulating electronic projects, but it could be apply to any sort of engineering (mechanical design comes second to me)

I would like to provide support for diffing, merging, branching and forking such electronics projects.

I know, that git is not a conventional SCM software, and as such doesn't rely on incremental diff (like CVS, SVN, ...), but...

My graphical software uses a document/command based approach, that is, it doesn't directly transform user interaction into graphical changes, instead graphical tools generates commands that are then executed on a document, which once completed cause the graphical view to update it's content.

So far, in my context, a document is simply a tree of objects, the lowest commands available are:
- Insert an object in the tree.
- Remove an object from the tree.
- Modify an object property.
All higher level commands are build in term of the above basic commands.

This is, IMHO, an "interesting" feature in the context of traditional SCMs. Instead of storing incremental diff, I could store incremental commands (I know it would be dead slow, but it would definitely works)

Since git is simply a "content addressable" file system, I can (using plumbing commands) create my own system to store my machine-readable project: a tree of documents, documents being themselves tree of objects. This fit pretty well with git commit, tree and blob objects.

I could even store a serialised command stack (as a tree of command objects, again git fits very well here) along with a commit. This would represent the set of operations (I call this a document transaction) to transform the git document tree associated with the previous commit into the git document tree associated with the current commit.

I feel very confident that I could create wrappers around git plumbing commands to implement my 3 basics document commands (that would work on the index):
mygit insert-object <document> <object-id>
mygit remove-object <document> <object-id>
mygit change-object <document> <object-id> <property-id> <property-value>
Of course, for this to work "mygit" needs to be aware of the low-level file format (XML, JSON, ...), but "mygit" doesn't need to know how to interpret the whole document. Storing my document transactions in git would definitely help with merging (automatic or manual) and diffing, since document transaction would have some extra meta-data that tells what the user really did and why it did it, hence giving hints to the algorithm or the end user on how to solve a merge conflict for example.

Now, from there, I don't know what would be the best approach for diffing and merging, should I completely replace the git pack, diff and merge feature? Should I rely on my concept of command and document transaction? Maybe I should keep the pack feature and simply implement diff and merge using "clever" algorithm? (Just by looking at 2 versions of a document, the algorithm is able to detect what was the purpose of the change and replay it on top of another document version)

I'm pretty sure I'm not the first person to investigate into this, I would be glad if anyone could provide feedback from their own experience, advice on how to move next or simply provides criticism or points out to literature or existing projects.

Thanks,
Chris
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]