Re: git bundle format [OT]

Stephen Bash <bash@xxxxxxxxxxx> · Mon, 26 Nov 2012 16:31:09 -0500 (EST)

----- Original Message -----
> From: "Jason J CTR Pyeron (US)" <jason.j.pyeron.ctr@xxxxxxxx>
> Sent: Monday, November 26, 2012 4:06:59 PM
> Subject: RE: git bundle format [OT]
> 
> > First, a shot out of left field: how about a patch based workflow?
> > (similar to the mailing list, just replace email with sneakernet)
> > Patches are plain text and simple to review (preferable to an
> > "opaque" binary format?).
> 
> This is to only address the accidental development on a high side.
> Using this or any process should come with shame or punishment for
> wasting resources/time by not developing on a low side to start
> with.

Ah, if only more of those I (previously) worked with thought as you do :)

> But accepting reality there will be times where code and its
> metadata (commit logs, etc) will be created on a high side and
> should be brought back to the low side.

Using git format-patch and git am it's possible to retain the commit messages (and other associated metadata).  But again, I'm not the expert on this :)  I've made it work a few times to test patches from this list, but so far I've avoided serious integration into the mailing list workflow.

> >   2) Do the diffs applied to public repo contain any sensitive
> >   data?
> 
> That is a great question. Can the change of code while neither the
> original or the resultant be secret while the change imply or
> demonstrate the secret. I think the answer is yes.

In actual fact I was thinking about the simple case where the result included an "Eek! 3.1415926 cannot show up in this code!" (sometimes that's easier to see in a diff than a full text blob).  Obviously the first line of defense should catch such mistakes.  But yes, your point is also a good one.  I'd be hard pressed to argue that a particular series of commits leaks information on their own, but they can certainly corroborate other available information.

> > Question 2 is relatively straight forward and lead me to the patch
> > idea.  I would:
> >   - Bundle the public repository
> >   - Init a new repo in the secure space from the public bundle
> >   - Fetch from the to-be-sanitized bundle into the new repo
> >   - Examine commits (diffs) introduced by branches in the to-be-
> >   sanitized bundle
> >   - Perhaps get a list of all the objects in the to-be-sanitized
> >   bundle and do a git-cat-file on each of them (if the bundle is
> >   assembled correctly it shouldn't have any unreachable objects...).
> >   This step may be extraneous after the previous.
> 
> Here we would be missing the metadata that goes along with the
> commit. Especially the SHA sums.

Ah sorry, I guess I wasn't complete.  Once that process has been done on the high side one has to go back to question 1 and see if it's safe to move the bundle out to repeat the process on the low side. 

Stephen
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html