On Sat, May 12, 2012 at 1:12 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> writes: > >> On Sat, May 12, 2012 at 4:16 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: >> ... >>> It is not automatically "converting"; as far as the notes subsystem is >>> concerned, the data you throw at it to be associated with an object the >>> note annotates has always been uninterpreted stream of bytes. If an >>> application wants to store the raw representation of a commit object >>> including the log message and its header, it has every right to expect >>> that it can use "git cat-file commit $argument_to_the_C_option" as the >>> source of that uninterpreted stream of bytes, doesn't it? >> >> Some part of git-notes expects this stream of bytes to be textual, >> human readable. In that case, "git notes add -C commit/tag"'s stuffing >> a blob with the given commit/tag content to notes tree may make sense. >> Personally I'd rather stuff the commit/tag object instead so no new >> blobs are created. The end result is the same: read_sha1_file() always >> return correct text, so does "git notes show". > > No, the end result is definitely not the same. > > There are two important characteristics of "uninterpreted byte stream" the > above thinking is not taking into consideration: > > (1) we do not interpret what the application stores; and > (2) the application is *not* limited by our type system. > > Suppose the application happens to want to stuff the contents it took from > a commit object, and "add -C $objecname" is a convenient way to do so. We > have recorded it as "blob" because it is uninterpreted stream of bytes. If > you record that as a leaf note in the note tree, does that mean the note > tree now suddenly have a submodule? Hell, no. > > What if the application wanted to record the contents of a tree object > instead? How would that affect the fan-out mechanism the note subsystem > uses to hash the 40-hexadecimal object names? After descending the notes > tree to consume the object name to reach the leaf node, it still finds > even more level hanging below. Not very careful "list all object names > that have notes attached in this note tree" implementation may end up > descending into the tree object, because of this patch. Another bad > implication of such a change is that suddenly we would start including all > the subobjects in that tree object when computing the reachability of the > notes tree. Hmm.. you are right. Consider this series dropped. > > The application needs to have a way to tell what is in the data it stores > anyway, because it is not necessarily "enhancing git" kind of application > that talks about relationships between git objects. And it may do so > either by convention (e.g. my "notes/amlog" notes tree only records a > single-line text that is a Message-Id header of the original patch e-mail > commits came from) or by having its own way to identify the application > specific data type (e.g. json, pickle, protobuf, etc.). It is pointless > to say "Git object types can be stored natively, but there is only one > type of blob so the application needs to find a way to coax the types of > data it wants to store itself." It needs to do so anyway, and having > native and standardized way only for git object types does not improve the > system. It only ties our hands going forward because we need to define > what it _means_ to store non-blob types in the notes tree, and support > that forever. > > So this 1/4 patch is _not_ a bugfix at all. It breaks perfectly good > current storage semantics without no good reason. > > For that matter, as long as $EDITOR is set to something appropriate for > the application specific data, there is no reason to forbid editing, > either. > > The only sensible safety against "oops, I forgot that this notes tree > stores binary gunk" I can think of offhand that won't cripple sensible use > case is to sample the data to see if it is binary and ask confirmation, > similar to how "less" asks "frotz may be a binary file. See it anyway?", > and do so only when we are spewing it to the terminal. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html