On Fri, May 21, 2021 at 6:44 AM ZheNing Hu <adlternative@xxxxxxxxx> wrote: > > Christian Couder <christian.couder@xxxxxxxxx> 于2021年5月21日周五 上午12:20写道: > > > > On Thu, May 20, 2021 at 10:49 AM ZheNing Hu via GitGitGadget > > <gitgitgadget@xxxxxxxxx> wrote: > > > > > > From: ZheNing Hu <adlternative@xxxxxxxxx> > > > > > > Add new formatting option %(contents:raw), which will > > > print all the object contents without any changes. > > > > Maybe you could tell how it would be different from %(contents), or in > > other words what are the changes that %(contents) makes. > > > > Isn't %(contents) only printing the commit message or the tag message > > of a commit or a tag respectively? If %(contents:raw) would print all > > the object contents, it could feel strange that it is actually > > printing more than %(contents). > > Okay, some explanations are indeed missing here: By the way I forgot to say that your patch might want to also properly document %(contents:raw). If the doc was updated as part of the patch, maybe the whole patch would be easier to understand. > %(contents) will discard the metadata part of the object file, It's not clear what you mean when you talk about metadata in objects. Are you taking about only the headers, like the "tree XXX", "parent YYY", etc lines in commits, and the "object OOO", "type TTT", etc lines in tags? Or does it also includes the lines in tree objects that reference other trees or blobs? > and only print the data contents part of it. %(contents:raw) > can will not discard the metadata part of the object file, this s/can// > means that it can print the "raw" content of an object. The above seems to be changing the definition of %(contents) (as previously it was only the commit message of a commit or the tag message a tag) to explain %(contents:raw)... > In addition, %(contents:raw) can support print commit, blob, > tag, tree objects contents which %(contents) can only support > commit,tag objects. ...but this is saying that the definition of %(contents) actually doesn't change. This doesn't seem logical to me. If %(contents:raw) can support any kind of object (commit, blob, tag or tree) then it would be logical that %(contents) also support any kind of object. So if you really want to define %(contents:raw) as the raw object contents, you might want to consider a preparatory patch that would first change the definition of %(contents) to be the object contents without the headers. This would keep the same output for any commit or tag. But for blobs and trees it would print the whole content of the object in the same way as `git cat-file -p` does, instead of nothing. I think this acceptable as refs rarely point to something other than commits, so people are not likely to rely on the fact that %(contents) currently prints nothing for blobs and trees. > E.g: > > git for-each-ref --format=%(contents:raw) refs/heads/foo > > will have the same output as: > > git rev-parse refs/heads/foo | git cat-file --batch Ok, I think that would indeed be useful. > > Also is %(contents:raw) supposed to print something for a blob or a > > tree, while I guess %(contents) is printing nothing for them? > > Now my thoughts are: > Let %(contents) learn to print four kinds of objects. > and then let %(contents:raw) learn to print metadata. > I will split it into two patches. Yeah, great! > > > It will help further to migrate all cat-file formatting > > > logic from cat-file to ref-filter. > > > > > > Signed-off-by: ZheNing Hu <adlternative@xxxxxxxxx> > > > > It looks like you rewrote the patch nearly completely, but if you > > based your patch on, or got inspired by, Olga's work, it might be nice > > to acknowledge this using a trailer (for example "Based-on-patch-by: > > ..." or "Helped-by:..."). > > Okay, "Based-on-patch-by" would be more appropriate here. Nice! > > > @@ -1312,6 +1315,13 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf) > > > !starts_with(name, "trailers") && > > > !starts_with(name, "contents")) > > > continue; > > > + if (atom->u.contents.option == C_RAW) { > > > + v->s = xmemdupz(buf, buf_size); > > > + continue; > > > + } > > > + if (object_type != OBJ_TAG && object_type != OBJ_COMMIT) > > > + continue; > > > > When seeing the 2 lines above, I am guessing that, before this patch, > > grab_sub_body_contents() couldn't actually be called when object_type > > was OBJ_BLOB or OBJ_TREE, but you have made other changes elsewhere so > > that now it can. As only the atom->u.contents.option == C_RAW case is > > relevant in this case, you added this check. Let's see if I am > > right... > > > > > if (!subpos) > > > find_subpos(buf, > > > &subpos, &sublen, > > > @@ -1374,25 +1384,30 @@ static void fill_missing_values(struct atom_value *val) > > > * pointed at by the ref itself; otherwise it is the object the > > > * ref (which is a tag) refers to. > > > */ > > > -static void grab_values(struct atom_value *val, int deref, struct object *obj, void *buf) > > > +static void grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data) > > > { > > > + void *buf = data->content; > > > + unsigned long buf_size = data->size; > > > + > > > switch (obj->type) { > > > case OBJ_TAG: > > > grab_tag_values(val, deref, obj); > > > - grab_sub_body_contents(val, deref, buf); > > > + grab_sub_body_contents(val, deref, buf, buf_size, obj->type); > > > grab_person("tagger", val, deref, buf); > > > break; > > > case OBJ_COMMIT: > > > grab_commit_values(val, deref, obj); > > > - grab_sub_body_contents(val, deref, buf); > > > + grab_sub_body_contents(val, deref, buf, buf_size, obj->type); > > > grab_person("author", val, deref, buf); > > > grab_person("committer", val, deref, buf); > > > break; > > > case OBJ_TREE: > > > /* grab_tree_values(val, deref, obj, buf, sz); */ > > > + grab_sub_body_contents(val, deref, buf, buf_size, obj->type); > > > break; > > > case OBJ_BLOB: > > > /* grab_blob_values(val, deref, obj, buf, sz); */ > > > + grab_sub_body_contents(val, deref, buf, buf_size, obj->type); > > > > ...ok, I was right above. The issue now is that I wonder if > > grab_sub_body_contents() is still a good name for a function that can > > be called for a blob or a tree which does not really have a body. > > Makes sense, It might be better to use the new name: grab_contents(). Yeah.