"Nick Edelen" <sirnot@xxxxxxxxx> writes: > +DESCRIPTION > +----------- > +The revision cache ('rev-cache') provides a mechanism for significantly > +speeding up revision traversals. It does this by creating an efficient > +database (cache) of commits, their related objects and topological relations. > +Independant of packs and the object store, this database is composed of "independent" > +rev-cache "slices" -- each a different file storing a given segment of commit > +history. To map commits to their respective slices, a single index file is > +kept for the rev-cache. > + > +'git-rev-cache' provides a front-end for the rev-cache mechanism, intended for > +updating and maintaining rev-cache slices in the current repository. New cache > +slice files can be 'add'ed, to keep the cache up-to-date; individual slices can > +be traversed; smaller slices can be 'fuse'd into a larger slice; and the > +rev-cache index can be regenerated. What is the practical use of traversing a single individual slice? > +COMMANDS > +-------- > + > +add > +~~~ > +Add revisions to the cache by creating a new cache slice. Reads a revision > +list from the command line, formatted as: `START START ... \--not END END ...` > + > +Options: > + > +\--all:: > + Include all refs in the new cache slice, like the \--all option in > + 'rev-list'. > + > +\--fresh/\--incremental:: > + Exclude everything already in the revision cache, analogous to > + \--incremental in 'pack-objects'. Write these on separate lines, like this: --fresh:: --incremental:: Exclude ... > +\--stdin:: > + Read newline-seperated revisions from the standard input. Use \--not > + to exclude commits, as on the command line. > + > +\--legs:: > + Ensure newly-generated cache slice has no partial ends. This means that > + no commit has partially cached parents, in that all its parents are > + cached or none of them are. 99.9% of users can ignore this command. Bad presentation. I am sure 99.9% of readers would not understand what you are talking about, and I am sure I am among them. A "partial end" is an unexplained and undefined term at this point, and you use it to explain what --legs is about in the first sentence. This results in giving _no_ information to the reader with the first sentence. Then, the second sentence , by starting with "This means that", attempts to define the unexplained term "partial end" by rephrasing it differently. Such a presentation structure is good only if 1. the rephrased explanation ("all or none of the parents are cached") is much more understandable the new unfamiliar term ("partial end"); 2. the new unfamiliar term is much concise; and 3. the new unfamiliar term is used repeatedly in other parts of the documentation. But the explanation of --legs does not satisfy any of the above three. It is not clear what it means for a commit to get its parents "cached"; the rephasing explanation is not much longer than the "partial end", and you do not use "partial ends" in order to further explain other things anywhere in the documentation. In such a case, you are better off dropping the first cryptic "has no partial ends" together with "This means" and introduce the concept you are introducing more directly. And because you would be dropping the latter half of the first sentence and "This means that", you can do this with longer and easier to understand explanation. Perhaps... --legs:: Make sure each and every commit in the created cache slice either has its all parents in the same slice, or none of its parents in it. I said "in the _same_ slice" in my version, but I do not know if that is what you meant by "cached". Maybe you meant "in _some_ slice" instead. That is the kind of clarification you can afford to make, once you stop introducing otherwise unused term like "partial ends" here. Also, I do not find the word "legs" particularly "click" with the "no partial ends" concept you are trying to define. It often is good to use a verb that can be made into adjective for things like this. How about calling this "--close"? Close the newly created cache slice. i.e. make sure that each and every commit in the slice has its all parents in the same slice, or none of its parents in it. Then later you could use "a closed slice" (vs "an open slice"), if the distinction between a slice that was created with --legs and without becomes useful. > +walk > +~~~~ > +Analogous to a slice-oriented 'rev-list', 'walk' will traverse a region in a > +particular cache slice. Interesting and uninteresting (delimited, as with > +'rev-list', with \--not) are specified on the command line, and output is the > +same as vanilla 'rev-list'. > + > +Options: > + > +\--objects:: > + Like 'rev-list', 'walk' will normally only list commits. Use this > + option to list non-commit objects as well, if they are present in the > + cache slice. > + > +Output: > + > +'walk' will simply dump the contents of the output commit list, work list, and > +pending object array. The headers are outputed on `stderr`, the object hashes > +and names on `stdout`. What is the practical use of traversing a single individual slice? For example, if you have a slice created by an earlier 'add' that was run with, say, v1.6.0..v1.6.1 as the parameter (so presumably it will know only about the commits and their associated objects between these versions), and you tell the command to 'walk' v1.0.0..v1.3.0 on the slice, what happens? What I am getting at is if this command is also mainly intended for debugging this command, just like --no-objects option above. > +fuse > +~~~~ > +Merge several cache slices into a single large slice, like 'repack' for > +'rev-cache'. On each invocation of 'add' a new file ("slice") is added to the At this point, the reader has already read the explanation of what a slice is, so it is easier to read if you said "... a new slice is added to the ..." here, without using ambiguous but more familiar word "file". > +Running 'fuse' every once in a while will solve this problem by coalescing all > +the cache slices into one larger slice. For very large projects, using > +\--ignore-size is advisable to prevent overly large cache slices. Setting git > +'config' option 'gc.revcache' to 1 will enable cache slice fusion upon garbage > +collection. I am still unhappy with the word "--ignore-size". Its the threashold, existing slices larger than which will be kept uncoalesced; the option is not about "ignoring" the size, but means entirely opposite. The command actively pays attention to the size while operating under this option. Perhaps --keep-size might be slightly more appropriate; even though "size" does not tell if it is a lower bound or upper bound, at least it makes it clear that it is about keeping them from getting collapsed. > +Note that 'fuse' uses the internal revision walker, so the options used in Internal to what? Internal to git? Internal to rev-cache creator? Internal to the fuze command implementation (and if so, why)? > +This command prints the SHA-1 of the new slice on `stdout`, and information > +about its work on `stderr` -- specifically which files it's removing. When talking about the "standard output" in general terms, I'd prefer spelling it out, reserving `stdout` as a precise technical term to refer to the standard output stream from programming environments used only when discussing actually programming naming the stream with that particular spelling. > +index > +~~~~~ > +Regenerate the revision cache index. If the rev-cache index file associating > +objects with cache slices gets corrupted, lost, or otherwise becomes unusable, > +'index' will quickly regenerate the file. It's most likely that this won't be > +needed in every day use, as it is targeted towards debugging and development. Perhaps "reindex"? > +alt > +~~~ > +Create a cache slice pointer to another slice, identified by its full path: > +`fuse path/to/other/slice` > + > +This command is useful if you have several repositories sharing a common > +history. Although space requirements for rev-cache are slim anyway, you can in > +this situation reduce it further by using slice pointers, pointing to relavant > +slices in other repositories. Note that only one level of redirection is > +allowed, and the slice pointer will break if the original slice is removed. Hmm, why is this inconsistency? I think other symbolic-link-like construct we have follow 5 levels or so... How would you break the dependency once you make your rev-cache dependent on another? > diff --git a/Documentation/technical/rev-cache.txt b/Documentation/technical/rev-cache.txt > new file mode 100644 > index 0000000..91fce8b > --- /dev/null > +++ b/Documentation/technical/rev-cache.txt > @@ -0,0 +1,634 @@ > +rev-cache > +========= > + > +The revision cache API ('rev-cache') provides a method for efficiently storing > +and accessing commit branch sections. Such branch slices are defined by a > +series of start/top (interesting) and end/bottom (uninteresting) commits. Each It is often necessary to list synonyms like "start/top (interesting)" in description when a concept has been widely used before being formalized and different people used different words to refer to the same concept. But here you are introducing the rev-cache and its related concepts for the first time. You don't have to give three words to each of these two concepts from the beginning. Instead, pick one unambiguous pair and stick to them everywhere, in the code, in the input/output to/from the commands, and in the documentation. If it is important to be able to distinguish "uninteresting"-ness used by rev-list and "bottom"-ness used by rev-cache, then I would suggest to use "top/bottom". Otherwise, I would suggest "interesting/uninteresting". -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html