Hello Philip, Philip Oakley <philipoakley@iee.email> writes: > On 06/11/2020 18:26, Jakub Narębski wrote: >> Junio C Hamano <gitster@xxxxxxxxx> writes: >>> Philip Oakley <philipoakley@iee.email> writes: >>> >>>> This may be not part of the the main project, but could you consider, if >>>> time permits, also adding some entries into the Git Glossary (`git help >>>> glossary`) for the various terms we are using here and elsewhere, e.g. >>>> 'topological levels', 'generation number', 'corrected commit date' (and >>>> its fancy technical name for the use of date heuristics e.g. the >>>> 'chronological ordering';). >>>> >>>> The glossary can provide a reference, once the issues are resolved. The >>>> History Simplification and Commit Ordering section of git-log maybe a >>>> useful guide to some of the terms that would link to the glossary. >>> >>> Ah, I first thought that Documentation/rev-list-options.txt (which >>> is the relevant part of "git log" documentation you mention here) >>> already have references to deep technical terms explained in the >>> glossary and you are suggesting Abhishek to mimic the arrangement by >>> adding new and agreed-upon terms to the glossary and referring to >>> them from the commit-graph documentation updated by this series. >>> >>> But sadly that is not the case. What you are saying is that you >>> noticed that rev-list-options.txt needs a similar "the terms we use >>> to explain these two sections should be defined and explained in the >>> glossary (if they are not) and new references to glossary should be >>> added there" update. What terms you feel need glossary entry? >>> In any case, that is a very good suggestion. I agree that updating >>> "git log" doc may be outside the scope of Abhishek's theme, but it >>> would be very good to have such an update by anybody ;-) >> >> The only possible problem I see with this suggestion is that some of >> those terms (like 'topological levels' and 'corrected commit date') are >> technical terms that should be not of concern for Git user, only for >> developers working on Git. (However one could encounter the term >> "generation number" in `git commit-graph verify` output.) To be more precise, I think that user-facing glossary should include only terms that appear in user-facing documentation and in output messages of Git commands (with the possible exception of maybe output messages of some low-level plumbing). I think that the developer-facing glossary should include terms that appear in technical documentation, and in commit messages in Git history. > However we do mention "topolog*" in a number of the manual pages, and > rather less, as yet, in the technical pages. > > "Lexicographic" and "chronological" are in the same group of fancy > technical words ;-) I think that 'topological level' would appear only in technical documentation; if it would be the case then there is no reason to add it to user-facing glossary (to gitglossary manpage). 'Topological order' or 'topological sort', 'lexicographical order' and 'chronological order' are not Git-specific terms, and there are no Git-specific ambiguities. I am therefore a bit unsure about adding them to *Git* glossary. - In computer science, a _topological sort_ or _topological_ ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering. For Git it means that top to bottom, commits always appear before their parents. With `--graph` or `--topo-order` Git also avoids showing commits on multiple lines of history intermixed. - In mathematics, the _lexicographic_ or _lexicographical order_ (also known as lexical order, dictionary order, etc.) is a generalization of the alphabetical order. For Git it is simply alphabetical order. - _Chronological order_ is the arrangement of things following one after another in time; or in other words date order. Note that `git log --date-order` commits also always appear before their parents, but otherwise commits are shown in the commit timestamp order (committer date order) >> >> I don't think adding technical terms that the user won't encounter in >> the documentation or among messages that Git outputs would be not a good >> idea. It could confuse users, rather than help them. >> >> Conversely, perhaps we should add Documentation/technical/glossary.txt >> to help developers. > > I would agree that the Glossary probably ought to be split into the > primary, secondary and background terms so that the core concepts are > separated from the academic/developer style terms. I don't thing we need three separate layers; in my opinion separating terms that user of Git might encounter from terms that somebody working on developing Git may encounter would be enough. The technical glossary / dictionary could also help onboarding... > > Git does rip up most of what folks think about version "control", > usually based on the imperfect replication of physical artefacts. I don't quite understand what you wanted to say there. Could you explain in more detail, please? >> P.S. By the way, when looking at Documentation/glossary-content.txt, I >> have noticed few obsolescent entries, like "Git archive", few that have >> description that soon could be or is obsolete and would need updating, >> like "master" (when default branch switch to "main"), or "object >> identifier" and "SHA-1" (when Git switches away from SHA-1 as hash >> function). > > The obsolescent items can be updated. I'm expecting that the 'main' and > 'SHA-' changes will eventually be picked up as part of the respective > patch series, hopefully as part of the global replacements. Here I meant that "Git archive" entry is not important anymore, as I think there are no active users of GNU arch version control system (no "arch people"); arch's last release was in 2006, and its replacement, Bazaar (or 'bzr') doesn't use this term. So I think it can be safely removed in 2020, after 14 years after last release of arch. In most cases "SHA-1" in the descriptions of terms in glossary should be replaced by "object identifier" (to be more generic). This can be safely done before switch to NewHash is ready and announced. Best, -- Jakub Narębski