On Mon, Jul 29, 2013 at 10:55:41AM -0400, Marc Branchaud wrote: > On 13-07-29 04:18 AM, Ondřej Bílka wrote: > > Hi, > > > > I improved my tool and it catched following additional typos. > > > > As with any big project best way to catch errors is to have automated > > checks that catch them ( Other possibility would be to read everything ten > > times to get error rate down but nobody wants to do it). > > > > If you want you could add a pre-commit hook > > stylepp-spellcheck --hook > > that checks comments for likely typos (misspells by aspell and not > > occurring in code). It uses aspell to identify them so you need to > > teach aspell which words are valid. > > > > I would like make possible to share dictionaries so teaching phase can > > be done only once instead for each person but I did not found suitable > > workflow yet. > > Unfortunately no automated system is perfect (see some of my comments below). > I'm all for an automated system that identifies potential misspellings, but > > that checks comments for likely typos (misspells by aspell and not > > occurring in code) It just prints likely typos, nothing more. > I'm wary of anything that attempts to automatically correct perceived errors, > or that can't be overruled. In the end a human must make the final decision. > Its more about minimizing time human must spend to review. It is faster to read sentences with corrections and check if they make sense and fix few that do not than switching between find typo, read sentence, look up alternatives, decide which one makes sense. It is natural that there will be errors in generated corrections; I lack neccessary domain knowledge for start. > > Signed-off-by: Ondřej Bílka <neleai@xxxxxxxxx> > > > > diff --git a/pathspec.c b/pathspec.c > > index 6ea0867..27ffe77 100644 > > --- a/pathspec.c > > +++ b/pathspec.c > > @@ -40,7 +40,7 @@ void add_pathspec_matches_against_index(const char **pathspec, > > /* > > * Finds which of the given pathspecs match items in the index. > > * > > - * This is a one-shot wrapper around add_pathspec_matches_against_index() > > + * This is an one-shot wrapper around add_pathspec_matches_against_index() > > As many others have already said, this is not a typo. > > The use of "a" or "an" depends on whether or not the O's sound is hard or > soft. So although we say "an orange" we also say "a one-in-a-million chance". > Well it slipped through my filter and review. Ideally a script could just look up pronunciation in dictionary but I did not find downloadable one yet. > > > > [ ... snip ... ] > > > > > diff --git a/Documentation/RelNotes/1.7.9.1.txt b/Documentation/RelNotes/1.7.9.1.txt > > index 6957183..e8fddb8 100644 > > --- a/Documentation/RelNotes/1.7.9.1.txt > > +++ b/Documentation/RelNotes/1.7.9.1.txt > > @@ -20,7 +20,7 @@ Fixes since v1.7.9 > > submodule that only has uncommitted local changes in the patch > > prepared by for the user to edit. > > > > - * Typo in "git branch --edit-description my-tpoic" was not diagnosed. > > + * Typo in "git branch --edit-description my-topic" was not diagnosed. > > Here "tpoic" is illustrating the typo that was being misdiagnosed. > yes, domain knowledge. > > > > [ ... snip ... ] > > > > > diff --git a/Documentation/config.txt b/Documentation/config.txt > > index e0b923f..8420aff 100644 > > --- a/Documentation/config.txt > > +++ b/Documentation/config.txt > > @@ -434,11 +434,11 @@ core.repositoryFormatVersion:: > > version. > > > > core.sharedRepository:: > > - When 'group' (or 'true'), the repository is made shareable between > > + When 'group' (or 'true'), the repository is made sharable between > > several users in a group (making sure all the files and objects are > > group-writable). When 'all' (or 'world' or 'everybody'), the > > repository will be readable by all users, additionally to being > > - group-shareable. When 'umask' (or 'false'), Git will use permissions > > + group-sharable. When 'umask' (or 'false'), Git will use permissions > > "Sharable" is the North American spelling. AFAIK git doesn't specify what > kind of English the documentation source files should use. Perhaps one day > there'll be en_UK and en_US translations, and all the sources will be written > in Klingon... > > Until that day, or until the git project starts to care a lot more about > English style, I think patches that translate spellings between English > variants are a bit of a waste of time. > I need better dictionary that aspell currently has. My replacements were mostly generated from commit histories of several projects where I looked when wrong word changes to correct one and letters stay mostly same. This caused false positives like this. > > > > [ ... snip ... ] > > > > > diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt > > index fe723e4..1491d69 100644 > > --- a/Documentation/user-manual.txt > > +++ b/Documentation/user-manual.txt > > @@ -3116,7 +3116,7 @@ Trust > > If you receive the SHA-1 name of a blob from one source, and its contents > > from another (possibly untrusted) source, you can still trust that those > > contents are correct as long as the SHA-1 name agrees. This is because > > -the SHA-1 is designed so that it is infeasible to find different contents > > +the SHA-1 is designed so that it is unfeasible to find different contents > > that produce the same hash. > > > > Similarly, you need only trust the SHA-1 name of a top-level tree object > > Both "infeasible" and "unfeasible" are in common usage. If you want to avoid > future patches going back and forth on this, try "not feasible". > > M. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html