Hi Finn Arne, this is great stuff! On Mon, 10 May 2010, Finn Arne Gangstad wrote: > Previously, autocrlf would only work well for normalized > repositories. Any text files that contained CRLF in the repository > would cause problems, and would be modified when handled with > core.autocrlf set. > > Change autocrlf to not do any conversions to files that in the > repository already contain a CR. git with autocrlf set will never > create such a file, or change a LF only file to contain CRs, so the > (new) assumption is that if a file contains a CR, it is intentional, > and autocrlf should not change that. > > The following sequence should now always be a NOP even with autocrlf > set (assuming a clean working directory): > > git checkout <something> > touch * > git add -A . (will add nothing) > git comit (nothing to commit) s/comit/commit/ > Previously this would break for any text file containing a CR > > Signed-off-by: Finn Arne Gangstad <finag@xxxxxxx> > --- > > Some of you may have been folowing Eyvind's excellent thread about > trying to make end-of-line translation in git a bit smoother. > > I decided to attack the problem from a different angle: Is it possible > to make autocrlf behave non-destructively for all the previous problem cases? > > Stealing the problem from Eyvind's initial mail (paraphrased and > summarized a bit): > > 1. Setting autocrlf globally is a pain since autocrlf does not work well > with CRLF in the repo > 2. Setting it in individual repos is hard since you do it "too late" > (the clone will get it wrong) > 3. If someone checks in a file with CRLF later, you get into problems again > 4. If a repository once has contained CRLF, you can't tell autocrlf > at which commit everything is sane again > 5. autocrlf does needless work if you know that all your users want > the same EOL style. > > I belive that this patch makes autocrlf a safe (and good) default > setting for Windows, and this solves problems 1-4. > > I implemented it by looking for CR charactes in the index, and > aborting any conversion attempt if this is found. The code to read > the index contents was copied pretty verbatim from attr.c, and should > probably be made into a non-static function instead if there is no > better way of doing this. One technical question, see below. > Note that ALL the tests still pass unmodified. This is a bit > surprising perhaps, but think it is an indication that no one ever > intented autocrlf to do what it does to files containing CRs. Indeed. But a test of its own would be nice, no? If you do not have time, I will come up with one. BTW all this technical description after the "---" should probably go into the commit message. > diff --git a/convert.c b/convert.c > index 4f8fcb7..9d062c8 100644 > --- a/convert.c > +++ b/convert.c > @@ -120,6 +120,43 @@ static void check_safe_crlf(const char *path, int action, > } > } > > +static int has_cr_in_index(const char *path) > +{ > + int pos, len; > + unsigned long sz; > + enum object_type type; > + void *data; > + int has_cr; > + struct index_state *istate = &the_index; > + > + len = strlen(path); > + pos = index_name_pos(istate, path, len); > + if (pos < 0) { > + /* > + * We might be in the middle of a merge, in which > + * case we would read stage #2 (ours). > + */ > + int i; > + for (i = -pos - 1; > + (pos < 0 && i < istate->cache_nr && > + !strcmp(istate->cache[i]->name, path)); > + i++) > + if (ce_stage(istate->cache[i]) == 2) > + pos = i; > + } I think it makes sense to assume that "ours" determines whether we should assume that the index has a wrong format. But if there is also a "base" that disagrees on CR-ness with "ours", should we not try to pick "ours"? Ciao, Johannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html