Scott Johnson venit, vidit, dixit 15.12.2010 04:47: > I am attempting to do a word diff of an html source file. Part of the removed > html is disappearing from the diff when I enable the fancy html word diff. > > Here's the output from basic `git diff`: > diff --git a/adv_layout_source.html b/adv_layout_source.html > index 18a81dd..c4ed609 100644 > --- a/adv_layout_source.html > +++ b/adv_layout_source.html > @@ -42,8 +42,8 @@ > <ul> > <li class="ydn-patterns"><em></em><a href="#">ydn-patterns</a></li> > <li class="ydn-mail"><em></em><a href="#">ydn-mail</a></li> > - <li class="yws-maps"><em></em><a href="#">yws-maps</a></li> > - <li class="ydn-delicious"><em></em><a href="#">ydn-delicious</a></li> > + <li><em></em><a href="#">yws-maps</a></li> > + <li><em></em><a href="#">ydn-delicious</a></li> > <li class="yws-flickr"><em></em><a href="#">yws-flickr</a></li> > <li class="yws-events"><em></em><a href="#">yws-events</a></li> > </ul> > > > Here's the default `git diff --word-diff`: > diff --git a/adv_layout_source.html b/adv_layout_source.html > index 18a81dd..c4ed609 100644 > --- a/adv_layout_source.html > +++ b/adv_layout_source.html > @@ -42,8 +42,8 @@ > <ul> > <li class="ydn-patterns"><em></em><a href="#">ydn-patterns</a></li> > <li class="ydn-mail"><em></em><a href="#">ydn-mail</a></li> > [-<li class="yws-maps"><em></em><a-]{+<li><em></em><a+} > href="#">yws-maps</a></li> > [-<li class="ydn-delicious"><em></em><a-]{+<li><em></em><a+} > href="#">ydn-delicious</a></li> > <li class="yws-flickr"><em></em><a href="#">yws-flickr</a></li> > <li class="yws-events"><em></em><a href="#">yws-events</a></li> > </ul> > > Which is correct, but less than ideal because it highlights much more than the > actual changes. > > So I create a .gitattributes file with one line: > *.html diff=html > > And rerun `git diff --word-diff`: > diff --git a/adv_layout_source.html b/adv_layout_source.html > index 18a81dd..c4ed609 100644 > --- a/adv_layout_source.html > +++ b/adv_layout_source.html > @@ -42,8 +42,8 @@ > <ul> > <li class="ydn-patterns"><em></em><a href="#">ydn-patterns</a></li> > <li class="ydn-mail"><em></em><a href="#">ydn-mail</a></li> > <li[-class="yws-maps"-]><em></em><a href="#">yws-maps</a></li> > <li><em></em><a href="#">ydn-delicious</a></li> > <li class="yws-flickr"><em></em><a href="#">yws-flickr</a></li> > <li class="yws-events"><em></em><a href="#">yws-events</a></li> > </ul> > > Yikes! What happened to the second line of changes? The removed code is not > displayed at all. > > This is running git 1.7.3.3. > > I suspect the problem is in the html patterns in userdiff.c, but I don't > understand the word-diff-regex well enough to fix it. The wordRegex should really only control what comprises a word, i.e. the granularity of --word-diff. (Where do we insert additional line-breaks before running ordinary diff?) If a wordRegex can make parts of diff disappear than there is problem deeper in the diff machinery. Can you trim this down to a minimal example? Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html