p.p.s. git 2.20 has the same problem. On 12/30/22, Samuel Wales <samologist@xxxxxxxxx> wrote: > i am not subscribed, but am of the impression that's ok. please copy > me directly. > > > tldr: git diff is showing differences that do not exist in the files > themselves. > > i have nothing staged, nothing fancy like stashing, etc. this is a > repo of mostly emacs org mode files. mostly ascii text. > > git status and these commands show nothing unusual: > > git fsck --strict --no-dangling > git gc --prune="0 days" > > > the problem that seems like data corruption is that a few lines appear > twice as - and once as +. but in the current version of the files, > those lines exist only once. here are the lines. there are 2 - > versions and one + version: > > +***************** REF bigpart is a partition > +biglike and homelike are distracting nonsense i think except > +to describe inferior filesets. anomalous subset of home > +might be called homelike or so. > > > emacs magit shows the same problem. however, it shows a slightly > different diff. i did a meta-diff on git diff vs. magit, and there > are about 800 + real-content lines that magit shows but git diff does > not. i do not know what this means. wc -l is like > > 62540 aaa.diff > 62965 bbb--magit.txt > > idk why a diff would be different with only + lines being different? > > > in summary, what is wrong with my repo, if anything, and what can i do > about it? nothing on the web for git corruption seems to say much, > other than pull from github or whatever. this is my own repo, the > original repo, so i cannot do that. org annex has an uncorrupt tool > of some kind, but it did not seem relevant. i do have rsnapshot > [basically rsync] backups of the repo and the most significant files > and dirs, but i do not know what one does to use that to repair any > issues. i won't get into why, but changes were made over months. > > is there a protocol for this? > > would git fsck have balked? > > thank you! > > > p.s. > > i have no reason to believe this is related, but git diff has > intermingled emacs org mode entries. but i don't have to talk about > it in org terms; in generic text terms, it has intermingled parts of > different paragraphs. as a user, i'd prefer that completely unrelated > paragraphs not be mingled, regardless of minimality. if possible. > > with respect to the intermingling only, unless this is related to the > possible corruption, i will presume the diff is correct, in that a > patch from it would produce the same result as a patch that does not > intermingle. i believe this intermingling is because diff does not > understand org, or paragraphs for that matter. in org, an entry > starts with "^[*]+ " and ends at the beginning of another entry or at > eof. they consist in my case mostly of ascii text paragraphs. just > as with paragraphs, if you move an entry, you don't expect it to be > mingled with a different one in the diff. > > i have been told that this cannot be fixed by merely telling a > slightly improved differ that stuff between stars is worth preserving, > but that a parser, not merely a couple of regexps, is needed to reduce > this intermingling. i have also been told that difftastic uses > tree-sitter, which might get such a syntax for emacs org mode. and so > maybe at some point git diff can use that. idk. > > idk if any of this is related but i include it for completeness. > > also, please don't laugh, but i am using git version 2.11.0. i will > upgrade pending various library and os stuff but my main concern is > not for git, but for possible corruption in the repo and what is > possible to do, at least given rsnapshot, to fix it. > -- The Kafka Pandemic A blog about science, health, human rights, and misopathy: https://thekafkapandemic.blogspot.com