Re: crlf with git-svn driving me nuts...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 17, 2008 at 1:46 AM, Dmitry Potapov <dpotapov@xxxxxxxxx> wrote:
> On Thu, Apr 17, 2008 at 12:07:27AM +0100, Nigel Magnay wrote:
>  > >  > The bit I really don't understand is why git thinks a file that has
>  > >  > just been touched has chnaged when it hasn't,
>  > >
>  > >  Actually, it did change in the sense that if you try to commit this
>  > >  file now into the repository, you will have a different file in Git!
>  > >  So, it is more correct to say that Git did not notice this change until
>  > >  you touch this file, because this change is indirect (autocrlf causes
>  > >  a different interpretation of the file).
>  > >
>  >
>  > Okay - at the very least this behaviour is really, really confusing.
>  > And I think there's actually a bug (it should *always* report that the
>  > file is different), not magically after it's been touched.
>
>  I don't think there is a simple way to correct that without penalizing
>  normal use cases. Usually, people do not change autocrlf during their
>  normal work. Besides, you can have your own input filters and they may
>  cause the same effect. So, Git works in the assumption that input filters
>  always produce the same results...

This has nothing to do with changing core.autocrlf after checkout -
it's a problem with *any* repo with CRLF files, being checked out on a
core.autocrlf=true machine, which basically is any windows machine.

The current 'isDirty' check seems to be something like

isDirty = ( wc.file.mtime > someValue ) && ( repository.file !=
filter(wc.file) )

I'm saying it ought to be something like

isDirty = ( wc.file.mtime > someValue ) && (sha1(repository.file) !=
sha1(wc.file) ) && ( repository.file != filter(wc.file) )


>
>
>  >
>  > But fixing that minor bug still leads to badness for the user. Doing
>  > (on a core.autocrlf=true machine) a checkout of any revision
>  > containing a file that is (currently) CRLF in the repository, and your
>  > WC is *immediately* dirty. However technically correct that is, it
>  > doesn't fit most people's user model of an SCM, because they haven't
>  > made any modification.
>
>  IMHO, the only sane way is never store CRLF in the Git repository.
>  You can have whatever ending you like in your work tree, but inside
>  of Git, LF is the actually marker of the end-of-line.
>

Great. I'll go and argue with the team using svn, who don't even
*notice* this issue, and try to get them to adjust the metadata on
every single file in the repository.

Then, for a bonus, I'll try the same with every OSS project that I'm
tracking with git-svn. :-(

I get that things are horribly broken if you get CRLF in your
repository. But it's unreasonable to expect the ability to bend the
rest of the world to what's convenient for me! Some of our windows
coders probably even *like* svn:eol-style=CRLF !

>
>  > And if 1 person makes a change along with their
>  > conversion, and the other 'just' does a CRLF->LF conversion,
>
>  If you imported correctly in Git, it should not have CRLF for text
>  files. So, there is no conversion that a user does expliciltly.
>
>
>  > And because the svn is
>  > mastered crlf (well, strictly speaking, it's ignorant of line endings)
>  > this is gonna happen a lot.
>
>  Not really. SVN has its own setting for EOL conversion. If you have
>  'svn:eol-style' set to 'native' for any text file then SVN will
>  checkout text files accordingly to your native EOL (you can specify
>  your native EOL using the --native-eol option when it is necessary).
>

Can I set this personally, without affecting the svn repo? If so, why
isn't git-svn doing this anyway, and can I tell it to do so?

>
>  > Can't git be taught that if the WC is byte-identical to the revision
>  > in the repository (regardless of autocrlf) then that ought not to be
>  > regarded as a change?
>
>  Why should not it? If a file is different as long as Git repository is
>  concern then then it *is* a change. Git binary compare files _after_
>  applying all specified filters (and you can have your own filters, not
>  only autocrlf).
>

See above. Unchanged (on disk, byte identical) files, if touched, get
(sometimes) marked as dirty.

>
>  > Is there a way I can persuade the diff / merge mechanisms to normalise
>  > before they operate? (e.g if core.autocrlf does lf->crlf/crlf->lf,
>  > then an equivalent that does crlf->lf/crlf->lf before doing the merge
>  > )?
>
>  I am not sure if there is a standard option for that, but it is
>  certainly possible to define your own merge strategy.
>
Ok - I'll have a look into this - just a filter on each file before
merging would be sufficient. Presumably people that do things like
$Id$ expansion need something similar to avoid constant merge
conflicts..

>
>  >
>  > In a perfect world I'd be able to switch all files int he repo to LF,
>  > but that's not going to happen any time soon because of the majority
>  > of developers, still on svn, still on windows.
>
>  Well, I don't see any problem here if everything is configured properly.
>  How files are stored inside and what you have in your work tree does
>  not have to be the same. So, storing everything inside with LF is
>  certainly possible. Actually, I believe it is exactly what CVS does
>  (unless you added a file with '-kb'), and people use CVS on Windows.
>  Importing files with CRLF in Git, it is like putting files as _binary_
>  in CVS.
>
>  Dmitry
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux