Re: [PATCH] fast-import.c: always honor the filename case

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2014-02-03 23.11, Reuben Hawkins wrote:
>
>
>
> On Mon, Feb 3, 2014 at 2:21 PM, Torsten Bögershausen <tboegi@xxxxxx <mailto:tboegi@xxxxxx>> wrote:
>
>     []
>     > So to summarize, when fast-import uses strncmp_icase (what fast-import does now) import on a repository where ignorecase=true is wrong.  My patch, "fast-import.c: always honor the filename case" fixes this.  Can you verify?
>     >
>     > Thanks in advance,
>     > Reuben
>     >
>     Yes, I can verify. My feeling is that
>     a) the fast-export should generate the rename the other way around.
>        Would that be feasable ?
>
>
> I *think* this is feasible.  I did try this, and it worked, but I didn't like the idea of having to fix all the exporters.  I know about hg-export and svn-export, but I assume there are more that I don't know about, and maybe even other tools out there none of us know about, which would also have to be fixed in the same way.  As such, I decided fixing fast-export isn't really the right thing to do...I don't really think fast-export is broken to begin with.  I'm hoping there is a way to fix ignorecase such that it doesn't create this type of problem with this...
>
> M 100644 :1 Filename.txt
> D FileName.txt
>
> ..maybe by very carefully identifying when ignorecase should apply and when it shouldn't (I'm still trying to sort that out, the docs on ignorecase aren't specific).
>
> But for what it's worth, to "fix" fast-export, I added a check in builtin/fast-export.c in the function depth_first before all the other checks so it would always make diff_filepair->status == 'D' the lesser when not equal...something like this (I'm not looking at the code now, so this may *not* be what I did)...
>
> if (a->status != b->status) {
>   if (a->status == 'D') return -1;
>   if (b->status == 'D') return 1;
> }
>
> /Reuben
>  
>
>        Or generate a real rename ?
>
>
> A rename may also work, but it may not.  And that would also require fixing all other exporters.  If I understand the docs well enough, a rename would be done like so...
>
> R Filename.txt FileName.txt
>
> I think in the ignorecase=true situation, case folding would happen and this would be a nop like this...
>
> R Filename.txt Filename.txt
>
> ...Right?  I haven't tested this, but I *suspect* the result would be to not rename the file when ignorecase=true...I definitely think it's worth a try just to know the result, but this fixes the ignorecase problem in fast-import by passing a requirement to all the fast-exporters...a semi-artificial requirement created because ignorecase *could* be true, but may not be.
> /Reuben
>  
>
>       (I'm not using fast-export or import myself)
>
>     b)  As a workaround, does it help if you manually set core.ignorecase false ?
>
>
> Yes, this works.  It makes a single step clone, git clone hg::..., into a multi step process like this...
>
> $ mkdir test
> $ git init
> $ git config core.ignorecase false
> $ git remote add origin hg::whatever
> $ git fetch origin
> $ git reset --hard origin/master
> $ git branch --set-upstream-to=origin/master master
>
> That isn't a too big deal for people fluent in GIT (if you only have to do it once and wrote it down too maybe).  It works, just not ideally and it's easy to get burned on because git clone sort-of works, it just doesn't truly clone.  The resulting cloned repo can be mangled by case folding.
>
> Typically, unless somebody explains the multi step process to everybody, some people will have master with commit xxxxxx and others will have the exact same master with commit yyyyyy.  Some will have Filename.txt instead of FileName.txt way back in history.  Merging those branches is a mess.
>
> So setting core.ignorecase=flase does work, it's just a bit cumbersome.  My fingers really want to just type git clone hg::whatever and I hope to get a true 'clone' as in an exact, identical copy on all machines regardless of filesystem.  If GIT wants to do case folding after that I suppose it would be fine.  Maybe I'm expecting too much, but I've been under the impression for years that a clone of any git repo will have all the exact same commit id's.
> /Reuben
>
>
>     c)  Does it help to use git-hg-remote ? (could be another workaround)
>
>
> Yes, sorry, I guess I wasn't clear on that point.  That is what I'm using. 
> /Reuben
>
>
>     And no,  50906e04e8f48215b0 does not include any test cases.
>     (try git show 50906e04e8)
>
>     This is only a short answer, I can prepare a longer answer about ignorecase the next days.
>     /Torsten
>
>
> Thank you!  That would be very helpful.  I'm still trying to wrap my head around what ignorecase really needs to do, when and where it needs to do it and what it shouldn't do.  I suspect ignorecase is touching too many code paths and needs to be reined in a bit.
>
> I'm also wondering if it's possible to test a bunch of situations in 'make tests' with ignorecase=true/false....but I can't think of any way other than mounting filesystems on loopbacks to setup the tests (to ensure a vfat fs for example)....do you know a better way?
>
> /Reuben 
>
My experience and understanding is that ignorecase=true is useful when working in a Windows environment.
Some tools (or users) change the case of a filename (or directory name) and Git compensates for that.
Which means that you can rename a file behind the back of Git, and Git is not complaining.
(Junio recently explained it much better in detail)

The thing is that you can push and pull between different machines, and ignorecase=true makes sure that
Git finds the "right file", similar as Windows finds it.

The same is true for directory names, and I could call the whole ignorecase=true feature
a kind of "don't worry" packet, or a "Changing case is harmless insurance".

Since 50906e04e8 this is even more true, since even the fast-import is covered,
The importer will find the "right file" even if the exporter did not tell us about the rename.
(Or should we say re-case ?)

Some trouble starts when you push and pull such a repo to a Linux machine.
(You can replace Windows with Mac OS+case insensitve HFS+ and Linux with
Mac OS + case sensitive HFS+ (or Unix))

Especially the the case folding of directory names is interesting:
For Linux "Dir1/File", "dir1/file" or "dir1/File" are completely different, for Windows not.
So if a mixture of Windows/Linux systems is working together it can help to set
ignorecase=false on Windows, to detect some of the mess seen by Linux.
(and thats why I suggested to set ignorecase=false as a workaround)

Reading the answers from Peff and Junio, I am convinced that the fast-import should
not look at core.ignorecase at all.

If I am allowed to make a suggestion:
the patch for fast-import.c (0001-fast-import.c-ignorecase-iff-explicitly-told-to.patch)
 you send out is a good starting point.

(I would avoid using ignore_case and  strncmp_icase().
 A file-local variable like fi_ignore_case or so and a local function will isolate variables.
 
Perhaps slow_same_name() from name-hash.c can be used (it needs to be made non-local)

On top of that, a test case could be good.
 You have send a shell script to demonstrate the problem using fast-export and fast-import.
 This can be used as a start.

And if you want to experiement with case-sensitive/unsensitve file systems
under Mac OS, simply get a USB drive (8 GB Flash stick could be enough)
and format it with the HFS+ version you don't have on your hard disk.
/Torsten
 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]