RE: FW: Windows. Git, and Dedupe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yes, Dedup is in fact a Server-only feature.  However, there are lots of people using the Server SKU as development workstations (especially here at Microsoft <g>).  There are also some sysadmins that I know of who use git and download sysadmin scripts via git to Servers.  Finally, I would hazard a guess that it's possible to mount an NTFS filesystem containing deduped files from a Server machine onto a client SKU and access those files.  (I'm not on the NTFS team, and haven't tried it.)  So I think there are good reasons to support reparse points on Windows.  

The reparse point could be decoded as being a non-symlink reparse item using; in those cases, treating the file as an "ordinary" file would be appropriate.

For example, see the following.  The reparse tag value for symlinks is IO_REPARSE_TAG_SYMLINK (0xa000000c) and for deduped files is (IO_REPARSE_TAG_DEDUP) 0x80000013.  The value can be discovered from the information at [1].  

I admin to not having looked at the git code nor being familiar with mingw.  Are native Win32 calls supported in the git codebase?

Jmr


[1] http://msdn.microsoft.com/en-us/library/windows/desktop/aa365740(v=vs.85).aspx


PS I:\temp> cmd /c mklink x y
symbolic link created for x <<===>> y
PS I:\temp> fsutil reparsepoint query x
Reparse Tag Value : 0xa000000c
Tag value: Microsoft
Tag value: Name Surrogate
Tag value: Symbolic Link

Reparse Data Length: 0x00000010
Reparse Data:
0000:  02 00 02 00 00 00 02 00  01 00 00 00 79 00 79 00  ............y.y.
PS I:\temp> fsutil reparsepoint query x.txt
Reparse Tag Value : 0x80000013
Tag value: Microsoft

Reparse Data Length: 0x0000007c
Reparse Data:
0000:  01 02 7c 00 00 00 00 00  66 9c 1a 01 00 00 00 00  ..|.....f.......
0010:  00 00 01 00 00 00 00 00  cb eb c5 00 6a 97 63 4d  ............j.cM
0020:  97 9c 13 0c 41 8e ed 8b  40 00 40 00 40 00 00 00  ....A...@.@.@...
0030:  d3 b9 a8 d4 e4 c6 cd 01  55 ca 02 00 00 00 05 00  ........U.......
0040:  70 ac 21 04 00 00 05 00  01 00 00 00 88 8d 00 00  p.!.............
0050:  c8 30 00 00 00 00 00 00  c8 44 db 94 6c 88 9a d4  .0.......D..l...
0060:  0a a9 01 3a 1f 80 80 8d  ea 0d 53 d7 36 49 b9 a4  ...:......S.6I..
0070:  82 a2 b9 4e 2a 16 4b a1  2e d9 f3 dd              ...N*.K.....

-----Original Message-----
From: René Scharfe [mailto:rene.scharfe@xxxxxxxxxxxxxx] 
Sent: Tuesday, March 19, 2013 2:08 PM
To: Josh Rowe
Cc: git@xxxxxxxxxxxxxxx; msysgit@xxxxxxxxxxxxxxxx
Subject: Re: FW: Windows. Git, and Dedupe

Am 18.03.2013 22:20, schrieb Josh Rowe:
> On Windows with an NTFS volume with Deduplication enabled, Git 
> believes that deduplicated files are symlinks.  It then fails to be 
> able to do anything with the file.  This can be repro-ed by creating 
> an NTFS volume with dedup, creating some duplicate files, verifying 
> that a few files are deduped, and trying to add and commit the files 
> via git.

Both Single Instance Storage[1] and Data Deduplication[2] (introduced with Windows Server 2012) seem to be server-only features.  How about keeping regular git repositories with checked-out files on client disks and use the server only for bare repositories (without working tree)?

When I tried to add a symbolic link created with mklink on Windows 8, the mingw version of git refused because readlink(2) is not supported.  This seems to be sufficient to reproduce the issue.

I couldn't test the Cygwin version, though, because http://cygwin.com doesn't respond at the moment.

But a working readlink(2) wouldn't help anyway, I guess.  I imagine that the reparse points used for deduplication point into a magic block store which performs garbage collection of content that is no longer referenced -- which probably means that a recreated "symlink"
may point to blocks that have been deleted in the meantime.

Perhaps you need a way to ask git to always follow symlinks instead of trying to store their target specification.

René


[1] http://technet.microsoft.com/en-us/library/dd573308%28v=ws.10%29.aspx
[2] http://msdn.microsoft.com/en-us/library/windows/desktop/hh769303%28v=vs.85%29.aspx



��.n��������+%������w��{.n��������n�r������&��z�ޗ�zf���h���~����������_��+v���)ߣ�

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]