Re: newbie questions about git design and features (some wrt hg)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Matt Mackall wrote:
> On Wed, Jan 31, 2007 at 11:56:01AM +0100, Jakub Narebski wrote:
>> Theodore Tso wrote:
>> 
>>> On Tue, Jan 30, 2007 at 11:55:48AM -0500, Shawn O. Pearce wrote:
>>>> I think hg modifies files as it goes, which could cause some issues
>>>> when a writer is aborted.  I'm sure they have thought about the
>>>> problem and tried to make it safe, but there isn't anything safer
>>>> than just leaving the damn thing alone.  :)
>>> 
>>> To be fair hg modifies files using O_APPEND only.  That isn't quite
>>> as safe as "only creating new files", but it is relatively safe.
>> 
>>>From (libc.info):
>> 
>>  -- Macro: int O_APPEND
[...] 
>> I don't quote understand how that would help hg (Mercurial) to have
>> operations like commit, pull/fetch or push atomic, i.e. all or
>> nothing. 
> 
> That's because it's unrelated.
[...]
> Mercurial has write-side locks so there can only ever be one writer at
> a time. There are no locks needed on the read side, so there can be
> any number of readers, even while commits are happening.
> 
>> What happens if operation is interrupted (e.g. lost connection to
>> network during fetch)?
> 
> We keep a simple transaction journal. As Mercurial revlogs are
> append-only, rolling back a transaction just means truncating all
> files in a transaction to their original length.

Thanks a lot for complete answer. So Mercurial uses write-side locks
for dealing with concurrent operations, and transaction journal for
dealing with interrupted operations. I guess that incomplete transactions
are rolled back on next hg command...

I guess (please correct me if I'm wrong) that git uses "put reference
after putting data" scheme, and write-side lock in few places when it
is needed.
 
>> In git both situations result in some prune-able and fsck-visible crud in
>> repository, but repository stays uncorrupted, and all operations are atomic
>> (all or nothing).
> 
> If a Mercurial transaction is interrupted and not rolled back, the
> result is prune-able and fsck-visible crud. But this doesn't happen
> much in practice.
> 
> The claim that's been made is that a) truncate is unsafe because Linux
> has historically had problems in this area and b) git is safer because
> it doesn't do this sort of thing. 
> 
> My response is a) those problems are overstated and Linux has never
> had difficulty with the sorts of straightforward single writer
> operations Mercurial uses and b) normal git usage involves regular
> rewrites of data with packing operations that makes its exposure to
> filesystem bugs equivalent or greater.

Rewrites in git perhaps are (or should be) regular, but need not be often.
And with new idea/feature of kept packs rewrite need not be of full data.

One command which _is_ (a bit) unsafe in git is git-prune. I'm not sure
if it could be made safe. But not doing prune affects only a bit
repository size (where git is best I think of all SCMs) and not performance.

On the other hand hg repository structure (namely log like append changelog
/ revlog to store commits) makes it I think hard to have multiple persistent
branches.

Sidenote 1: it looks like git is optimized for speed of merge and checkout
(branch switching, or going to given point in history for bisect), and
probably accidentally for multi-branch repos, while Mercurial is optimized
for speed of commit and patch.

Sidenote 2: Mercurial repository structure might make it use "file-ids"
(perhaps implicitely), with all the disadvantages (different renames
on different branches) of those.

> In either case, both provide strong integrity checks with recursive
> SHA1 hashing, zlib CRCs, and GPG signatures (as well as distributed
> "back-up"!) so this is largely a non-issue relative to traditional
> systems.

Integrity checks can tell you that repository is corrupted, but it would
be better if it didn't get corrupted in first place.

Besides: zlib CRC for Mercurial? I thought that hg didn't compress the
data, only delta chain store it?
-- 
Jakub Narebski
Poland
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]