Re: long fast-import errors out "failed to apply delta"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dmitry,

Dmitry Ivankov wrote:

> 3) Is this breakage specific to tags/v1.7.3-rc0~75^2 "Teach
> fast-import to import subtrees named by tree id" 30 Jun 2010 (allowing
> M 040000 <tree id> pathname)?

Probably. :)

> The harder fix is to try to keep e->versions[0].sha1 for trees correct.

Context for those who don't remember all the details:

When first introduced, "struct tree_entry" was very simple: a mode, a
filename, and:

 - for regular objects and symlinks, a blob object name representing
   its content;

 - for subdirectories, a pointer to "struct tree_content" listing its
   contents, along with an _optional_ cached tree object name.

When modifying a tree entry, fast-import would walk through the path
to it and invalidate the cached tree names at each step.  Shawn wrote:

	Currently trees and commits aren't being deltafied when written to
	the pack and branch reloading from the current pack doesn't work,
	so at most 5 branches can be worked with at any one time.

but the advantage was that it was very simple.  Later, delta
compression arrived:

	commit 4cabf858
	Author: Shawn O. Pearce <spearce@xxxxxxxxxxx>
	Date:   Mon Aug 28 12:22:50 2006 -0400

	    Implemented tree delta compression in fast-import.

	    We now store for every tree entry two modes and two sha1 values;
	    the base (aka "version 0") and the current/new (aka "version 1").
	    When we generate a tree object we also regenerate the prior version
	    object and use that as our base object for a delta.  This strategy
	    saves a significant amount of memory as we can continue to use the
	    atom pool for file/directory names and only increases each tree
	    entry by an additional 24 bytes of memory.

In other words, this commit introduced a "prior" mode and tree or blob
name to give the pack-writing machinery a hint about what to delta
against.

With that in mind, it seems very weird that the version 0 tree would
ever be changed and need to have its object name invalidated.  Perhaps
we are releasing some memory too early and it is getting clobbered?
Unless I am missing something, the patch

> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -1514,6 +1514,9 @@ static int tree_content_set(
>                                 if (e->tree)
>                                         release_tree_content_recursive(e->tree);
>                                 e->tree = subtree;
> +                               if (S_ISDIR(mode)) {
> +                                       hashclr(e->versions[0].sha1);
> +                               }
>                                 hashclr(root->versions[1].sha1);
>                                 return 1;

just disables deltas for trees that have been modified by a "M 040000"
command, so it feels more like a workaround than a fundamental fix.

Could you save the svn-fe output (e.g., by introducing "tee" in the
middle of the "svn-fe | fast-import" pipeline) and put it up
somewhere online?  This would also be a good starting point for coming
up with a reduced testcase.

Hope that helps,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]