Hi, Jonathan Nieder <jrnieder@xxxxxxxxx> writes: > (+cc David, who I think mentioned wishing for something like this) > > Raja R Harinath wrote: > >> Subject: fast-import: export correctly marks larger than 2^20-1 > > Thank you! That would be a very good thing. I needed this so that I could maintain two 'marks' counters, one for commits counting up from 1, and one for files counting down from some large value, where the file marks could be re-used. http://gitorious.org/~harinath/svn2git/rrh-svn2git/commit/ffc5270a6fa106fecad1a6a9f1520ca8f075c6b7 However, the 1M limit on marks is a bit too tight. For instance, the mono SVN repository has 160K commits, with one project alone using 60K commits. And one of the commits touched nearly 24K files. The number of marks used is uncomfortably close: within one order of magnitude of the limit. I'd like 10 more bits of breathing space, please :-) >> dump_marks_helper() has a bug when dumping marks larger than 2^20-1, >> i.e., when the sparse array has more than two levels. The bug was >> that the 'base' counter was being shifted by 20 bits at level 3, and >> then again by 10 bits at level 2, rather than a total shift of 20 bits >> in this argument to the recursive call: > > I haven’t read or grokked that code you are changing, so I can’t > comment on the substance of your patch. In case no one with such > knowledge turns up, could you give a quick summary of what the > existing code does and why? This is not particular quick or clear, but here goes. The marks are stored in a sparse array data structure. The sparse array is represented as a 1024-tree, with object storage only at the leafs, and every path from root to leaf being the same length. So, each node can, and does, have the notion of a level, which is represented by a number of bits to shift. When you try to lookup at a non-leaf, the lookup index can be shifted right the given number of bits, and masked to 10 bits [1], to get the sub-tree to continue lookup in. IOW, all the indices in a subtree have a common prefix. In dump_marks_helper, the 'base' argument contains that common prefix to the current tree. In the recursive case, the common prefix of the sub-tree is computed by taking this 'base', and adding to it the contribution from this node, i.e., k << shift [2]. The bug was that 'base' was being shifted too, even though it already had the correct value from its caller. - Hari [1] The code actually does something different, but equivalent. It masks off the top bits immediately before recursing. However, it's easier to explain dump_marks_helper if we think of the index surviving to the leaf. [2] Now that I think of it, the other fix, that I called more elegant, is harder to explain. So, let's not go with that. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html