Re: Extremely slow progress during 'git reflog expire --all'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Apr 04, 2010 at 11:22:18AM -0700, Junio C Hamano wrote:

> Thanks for the analysis, but expire_reflog() that is run for each ref
> already does that, I think.  It first runs mark_reachable(), then walks
> each reflog entry for the ref to call expire_reflog_ent(), which in turn
> calls unreachable() that first checks if mark_reachable() has marked the
> commit, and if so we don't run in_merge_bases().

Hmm. It looks like mark_reachable() stops traversing when it hits a
commit older than expire_total. I imagine that's to avoid going all the
way to the roots. But if we hit any unreachable entry, in_merge_bases()
is going to have to go all the way to the roots, anyway.

If we just marked everything, couldn't we then trust the REACHABLE
value and not have to do the in_merge_bases double-check? It has worse
best-case cost (since we always go to the root), but better worst-case
(since we can potentially go to the roots for each unreachable reflog
entry).

For example:

> +	/*
> +	 * Unless there was a clock skew, younger ones that are
> +	 * reachable should have been marked by mark_reachable().
> +	 */
> +	if (cb->cmd->expire_total < commit->date)
> +		return 1;
> +
>  	if (in_merge_bases(commit, &cb->ref_commit, 1))
>  		return 0;

If we haven't done a "reflog expire" in a while, won't we have a bunch
of old commits that will still need double-checked, and produce the same
slow behavior? Or is that what your "clock skew" is meant to mean? That
we would have removed those old ones already assuming the commit date
and the reflog entry for them match up? That's not necessarily true if
you move to an older commit, which freshens its reflog entry compared to
the commit date.

I wonder if, in addition to your patch, we should remove the
double-check in_merge_bases and simply report those old ones as
reachable. We may be wrong, but we are erring on the side of keeping
entries, and they will eventually expire in the regular cycle (i.e., 90
days instead of 30).

All of that being said, your patch does drop Frans' case down to about
1s of CPU time, so perhaps it is not worth worrying about beyond that.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]