Re: [PATCH] travis-ci: run previously failed tests first, then slowest to fastest

Junio C Hamano <gitster@xxxxxxxxx> · Mon, 01 Feb 2016 10:17:24 -0800

Clemens Buchacher <drizzd@xxxxxx> writes:

> Ok, then let's take a step back. I do not actually care if git diff and
> friends say the worktree is clean or not.

You may not, but many existing scripts people have do.

> But I know that I did not make
> any modifications to the worktree, because I just did git reset --hard.
> And now I want to use commands like cherry-pick and checkout without
> failure. But they can fail, because they essentially use git diff to
> check if there are worktree changes, and if so refuse to overwrite them.

Yes, exactly.

> So, if the check "Am I allowed to modify the worktree file?", would go
> the extra mile to also check if the worktree is clean in the sense that
> convert_to_worktree(index state) matches the worktree. If this is the
> case, then it is safe to modify the file because it is the committed
> state, and can be recovered.

So in essense, the proposed "fix" is "let's fix it in the right
way"?

The way we defined "would we lose some changes that are only in the
working tree?", aka "is the working tree dirty wrt the index?", has
been to check if "git add -u" would change the states in the index.
And for scripted Porcelains and end-user scripts, "git diff-files",
aka "what change would 'git add -u' make to the states in the
index?", has been the command to do the same check.

Your proposal is to redefine "is the working tree dirty?"; it would
check if "git checkout -f" would change what is in the working tree.

I agree that indeed is "would we lose some changes that are only in
the working tree", and I think we can do that transparently for
"internal" commands, i.e. without any end-user impact, as the new
check would behave identically when they have sane contents--the
difference between the current check and the new check only exists
when the contents in the index contradicts what the user specifies
for to-git conversion via eol or clean filter.

We would need a way for our scripted Porcelains and end-user scripts
to ask that new question, though, but I think that is not something
insurmountable.  A new option to "diff-files" or something, perhaps,
would be workable, but having a new "git require-clean-work-tree"
plumbing, which would replace require_clean_work_tree shell helper
in git-sh-setup, may be conceptually much cleaner, because the new
definition of "working tree being clean" is no longer tied to what
"diff" should say.

I like that as a general direction.

> Regarding performance impact: We only need to do this extra check if the
> usual check convert_to_git(work tree) against index state fails, and
> conversion is in effect.

How would you detect the failure, though?  Having contents in the
index that contradicts the attributes and eol settings affects the
cleanliness both ways.  Running the working tree contents via to-git
conversion and hashing may match the blob in the index, declaring
that the index entry is "clean", but passing the blob to to-worktree
conversion may produce result different from what is in the
worktree, which is "falsely clean".  That is an equally important
case that is opposite from what we have been primarily discussing,
which is "falsely dirty".

>> Besides, I do not think the above approach really solves the issue,
>> either.  After "git reset --hard" to have the contents in the index
>> dumped to the working tree, if your core.autocrlf is flipped,
>
> Indeed, if the user configuration changes, then we cannot always detect
> this (at least if the filter is an external program, and the behavior of
> that changes). But the user is in control of that, and we can document
> this limitation.

That argument does not result in a very useful result, though.
Because the user is in control of what attributes and eol settings
are in effect in her repository, we can just document that the
current check will give unspecified result if the indexed contents
contradict with that setting, e.g. when you have CRLF encoded data
in the index but the eol conversion assumes LF in the repository.
But this discussion is an attempt to do better than that, no?

> On the other hand, a user who simply follows an upstream repository by
> doing git pull all the time, and who does not make changes to their
> configuration, can still run into this issue, because upstream could
> change .gitattributes. This part we could actually detect by hashing the
> attributes for each index entry, and if that changes we re-evaluate the
> file state.

If this has to bloat each index entry, I do not think solving the
problem is worth that cost of that overhead.  I'd rather just say
"if you have inconsistent data, here is a workaround using 'reset'
and then 'reset --hard'" and be done with it.

> This is also an issue only if a smudge filter is in place. The eol
> conversion which only acts in the convert_to_git direction is not
> affected.

IIRC, autocrlf=true would strip CR at the end of line in to-git
conversion, and would add CR in to-worktree conversion.  So some eol
conversion may only act in to-git, but some others do affect both,
and without needing you to touch attributes.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html