Re: [PATCH] travis-ci: run previously failed tests first, then slowest to fastest

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/25, Junio C Hamano wrote:
> Thomas Gummerer <t.gummerer@xxxxxxxxx> writes:
>
> > On 01/24, Junio C Hamano wrote:
> >> To put it differently, if a blob stored at path has CRLF line
> >> endings and .gitattributes is changed after the fact to say that it
> >> must have LF line endings, we should treat it as a broken transitory
> >> state.
> >
> > Right, I wasn't considering this as a broken state, because t0025 uses
> > just this to transition between the states.
> >
> >> There may have to be a way to "fix" an already "wrong" blob
> >> in the index that is milder than "rm --cached && add .", but I do
> >> not think write_entry(), which is shared by all the normal codepaths
> >> that writes out to the working tree, is the best place to do so, if
> >> doing so forces the contents of the paths to be always re-checked,
> >> just in case the user is in such a broken transitory state.
> >
> > Maybe I'm misunderstanding something, but the contents of the paths
> > are only re-checked if we are in such a broken transition state, and
>
> What I do not understand here is how the added check ensures that
> "only if in such a broken transition state".  would_convert_to_git()
> does not take the contents but is called only with the pathname to
> key into the attributes, so in a typical cross platform project
> where all the source files are "text" and the repository can set
> core.eol to adjust the end of line convention for its working tree,
> the added check has no way to differentiate the paths that are
> recorded with CRLF line endings in the object database by mistake,
> i.e. the ones in the broken transitory state, and all the other
> paths that are following the "text" and storing their blobs with LF
> line endings.  The added check would trigger "is it racy" check to
> re-reads the contents that we have written out from the working tree
> for the paths with "wrong" blobs, but how would it avoid doing so
> for the paths whose blobs are already stored correctly?
>
> The new code affects not just "reset --hard", but everybody who
> writes out from the object database to the working tree and records
> that these paths are checked out in the index.  How does the new
> code avoid destroying the performance for all paths?

I misunderstood the way would_convert_to_git() works, I should have
actually read the code, instead of just relying on my test, which was
even wrong.  Sorry about the confusion, my patch does indeed hurt
the performance.

> > the file stored in git has crlf line endings, and thus would be
> > normalized.  In this case we currently re-check the contents of that
> > file anyway, at least when the file and the index have the same mtime,
> > and we actually show the correct output.
>
> The right way to at that "correct output", I think, is that it
> happens to be shown that way by accident.  It is questionable that
> it is correct to report that such a path is modified.  Immediately
> after you check out a path to the working tree out of the index, via
> "reset --hard" and "checkout $path", by definition it ought to be
> the same between the working tree file and the index.
>
> Unless it is in this transititory broken state, that is.
>
> The "by accident" happens only because racy-git avoidance is being
> implemented in one particular way.  Is about protecting people from
> making changes to the working tree files immediately after their
> previous contents are registered to the index (and the index files
> written to the disk), and immediately after we write things out of
> the index and by definition the result and the indexed blob ought to
> match there is no reason to re-read and recheck unless the working
> tree files are further edited.
>
> The current way the racy-git avoidance works by re-reading and
> re-checking the contents when the timestamps match is merely one
> possible implementation, and that is the only thing that produces
> your "correct" output most of the time in this test.  We could have
> waited after writing the index time for a second before giving the
> control back to the user instead, which would have also allowed us
> to implement the racy-git avoidance, and in that alternate world,
> all these transitory broken paths would have been correctly reported
> as unmodified.
>
> IOW, I would think the test in question is insisting a single
> outcome for an operation whose result is undefined, and it is
> harmful to twist the system by pessimizing the common cases just
> to cater to this transititory broken state.
>
> I am not saying that we shouldn't have support for users to fix
> their repository and get out of this transititory broken state.  A
> recent work by Torsten Bögershausen to have ls-files report the end
> of line convention used in the blob in the index and the settings
> that affect conversion for each path (among other things) is a step
> in the right direction.  With a support like that, those who noticed
> that they by mistake added CRLF files to the index as-is when they
> wanted their project to be cross platform can recover from it by
> setting necessary attributes (i.e. mark them as "text") and then
> find paths that are broken out of "ls-files --eol" output to see
> which ones are not using lf end-of-line in the index.
>
> I do not think there is a canned command to help dealing with these
> broken paths right now.  You would have to check them out of the
> index (you would get a CRLF file in the working tree in the example
> we are discussing), fix the line endings (you would run dos2unix on
> it in this example, as you would want "text=true" attribute) and
> "git add" them to recover manually, but I can imagine that Torsten's
> work can be extended to do all of these, without molesting the
> working tree files, with minimum work by the end user.  That is:
>
>  * Reuse Torsten's "ls-files --eol" code to find paths that record
>    the blob in the index that does not follow the eol convention
>    specified for the path;
>
>  * For each of these index entries, run convert_to_working_tree() on
>    the indexed contents, and then on the result of it, run
>    convert_to_git().  The result is the blob that the index ought to
>    have had, if it were to be consistent with the attribute
>    settings.  So add that to the index.
>
>  * Write the index out.
>
>  * Tell the user to commit or commit it automatically with a canned
>    log message "fix broken encoding" or something.

Thanks for the thorough explanation, and the patch in the next email,
I'll have a look at that tomorrow.

--
Thomas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]