Re: [PATCH 2/6] diff: ignore sparse paths in diffstat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/20/2021 5:32 PM, Elijah Newren wrote:
> On Tue, Aug 17, 2021 at 10:08 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@xxxxxxxxx> wrote:
>>
>> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>>
>> The diff_populate_filespec() method is used to describe the diff after a
>> merge operation is complete, especially when a conflict appears. In
>> order to avoid expanding a sparse index, the reuse_worktree_file() needs
>> to be adapted to ignore files that are outside of the sparse-checkout
>> cone. The file names and OIDs used for this check come from the merged
>> tree in the case of the ORT strategy, not the index, hence the ability
>> to look into these paths without having already expanded the index.
> 
> I'm confused; I thought the diffstat was only shown if the merge was
> successful, in which case there would be no conflicts appearing.

That's my mistake. I'll edit the message accordingly.
 
> Also, I'm not that familiar with the general diff machinery (just the
> rename detection parts), but...if the diffstat only shows when the
> merge is successful, wouldn't we be comparing two REVS (ORIG_HEAD to
> HEAD)?  Why would we make use of the working tree at all in such a
> case?  And, wouldn't using the working tree be dangerous...what if
> there was a merge performed with a dirty working tree?
> 
> On a bit of a tangent, I know diffcore-rename.c calls into
> diff_populate_filespec() as well, and I have some code doing merges in
> a bare repository (where there obviously is no index).  It seemed to
> be working, but given this commit message, now I'm wondering if I've
> missed something fundamental either in that implementation or there's
> something amiss in this patch.  Or both.  Maybe I need to dig into
> diff_populate_filespec() more, but it seems you already have.  Any
> pointers to orient me on why your changes are right here (and, if you
> know, why diffcore-rename.c should or shouldn't be using
> diff_populate_filespec() in a bare repo)?

I think the cases you are thinking about are covered by this
condition before the one I'm inserting:

	/* We want to avoid the working directory if our caller
	 * doesn't need the data in a normal file, this system
	 * is rather slow with its stat/open/mmap/close syscalls,
	 * and the object is contained in a pack file.  The pack
	 * is probably already open and will be faster to obtain
	 * the data through than the working directory.  Loose
	 * objects however would tend to be slower as they need
	 * to be individually opened and inflated.
	 */
	if (!FAST_WORKING_DIRECTORY && !want_file && has_object_pack(oid))
		return 0;

or after:

	/*
	 * Similarly, if we'd have to convert the file contents anyway, that
	 * makes the optimization not worthwhile.
	 */
	if (!want_file && would_convert_to_git(istate, name))
		return 0;

(This makes me think that I should move my new condition further down
so these two can be linked by context.)

Sounds like this is just an optimization, so it is fine to opt out of it
if we think the optimization isn't necessary. Outside of the sparse-checkout
cone qualifies.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux