Re: [RFC PATCH] diff: only prefetch for certain output formats

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Sometimes "status" does need the actual blobs, if it's going to do
> inexact rename detection. Likewise if break-detection is turned on,
> though I don't think anything does it by default, and there's (I don't
> think) any config option to enable it.
> 
> So something like "git log --name-status -B -M" would probably regress
> from this (though only in speed, of course; I do think we can play a
> little loose with heuristics since we'd generate the correct answer
> either way).
> 
> You could get pretty specific by putting logic inside diffcore_rename(),
> which would know if anything is left over after exact rename detection,
> but I suspect just checking:
> 
>   (options->break_opt != -1 || options->detect_rename)
> 
> in diffcore_std() would be OK in practice.

Thanks for taking a look at this patch and for the pointers, especially
to rename detection. I investigated and found that in practice, with
respect to rename detection, options->detect_rename is insufficient to
determine exactly when we need to fetch; we need to fetch when
(for example) a file is deleted and another added, but not when a file
is merely changed, and these rules are not reflected in
options->detect_rename. These rules indeed are in diffcore_rename(), as
you mentioned, but putting logic inside diffcore_rename() (or copying
the same logic over to diffcore_std()) complicates things for too little
benefit, I think.

To add to this, rename detection is turned on by default, so it wouldn't
even fix the original issue with "status".

So I'll abandon this patch, at least until someone finds a use case for
diffing with no rename detection on a partial clone and would rather not
have a prefetch.

> >  - Is the whitelist of output_format constants the best?
> 
> I think it could be pared down a bit. For example, --raw doesn't
> need the blobs (aside from renames, etc, above). I think the same is
> true of --summary. You've already omitted --name-status and --name-only,
> which makes sense.
> 
> I think --dirstat, even though it is only showing per-file info, still
> relies on the line-level stat info. So it should stay.

Thanks for taking a look at this.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux