Re: [PATCH 0/9] respect binary attribute in grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Feb 04, 2012 at 02:22:52PM -0500, Pete Wyckoff wrote:

> I took a look at this series.  It's nice.  My worry was that the
> extra open() of non-existent .gitattributes files in all the
> directories would cause performance problems across networked
> filesystems like NFS.

Yeah, I was able to measure a small slow-down on a quick grep even with
a warm cache. So it does take some extra effort, but I think the
correctness is worth it (and note that the slow down is in the tens of
milliseconds if you have a reasonable stat()).

If people have big trees on NFS (or some other slow-stat system) where
these lookups are actually a problem, I'd rather see a global option to
disable .gitattributes lookups for both diff and grep (i.e., a "trust
me, I'm not using gitattributes, and don't bother with stat" flag). In
practice, though, I think such a thing is not necessary because the
stat() is local to the file being examined (e.g., for "foo/bar/baz", we
look only at "foo/bar/.gitattributes", "foo/.gitattributes", and
".gitattributes", without having to touch other parts of the tree).

Anyway, thanks for doing some performance testing. More data is always
good.

> It could be plausible that deep directory structures with few
> grep-able files will suffer with this change.  For example, many
> big binary blobs in deep directory hierarchies, but also some
> useful files here and there.
>
> One could argue that with the use of .gitattributes to specify
> which blobs should not be searched, this series makes this faster
> by not having to to read the binary blobs at all.  And I'd be
> okay with that.

Yes, exactly. I think this will end up being a big win for such cases,
because the cost of loading even one large binary file from disk will
dwarf all of the stats. But it does depend on people marking their
binaries and using "-I".

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]