Re: [PATCH] cache-tree: replace a sscanf() by two strtol() calls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes:

> On one of my systems, sscanf() first calls strlen() on the buffer. But
> this buffer is not terminated by NUL. So git crashed.

Interesting.

"git grep sscanf next -- '*.c'" reports handful sscanf() there.
Most of them scan argv[], which are NUL terminated, so they
should be OK.  The rest in convert-objects.c, tree-walk.c, and
tree.c all scan the mode bits in tree objects, which will later
have the pathname component terminated with NUL, so that would
also be OK.

But I think your patch is wrong, and makes it ignore cache-tree
structure; I suspect you have two off-by-one errors and are
making buf and size out of sync.

> 	Maybe, a better solution would be to store the integers in 
> 	binary form. But I am not familiar with that part of git, and
> 	further, it would break setups which already have an index
> 	with cache-tree information.

In theory, it is stashed in the extension section of index so we
could define a new extension type, say "TRE2" and store the
information in the new format.  But this is purely a cache used
as optimiation, so we could just say, "make sure to save local
modifications before doing an update, then run 'rm .git/index &&
git-read-tree HEAD' please".

I've applied a fixed up one, but I am actually thinking about
ripping out the whole cache-tree thing and redoing it all in the
index.
 
Currently the index stores set of blobs after flattening the
hierarchical tree structure, losing the original "tree"
information.  We could instead store something that looks like
"ls-tree -t -r" output (plus the toplevel tree information which
"ls-tree -t -r" does not give you).  Just like an update-index
on the path t/t0000-basic.sh invalidates the cache-tree entry
for "" and "t/", we could either invalidate or recompute (I am
inclined to do the former to make things lazy) these "tree"
entries in the index.  This would be more direct way to store
what I am storing in the cache-tree.

Keeping the object names of unchanged subtrees available will
allow us to walk the index and a tree (or two or more trees) in
parallel in various applications.  "diff-index --cached" and
"read-tree -m" extract one entry from tree and index for each
blob, but when we have an up-to-date information for a subtree
in the index, and when that subtree matches the corresponding
subtree in the tree object diff-index or read-tree is reading,
the application can short-cut without reading anything in the
subtree.

-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]