On Friday 28 August 2009, Johannes Schindelin wrote: > Hi, > > On Fri, 28 Aug 2009, Johan Herland wrote: > > On Thursday 27 August 2009, Junio C Hamano wrote: > > > "Shawn O. Pearce" <spearce@xxxxxxxxxxx> writes: > > > > Yea, it was me. I still think it might be a useful idea, since > > > > it allows you better density of loading notes when parsing the > > > > recent commits. In theory the last 256 commits can easly be in > > > > each of the 2/ fanout buckets, making 2/38 pointless for > > > > reducing the search space. Commit date on the other hand can > > > > probably force all of them into the same bucket, making it easy > > > > to have the last 256 commits in cache, from a single bucket. > > > > > > > > But I thought you shot it down, by saying that we also wanted > > > > to support notes on blobs. I happen to see no value in a note > > > > on a blob, a blob alone doesn't make much sense without at > > > > least an annotated tag or commit to provide it some named > > > > context, and the latter two have dates. > > > > > > Yeah, and in this thread everybody seems to be talking about > > > commits so I think it is fine to limit notes only to commits. > > > > Agreed. I'm starting to come around to the idea of storing them in > > subtrees based on commit dates. For one, you don't have multiple > > notes for one commit in the same notes tree. Also, the common-case > > access pattern seems tempting. > > > > Dscho: Were there other problems with the date-based approach other > > than not supporting notes on trees and blobs? > > It emphasized an implementation detail too much for my liking. > > And I would rather have some flexibility in the code as to _when_ it > fans out and when not. > > So I can easily imagine a full repository which has only, say, 5 > notes. Why not have a single tree for all of those? Yes, if you only have a handful of notes, the date-based approach is definitely overkill. On the other hand, if you only have a handful of notes, performance is not going to be a problem in the first place, no matter which notes structure you use... > And I can easily imagine a repository that has a daily note generated > by an automatic build, and no other notes. The date-based fan-out > just wastes our time here, and even hurts performance. What about a month-based fanout? Looking at the kernel repo with git log --all --date=iso --format="%ad" | cut -c1-7 | sort | uniq -c | sort -n I find that commits are spread across 66 months, and the most active month (2008-07) has 5661 commits. If we assume the one-note-per-commit worst case, that gives up to 5661 notes per month-based subdir. Is that too much? Doing for subdir in $(find . -type d); do echo "$(ls -1 $subdir | wc -l) $subdir" done | sort -n shows me that the currently largest tree in the kernel has 985 entries (include/linux), so a 5661-entry tree is probably larger than what git is used to... ...just thinking that we shold make things as simple as possible (but no simpler), and if a month-based fanout works adequately in all practical cases, then we should go with that... ...Johan -- Johan Herland, <johan@xxxxxxxxxxx> www.herland.net -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html