On Sun, Apr 05, 2009 at 12:04:12AM -0700, Robin H. Johnson wrote: > Before I answer the rest of your post, I'd like to note that the matter > of which choice between single-repo, repo-per-package, repo-per-category > has been flogged to death within Gentoo. > > I did not come to the Git mailing list to rehash those choices. I came > here to find a solution to the performance problem. I understand. I know two ways to resolve this: - by resolving the performance problem itself, - by changing the workflow to something more accurate and more suitable against the facts. My point is that going from a centralized to a decentralized SCM involves breacking strongly how developers and maintainers work. What you're currently suggesting is a way to work with Git in a centralized way. This sucks. To get the things right with Git I would avoid shared and global repositories. Gnome is doing it this way: http://gitorious.org/projects/gnome-svn-hooks/repos/mainline/trees/master > The GSoC 2009 ideas contain a potential project for caching the > generated packs, which, while having value in itself, could be partially > avoided by sending suitable pre-built packs (if they exist) without any > repacking. Right. It could be an option to wait and see if the GSoC gives something. > Also, I should note that working on the tree isn't the only reason to > have the tree checked out. While the great majority of Gentoo users have > their trees purely from rsync, there is nothing stopping you from using > a tree from CVS (anonCVS for the users, master CVS server for the > developers). > > A quick bit of stats run show that while some developers only touch a > few packages, there are at least 200 developers that have done a major > change to 100 or more packages. That's a point that has to be reconsidered. Not the fact that at least 200 developers work on over 100 packages (this is really not an issue)¹ but the fact that they do that directly on the main repo/server. The good way to achieve this is to send his work to the maintainer². The main issue is a better code reviewing. 1. Some or all repo-per-category can be tracked with a simple script. 2. Maintainers could be - or not be - the same developers as today. Adding a layer of maintainers in charge of EAPI review (for example) up to the packages-maintainers could help in fixing a lot of portage issues and would avoid "simple developers" to do crap on the main repo(s) that users download. > And per-package numbers, because we DID do an experimental conversion, > last year, although the packs might not have been optimal: > - ~410MiB of content (w/ 4kb inodes) > - 4.7GiB of Git total overhead, with a breakdown: > - 1.9GiB in inode waste > - 2.8GiB in packs Ok. > > One repo per category could be a good compromise assuming one seperate > > branch per package, then. > Other downsides to repo-per-category and repo-per-package: Let's forget a repo-per-package. > - Raises difficulty in adding a new package/category. > You cannot just do 'mkdir && vi ... && git add && git commit' anymore. Right, but categories are not evolving that much. > - The name of the directory for both of the category AND the package are not > specified in the ebuild, as such, unless they are checked out to the right > location, you will get breakage (definitely in the package name, and > about 10% of the time with categories). Of course. Quite franckly, it's recoverable without pain. A repo-per-category local workflow would be: $ git branch master * next package_one package_two [...] $ tree -a |-- .git | |-- [...] | [...] |-- package_one | |-- ChangeLog | |-- Manifest | |-- metadata.xml | |-- package_one-0.4.ebuild | `-- package_one-0.5.ebuild |-- package_two | |-- ChangeLog | |-- Manifest | |-- files | | |-- package_two.confd | | `-- package_two.rc | |-- metadata.xml | `-- package_two-0.7-r3.ebuild [...] $ git checkout package_one $ tree -a |-- .git | |-- [...] | [...] `-- package_one |-- ChangeLog |-- Manifest |-- metadata.xml |-- package_one-0.4.ebuild `-- package_one-0.5.ebuild $ <hack, hack, hack> $ git checkout next $ git merge package_one > - Does NOT present a good base for anybody wanting to branch the entire > tree themselves. Scriptable. > We're already on track to drop the CVS $Header$, and thereafter, some of the > ebuilds are already on track to be smaller. Here's our prototype dev-perl/Sub-Name-0.04. > ==== > # Copyright 1999-2009 Gentoo Foundation > # Distributed under the terms of the GNU General Public License v2 > MODULE_AUTHOR=XMATH > inherit perl-module > DESCRIPTION="(re)name a sub" > LICENSE="|| ( Artistic GPL-2 )" > SLOT="0" > KEYWORDS="~amd64 ~x86" > IUSE="" > SRC_TEST=do > ==== > > We can have all the CPAN packages from CPAN author XMATH, with changing > only the DESCRIPTION string. KEYWORDS then just changes over the package > lifespan. Sounds good. -- Nicolas Sebrecht -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html