On Tue, 7 Apr 2009, Jakub Narebski wrote: > On Tue, 7 Apr 2009, Nicolas Pitre wrote: > > Having git-rev-list consume about 2G RSS for the enumeration of 4M > > objects is simply inacceptable, period. This is the equivalent of 500 > > bytes per object pinned in memory on average, just for listing object, > > which is completely silly. We ought to do better than that. > > I have thought that the large amount of memory consumed by git-rev-list > was caused by not-so-sequential access to very large packfile (1.5GB+ if > I remember correctly), which I thought causes the whole packfile to be > mmapped and not only window, plus large amount of objects in 300MB+ mem > range or something; those both would account for around 2GB. The pack has not to be mapped all at once. At least on 32-bit machines the total pack mappings cannot exceed 256MB total by default. On 64-bit machines the default is 8GB which might not work very well if total amount of RAM is lower than that. Another consideration is the object layout in a pack. Currently we have tree and blob objects mixed together so to have sequential pack access when performing a checkout. Maybe having trees packed together would help a lot with object enumeration as the blobs have not to be mapped at all. Remains to see how that might impact other operations though. > Besides even if git-rev-list wouldn't take so much memory, object > enumeration caching would still help with CPU load... admittedly less. Yes, but let's not lose sight of all the inconvenients associated with extra caching. If we can get away without it then all the better. Nicolas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html