On Sun, Mar 3, 2019 at 2:18 PM Christian Couder <christian.couder@xxxxxxxxx> wrote: > One thing I am still worried about is if we are sure that adding > parallelism is likely to get us a significant performance improvement > or not. If the performance of this code is bounded by disk or memory > access, then adding parallelism might not bring any benefit. (It could > perhaps decrease performance if memory locality gets worse.) So I'd > like some confirmation either by running some tests or by experienced > Git developers that it is likely to be a win. This is a good point. My guess is the pack access consists of two parts: deflate zlib, resolve delta objects (which is just another form of compression) and actual I/O. The former is CPU bound and may take advantage of multiple cores. However, the cache we have kinda helps reduce CPU work load already, so perhaps the actual gain is not that much (or maybe we could just improve this cache to be more efficient). I'm adding Jeff, maybe he has done some experiments on parallel pack access, who knows. The second good thing from parallel pack access is not about utilizing processing power from multiple cores, but about _not_ blocking. I think one example use case here is parallel checkout. While one thread is blocked by pack access code for whatever reason, the others can still continue doing other stuff (e.g. write the checked out file to disk) or even access the pack again to check more things out. -- Duy