Thomas Rast <trast@xxxxxxxxxxx> writes: > Junio C Hamano <gitster@xxxxxxxxx> writes: > >> Thomas Rast <trast@xxxxxxxxxxxxxxx> writes: >> >>> I like the general idea, too, but I think there is a long way ahead, and >>> we shouldn't hold up v5 on this. >> >> We shouldn't rush, only to keep some deadline, and regret it later >> that we butchered the index format without thinking things through. >> When this was added to the GSoC idea page, I already said upfront >> that this was way too big a topic to be a GSoC project, didn't I? > > Let me spell out my concern. There are two v5s here: > > * The extent of the GSoC task. > > * The eventual implementation of index-v5 that goes into Git mainline. > > IMHO this thread is mixing up the two. There indeed must not be any > rush in the final implementation of index-v5. However, the GSoC ends in > less than two weeks, and I have to evaluate Thomas on whatever is > finished until then. This is the primary reason why I have recused myself from the Mentor pool. My involvement in this thread is mostly about the latter. It is not like "I do not really care about GSoC", but the maintainer works for what is best for the project, not for GSoC schedule. > AFAIK Thomas is now cleaning up the existing code to be in readable > shape, using your feedback, which is great. However, the above > suggestion is such a fuzzily-specified task that there is no way to even > find out what needs to be done within the next two weeks. Yes, it is the mentor's job to (1) keep an eye on the progress of the student, (2) avoid giving a task that is too big to chew within the given timeframe, and (3) help the student learn the skill to break down large tasks to manageable pieces. > Perhaps it > makes sense, at this point, to wrap anything that ended up having _v[25] > suffixes in an index_ops like Duy did. Yes, I think that suggestion was a welcome input for the mentor and the student (item (3) above). > That's a long way from actually > following through on the idea, though. I think that is perfectly fine, both from the point of view of the project maintainer (who officially does not give a whit about GSoC schedule) and from the point of view of somebody who cares about the health of the development community (and as one part of it, cares about the GSoC student project). If Git GSoC admins initially picked a project that is too large by mistake, finishing a subpart of it that is of reasonable size and polishing the result into a nice shape would be the best the student can do, and the grading should be done on the quality of that subtask alone. It may not directly help the project without the remainder, but that is not the student's fault. But as I am not part of the Mentor pool, what I wrote in this paragraph is just my opinion. > I think the part you snipped > >>> the loops that iterate over the index [...] either >>> skip unmerged entries or specifically look for them. There are subtle >>> differences between the loops on many points: what do they do when they >>> hit an unmerged entry? Or a CE_REMOVED or CE_VALID one? > > is a symptom of the same general problem: the data structures are sound, > but they are leaking all over the code and now we have lots of > complexity to do even simple operations like "for each unmerged entry". I do not think I was arguing against an updated cleaner API, so we are in agreement. In fact, I was saying that the calling code should be ported to such a cleaner API and in-core data structure first, and only then an optimal on-disk representation of the in-core data structure can be designed. The mistaken title of this GSoC topic was one of the root cause of the issues, I think, you are seeing. It said "faster file format", but file format is a result of a design of the code that uses the data, not the other way around. That, and also the project scope is too large for a summer student project as I said in the very beginning. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html