Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes: >> It is a bit surprising that ref_iterator does not know which >> repository it is working in (regardless of "include broken" bit). >> Do you think it will stay that way? I have this nagging feeling >> that it won't, and having "struct repository *repository" pointer >> that always points at the repository the ref-store belongs to in a >> ref_iterator instance would become necessary in the longer run. > > I think it's better if it stays that way, so that callers that don't > want to access the ODB (all the callers that currently use > DO_FOR_EACH_INCLUDE_BROKEN) can be assured that the iterator won't do > that. If we had a non-NULL "struct repository *repository" parameter, a > future code change might inadvertently use it, thus causing a bug. Hmph, I am unlikely to be the one who is doing such a future code change, but anybody who equates having a pointer to the repository structure and the need to always validate the tips of refs with the object store associated to the repository would be poor future developers we wouldn't want their hands on our codebase. An in-core repository has a lot more than the object store. Besides, if we really want to have choice between "do check them" and "ignore broken" expressed cleanly in the code, it would be much better to be explicit about it, and a member in the context struct whose name is ".repo" is not it[*]. A reader would say "ok, I see a repo. What is that thing for? Can I reliably use it to figure out other things about the repository this reference enumeration is going on from it? Ah, no, it sometimes is NULL---what crazy thing is going on here???". Side note. If it were called .check_ref_tips_against_the_object_store_of_this_repository, it would be a different story. >> In which case, this .repo member this patch adds would become a big >> problem, no? If we were to validate objects at the tip of the refs >> against object store, we will always use the object store that >> belongs to the iterator->repository, so the only valid states for >> iterator->repo are either NULL or iterator->repository. That again >> is the same problem I pointed out already about the parameter the >> do_for_each_repo_ref() helper that is inviting future bugs, it seems >> to me. Wouldn't it make more sense to add >> >> * iterator->repository that points at the repository in which we >> are iterating the refs >> >> * a bit in iterator that chooses between "do not bother checking" >> and "do check the tip of refs against the object store of >> iterator->repository >> >> to avoid such a mess? Perhaps we already have such a bit in the >> flags word in the ref_iterator but I didn't check. > > If we need iterator->repository, then I agree with you. The bit could > then be DO_FOR_EACH_INCLUDE_BROKEN (which currently exists, and which I > am removing in this patch, but if we think we should keep it, then we > should keep it). I do not care too much about the bit itself. I have huge trouble with the idea that representing a single bit with an entire pointer to a repository, which can cause confusion down the line. Those who want to have an access to the repository does not have a reliable way to get to it, those who do set it can mistakenly set to point at an unrelated repository. > Having said all that, it may be better to have a non-NULL repository > reference in the iterator and retain DO_FOR_EACH_INCLUDE_BROKEN - at the Yes. > very least, this is a more gradual change and still leaves open the > possibility of turning the repository reference into a nullable one. > Callers that use DO_FOR_EACH_INCLUDE_BROKEN will have to deal with an > API that is unclear about what is being done with the repository object > being passed in, but that is the same as the status quo. I'll try it and > see how it goes (it will probably simplify patch 2 too). I do not think a structure member of type "struct repository" that signals anything but "we are not operating in a repository" by being NULL is a sane interface, so I do not see any positive value in leaving opent he possibility at all. The next person who would want to _optionally_ use a repository for some other purpose (other than "checking the validity of the object name") may be tempted to add another member .repo2 next to your .repo---and that would be insane, given that ref iterator will be iterating over a ref store of a single repo at any given time. It is much saner to have a single "this is the repository the refstore we are iterating over is attached to" instance, with separate bits "please do validate" and "do whatever check the second develoepr wanted to signal the need for with the .repo2 member". Thanks.