On 3/21/2023 2:27 PM, Jeff King wrote: > On Tue, Mar 21, 2023 at 02:16:40PM -0400, Taylor Blau wrote: > >> On Tue, Mar 21, 2023 at 02:13:15PM -0400, Jeff King wrote: >>> I'm not 100% sure on where these offsets come from. But it looks like >>> they're coming from the bitmap lookup table. In which case a bogus value >>> there should be an error(), and not a BUG(), I would think. >> >> They do come from the lookup table, yes. I'm not sure that I agree that >> bogus values here should be an error() or a BUG(), or if I even have a >> strong preference between one and the other. > > The usual philosophy we've applied is: a BUG() should not be > trigger-able, even if Git is fed bad data. A BUG() should indicate an > error in the program logic, and if we see one, there should be a code > fix that handles the case. > > Whereas if I understand this correctly, if I corrupt the bitmap file on > disk, we'd trigger this BUG(). > > In many cases I think one could argue that it's kind of academic. But in > this case we should be able to say "oops, the bitmap file seems corrupt" > and skip using it, rather than bailing completely from the process. It's not just academic. BUG() statements kill the process without running important cleanup steps like deleting open .lock files or outputting the final traces. This can be especially problematic when we count on those operations in order to recover a repository from such errors. >> But I do think that trying to make it an error() makes it awkward for >> all of the other callers that want it to be a BUG(), since the detail of >> whether to call one or the other is private to bitmap_index_seek(). >> >> We *could* open-code it, introduce a variant of bitmap_index_seek(), >> make it take an additional parameter specifying whether to call one over >> the other, *or* check the bounds ourselves before even calling >> bitmap_index_seek(). > > I'm mostly unconvinced of the value of bitmap_index_seek() doing > checking at all, because it is too late in most of the cases. In fact it > is only in this case that it is doing something useful, which makes me > think that the check should be open-coded here. If we universally check whether bitmap_index_seek() works, then there is value. It avoids the existing ad-hoc checks in favor of always-on checks (as well as avoiding potential disconnects between the check and the seeked position in the future). Thanks, -Stolee