On Thu, Mar 07, 2024 at 04:40:46PM -0800, Junio C Hamano wrote: > Junio C Hamano <gitster@xxxxxxxxx> writes: > > > Patrick Steinhardt <ps@xxxxxx> writes: > > > >> The consequence is that `binsearch()` essentially always returns 0, > >> indicacting to us that we must start searching right at the beginning of > >> the block. This works by chance because we now always do a linear scan > >> from the start of the block, and thus we would still end up finding the > >> desired record. But needless to say, this makes the optimization quite > >> useless. > > >> Fix this bug by returning whether the current key is smaller than the > >> searched key. As the current behaviour was correct it is not possible to > >> write a test. Furthermore it is also not really possible to demonstrate > >> in a benchmark that this fix speeds up seeking records. > > > > This is an amusing bug. > > Having said all that. > > I have to wonder if it is the custom implementation of binsearch() > the reftable/basic.c file has, not this particular comparison > callback. It makes an unusual expectation on the comparison > function, unlike bsearch(3) whose compar(a,b) is expected to return > an answer with the same sign as "a - b". > > I just checked the binary search loops we have in the core part of > the system, like the one in hash-lookup.c (which takes advantage of > the random and uniform nature of hashed values to converge faster > than log2) and ones in builtin/pack-objects.c (both of which are > absolute bog-standard). Luckily, we do not use such an unusual > convention (well, we avoid overhead of compar callbacks to begin > with, so it is a bit of apples-to-oranges comparison). Very true, this behaviour cought me by surprise, as well, and I do think it's quite easy to get wrong. Now I would've understood if `binsearch()` were able to handle and forward errors to the caller by passing -1. And I almost thought that was the case because `restart_key_less()` can indeed fail, and it would return a negative value if so. But that error return code is then not taken as an indicator of failure, but instead will cause us to treat the current value as smaller than the comparison key. But we do know to bubble the error up via the pasesd-in args by setting `args->error = -1`. Funny thing though: I just now noticed that we check for `args.error` _before_ we call `binsearch()`. Oops. I will send a follow-up patch that addresses these issues. Patrick
Attachment:
signature.asc
Description: PGP signature