On Fri, Dec 10, 2021 at 03:27:24PM +0100, Johannes Schindelin wrote: > > I'm not sure if this is helpful or not, but this is the minimal fix I > > came up with that runs the testcase I showed earlier. It's basically > > just swapping out "int" for "ssize_t" for any variables we use to index > > the arrays (though note a few are themselves held in arrays, and we have > > to cross some function boundaries). > > > > I won't be surprised if it doesn't hit all cases, or if it even hits a > > few it doesn't need to (e.g., should "phase" be dragged along with "i" > > and "j" in the first hunk?). I mostly did guess-and-check on the > > test-case, fixing whatever segfaulted and then running again until it > > worked. I didn't even really read the code very carefully. > > > > I think you _did_ do more of that careful reading, and broke down the > > refactorings into separate patches in your series. Which is good. So I > > think what we'd want is to pick out those parts of your series that end > > up switching the variable type. My goal in sharing this here is just to > > show that the end result of the fix can (and IMHO should) be around this > > same order of magnitude. > > I am in favor of this patch. Will you have time to submit this with a > commit message? I'm not at all sure that it's sufficient. It avoids overflowing the cost array accesses, and was tested on a square input of 2^15. But: - some of the other ints may need changing, too (e.g., column_count). Probably 2^31 commits is out of reach in practice (and probably other parts of Git run into problems there anyway). But some of those arguments may just be (a->nr * b->nr), whereas I was testing overflow at (a->nr + b->nr)^2. - I threw around ssize_t willy-nilly. Some of those could be size_t, and I think Ævar's patches go in the direction of splitting the two, which is good. - some light refactoring may be helpful to split those cases ahead of time. So I was hoping Ævar would take this approach and run with it. I just reviewed his follow-up series, though, and it looks like it is still putting bounds-checks into the COST macro, which I think is not sufficient. -Peff