On 10/2/24 7:56 PM, Jeff King wrote:
On Wed, Oct 02, 2024 at 04:07:04PM +0000, Derrick Stolee via GitGitGadget wrote:
...
Good catch. Your analysis looks correct, though I had two questions I managed to answer myself: 1. This seems like an obvious use-after-free problem. Why didn't we catch it sooner? I think the answer is that most uses of the output_prefix callback happen via diff_line_prefix(), which was not broken by 394affd46d. It's only line-log that was affected. Building with ASan and running: ./git log --graph -L :diff_line_prefix:diff.c shows the problem immediately (and that your patch fixes it). Should we include a new test in this patch?
Thank you for sending a test in your second reply. I will include it in v2.
2. If the caller isn't freeing it, then who does? Should diffopt cleanup do so? The answer is "no", the pointer is not owned by that struct. In your cases (1) and (3), the caller does so after finishing with the diffopt struct. In case (2) it is effectively "leaked", but still reachable by the static strbuf. That's not great, and is a problem for eventual libification. But for now, we can ignore it as it won't trigger the leak-checker.
I agree with this, including the potential for cleaning up the static strbuf and using the 'data' parameter like the other functions.
It does make me wonder what leak Patrick saw that caused him to write 394affd46d, and whether we're now leaking in some case that I'm missing.
Looking at the change, I can only guess that it was the previous use of char *prefix = ""; that signaled that an unfreeable string was being assigned to a non-const pointer. This signals that _something_ is wrong with the function, but the way that the buffer is returned by the function pointer is suspicious, too.
I do think this would have been a harder mistake to make if the callback simply returned a "const char *" pointer. We'd lose the ability to show prefixes with embedded NULs, but I'm not sure that's a big deal. It would also help for line-log to use the existing helper rather than inventing its own. So together on top something like this (which I'd probably turn into two patches if this seems to others like it's valuable and not just churn):
I do agree that changing the return type will make this easier to prevent and the code should be easier to read as well. Your diffed patch looks pretty good. I made an attempt at guessing where you would have split them (first remove the duplicate method, then change the method prototype and callers). I even took some time to attempt to remove the static strbuf from diff_output_prefix_callback() in favor of using the 'data' member of the diff_options struct, but it was not incredibly obvious how to communicate ownership of the struct which would need to store both the graph struct and the strbuf. Perhaps this would be good for #leftoverbits. Thanks, -Stolee