On Sun, Oct 8, 2023 at 6:44 PM Taylor Blau <me@xxxxxxxxxxxx> wrote: > > On Sun, Oct 08, 2023 at 06:45:02AM +0000, Elijah Newren via GitGitGadget wrote: > > It turns out that AI is pretty good at making small fixes to documentation; > > certainly not perfect, but it provides quite good signal. Unfortunately, > > there is a lot to sift through. Some points about my strategy: > > Quite interesting ;-). > > I'm curious to learn a little bit more about your > strategy beyond what you wrote: > > - What tool did you use? ChatGPT? Something home-grown? A mixture of gpt-4 and gpt-4-32k (I would have just used gpt-4, but trying to give it a full file blows the token limit on several of Git's documentation files). Also, it was sent to an internally hosted instance. On this internal instance, it seemed to require passing the api-version=2023-03-15-preview parameter. I don't really know what that parameter means, but I suspect it might have been some 6-months-ish old version of gpt-4? > - (Assuming this was generated by some sort of LLM): what did you > prompt it with? Note that it was exactly one file per prompt, which was as follows: """ For the asciidoc file below, are there any typos, grammatical errors, or wording problems? If so, please highlight them along with proposed corrections: -------------------- ${FILE_CONTENTS} """ If I had to do it over, I'd be much more explicit about the output format. Probably, "Please respond by outputting the full file, with any corrections included. If there are no corrections, simply output the original file as-is." which would allow me to simply diff the output and look at the changes. Also, I would probably specify that "The ascii doc file starts three lines below, just after the line of dashes", hoping that would help it avoid sometimes presuming that the dashes were part of the file. > - What was the output format: the edited text in its entirety, or a > patch that can be applied on top? My wording was unfortunately vague, so I sometimes got human prose instructing me with a change to make, sometimes I got a bulleted list in the form "${old_text} -> ${new_text}", but most of the time it printed the file (or a subset thereof) with corrections. I also had all the output concatenated into one large file, which made it "fun" to work through all the changes. Even when diffing files, I manually applied any changes I saw to the actual file (which did risk introducing new typos, and missing some of the corrections, but did ensure I reviewed everything). Also, not only did I get different output formats, but there were many times the file was cut off at some point. I sometimes assumed that just meant there were no changes outside that region, but there were times where there was only one change and it had given me hundreds of lines of context around it before it cut off, so it did leave me with the feeling it might have only processed or responded to part of the file. There were also several times where the changes it suggested were a no-op, making me wonder if it just failed or something -- I looked at it really closely (including sometimes piping the output through xxd, and thus once noticed a change of tab-after-period to space-after-period), but when it was responding with human prose and said something like "Change the sentence that reads '${old_version}' -> '${old_version}', it made me wonder if something just went haywire with the LLM and I should retry. However, despite the above issues making me think there are more documentation issues to be found with an LLM, I didn't re-check any files unless I got an error with no output (e.g. excessive number of tokens, or I've hit rate limits on using the API). I didn't bother, because the firehose of changes it provided me even without those caveats was far more than enough to deal with.