Hi Konstantin, On Tue, Feb 27, 2024 at 05:32:34PM -0500, Konstantin Ryabitsev wrote: > Hi, all: > > I was playing with shell-gpt and wrote a quickie integration that would allow > retrieving (slimmed-down) threads from lore, feeding them to ChatGPT, and > asking it to provide some basic analysis of the thread contents. Here's a > recorded demo session: > > https://asciinema.org/a/643435 > > A few notes: > > 1. This is obviously not a replacement for actually reading email, but can > potentially be a useful asset for a busy maintainer who just wants a quick > summary of a lengthy thread before they look at it in detail. > 2. This is not free or cheap! To digest a lengthy thread, you can expect > ChatGPT to generate enough tokens to cost you $1 or more in API usage fees. > I know it's nothing compared to how expensive some of y'all's time is, and > you can probably easily get that expensed by your employers, but for many > others it's a pretty expensive toy. I managed to make it a bit cheaper by > doing some surgery on the threads before feeding them to chatgpt (like > removing most of the message headers and throwing out some of the quoted > content), but there's a limit to how much we can throw out before the > analysis becomes dramatically less useful. > 3. This only works with ChatGPT-4, as most threads are too long for > ChatGPT-3.5 to even process. > > So, the question is -- is this useful at all? Am I wasting time poking in this > direction, or is this something that would be of benefit to any of you? If the > latter, I will document how to set this up and commit the thread minimization > code I hacked together to make it cheaper. Amusing, I've run experiments about something comparable with my own e-mails (I'd like to get a few lines summary before reading them), and thought about being able to summarize long LKML threads to still know what is currently going on without having to spend a lot of time on all of them. I figured a number of shortcomings about this: I suspect that those most interested in such output are either, a bit like me, not much active on kernel development, or focus on a specific area and mostly want to stay aware of ongoing changes in other areas they're really not familiar with. And because of this I didn't find on what boundaries to cut the analysis, If it's "since last time I read my email", it can only be done locally and will be per-user. If it's a summary of a final thread, it's not super interesting and it's better explained (IMHO) on LWN where the hot topics are summarized and developed. If it's the list of threads of the day, I've suspected that there are so many that it's unlikely I'd read all of them every evening or every morning. I've been wondering if an interesting approach would be to only summarize long threads, since most short ones are a patch, a review and an ACK and do not need to be summarized, but I think that most of us seeing a subject repeat over many e-mails will just look at a few exchanges there to get an idea of what's going on. Ideally having a link in each thread to a place where a summary is being held could be nice, except that it's not how such tools work. You certainly don't want to re-run the analysis on a whole thread every time it grows by a few messages due to processing time and cost. Also regarding processing costs, I've had extremely good results using the Mixtral-8x7B LLM in instruct mode running locally. It has a 32k context like GPT4. And if not enough, given that most of a long thread's contents is in fact quoted text, it could be sufficient to drop multiple indents to preserve a response and its context while dropping most of the repeat (it cuts your example thread in roughly half). But this still takes quite a bit of processing time: processing the 14 mails from the thread above took 13 minutes on a 80-core Ampere Altra system (no GPU involved here). This roughly costs 1 minute per e-mail, that's a lot per day, not counting the time needed to tune the prompt to get the best results! Overall, while I think that some people might find "something like this" useful, most of them would want it "slightly different" to be useful to them. Just my two cents, Willy