Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote: > OK, so I understand how to clone archives from lore.kernel.org and how > to convert a git archive to a maildir (thanks, Konstantin!) > > What I *don't* understand is how to effectively read this locally. > Ideally I'd like to run mutt, possibly with notmuch for indexing. But > a maildir with 3M files seems impractical. I did actually try it > (without notmuch), but it takes mutt about 5 minutes to start up. And > the maildir is about 23G, compared with 7.5G for the git archive. Right, relying on Maildir for long-term storage of giant archives is not a usable solution with any general purpose FSes I know about. git itself had the same problem with loose object scalability in the old days and packs were invented as a result. > Any pointers? I guess there's no mutt backend that can read a > public-inbox archive directly? There's mutt patches to support reading over NNTP, so that works: mutt -f news://$INBOX_HOST/$INBOX_NEWSGROUP I don't think mutt handles mboxrd 100% correctly, but it's close enough that you can can download the gzipped mboxrd of a search query and open it via "mutt -f /path/to/downloaded/mbox.gz" curl -XPOST -OJ "$INBOX_URL/?q=$SEARCH_QUERY&x=m" POST is required(*), and -OJ lets it use the Content-Disposition: header for a meaningful server-generated name, but you can also redirect the result to whatever you want. For all messages since March 1, you could use: SEARCH_QUERY=d:20190301.. All the supported search queries are documented in $INBOX_URL/_/text/help/ and the search prefixes (e.g. "d:", "s:", "b:") are modeled after what's in mairix. You'll need to escape the queries for URIs (e.g. " " => "+", and so on). Xapian requires date ranges to be denoted with ".." whereas mairix uses "-" for ranges. The main thing public-inbox search misses from mairix is support for "-t" which grabs non-matching messages from the same thread. I would like to support that someday, but don't have enough time (or funding) to make it happen at the moment. (*) to reliably avoid wasting resources from spiders/prefetchers _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies