On Thu, Mar 25, 2010 at 11:02 AM, Allan McRae <allan@xxxxxxxxxxxxx> wrote: > Upstream big update. > > Local changelog: > - Removed the multibyte locale speed-up patch (and all the patches to fix > the issues it created...) as it is now included upstream. > - Removed the other patches as it appears they are not being considered > upstream. > > Upstream NEWS: > * Noteworthy changes in release 2.6 (2010-03-23) [stable] > > ** Speed improvements > > grep is much faster on multibyte character sets, especially (but not > limited to) UTF-8 character sets. The speed improvement is also very > pronounced with case-insensitive matches. > That's awesome. After all these years, I thought this would never happen :) I did a quick benchmark before and after, and I got very similar results, so we are good. grep -i is still considerably slower than grep in UTF-8 (0.1 -> 1.5s , that is 15x slower), but IIRC it was MUCH worse with an unpatched grep 2.5, like hundred of times slower. With LANG=C , grep and grep -i are both at 0.1s.