Hi Jeremy, On Sun, Jun 11, 2023 at 09:17:19AM +0100, Jeremy Sowden wrote: > The `shift` variable which indicates the offset in the string at which > to start matching the pattern is initialized to `bm->patlen - 1`, but it > is not reset when a new block is retrieved. This means the implemen- > tation may start looking at later and later positions in each successive > block and miss occurrences of the pattern at the beginning. E.g., > consider a HTTP packet held in a non-linear skb, where the HTTP request > line occurs in the second block: > > [... 52 bytes of packet headers ...] > GET /bmtest HTTP/1.1\r\nHost: www.example.com\r\n\r\n > > and the pattern is "GET /bmtest". > > Once the first block comprising the packet headers has been examined, > `shift` will be pointing to somewhere near the end of the block, and so > when the second block is examined the request line at the beginning will > be missed. > > Reinitialize the variable for each new block. > > Adjust some indentation and remove some trailing white-space at the same > time. > > Fixes: 8082e4ed0a61 ("[LIB]: Boyer-Moore extension for textsearch infrastructure strike #2") > Link: https://bugzilla.netfilter.org/show_bug.cgi?id=1390 > Signed-off-by: Jeremy Sowden <jeremy@xxxxxxxxxx> > --- > lib/ts_bm.c | 16 +++++++++------- > 1 file changed, 9 insertions(+), 7 deletions(-) > > diff --git a/lib/ts_bm.c b/lib/ts_bm.c > index 1f2234221dd1..ef448490a2cc 100644 > --- a/lib/ts_bm.c > +++ b/lib/ts_bm.c > @@ -60,23 +60,25 @@ static unsigned int bm_find(struct ts_config *conf, struct ts_state *state) > struct ts_bm *bm = ts_config_priv(conf); > unsigned int i, text_len, consumed = state->offset; > const u8 *text; > - int shift = bm->patlen - 1, bs; > + int bs; > const u8 icase = conf->flags & TS_IGNORECASE; > > for (;;) { > + int shift = bm->patlen - 1; This line is the fix, right? > text_len = conf->get_next_block(consumed, &text, conf, state); > > if (unlikely(text_len == 0)) > break; > These updates below are a clean up, right? If so, maybe split this in two patches I'd suggest? > while (shift < text_len) { > - DEBUGP("Searching in position %d (%c)\n", > - shift, text[shift]); > - for (i = 0; i < bm->patlen; i++) > + DEBUGP("Searching in position %d (%c)\n", > + shift, text[shift]); > + for (i = 0; i < bm->patlen; i++) > if ((icase ? toupper(text[shift-i]) > - : text[shift-i]) > - != bm->pattern[bm->patlen-1-i]) > - goto next; > + : text[shift-i]) > + != bm->pattern[bm->patlen-1-i]) Maybe disentagle this with a few helper functions? static char bm_get_char(const char *text, unsigned int pos, bool icase) { return icase ? toupper(text[pos]) : text[pos]; } Thanks > if ((icase ? toupper(text[shift-i]) > - : text[shift-i]) > + goto next; > > /* London calling... */ > DEBUGP("found!\n"); > -- > 2.39.2 >