On 2023-06-19, at 15:49:14 +0200, Pablo Neira Ayuso wrote: > On Sun, Jun 11, 2023 at 09:17:19AM +0100, Jeremy Sowden wrote: > > The `shift` variable which indicates the offset in the string at which > > to start matching the pattern is initialized to `bm->patlen - 1`, but it > > is not reset when a new block is retrieved. This means the implemen- > > tation may start looking at later and later positions in each successive > > block and miss occurrences of the pattern at the beginning. E.g., > > consider a HTTP packet held in a non-linear skb, where the HTTP request > > line occurs in the second block: > > > > [... 52 bytes of packet headers ...] > > GET /bmtest HTTP/1.1\r\nHost: www.example.com\r\n\r\n > > > > and the pattern is "GET /bmtest". > > > > Once the first block comprising the packet headers has been examined, > > `shift` will be pointing to somewhere near the end of the block, and so > > when the second block is examined the request line at the beginning will > > be missed. > > > > Reinitialize the variable for each new block. > > > > Adjust some indentation and remove some trailing white-space at the same > > time. > > > > Fixes: 8082e4ed0a61 ("[LIB]: Boyer-Moore extension for textsearch infrastructure strike #2") > > Link: https://bugzilla.netfilter.org/show_bug.cgi?id=1390 > > Signed-off-by: Jeremy Sowden <jeremy@xxxxxxxxxx> > > --- > > lib/ts_bm.c | 16 +++++++++------- > > 1 file changed, 9 insertions(+), 7 deletions(-) > > > > diff --git a/lib/ts_bm.c b/lib/ts_bm.c > > index 1f2234221dd1..ef448490a2cc 100644 > > --- a/lib/ts_bm.c > > +++ b/lib/ts_bm.c > > @@ -60,23 +60,25 @@ static unsigned int bm_find(struct ts_config *conf, struct ts_state *state) > > struct ts_bm *bm = ts_config_priv(conf); > > unsigned int i, text_len, consumed = state->offset; > > const u8 *text; > > - int shift = bm->patlen - 1, bs; > > + int bs; > > const u8 icase = conf->flags & TS_IGNORECASE; > > > > for (;;) { > > + int shift = bm->patlen - 1; > > This line is the fix, right? Yup. > > text_len = conf->get_next_block(consumed, &text, conf, state); > > > > if (unlikely(text_len == 0)) > > break; > > > > These updates below are a clean up, right? If so, maybe split this in > two patches I'd suggest? Sure. > > while (shift < text_len) { > > - DEBUGP("Searching in position %d (%c)\n", > > - shift, text[shift]); > > - for (i = 0; i < bm->patlen; i++) > > + DEBUGP("Searching in position %d (%c)\n", > > + shift, text[shift]); > > + for (i = 0; i < bm->patlen; i++) > > if ((icase ? toupper(text[shift-i]) > > - : text[shift-i]) > > - != bm->pattern[bm->patlen-1-i]) > > - goto next; > > + : text[shift-i]) > > + != bm->pattern[bm->patlen-1-i]) > > Maybe disentagle this with a few helper functions? > > static char bm_get_char(const char *text, unsigned int pos, bool icase) > { > return icase ? toupper(text[pos]) : text[pos]; > } Sure. > Thanks > > > if ((icase ? toupper(text[shift-i]) > > - : text[shift-i]) > > + goto next; > > > > /* London calling... */ > > DEBUGP("found!\n"); > > -- > > 2.39.2 > > J.
Attachment:
signature.asc
Description: PGP signature