[PATCH iptables] man: string: document BM false negatives

Jeremy Sowden <jeremy@xxxxxxxxxx> · Sun, 11 Jun 2023 09:38:05 +0100

For non-linear skb's there's a possibility that the kernel's Boyer-Moore
text-search implementation may miss matches.  There's a warning about
this in the kernel source.  Include that warning in the man-page.

Signed-off-by: Jeremy Sowden <jeremy@xxxxxxxxxx>
---
 extensions/libxt_string.man | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/extensions/libxt_string.man b/extensions/libxt_string.man
index 5f1a993c57eb..34a8755ba14e 100644
--- a/extensions/libxt_string.man
+++ b/extensions/libxt_string.man
@@ -29,3 +29,18 @@ iptables \-A INPUT \-p tcp \-\-dport 80 \-m string \-\-algo bm \-\-string 'GET /
 # The hex string pattern can be used for non-printable characters, like |0D 0A| or |0D0A|.
 .br
 iptables \-p udp \-\-dport 53 \-m string \-\-algo bm \-\-from 40 \-\-to 57 \-\-hex\-string '|03|www|09|netfilter|03|org|00|'
+.P
+Note: Since Boyer-Moore (BM) performs searches for matchings from right to left
+and the kernel may store a packet in multiple discontiguous blocks, it's still
+possible that a match could be spread over multiple blocks, in that case this
+algorithm won't find it.
+.P
+If you wish to ensure that such thing won't ever happen, use the
+Knuth-Pratt-Morris (KMP) implementation instead. In conclusion, choose the
+proper string search algorithm depending on your setting.
+.P
+Say you're using the textsearch infrastructure for filtering, NIDS or any
+similar security focused purpose, then go KMP. Otherwise, if you really care
+about performance, say you're classifying packets to apply Quality of Service
+(QoS) policies, and you don't mind about possible matchings spread over multiple
+fragments, then go BM.
-- 
2.39.2