No, I agree, for this matter you can never create a regular expression well written enough to match all or even most of the lingo we use. I've been thinking about that though, you could theoretically make the computer create sound-comparision between two words, like the soundex() function was made to do (but it isn't that exact so) On 3/12/06, Julien Bonastre <julien@xxxxxxxxxxxxxxxx> wrote: > Oh quite right, quite right > > I will never put my word down in stone and say that a particular > rule/pattern or for that matter, anything I say, can be held to 100% > certainty > > Who can? Ha > > > You are quite right Ludvig, we can only work with what we are given. I > merely attempt to "imagine" all the possibilities. Some people strive to > create them. > > Its a delicate balance but at what point do we draw the line at > automatted user input verification and simply using actual human > verification methods? > > > A computer will have a much more difficult time recognising a word which > we read as slang but has been creatively disguised by a fellow human > brain, there is no doubt there. > > > I only merely tried to slightly improve his filtering regex pattern. Its > never going to be perfect, but its an workable example. > > > > > Anyway, tata! > > ---oOo--- Allowing users to execute CGI scripts in any directory should > only be considered if: ... a.. You have no users, and nobody ever visits > your server. ... Extracted Quote: Security Tips - Apache HTTP > Server ---oOo--- ------oOo---------------oOo------ Julien Bonastre > [The_RadiX] The-Spectrum Network CEO ABN: 64 235 749 494 > julien@xxxxxxxxxxxxxxxx > www.the-spectrum.org ------oOo---------------oOo------ > ----- Original Message ----- > From: "Ludvig Ericson" <ludvig.ericson@xxxxxxxxx> > To: "Julien Bonastre" <julien@xxxxxxxxxxxxxxxx> > Cc: <php-db@xxxxxxxxxxxxx> > Sent: Sunday, March 12, 2006 11:31 AM > Subject: Re: Database abuse help needed > > > Well, no matter how long you spend on coding a regex - no sane one > would capture all misspellings possible. It's impossible. Think of > these: fukc, fucck, f uck, fu ck, fuc k, f ukc, fu kc, fuk c, fu kk, > fawk, faak, fak, etc. > > There are quite a lot > > A not too sober Ludvig. > > On 3/12/06, Julien Bonastre <julien@xxxxxxxxxxxxxxxx> wrote: > > Yes.. elitism ;-) > > > > That is I.... > > > > The indentation, yes, formatting of emails across different clients > > will > > always be an issue. Regardless though, and thankfully, my code was > > only > > a few one liners, whereby the indentation didn't play a huge role at > > all > > in representing statements and their conditional execution basis [as > > there wasn't one :p ] > > > > Next, my snippet was an example, as I'm certain I mentioned. > > > > A slightly modified regex could be: > > > > /(fuc?k|dic?k|wank)(e(r|d|n)|hea?d|wit|ing?)?/i > > > > > > that would capture many more variations of these profanities and their > > common derivatives and suffixes.. > > > > > > > > [aside] > > That I assume was where you were going with the "spelling" issue??? > > [/end of aside] > > > > > > What is unpredictable by the way? > > > > > > You seem as though you are targetting the regex patterns themselves. > > > > > > Remember, there is no virtually no such thing as a "computer error", > > only humans that don't know how to use the computers. > > > > > > if a regex behaves differently that what you expected, there is beyond > > a > > 99.9999% certainty that it is due to not having formulated the regex > > correctly. > > > > > > There have been many a times when even I, yes, Supreme Commander of > > the > > entire known and even undiscovered Universe, have forged together a > > pattern, ran it, achieved desired results, then realised later down > > the > > track a certain word/condition it wasn't matching... Generally this > > is > > due to overlooking some small condition in the pattern or a particular > > situation you hadn't thought of. > > > > > > For example in the above regex I give I didn't rule out strings like: > > "F|_|CK" > > "F\_/CK" > > "D|CK" > > "W/\NK" > > > > which do look like the word I want to ensure doesn't exist on the > > site, > > > > Catch is? before I run this regex I also ensure the string firstly > > only > > contains the following char classes: /[a-z0-9_-]/i > > > > There we go.. > > > > > > > > Anyway, pick me more, please I love it!!! > > > > > > > > ---oOo--- Allowing users to execute CGI scripts in any directory > > should > > only be considered if: ... a.. You have no users, and nobody ever > > visits > > your server. ... Extracted Quote: Security Tips - Apache HTTP > > Server ---oOo--- ------oOo---------------oOo------ Julien Bonastre > > [The_RadiX] The-Spectrum Network CEO ABN: 64 235 749 494 > > julien@xxxxxxxxxxxxxxxx > > www.the-spectrum.org ------oOo---------------oOo------ > > ----- Original Message ----- > > From: "Ludvig Ericson" <ludvig.ericson@xxxxxxxxx> > > To: "Julien Bonastre" <julien@xxxxxxxxxxxxxxxx> > > Cc: "Chris Payne" <cjp@xxxxxxxxxxxxxxxxx>; <php-db@xxxxxxxxxxxxx> > > Sent: Sunday, March 12, 2006 12:18 AM > > Subject: Re: Database abuse help needed > > > > > > Erm, dude, chill out with the elitism. > > I think there's more then 2% knowing about regexes, and more then 5% > > of those 2% that can write "oh-so-complex regular expressions" > > > > (Either GMail mangled the indentation or you need help with that part, > > by the way >_>) > > > > Oh and you complain about it not catching spelling mistakes? Yours > > doesn't either - want to know why? Because they're so unpredictable. > > > > Cheers, toxik > > > > On 3/11/06, Julien Bonastre <julien@xxxxxxxxxxxxxxxx> wrote: > > > Well this is cute, really it is. > > > > > > > > > Kudos to all the in_array ideas and so forth > > > > > > > > > But really this is just an example. > > > > > > In reality this wouldn't work how you've planned. > > > > > > > > > For example take this quite realistic possibility. > > > > > > Lets assume the word "bad" is in your array of bad words > > > > > > > > > Now for realistic reasons I will tell you now that the word "bad" I > > > am > > > going to use as the word we all know exists as a derogatory slang > > > form > > > of human reproduction or cursing [its starts with an F in case you > > > haven't figured it out yet, four letters, ends in K, got it yet? ] > > > > > > Now as we know this "bad" word can be written many ways, remember, I > > > won't use real word, just our safe-substitute: > > > bad, bader, bading, baden, badhead, badwit, badoff, baded, > > > > > > and there maybe many more I can't think of.... > > > > > > Point being? unless you do something more exotic than a precise word > > > match then it won't get these suffixed versions, or even altered > > > spelling versions. > > > > > > > > > Now the next even larger problem? > > > > > > This in_array thing? Its cute, but if you have more than one word in > > > any > > > of your POST variables [which would be pretty safe to assume unless > > > you > > > have a bad habit of sending those one word subject, one word > > > content, > > > one word sender types of emails] > > > then it won't work either > > > > > > > > > If this is passed as say $_POST["name"]="You are a bad head!" > > > > > > your little snippet here will try to match "You are a bad head" to > > > singular words such as ["this" "is" "a" "bad" "word"] > > > > > > What you need is to break up each word in your string, then do some > > > form > > > of processing ;-) > > > > > > > > > > > > > > > > > > Ok ok, so you want the secrets now don't you?? > > > > > > Ok try signing up at these sites with names like: root, radix, > > > admin, > > > or > > > some common profanity, which is located anywhere in the username, > > > alias, > > > etc: > > > http://www.befitcommunity.com > > > www.the-spectrum.org > > > > > > Exactly.. > > > > > > Now for my implementation I ONCE AGAIN "BAD"ING rely on my regular > > > expressions > > > > > > > > > OH SUPRISE SUPRISE, maybe they were invented for a purpose??? > > > > > > > > > Its ok, nevermind, its a personal joke of mine on this list, it > > > seems > > > 2% > > > of the PHP dev population is aware of what a regular expression is, > > > and > > > only 5% of those 2% know how to write a functioning OH SO difficult > > > expression pattern.. > > > > > > > > > > > > Here's the code [brace yourself, its SOOOO advanced, took me a WHOLE > > > 0 > > > text books to master how to handle myself with a regular expression > > > parser]: > > > > > > $SYSTEM["REX_FILTER"]=Array(); > > > $SYSTEM["REX_FILTER"]["user_name"]="/^[a-z]{2,}[a-z0-9\_\-]+$/i"; > > > $SYSTEM["REX_FILTER"]["password"]="/^[a-z0-9\_\-\ \!\.]+$/i"; > > > //$SYSTEM["REX_FILTER"]["password_chk"]="/([0-9]+[a-zA-Z\_\-\ ]+|[a-zA-Z\_\-\ > > > ]+[0-9]+).*[0-9]*$/i"; > > > $SYSTEM["REX_FILTER"]["alias"]="/^[a-z0-9\.\_\-\!ÇüéâäàåçêëèïîìÄÅÉæÆôöòûùÿÖÜ¢£¥]+$/i"; > > > $SYSTEM["REX_FILTER"]["email"]="/^[a-z\_0-9\.]+@[A-Za-z0-9\-]+\.[A-Za-z0-9\-]{2,}/i"; > > > $SYSTEM["REX_FILTER"]["name"]="/^[a-zÇüéâäàåçêëèïîìÄÅÉæÆôöòûùÿÖÜ¢£¥]+$/i"; > > > $SYSTEM["REX_FILTER"]["RESERVED_WORDS"]="/admin|web.+(master|root)|root|forum|profile|preview|befit/i"; > > > $SYSTEM["REX_FILTER"]["BANNED_WORDS"]="/(fuck|cunt|shit|wanker|dick([^ > > > ]*(head|suck|lick)))/i"; > > > > > > if(strlen($_POST["user_name"])<5 or > > > strlen($_POST["user_name"])>32) > > > $errarr[]=$owner."user name must be between 5 and 32 characters > > > [inclusive]"; > > > > > > > > > elseif(!preg_match($SYSTEM["REX_FILTER"]["user_name"],$_POST["user_name"])) > > > $errarr[]=$owner."user name must start with at least 2 alphabetical > > > characters and must be followed by only alphanumerical characters > > > and/or > > > the following characters: - (hyphen) _ (underscore) \" \" (space)"; > > > > > > > > > elseif(preg_match($SYSTEM["REX_FILTER"]["RESERVED_WORDS"],$_POST["user_name"])) > > > $errarr[]=$owner."user name contains reserved or system words"; > > > > > > > > > elseif(preg_match($SYSTEM["REX_FILTER"]["BANNED_WORDS"],$_POST["user_name"])) > > > $errarr[]=$owner."user name contains \"inappropriate\" or > > > \"offensive\" > > > words"; > > > > > > > > > > > > Ok so first that from two far and distant libraries on my site, > > > first > > > part with Array definition is contained in a global core variable > > > definition library I have... > > > > > > > > > its basically just there to define the chosen patterns I've chosen > > > to > > > use for particular different fields. Easy enough? > > > > > > > > > Then I have the second part, which uses the PCRE [perl compat reg > > > exp] > > > handler functions of PHP to attempt matching my patterns to the > > > given > > > inputs from user. > > > > > > > > > Easy right??? > > > > > > > > > Too easy, and extremely fast and effective... > > > > > > > > > > > > Feel free to pick me apart though, I'd love to hear all the negative > > > things people have to say about regular expressions. > > > > > > They are like cars I find, everyone bitches about how expensive they > > > are > > > to run, but wouldn't we be BADed without them!?!?!? > > > > > > > > > ---oOo--- Allowing users to execute CGI scripts in any directory > > > should > > > only be considered if: ... a.. You have no users, and nobody ever > > > visits > > > your server. ... Extracted Quote: Security Tips - Apache HTTP > > > Server ---oOo--- ------oOo---------------oOo------ Julien Bonastre > > > [The_RadiX] The-Spectrum Network CEO ABN: 64 235 749 494 > > > julien@xxxxxxxxxxxxxxxx > > > www.the-spectrum.org ------oOo---------------oOo------ > > > ----- Original Message ----- > > > From: "Chris Payne" <cjp@xxxxxxxxxxxxxxxxx> > > > To: <php-db@xxxxxxxxxxxxx> > > > Sent: Saturday, March 11, 2006 2:53 AM > > > Subject: RE: Database abuse help needed > > > > > > > > > > Ahhh thank you everyone, > > > > > > > > I came up with the same solution - kind of, but I used about 5 > > > > more > > > > lines of > > > > code to achieve the same thing as below so I was on the same > > > > tracks > > > > just not > > > > quite as efficient :-) > > > > > > > > Chris > > > > > > > > Incorporating what Bastien said: > > > > > > > > $badWordsArray = array("these" ,"are", "bad", "words"); > > > > foreach($_POST > > > > as > > > > $key => $value){ > > > > if( in_array($value, $badWordsArray) ){ > > > > //$value was found in $badWordsArray > > > > } > > > > } > > > > > > > > http://us2.php.net/in_array > > > > > > > > -----Original Message----- > > > > From: Chris Payne [mailto:cjp@xxxxxxxxxxxxxxxxx] > > > > Sent: Thursday, March 09, 2006 8:40 PM > > > > To: php-db@xxxxxxxxxxxxx > > > > Subject: RE: Database abuse help needed > > > > > > > > Thank you for that. And excuse the inexperience, but how would I > > > > use > > > > an > > > > Array with the below? I mean say I had words such as > > > > this,is,a,bad,word > > > > (Just as examples as I can't post what I'm trying to block on > > > > here) > > > > how > > > > would I loop through those to check if any of them exist and if > > > > they > > > > do THEN > > > > execute the error script? I'm not too good with Arrays - but I'm > > > > learning. > > > > > > > > Thank you > > > > > > > > Chris > > > > > > > > If you POST from your form use $_POST, or $_GET for a form GET > > > > > > > > foreach($_POST as $key => $value){ > > > > if( strpos($value, $findme) !== false ){ > > > > //$findme was found in $value > > > > } > > > > } > > > > > > > > http://php.net/manual/en/reserved.variables.php > > > > http://us2.php.net/manual/en/control-structures.foreach.php > > > > http://us2.php.net/strpos Yes, that's !== or === > > > > > > > > -----Original Message----- > > > > From: Chris Payne [mailto:chris@xxxxxxxxxxxx] > > > > Sent: Thursday, March 09, 2006 5:21 PM > > > > To: php-db@xxxxxxxxxxxxx > > > > Subject: Database abuse help needed > > > > > > > > Hi there everyone, > > > > > > > > Is there a better way I can do this? > > > > > > > > if ($email == "mur@xxxxxxx" OR $subject == "Rulez666" > > > > > > > > Basically, if I have data coming from a form to a DB, is there a > > > > better way > > > > to say check EVERY variable for a specific set of words rather > > > > than > > > > doing > > > > $name, $subject etc .... seperately? > > > > > > > > The reason I ask is my scripts are being exploited and I can fix > > > > it > > > > when the > > > > attacks happen, but i'd like to be able to have a string which > > > > checks > > > > all > > > > the form data and takes action if a word I define in a list > > > > exists. > > > > > > > > So, instead of doing if ($name == " mememe " ...... if($email == " > > > > Rulez666@xxxxxxxxxxxx " ....... I could just have a simple > > > > statement > > > > with a > > > > group of words, and if one of the words appears it takes an action > > > > I > > > > specify > > > > such as do not proceed to add to DB etc .... > > > > > > > > Any help would be greatly appreciated as I am tired of keep > > > > writing > > > > the same > > > > scripts with different variables, i'd love to just grab all the > > > > variables > > > > from the form and perform the action ONCE on the incoming form > > > > data > > > > and then > > > > all the variables are affected instead of doing each one. > > > > > > > > Please save me from going nuts :-) > > > > > > > > Chris > > > > > > > > -- > > > > > > > > > > > > -- > > > > > > > > No virus found in this incoming message. > > > > Checked by AVG Free Edition. > > > > Version: 7.1.375 / Virus Database: 268.2.1/278 - Release Date: > > > > 3/9/2006 > > > > > > > > -- > > > > > > > > -- > > > > PHP Database Mailing List (http://www.php.net/) To unsubscribe, > > > > visit: > > > > http://www.php.net/unsub.php > > > > > > > > > > > > -- > > > > No virus found in this incoming message. > > > > Checked by AVG Free Edition. > > > > Version: 7.1.375 / Virus Database: 268.2.1/278 - Release Date: > > > > 3/9/2006 > > > > > > > > -- > > > > PHP Database Mailing List (http://www.php.net/) > > > > To unsubscribe, visit: http://www.php.net/unsub.php > > > > > > > > > > > > > > > > -- > > > > No virus found in this incoming message. > > > > Checked by AVG Anti-Virus. > > > > Version: 7.1.375 / Virus Database: 268.2.0/276 - Release Date: > > > > 7/03/2006 > > > > > > > > > > > > > > > > > > > > -- > > > No virus found in this outgoing message. > > > Checked by AVG Anti-Virus. > > > Version: 7.1.384 / Virus Database: 268.2.1/279 - Release Date: > > > 10/03/2006 > > > > > > -- > > > PHP Database Mailing List (http://www.php.net/) > > > To unsubscribe, visit: http://www.php.net/unsub.php > > > > > > > > > > > > -- > > No virus found in this incoming message. > > Checked by AVG Anti-Virus. > > Version: 7.1.384 / Virus Database: 268.2.1/279 - Release Date: > > 10/03/2006 > > > > > > > > > > -- > > No virus found in this outgoing message. > > Checked by AVG Anti-Virus. > > Version: 7.1.384 / Virus Database: 268.2.1/279 - Release Date: > > 10/03/2006 > > > > -- > > PHP Database Mailing List (http://www.php.net/) > > To unsubscribe, visit: http://www.php.net/unsub.php > > > > > > -- > PHP Database Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > > > > -- > No virus found in this incoming message. > Checked by AVG Anti-Virus. > Version: 7.1.384 / Virus Database: 268.2.1/279 - Release Date: > 10/03/2006 > > > > > -- > No virus found in this outgoing message. > Checked by AVG Anti-Virus. > Version: 7.1.384 / Virus Database: 268.2.1/279 - Release Date: 10/03/2006 > > -- > PHP Database Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > > -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php