Re: filter_var using regex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2011-05-04 at 13:46 -0600, Jason Gerfen wrote:

> On 05/04/2011 01:27 PM, Ashley Sheridan wrote:
> > On Wed, 2011-05-04 at 13:20 -0600, Jason Gerfen wrote:
> > 
> >> I am running into a problem using the REGEXP option with filter_var().
> >>
> >> The string I am using: 09VolunteerApplication.doc
> >> The PCRE regex I am using:
> >> /^[a-z0-9]\.[doc|pdf|txt|jpg|jpeg|png|docx|csv|xls]{1,4}$/Di
> >>
> >> The function in it's entirety:
> >> return (!filter_var('09VolunteerApplication.doc',
> >> FILTER_VALIDATE_REGEXP,
> >> array('options'=>array('regexp'=>'/^[a-z0-9]\.[doc|pdf|txt|jpg|jpeg|png|docx|csv|xls]{1,4}$/Di'))))
> >> ? false : true;
> >>
> >> Anyone have any insight into this?
> >>
> > 
> > 
> > You missed a + in your regex, at the moment you're only checking to see
> > if a file starts with a single a-z or number and then is followed by the
> > period. Then you're checking for oddly for one to four extensions in the
> > list, are you sure you want to do that? And the square brackets are used
> > to match characters, not strings, use the standard brackets to allow
> > from a choice of strings
> > 
> > Try this:
> > 
> > '/^[a-z0-9]+\.(doc|pdf|txt|jpg|jpeg|png|docx|csv|xls)$/Di'
> > 
> > One other thing you should be aware of maybe, filenames won't always
> > consist of just the letters a-z and numbers 0-9, they may contain
> > accented or foreign letters, hyphens, spaces and a number of other
> > characters depending on the client machines OS. Windows allows very few
> > characters for example compared to the Unix-like OS's like MacOS and
> > Linux.
> > 
> 
> Both are valid PCRE regex's. However the rules regarding usage of
> parenthesis for an XOR string does not explain a similar regex being
> used with the filter_var() like so:
> 
> return (filter_var('kc-1', FILTER_VALIDATE_REGEXP,
> array('options'=>array('regexp'=>'/^[kc\-1|kc\-color|gr\-1|fa\-1|un\-1|un\-color|ben\-1|bencolor|sage\-1|sr\-1|st\-1]{1,8}$/Di')))
> ? true : false;
> 
> The above returns string(4) "kc-1"
> 
> Another test using the following works similarly:
> 
> return (filter_var('u0368839', FILTER_VALIDATE_REGEXP,
> array('options'=>array('regexp'=>'/^[gp|u|gx]{1,2}[\d+]{6,15}$/Di'))) ?
> true : false;
> 
> The above returns string(8) "u0368839"
> 
> And
> return (filter_var('u0368839', FILTER_VALIDATE_REGEXP,
> array('options'=>array('regexp'=>'/^[gp|u|gx]{1,2}[\d+]{6,15}$/Di'))) ?
> true : false;
> 
> returns string(8) "gp123456"
> 
> As you can see these three examples use the start [] as XOR conditionals
> for multiple strings as prefixes.
> 
> 
> 


Not quite, you think they match correctly because that's all you're
testing for, and you're not looking for anything that might disprove
that. Using your last example, it will also match these strings:

gu0368839
xx0368839
p0368839


I tested your first regex with '09VolunteerApplication.doc' and it
doesn't work at all until you add in that plus after the basename match
part of the regex:

^[a-z0-9]+\.[doc|pdf|txt|jpg|jpeg|png|docx|csv|xls]{1,4}$

However, your regex (with the plus) also matches these strings:

09VolunteerApplication.docp
09VolunteerApplication.docj
09VolunteerApplication.doc|    <-- note it's matching the literal bar
character

Making the changes I suggested (^[a-z0-9]+\.(doc|pdf|txt|jpg|jpeg|png|
docx|csv|xls)$) means the regex works as you expect. Square brackets in
a regex match a range, not a literal string, and without any sort of
modifier, match only a single instance of that range. So in your
example, you're matching a 4 character extension containing any of the
following characters '|cdfgjlnopstx', and a basename containing only 1
character that is either an a-z or a number.

-- 
Thanks,
Ash
http://www.ashleysheridan.co.uk



[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux