On Fri, Mar 13, 2009 at 12:34 PM, Matt Neimeyer <matt@xxxxxxxxxxxx> wrote: > I'm trying to sanitize some numeric data that's coming to us from > another system which I have no control over where all fields are > character fields with no formatting from the end user so data is a > mishmash of clean and mixed types of dirty. > > I know I can use intval and floatval to sanitize if the numeric data > is at the front of the string but what about when it's not? > > For example, Jersey Number = #45 or Dues = $1,234.56.... > > I see in the comments at php.net for floatval a lot of very complex > solutions... am I missing something about the following that wouldn't > cover me? > > <?php $output = floatval(ereg_replace("[^-0-9\.]","",$input)); ?> > > I'm willing to assuming only US formatted numbers... and knowing that > if they put in 45/46 for jersey it would come out 4546 (but I might > put in additional code for that specific case on that specific > field...). I'm also looking for something that I can generically apply > to any numeric field. Man... I went completely apeshit on this with regex and made this long, complicated string to match numeric values and strip out possible false positives like IP addresses, etc... if you really want it, I'll post it. Anyway, the point I'm going to make instead is that you need to do more than just strip out everything but numbers and periods unless all of your values are going to be in the format "Some name = [possible junk]<some value>[possible junk]". If all of your values are guaranteed to be in this format, and there will be no values that would act as false positives (like IP addresses), then stripping everything but numbers and periods should be fine. However, values like "1 and 5" will turn into "15", "3 sizes. 30 day trial." will return "3.30", etc... just something to keep in mind (if cases like these can arise in your data). -- // Todd -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php