Re: Comparing strings (revisited)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 25 May 2009 02:11:24 -0400, paulf@xxxxxxxxxxxxxxxxx (Paul M Foster) wrote:

.........
>
>This is why I originated a thread along these lines some time ago.  I
>sympathize with your pain, being a C programmer as well. Apparently, PHP
>plays fast and loose with types when doing == comparisons. And empty()
>has a really wild way of determining if something is "empty" (an integer
>0 is "empty"?). Which is why I originally asked if strcmp() was the
>preferred method of comparison for the list members.
>
>In any case, strcmp() does what you want and is the safest way to
>compare strings, which is what PHP passes around a lot (data comes out
>of databases as strings, comes back from forms as strings, etc.). And
>since most of the syntax and library functions of PHP are based on C
>paradigms, I'm guessing that the PHP strcmp() function is a thin veneer
>over the actual C function.

Thanks, Paul. 

I have done some more experimenting, and have a better handle on what is going on now, so
I don't think I will fall into any unexpected holes (apart from  by being careless!)

If you enter a value directly (eg. $a[0] = 000a; ) it tries to convert the input to a
number, and rejects any input it cannot convert (such as 000a). However if the value is
quoted it is stored internally as a string.

If the data is stored as elements of a string, and is exploded into an array no attempt is
made to interpret them, and they are stored as strings in their original form. They appear
to retain this form, but if they are compared with some other value the two values are
adjusted until they are of the same type, and then they are compared.  The results often
seem absurd at first glance. For example 000A < 2 < 10, but A > 9999. I think the reason
for this is that if the values can be treated as numbers they are compared directly, but
otherwise the one with less characters is right padded with spaces, and then there are
compared as strings. Thus '000A' < '2   ', and 'A   ' > '9999'.

If the values are compared as strings (using strcmp or SORT_STRING) the results are
entirely logical if all the strings are of the same length. If the strings are of
different lengths the shorter one is again right padded (probably with spaces) and then
the two are compared.

These points are illustrated in the following test programs.

<?php
// Test one data:

	$a[] = 2000; 	$a[] = 20e2; 	$a[] = 2.e3; 	$a[] = 2.E3;
 	$a[] = 2.000e3; 	$a[] = 4000/2; 	$a[] = 4.0e3/2.0; 	$a[] = '20E2';
 	$a[] = 9999; 	$a[] = '9999';	$a[] = '000A';	// 	$a[] = 000A;

echo '<p>&nbsp;</p>Test 1. Values entered directly<p>&nbsp;</p>';
	$i = 0; $n = count ($a);
	while ($i < $n)
		{
		echo '<p> $a['.$i.']: '.$a[$i].' = ';
		$j = 0; while ($j < $n)
			{
			if (($i != $j) && ($a[$i] == $a[$j])) { echo $a[$j].', '; }
			++$j;
			}
		++$i; echo '</p>';
		}

// Test two data:	
$ss = 2000;20e2;2.e3;2.E3;2.000e3;4000/2;4.0e3/2.0;20E2;9999;000A;A000;2;0010;
A;10;20;21';
	$a = explode (';',$ss);

echo '<p>&nbsp;</p>Test 2. Values exploded into array<p>&nbsp;</p>';
	$i = 0; $n = count ($a);
	while ($i < $n)
		{
		echo '<p> $a['.$i.']: '.$a[$i].' = ';
		$j = 0; while ($j < $n)
			{
			if (($i != $j) && ($a[$i] == $a[$j])) { echo $a[$j].', '; }
			++$j;
			}
		++$i; echo '</p>';
		}
	
// Test 3.
	$b = $a;
	sort ($b, SORT_STRING);
	sort ($a);
	echo '<p>&nbsp;</p><p>  Sort normal.</p>';
	$i = 0; while ($i < $n)
		{
		echo '<p>$a['.$i.'] = '.$a[$i].'</p>';
		++$i;
		}

	echo '<p>&nbsp;</p><p>  Sort string.</p>';
	$i = 0; while ($i < $n)
		{
		echo '<p>$b['.$i.'] = '.$b[$i].'</p>';
		++$i;
		}
?>
Results:

Test 1. Values entered directly. All values are converted to the simplest form on input.
 
$a[0]: 2000 = 2000, 2000, 2000, 2000, 2000, 2000, 20E2, 
$a[1]: 2000 = 2000, 2000, 2000, 2000, 2000, 2000, 20E2, 
$a[2]: 2000 = 2000, 2000, 2000, 2000, 2000, 2000, 20E2, 
$a[3]: 2000 = 2000, 2000, 2000, 2000, 2000, 2000, 20E2, 
$a[4]: 2000 = 2000, 2000, 2000, 2000, 2000, 2000, 20E2, 
$a[5]: 2000 = 2000, 2000, 2000, 2000, 2000, 2000, 20E2, 
$a[6]: 2000 = 2000, 2000, 2000, 2000, 2000, 2000, 20E2, 
$a[7]: 20E2 = 2000, 2000, 2000, 2000, 2000, 2000, 2000, 
$a[8]: 9999 = 9999, 
$a[9]: 9999 = 9999, 
$a[10]: 000A = 
 
Test 2. Values exploded into array. Values are preserved as strings until compared. 
 
$a[0]: 2000 = 20e2, 2.e3, 2.E3, 2.000e3, 20E2, 
$a[1]: 20e2 = 2000, 2.e3, 2.E3, 2.000e3, 20E2, 
$a[2]: 2.e3 = 2000, 20e2, 2.E3, 2.000e3, 20E2, 
$a[3]: 2.E3 = 2000, 20e2, 2.e3, 2.000e3, 20E2, 
$a[4]: 2.000e3 = 2000, 20e2, 2.e3, 2.E3, 20E2, 
$a[5]: 4000/2 = 
$a[6]: 4.0e3/2.0 = 
$a[7]: 20E2 = 2000, 20e2, 2.e3, 2.E3, 2.000e3, 
$a[8]: 9999 = 
$a[9]: 000A = 
$a[10]: A000 = 
$a[11]: 2 = 
$a[12]: 0010 = 10, 
$a[13]: A = 
$a[14]: 10 = 0010, 
$a[15]: 20 = 
$a[16]: 21 = 
 
Test 3. Sort normal.
$a[0] = 000A
$a[1] = 2
$a[2] = 10
$a[3] = 0010
$a[4] = 20
$a[5] = 21
$a[6] = 20e2
$a[7] = 2.e3
$a[8] = 2000
$a[9] = 2.E3
$a[10] = 20E2
$a[11] = 2.000e3
$a[12] = 4.0e3/2.0
$a[13] = 4000/2
$a[14] = 9999
$a[15] = A
$a[16] = A000
 
Test 4:
Sort string.
$b[0] = 000A
$b[1] = 0010
$b[2] = 10
$b[3] = 2
$b[4] = 2.000e3
$b[5] = 2.E3
$b[6] = 2.e3
$b[7] = 20
$b[8] = 2000
$b[9] = 20E2
$b[10] = 20e2
$b[11] = 21
$b[12] = 4.0e3/2.0
$b[13] = 4000/2
$b[14] = 9999
$b[15] = A
$b[16] = A000


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux