Re: Why does count() make copies of arrays?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Robert Cummings wrote:
On Mon, 2006-08-14 at 17:24 -0400, Adam Zey wrote:
Robert Cummings wrote:
On Mon, 2006-08-14 at 13:16 -0400, Adam Zey wrote:
I was writing a shell script in PHP (4.4.2) that dealt with a rather large array. To figure out what I needed the new memory limit to be, I did a memory_get_usage() at the end of my script, and came up with about 5.5MB. I then set the memory limit to 8MB.

When I tried to run it, the script ran out of memory on the line:

$numwords = count($words);

However, when I switched to simply incrementing $numwords every time I added an element to $words, the memory limit of 8MB was fine.

So my question is, if PHP does copy-on-write, why does PHP make a copy of an array when you use count() on it, which should NOT be modifying the array?
For some reason the memory_get_usage() function wouldn't appear in my
PHP compilation even after using the --enable-memory-limit flag, and
rather than dig very deep, I whipped up the following script to test
your issue (under PHP 4.2.2):

<?php

    //echo 'Mem Usage: '.memory_get_usage()."\n";

    $foo = array();

    for( $i = 0; $i < 10000000; $i++ )
    {
        $foo[$i] = $i;
    }

    echo 'Created big array!'."\n";
    sleep( 10 );

    //echo 'Mem Usage: '.memory_get_usage()."\n";

    $numEntries = count( $foo );

    echo 'Counted big array!'."\n";
    sleep( 10 );

    //echo 'Mem Usage: '.memory_get_usage()."\n";
?>

Using the following command:

    watch -n 0 'ps awxu | grep foo.php | grep -v grep'

I got the following snapshots during the two sleep steps:

    rob      16018 66.7 44.7 935084 928684 pts/7   S+   17:11
0:18 /usr/local/bin/php -qC ./foo.php

    rob      16018 43.9 44.7 935084 928684 pts/7   S+   17:11
0:18 /usr/local/bin/php -qC ./foo.php

Which indicated no change from the 935 megs of memory already allocated
before the count().

You've either encountered a bug in your version, or a confounding
variable :)

Cheers,
Rob.
That's the thing, count only creates a duplicate of the array (or consumes massive amounts of memory) *during* the call of count(). It frees the memory right after. The problem is that if you've got a 2MB array, you can't call count() on it because the temporarily increased memory usage will break the 4MB memory limit.

Here's a better test case:

1) Ensure the memory limit is enabled and set to 4MB
2) Create an array that is 3MB in size
3) Try to call count() on that array

With PHP 4.4.2, this will fail, because count will try to copy the array (or do something else that consumes a lot of memory). If you increase the memory limit to compensate, the memory usage goes back down immediately after the count call. For this reason, memory_get_usage() will never show the extra memory usage; it's allocated and freed entirely during the count() call.

When I ran the original test I was watching the process, it generally
takes more than a second on most system to allocate several hundred
megabytes which would have exposed your problem as a spike. At any
rate...

I figured out my problem with the recompile and then ran the script with
appropriate settings. On the first run I determined the memory required
and then for the second run I set the memory to an amount very close to
what was used. Here is second script:

#!/usr/local/bin/php -qC
<?php

ini_set( 'memory_limit', '627150412' );

echo 'Mem Usage: '.memory_get_usage()."\n";

$foo = array();

for( $i = 0; $i < 10000000; $i++ )
{
    $foo[$i] = $i;
}

echo 'Created big array!'."\n";
echo 'Mem Usage: '.memory_get_usage()."\n";

$numEntries = count( $foo );

echo 'Counted big array!'."\n";
echo 'Mem Usage: '.memory_get_usage()."\n";

?>

Following I the output:

Mem Usage: 41296
Created big array!
Mem Usage: 627150256
Counted big array!
Mem Usage: 627150320

Changing the memory limit from '627150412' to '627150212' result sin the
expected memory limit exception:

Mem Usage: 41296
<br />
<b>Fatal error</b>:  Allowed memory size of 627150212 bytes exhausted
(tried to allocate 12 bytes) in <b>/home/suds/foo.php</b> on line
<b>10</b><br />

So I'm not experiencing your memory issue since due to the the immense
size of the array I'm creating it would certainly show if a copy was
performed. That said (and maybe this is related to the recent memory
thread on internals that I sort of skipped over), I'm very surprised
that while I allowed '627150412' bytes for memory, that the PHP process
climbed to 900+ megs. It seems as though it doesn't account for it's own
usage of memory, which is extremely misleading. Admittedly this kind of
allocation on a production web site would normally be considered
ludicrous, it still strikes me that the memory_limit ini setting is
somewhat misleading -- in this case by about 30%.

Cheers,
Rob.

Further experimentation shows that the problem only occurs if the variable being count'd is a static variable inside a function. Of course, the original point still stands, static or no, count shouldn't make a copy. Here is a sample script that I can confirm reproduces the issue:

<?php

use_mem();

function use_mem()
{
        static $foo = "";

        for ( $x=0; $x <= 70000; $x++ )
                $foo[] = "BwaHA" . mt_rand(0, 1000000);

        echo memory_get_usage();
        $numrows = count($foo);
}

?>

PHP's default memory limit is 8MB. This script creates an array that's about 5.5MB. That part's fine. But it fails on that last line with the count($foo).

I realize that this particular function doesn't need the variable to be static, but it's just a demonstration. My actual script had a function that needed some data to be read in from a file into an array. Rather than reading it in in the main script and passing it to the function, or having the function read the data in every execution, I simply made the variable static and had the function check if the variable was empty to see if it was the first time the function had been called.

Sorry for not realizing that the static variable is the key to reproducing this issue. Does the script that I've pasted here give you any better luck in reproducing?

Regards, Adam Zey.

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux