RE: refernces, arrays, and why does it take up so much memory? [SOLVED]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Stuart Dallas [mailto:stuart@xxxxxxxx]
> Sent: Tuesday, September 03, 2013 2:37 PM
> To: Daevid Vincent
> Cc: php-general@xxxxxxxxxxxxx; 'Jim Giner'
> Subject: Re:  refernces, arrays, and why does it take up so much
> memory? [SOLVED]
> 
> On 3 Sep 2013, at 21:47, "Daevid Vincent" <daevid@xxxxxxxxxx> wrote:
> 
> > There were reasons I had the $id -- I only showed the relevant parts of
> the
> > code for sake of not overly complicating what I was trying to
illustrate.
> > There is other processing that had to be done too in the loop and that
is
> > also what I illustrated.
> >
> > Here is your version effectively:
> >
> > 	private function _normalize_result_set() //Stuart
> > 	{
> > 		  if (!$this->tmp_results || count($this->tmp_results) < 1)
> > return;
> >
> > 		  $new_tmp_results = array();
> >
> > 		  // Loop around just the keys in the array.
> > 		  $D_start_mem_usage = memory_get_usage();
> > 		  foreach (array_keys($this->tmp_results) as $k)
> > 		  {
> 
> You could save another, relatively small, chunk of memory by crafting your
> loop with the rewind, key, current and next methods (look them up to see
> what they do). Using those you won't need to make a copy of the array keys
> as done in the above line. When you've got the amount of data you're
dealing
> with it may be worth investing that time.
> 
> > 			/*
> > 		  	if ($this->tmp_results[$k]['genres'])
> > 			{
> > 				// rip through each scene's `genres` and
> > store them as an array since we'll need'em later too
> > 				$g = explode('|',
> > $this->tmp_results[$k]['genres']);
> > 				array_pop($g); // there is an extra ''
> > element due to the final | character. :-\
> 
> Then remove that from the string before you explode.

> Munging arrays is
> expensive, both computationally and in terms of memory usage.
> 
> > 				$this->tmp_results[$k]['g'] = $g;
> 
> Get rid of the temporary variable again - there's no need for it.

> $this->tmp_results[$k]['g'] = explode('|', trim($this-
> >tmp_results[$k]['genres'], '|'));

Maybe an option. I'll look into trim() the last "|" off the tmp_results in a
loop at the top. Not sure if changing the variable will have the same effect
as adding one does. Interesting to see...

> If this is going in to a class, and you have control over how it's
accessed,
> you have the ability to do this when the value is accessed. This means you
> won't need to
> 
> > 			}
> > 			*/
> >
> > 		  	// Store the item in the temporary array with the ID
> > as the key.
> > 		    // Note no pointless variable for the ID, and no use of
> > &!
> > 		    $new_tmp_results[$this->tmp_results[$k]['id']] =
> > $this->tmp_results[$k];
> > 		  }
> >
> > 		  // Assign the temporary variable to the original variable.
> > 		  $this->tmp_results = $new_tmp_results;
> > 		  echo "\nMEMORY USED FOR STUART's version:
> > ".number_format(memory_get_usage() - $D_start_mem_usage)." PEAK:
> > (".number_format(memory_get_peak_usage(true)).")<br>\n";
> > 		  var_dump($this->tmp_results);
> > 		  exit();
> > 	}
> >
> > MEMORY USED FOR STUART's version: -128 PEAK: (90,439,680)
> >
> > With the processing in the genres block
> > MEMORY USED FOR STUART's version: 97,264,368 PEAK: (187,695,104)
> >
> > So a slight improvement from the original of -28,573,696
> > MEMORY USED FOR _normalize_result_set(): 97,264,912 PEAK: (216,268,800)
> 
> Awesome.
> 
> > No matter what I tried however it seems that frustratingly just the
simple
> > act of adding a new hash to the array is causing a significant memory
> jump.
> > That really blows! Therefore my solution was to not store the $g as
['g']
> --
> > which would seem to be the more efficient way of doing this once and re-
> use
> > the array over and over, but instead I am forced to inline rip through
and
> > explode() in three different places of my code.
> 
> Consider what you're asking PHP to do. You're taking an element in the
> middle of an array structure in memory and asking PHP to make it bigger.
> What's PHP going to do? It's going to copy the entire array to a new
> location in memory with an additional amount reserved for what you're
> adding. Note that this is just a guess - it's entirely possible that PHP
> manages it's memory better than that, but I wouldn't count on it.
> 
> > We get over 30,000 hits per second, and even with lots of caching, 216MB
> vs
> > 70-96MB is significant and the speed hit is only about 1.5 seconds more
> per
> > page.
> >
> > Here are three distinctly different example pages that exercise
different
> > parts of the code path:
> >
> > PAGE RENDERED IN 7.0466279983521 SECONDS
> > MEMORY USED @START: 262,144 - @END: 26,738,688 = 26,476,544 BYTES
> > MEMORY PEAK USAGE: 69,730,304 BYTES
> >
> > PAGE RENDERED IN 6.9327299594879 SECONDS
> > MEMORY USED @START: 262,144 - @END: 53,739,520 = 53,477,376 BYTES
> > MEMORY PEAK USAGE: 79,167,488 BYTES
> >
> > PAGE RENDERED IN 7.558168888092 SECONDS
> > MEMORY USED @START: 262,144 - @END: 50,855,936 = 50,593,792 BYTES
> > MEMORY PEAK USAGE: 96,206,848 BYTES
> 
> Knowing nothing about your application I'm obviously not in a strong
> position to comment, but seven seconds to generate a page would be
> unacceptable to me and any of my clients.

It's a "one time hit" and the rest is served from a cache for the next 24
hours which serves very very fast after that initial rendering. It's just we
have so many thousands of pages that this becomes an issue -- especially
when webcrawlers hit us and thread-out so MANY pages are trying to render at
the same time, especially the ones towards the end where they haven't been
cached since rarely do real people get that far... Like you know, pages 900,
901, 902, etc... with new content each day, page 1 today is now page 2
tomorrow, so it's a constant thorn.

> I'll put money on it being
> possible to cut that time by changing your caching strategy. The memory
> usage is also ridiculous - does a single page really display that amount
of
> data? Granted, there are some applications that cannot be optimised beyond
a
> certain point, but those numbers make me sad!

HA! It was over 400MB per page a few weeks ago. I keep whittling it down,
but I think I'm hitting the lower limit at this point.

It's a tough balance between database hits, cache hits, network traffic
(memcached), disk i/o, page speed, load balancing, etc. All we can do is try
things and tweak and see what works and what brings the servers to their
binary knees.


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php





[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux