On Fri, Apr 18, 2008 at 10:53 AM, Daniel Kolbo <kolb0057@xxxxxxx> wrote: > > > > Struan Donald wrote: > > > * at 17/04 16:30 -0500 Daniel Kolbo said: > > > > > > > Hello, > > > > > > I am writing a PHP script for a local application (not web/html > > > based). My script is taking a longer time to execute than I want. The > > > source code is a few thousand lines, so I will spare you all this > > > level of detail. > > > > > > I prefer to write in PHP because that is what I know best. However, I > do not think there is anything inherent with my script that requires PHP > over C/C++/C#. > > > > > > > > > > I think this points to an answer. If you're not too familiar with one > > of the compiled languages then producing code that runs faster than > > your current PHP implementation is a tall order. PHP, like most > > scripting languages, is compiled into an internal format as a first > > step and it's this that's then run. A lot of effort has gone into > > making this pretty fast and by deciding to rewrite in a compiled > > language you are betting that the C code, or whatever, you write will > > be faster. Given the effort I imagine translating a few thousand > > lines of PHP into one of the languages you name is likely to be > > significant you'd want to be sure of winning that bet. > > > > > > > > > If I wrote the console application in a c language (and compiled) would > one expect to see any improvements in performance? If so, how much > improvement could one expect (in general)? > > > > > > > > > > How long will it take you to convert the program? How much more time > > will you spend on support and bugfixing? > > > > > > > > > I assume because php is not compiled that this real time interpretation > of the script by the zend engine must take some time. This is why I am > thinking about rewriting my whole script in a C language. But before I > begin that ordeal, i wanted to ask the community for their opinions. If you > think using a c language would suit me well, what language would you > recommend? > > > > > > > > > > It's not real time interpretation. It's a one time parse and compile > > when the script starts and then it runs the internal bytecode. If you > > have APC, or some other sort of caching mechanism installed then part > > of the speed up comes from caching the bytecode and saving on the > > initial parse and compile phase. > > > > As to what language then if you want to go ahead and do this you > > should pick the one you know best. If you don't know any of them that > > well then I really think that your time would be better spent on > > optimising the existing PHP code first. Are you sure it's running as > > fast as it can? Do you know where it's slow? > > > > Rewriting it in another language really is the 50 pound lump hammer > > solution to the problem if you've not tried anything else to speed it > > up. > > > > > > > > > My google and mail archive searching for this yielded mainly PHP for web > apps, so I am asking all of you. > > > > > > My main question is, how much of an improvement in performance will one > see by using a compiled version of an application versus using a scripted > version of an application? > > > > > > I looked at PHP's bcompiler, but the documentation is minimal so I am > hesitant to dig much deeper into that, unless someone strongly suggests > otherwise. > > > > > > > > > > A quick look at the docs tells me that all that bcompiler will do is > > save you the initial parse and compile phase and not speed up the > > execution of the code. > > > > The one thing you don't say is exactly how long is too long? Is it > > hours or minutes? If it's seconds then consider, as someone has > > suggested elsewhere in the thread, looking at APC as that should cut > > down the start up time of the script. > > > > HTH and apologies if none of this is news to you. > > > > Struan > > > > > > > > You are correct in that I want to be pretty sure I win the bet, before > translating all the code. The code really isn't that complicated, so I > think I am capable of translating it. Just a bunch of pretty small > functions. The code is a simulation-model type of program, so the > bottleneck in the code is the "looping". I know the exact function that > takes up 86% of the time. I have tried to rewrite this function from > different approaches. The bottom line is "work" needs to be done, and a lot > of it. I really can't think of any other ways to improve the "logic" of the > code at this time. Perhaps there are different methods I could be using to > speed up execution. Again, I think the source of the issue is looping. > > Here is the function that takes 86% of the time...This function is called > 500,000,000 times with different parameters ($arr and $arr2) (which I cannot > predict their values just their size). > > ========= C O D E ===S T AR T ======== > //this function is essentially a search and remove function for a nested > array > > foreach ($arr as $key => $value) { > //count($arr) == 3 > foreach ($value as $key2 => $value2) { > //0<=count($value) <=1000 > foreach($arr2 as $value3) { > //count($arr2) == 2 > if (in_array($value3, $value2)) { > unset($arr[$key][$key2]); > break; > } > } > } > > ========= C O D E ===E N D ======== > > So essentially 3 foreach nested, invoking in_array(), and unset(). > I rewrote the above code by making $arr a 1 dimensional array and 'storing' > the nested key values as a string index with delimiter, so that I could > unset the original nested $arr by exploding this index...i'll just show the > code. > > ========= C O D E 2 ==== S T A R T======= > //first i prepare $arr > > function CompressRay($some_nested_ray, $delimiter = "|") { > //not really compression just flattens the array > //returns an array of string of key_strings and the final value > $answer_ray = array(); > foreach ($some_nested_ray as $key => $value) { > $key_string = (string)$key.$delimiter; > if (is_array($value)) { > $compressed_sub_ray = CompressRay($value, $delimiter); > //echo "Compressed Sub is \n"; > //print_r($compressed_sub_ray); > foreach ($compressed_sub_ray as $sub_key_string => $final_value) > { > $answer_ray[$key_string.$sub_key_string] = $final_value; > } > }else { > $answer_ray[substr($key_string,0,-1)] = $value; > } > } > return $answer_ray; > } > > $arr['compressed'] = CompressRay($arr); > //this part happens quickly, no worries so far > > //then i call the below procedure oh, about 500,000,000 times > > foreach ($arr2 as $value3) { > $key_strings = array_keys($arr['compressed'], $value3); > foreach ($key_strings as $key_string) { > $key_sequence = explode("|",$key_string); > unset($all_vs_holes[$key_sequence[0]][$key_sequence[1]]); > $upto_hole = substr($key_string,0,-2); > unset($arr['compressed'][$upto_hole."|0"]); > //to keep the compressed archive accurate > unset($arr['compressed'][$upto_hole."|1"]); > //to keep the compressed archive accurate > } > } > > ========= C O D E 2 ==== E N D======= > > to my surprise code2 was actually slower, twice as slow. I started > thinking maybe by passing the relatively large $arr by value 500 million > times was taking up a lot of time...but some bench mark testing I did, > actually didn't show that much improvement (if any) by passing a large array > by reference. This seemed counterintuitive to me, and b/c i would have to > make a duplicate copy of $arr if i did pass by reference it seemed like > little gain would come of it. > > Like I said, some work has to be done...these iterations have to be > performed. > By long time, i am speaking about days. I am not entirely convinced that > by making minor optimization changes to the particular syntax or methods > invoked will yield any order of magnitude difference. The order of > magnitude difference I need, (i think) must come from changing actual logic > of the code - which is difficult to do in an almost simple iteration > procedure. An analogy, it doesn't matter if the code is lance armstrong or > some big NFL lineman, they are running 100,000 back to back marathons and > are going to get tired and start crawling either way. > > This is why i feel i am up against a brick wall, and must start looking for > a language that runs a bit faster. Some preliminary looping of 1 billion > iterations in C++ vs. PHP has yielded substantial difference...like 10^4 > magnitude difference in time. This makes me feel like my bet is justified > in translating the code. > > I am going to miss php ;( > > As I know the bottleneck is in the actual execution of the code, the APC > and bcompiler won't offer much gain, thanks for the consideration and > looking into those. > > At this point some of you may encourage me to go to C++ so i stop with this > question...but I'd like to hear if you all agree that perhaps it is time to > pull out the 50 lbp lump hammer? > > Thanks, > Dan K > > Like I said before, since you know that most of your time is in a specific part of your script, just move that function into a custom extension written in c/c++. http://talks.php.net/show/extending-php-apachecon2003/0 -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php