Programming Theory
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Codewalkers ForumsOther TechnologiesProgramming Theory

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Codewalkers Forums Sponsor:
Stay one step ahead of the competition. Evaluate and give feedback on some of the hottest web development tools on the market today. Make your opinion heard! Click Here
  #1  
Old October 5th, 2003, 09:06 PM
Anonymous Anonymous is offline
Registered User
Codewalkers God 35th Plane (22000 - 22499 posts)
 
Join Date: Apr 2007
Posts: 22,309 Anonymous User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 24
Passing strings by value in PHP

In an effort to understand PHP better I decided recently to do a number of tests to find out once and for all how PHP handled passing strings by value.

I wanted to see if it was advantageous to pass strings by reference at all.

What I discovered was very interesting, but it also made me wonder.

Most times when you copy a string by value, the copy is not done immediately. I call this copy 'deferred'.

When this string is used later, no copy is actually made, the old content of the string is used.

php Code:
Original - php Code
  1.  
  2. $s = 'hello';
  3. $a = $s;
  4. $b = $a;
  5. $c = str_replace('hello', 'goodbye', $b);


Here no copy is done, the 'str_replace' uses $s's content. This is because str_replace takes the strings by value.

When you introduce references, the picture changes a bit.

php Code:
Original - php Code
  1.  
  2. $s = 'hello';
  3. $a = $s;
  4. $b =& $a;


Here the first two lines don't copy the string, but as soon as $b tries to reference $a, $a's copy gets performed. This makes sense, because if you make a references like this, chances are you plan to modify that exact string.

php Code:
Original - php Code
  1.  
  2. $s = 'hello';
  3. $a = $s;
  4. $b =& $s;


Here it looks like no copy should take place, but it turns out that when $b references $s, $a's copy is performed.

It seems that as soon as a variable is referenced by another, another copies of it that have been deferred are performed immediately. Any copy of content referenced by more than one symbol is immediately performed.

php Code:
Original - php Code
  1.  
  2. $s = 'hello';
  3. $a = $s;
  4. unset($s);
  5. $b =& $a;


Here above when $s is unset, its content is not removed from memory because there is a deferred copy pending on it. When $b references $a, $a gets given $s's old content.

So it seems that mostly there is no reason to pass strings by reference, because it can force a copy for no reason.

What I am wondering is why when another symbol references a symbol's content all deferred copies of that content are processed immediately. It seems that it would be more pertinent to leave these as deferred until that content is modified/deleted, or the copies are needed.

Perhaps somebody can shed some light on this, and even explain why this can/can't be done, with respect to the data structures used by PHP when storing symbols and content.

The version of PHP I did these tests with was 4.2.3. What will PHP5 change in the way passing by reference/value, specifically with respect to these deferred copies?

Reply With Quote
  #2  
Old October 5th, 2003, 09:10 PM
-vertigo- -vertigo- is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Louth, Lincolnshire
Posts: 314 -vertigo- User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 24 sec
Reputation Power: 2
RE: Passing strings by value in PHP

I didn't mean to post the previous post anonymously, it was me who posted it.

Reply With Quote
  #3  
Old October 6th, 2003, 09:28 PM
brut brut is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Posts: 367 brut User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 30 sec
Reputation Power: 2
RE: Passing strings by value in PHP

This is referred to as 'reference counting'.
Here is an excerpt:
"Note that PHP 4 is not like C: passing variables by reference is not necessarily faster than passing them by value. Indeed, in PHP 4 it is usually better to pass a variable by value, except if the function changes the passed value or if a reference (or alias, see Aliasing: added language flexibility) is being passed."

Reply With Quote
  #4  
Old October 6th, 2003, 09:57 PM
-vertigo- -vertigo- is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Louth, Lincolnshire
Posts: 314 -vertigo- User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 24 sec
Reputation Power: 2
RE: Passing strings by value in PHP

Thanks for the link, it was informative. Unfortunately it didn't say why when some content referenced by one symbol gets referenced by another, all deferred copies on that content get performed immediately.

Surely those copies could wait for longer, since the content hasn't been changed yet. This would negate the negative effects of passing by reference, which namely is that it could cause copies to be performed unnecessarily.

Any ideas on that?

Reply With Quote
  #5  
Old October 9th, 2003, 09:59 PM
brut brut is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Posts: 367 brut User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 30 sec
Reputation Power: 2
RE: Passing strings by value in PHP

Hey Vertigo-
Sorry, been a while getting back. I was wondering what method you're using to determine whether or not a copy has taken place?

Reply With Quote
  #6  
Old October 9th, 2003, 10:07 PM
zombie zombie is offline
Codewalkers Intermediate (1500 - 1999 posts)
 
Join Date: Apr 2007
Location: serbia
Posts: 1,876 zombie User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 3
RE: Passing strings by value in PHP

me too! i didn't get that part. i don't even know if it can be done.

Reply With Quote
  #7  
Old October 15th, 2003, 12:21 AM
-vertigo- -vertigo- is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Louth, Lincolnshire
Posts: 314 -vertigo- User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 24 sec
Reputation Power: 2
RE: Passing strings by value in PHP

Sorry for the late reply, I haven't been online much lately.

Obviously it is impossible to see the difference in speed it takes for a copy to be done, so I read a 4.4MB mp3 into a string, and then did the tests with that.

However, I did the tests in a loop that runs 100 times, so the difference in running time is definitely noticible. On my setup 100 copies of that string takes 1.4 seconds, So it is easy to tell exactly how many copies are being done, generally you divide the running time by 1.4 seconds.

I don't know any fancy way of doing this.

Here is an example of a loop like I used:

php Code:
Original - php Code
  1.  
  2. function file_string($filename) {
  3. //read a large file into a string, don't use file_get_contents, it sux.
  4.     $fd = fopen($filename, "rb");
  5.     $content = fread($fd, filesize($filename));
  6.     fclose($fd);
  7.     return $content;
  8. }
  9.  
  10. $s = file_string('F:mp3Skunk anansieSkunk Anansie - Brazen (weep).mp3');
  11.  
  12. $i = 0; // this seems to have an affect on the time
  13. $t1 = getmicrotime();
  14. for ($i = 100; $i > 0; --$i) {
  15.     $a = $s; //delayed
  16.     $b =& $s; //forces $a's copy of $s (see explanation below for why)
  17.     unset($b);
  18.     $b = $a; //delayed
  19.     $a = 1; //$a's old content is kept, since a delayed copy on it exists
  20.     $c =& $b;
  21.     // above: $b's copy of $a is needed, old content has refcount 0, 1 deferred copy, content is inherited
  22.     unset($c);
  23. }
  24. $t2 = getmicrotime();
  25.  
  26. print 'Time: '.($t2-$t1)."<br>n";
  27. unset($a); unset($b); unset($c);

Reply With Quote
  #8  
Old October 15th, 2003, 09:32 PM
brut brut is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Posts: 367 brut User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 30 sec
Reputation Power: 2
RE: Passing strings by value in PHP

I modified your script to output the time for each step. The results were strange. I'm using an mp3 as well(11.8MB) The time of each loop is consistently inconsistent.
Look here:
http://dev.mcstore.com/~brut/mem.php
It varies, roughly, at 0.26 seconds and 0.51 seconds every other loop.
I can't figure out why this would be?

Reply With Quote
  #9  
Old October 17th, 2003, 04:24 PM
-vertigo- -vertigo- is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Louth, Lincolnshire
Posts: 314 -vertigo- User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 24 sec
Reputation Power: 2
RE: Passing strings by value in PHP

Brut, I ran your code and the times were pretty consistent for me.

Perhaps you have other processes in the background that are making the times irregular. I get about 20% fluctuation in the running times, but you were getting 100%. If you don't have output buffering, try with that enabled, perhaps that will make it more regular.

My original question is still posed. Why are any copies done as soon as a reference is made?

Reply With Quote
  #10  
Old October 17th, 2003, 11:16 PM
zombie zombie is offline
Codewalkers Intermediate (1500 - 1999 posts)
 
Join Date: Apr 2007
Location: serbia
Posts: 1,876 zombie User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 3
RE: Passing strings by value in PHP

short answer is probably: to speed things up.

see, when php interpreter needs to copy some variable to another, it doesn't know if you are going to only read from that other variable, or also read/write.

if you only read, it is stupid to create another copy (so you get two variables pointing to the same memory space). on the other hand, if you modify it, you need two different variabe, and thus you get two of them.

Reply With Quote
  #11  
Old October 19th, 2003, 07:28 PM
-vertigo- -vertigo- is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Louth, Lincolnshire
Posts: 314 -vertigo- User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 24 sec
Reputation Power: 2
RE: Passing strings by value in PHP

Zombie, thanks for your patient answer. Unfortunately, that wasn't what I wanted to know. I think I should have phrased the question better. I know why they don't do the copies until they are needed, as you explained.

But notice that as soon as you make a reference to another variable, if that variable is a copy of another which hasn't yet been done, it is done immediately. Also, I I go ($a = $s), then I go ($b = &$s) a's copy of s is performed immediately. It isn't delayed there.

So my question is not why do they delay the copies, but rather why do they perform the delayed copies immediately when you make a reference?

Surely it would be better to still leave those copies delayed, even if you make a reference, because perhaps content could be inherited. Here's an example:

php Code:
Original - php Code
  1.  
  2. $a = $s; //this copy is left for later
  3. $b = &$s; //here a's copy is done, because ref made
  4. $a = 1; //notice here $a never really needed a copy of s, that copy was a waste
  5.  


Hopefully this makes it more clear.

Reply With Quote
  #12  
Old October 28th, 2003, 12:19 AM
zombie zombie is offline
Codewalkers Intermediate (1500 - 1999 posts)
 
Join Date: Apr 2007
Location: serbia
Posts: 1,876 zombie User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 3
RE: Passing strings by value in PHP

ahh.. i get your question now. sorry.

but maybe the answer is very simple.

it might be that they thought that this scenario you are presenting is not that often, and to fully cover it (in a fastest way for the algorithm) might be too much code or just too hard.

so they "cut the corner" and just made it that (simplar) way.


anyway, you don't really need to know all this. from your point of view, it will always act the same (except for some minor speed differences in some wierd cases), and your script should *never* relay on that, because then can just as well change the inner workings in some newer version, and you end up with a (much) slower script..

Reply With Quote
  #13  
Old November 25th, 2003, 03:53 PM
xs0 xs0 is offline
Codewalkers Novice (500 - 999 posts)