While using in a project an optimized math function presented in one of my previous post, I got a really interesting surprise : the win with the optimized function was much better than the one profiled in my post,
Great news, but what happened ?
So here what my profiling test was:
var i:Number,j:Number; var val:Number=0; const REPS:Number = 14; const INC:Number = 0.01; //------------test0 var calibrateTime:Number = AccuraProfilingMathteTimer.calibrateTimer(); t = AccurateTimer.startTimer(); for(i=-REPS;i<REPS;i+=INC) { for(j=-REPS;j<REPS;j+=INC) { val = Math.atan2(i,j); } } val = 0; var time0:Number = AccurateTimer.endTimer(t,calibrateTime);
blazing fast, 46ms!
Now, lets just change a tiny thing in our test, instead of val=0
after the loop, lets put val++;
//------------test1 var calibrateTime:Number = AccurateTimer.calibrateTimer(); t = AccurateTimer.startTimer(); for(i=-REPS;i<REPS;i+=INC) { for(j=-REPS;j<REPS;j+=INC) { val = Math.atan2(i,j); } } val++;// was val = 0; in test0 var time1:Number = AccurateTimer.endTimer(t,calibrateTime);
how fast is my new test ? 590ms !
Really slow, What happened ?
So maybe the compiler realized that val was set for nothing in the first test, and discarded and optimized the code for me ?
Lets find out !! So we can look at the assembly code:
Here, a comparison of the generate code for test0 and test1:
so, they are identical, except of course the label number, and, at the end with :
val=0 compiled to PushByte(0) ConvertDouble() SetLocal(3)
val++ compiled to GetLocal(3) Increment() ConvertDouble() SetLocal(3)
So nothing was optimized by the compiler, so it must be an optimization done by Adobe on the fly !
Here 2 more tests:
test2: here, we need to previous result of val as we do in the loop val+= , yet, val is nullified just after the loop
//------------test2 t = AccurateTimer.startTimer(); for(i=-REPS;i<REPS;i+=INC) { for(j=-REPS;j<REPS;j+=INC) { val += Math.atan2(i,j); } } val = 0; var time2:Number = AccurateTimer.endTimer(t,calibrateTime);
test3: I added an indirection with val2=val, but val2 is nullified also after the loop
//------------test3 t = AccurateTimer.startTimer(); for(i=-REPS;i<REPS;i+=INC) { for(j=-REPS;j<REPS;j+=INC) { val = Math.atan2(i,j); val2 = val; } } val=0; val2=0; var time3:Number = AccurateTimer.endTimer(t,calibrateTime);
So here the results
So, with those results, it makes optimizations, profiling and comparison a little bit harder, we really need to be careful with the test code, and the best is to be as close as possible to a real used case.
Given those results, I updated previous posts (pow, atan2, reciprocal …), and the good news is, most of my optimizations are much better than what I originally measured.
What do you get?
Full Code
protected function mathProfiling():void { var i:Number,j:Number; var val:Number=0,val2:Number; const REPS:Number = 14; const INC:Number = 0.01; //------------test0 var calibrateTime:Number = AccurateTimer.calibrateTimer(); var t:int = AccurateTimer.startTimer(); for(i=-REPS;i<REPS;i+=INC) { for(j=-REPS;j<REPS;j+=INC) { val = Math.atan2(i,j); } } val = 0; var time0:Number = AccurateTimer.endTimer(t,calibrateTime); //------------test1 t = AccurateTimer.startTimer(); for(i=-REPS;i<REPS;i+=INC) { for(j=-REPS;j<REPS;j+=INC) { val = Math.atan2(i,j); } } val++; var time1:Number = AccurateTimer.endTimer(t,calibrateTime); //------------test2 t = AccurateTimer.startTimer(); for(i=-REPS;i<REPS;i+=INC) { for(j=-REPS;j<REPS;j+=INC) { val += Math.atan2(i,j); } } val = 0; var time2:Number = AccurateTimer.endTimer(t,calibrateTime); //------------test3 t = AccurateTimer.startTimer(); for(i=-REPS;i<REPS;i+=INC) { for(j=-REPS;j<REPS;j+=INC) { val = Math.atan2(i,j); val2 = val; } } val=0; val2=0; var time3:Number = AccurateTimer.endTimer(t,calibrateTime); startTest(); setTest( "test0" , "atan", time0); setTest( "test1" , "atan", time1); setTest( "test2" , "atan", time2); setTest( "test3" , "atan", time3); endTest(); }
I get the same results as you, roughly, and agree that you should always make sure that you use the results of your code to avoid optimization at the compiler or JIT layers. It’s can be a bit annoying, but accurate results are worth it!
I got test3 as the winner.
http://prntscr.com/23vg38