after trying to faster convert an integer to a Number, lets explore some new conversions tricks..
By reading this Blog about “Understanding fast float/integer conversions”, lets try some of the techniques !
this time, I want to convert a byte or an integer [0;255] to a Number, and normalize the result [0;1]
const oo255:Number = 1/255;
and the code:
var n:Number = byteValue * oo255;
can we go faster?
//first, lets fill a lookup table when our program starts: for(var i:int=0;i<256;i++) { fastMem.fastSetDouble( i*oo255 , i<<3); }
and our conversion code becomes:
var n:Number = fastMem.fastGetDouble(i<<3);
Which version is faster ? Lets find-out...
okay, lookup table is slower, but its not far from being at the same speed, so doing a LUT of a formula with more maths than just a multiply will quickly become faster !
so here, lets assume we want to add a simple ease-in ease-out function, a very simple sigmoid:
n = n*n*(3-2*n)
which method is the best ?
1)direct computation
a = val*oo255;a=a*a*(3-a-a);
2)LUT and alchemy:
//first, initialize our LUT: for(i= 0;i<256;i++) { var n:Number = i*oo255; n =n*n*(3-n-n); fastmem.fastSetDouble(n,i<<3); }
now, the only call we need to profile:
a = fastmem.fastGetDouble(val<<3);
which method is the best?
LUT is just a tiny better, so lessons learned:
* flash can be pretty good at doing brute force maths if carefully coded.
* LUT lookup table are not always faster
Next time... we coud try a slower function.. such as:
What do you get ?
the code:
protected function byteto01():void { // initialize alchemy var ba:ByteArray= new ByteArray(); ba.endian = Endian.LITTLE_ENDIAN; ba.length = 4096; fastmem.fastSelectMem(ba); //azoth var i:int = 0; var a:Number; var val:int; var oo255:Number = 1.0/255.0; //Fill our look-up-table: for(i= 0;i<256;i++) { fastmem.fastSetDouble(i*oo255,i<<3); } var time1:uint = getTimer(); for(i= 0;i<50000000;i++) { val = i&255; a = val*oo255; a = val*oo255; a = val*oo255; a = val*oo255; a = val*oo255; a = val*oo255; a = val*oo255; a = val*oo255; a = val*oo255; a = val*oo255; } var time2:uint = getTimer(); for(i= 0;i<50000000;i++) { val = i&255; a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); } var time3:uint = getTimer(); //with a simple sigmoid - lookup table //y=x*x*(3−2x). for(i= 0;i<256;i++) { var n:Number = i*oo255; n =n*n*(3-n-n); fastmem.fastSetDouble(n,i<<3); } var time4:uint = getTimer(); for(i= 0;i<50000000;i++) { val = i&255; a = val*oo255;a=a*a*(3-a-a); a = val*oo255;a=a*a*(3-a-a); a = val*oo255;a=a*a*(3-a-a); a = val*oo255;a=a*a*(3-a-a); a = val*oo255;a=a*a*(3-a-a); a = val*oo255;a=a*a*(3-a-a); a = val*oo255;a=a*a*(3-a-a); a = val*oo255;a=a*a*(3-a-a); a = val*oo255;a=a*a*(3-a-a); a = val*oo255;a=a*a*(3-a-a); } var time5:uint = getTimer(); for(i= 0;i<50000000;i++) { val = i&255; a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); a = fastmem.fastGetDouble(val<<3); } var time6:uint = getTimer(); startTest(); setTest( "n=i*oo255" , "byteTo01", time2-time1 ); setTest( "n=alchemyLookupTable" , "byteTo01", time3-time2 ); setTest( "n=i*oo255;n=n*n*(3-n-n)" , "byteTo01_sigmoid", time5-time4 ); setTest( "n=alchemyLookupTable" , "byteTo01_sigmoid", time6-time5 ); endTest(); }
Leave a Reply