In a previous post, I explored ways to compute the Math.min or Math.max for Numbers (http://guihaire.com/code/?p=550)
What if we need to perform this operation when the floats are in a ByteArray ? Lets explore some interesting methods
So our ByteArray contains a list of floats or numbers, the easiest way is to use the ternary operator to choose which value to write back to the byteArray:
Here a very basic implementation (here using alchemy with apparat):
//get afloat and the following bfloat, save the min afloat = fastmem.fastGetFloat(adr); bfloat = fastmem.fastGetFloat(adr+4); /*get min using ternary operator between afloat & bfloat, and write result in offset 0*/ fastmem.fastSetFloat( afloat<bfloat ? afloat:bfloat , 0);
Only Positives floats
Now, if we know that all the floats are positive, then we can consider the byteArray as an array of integers, compare the integers pairs, and save the min … and it works !
The reason it works, is because The IEEE float and double formats is designed so that the numbers are “lexicographically ordered”, and they are ordered the same way when their bits are interpreted as Sign-Magnitude integers. (So there will be more work for negatives numbers)
so in a byteArray full of positive floats, we can find the min by doing:
aint = fastmem.fastGetI32(adr); bint = fastmem.fastGetI32(adr+4); /*get min between afloat & bfloat, and write result in 0, this works if aint and bint are >=0*/ fastmem.fastSetI32( aint<bint ? aint:bint , 0);
This is great, because then we can use one of our bittwiddling to do min or max on integers:
aint = fastmem.fastGetI32(adr); bint = fastmem.fastGetI32(adr+4); //test that works when all values are positives: fastmem.fastSetI32(bint ^ ((aint ^ bint) & ((aint - bint)>>31)),0);
Only Negatives floats
Another special case, is if all the values are negatives, then we can simply reverse the test to get the proper min:
fastmem.fastSetI32( aint>bint ? aint:bint , 0);
An alternate method would be to invert the bits in the test : ~aint<~bint ? aint:bint
Positives and Negatives floats
What if the floats values in the byteArray are both negatives and positives ? Can we still read the floats as integers to compare ?
Yes! And here the trick !!! If we treat the values as uint:
* for negative values, if we invert all the bits, this means we can do the regular test: (~a < ~b) ? a : b
Problem with that, is now we have a mix of positives and negatives values, we are going to get collisions.
* but if we always set the bit sign of the positive values, (and unset for negatives, which is done by inverting all the bits for negative floats), then reading the floats bits as uint, we make sure the reading makes all the floats positives values be bigger than the floats negatives values.
so we need to:
if(value<0) value = ~value;
if(vale>=0) value = values | 0x80000000;
We can combine both tests:
aint = (aint^((aint >> 31) | 0x80000000)
lets verify:
if aint>0 , aint>>31 is 0, (0|0x80000000) is 0x80000000, aint^0x80000000 -> this set the bit sign
if aint<0 , aint>>31 is -1,(-1|0x80000000) is -1 , aint^-1 -> this invert all bits.
So with this trick, we can now compare both positives and negatives floats by aliasing them as integers.
//get afloat and bfloat as integer: aint = fastmem.fastGetI32(adr); bint = fastmem.fastGetI32(adr+4); //test that works for negatives and positives values: fastmem.fastSetI32(((aint^((aint >> 31) | 0x80000000))<(bint^((bint >> 31) | 0x80000000)))?aint:bint,0);
We can try to rewrite this code in assembly and hope for the best:
__asm( GetLocal(adr), Dup, PushInt(4), AddInt, GetInt, //bint SetLocal(bint), GetInt, //aint Dup, Dup, PushInt(31), ShiftRight, PushInt(-2147483648),//PushDouble(2.147483648E9), BitOr, BitXor, GetLocal(bint), Dup, PushInt(31), ShiftRight, PushInt(-2147483648),//PushDouble(2.147483648E9), BitOr, BitXor, IfNotLessThan("L10"), Jump("L11"), "L10:", Pop, GetLocal(bint), //ConvertInt, "L11:", PushByte(0), SetInt );
okay ! lets profile
what do you get ?
Notes
What is interesting with the integer aliased method, is that we can now compare floats as integers, which means we should be able to sort floats with a Radix algorithm..
It will be the subject of my next post 🙂
The full code:
private function minmaxFloatByteArray():void { var REPS:int = 4096; var NBR_FLOATS:int = 4096; var i:int,j:int; var aint:uint; var bint:uint; var cint:int; var afloat:Number; var bfloat:Number; var adr:int; var min_float:Number; var min_alch:Number; //POSITIVES ONLY: //initialize an array of REPS floats in our fast byteArray. for(i= 0;i<NBR_FLOATS;i++) { //compute address where to get put float adr = (i<<2) + 4 //write a random positive float @ adr fastmem.fastSetFloat( Math.random() * REPS * 2 /*- REPS*/, adr ); //test positive values } var time1:uint = getTimer(); //TEST1 : get pair of floats, compare, write min float value @ adress 0 for(j=REPS;--j;) for(i= 0;i<NBR_FLOATS;i+=2) { //compute address where to get the float adr = (i<<2) + 4; //get afloat and the following bfloat afloat = fastmem.fastGetFloat(adr); bfloat = fastmem.fastGetFloat(adr+4); //get min between afloat & bfloat, and write result in 0 fastmem.fastSetFloat( afloat<bfloat ? afloat:bfloat , 0); } var time2:uint = getTimer(); //TEST2 : get int32,bit twiddling,compare, write min integer value, un-twiddle. for(j=REPS;--j;) for(i= 0;i<NBR_FLOATS;i+=2) { //compute address where to get the float adr = (i<<2) + 4; //get afloat and bfloat as integer: aint = fastmem.fastGetI32(adr); bint = fastmem.fastGetI32(adr+4); //test that works when all values are positives: fastmem.fastSetI32(bint ^ ((aint ^ bint) & ((aint - bint)>>31)),0); } var time3:uint = getTimer(); for(i= 0;i<NBR_FLOATS;i++) { //compute address where to get put float adr = (i<<2) + 4 //write a random negativ or positive float @ adr fastmem.fastSetFloat( Math.random() * REPS * 2 - REPS, adr ); //to test positives & negatives values } var time4:uint = getTimer(); //TEST1 : get floats, compare, write min float value for(j=REPS;--j;) for(i= 0;i<NBR_FLOATS;i+=2) { //compute address where to get the float adr = (i<<2) + 4; //get afloat and the following bfloat afloat = fastmem.fastGetFloat(adr); bfloat = fastmem.fastGetFloat(adr+4); //get min between afloat & bfloat, and write result in 0 fastmem.fastSetFloat( afloat<bfloat ? afloat:bfloat , 0); } var time5:uint = getTimer(); //TEST2 : get int32,bit twiddling,compare, write min integer value, un-twiddle. for(j=REPS;--j;) for(i= 0;i<NBR_FLOATS;i+=2) { //compute address where to get the float adr = (i<<2) + 4; //get afloat and bfloat as integer: aint = fastmem.fastGetI32(adr); bint = fastmem.fastGetI32(adr+4); //test that works for negatives and positives values: -- slower fastmem.fastSetI32( ((aint^((aint >> 31) | 0x80000000))<(bint^((bint >> 31) | 0x80000000)))?aint:bint,0); // some debug code to verify our results: //afloat = fastmem.fastGetFloat(adr); //bfloat = fastmem.fastGetFloat(adr+4); //min_float = (afloat<bfloat)?afloat:bfloat; //min_alch = fastmem.fastGetFloat(0); //if( min_float != min_alch) // throw("min error"); } var time6:uint = getTimer(); //TEST3 : same as above, but byte code optimized for(j=REPS;--j;) for(i= 0;i<NBR_FLOATS;i+=2) { //compute address where to get the float adr = (i<<2) + 4; __asm( GetLocal(adr), Dup, PushInt(4), AddInt, GetInt, //bint //ConvertInt, SetLocal(bint), GetInt, //aint Dup, Dup, PushInt(31), ShiftRight, PushInt(-2147483648),//PushDouble(2.147483648E9), BitOr, BitXor, GetLocal(bint), Dup, PushInt(31), ShiftRight, PushInt(-2147483648),//PushDouble(2.147483648E9), BitOr, BitXor, IfNotLessThan("L10"), Jump("L11"), "L10:", Pop, GetLocal(bint), //ConvertInt, "L11:", PushByte(0), SetInt ); // some debug code to verify our results: //afloat = fastmem.fastGetFloat(adr); //bfloat = fastmem.fastGetFloat(adr+4); //min_float = (afloat<bfloat)?afloat:bfloat; //min_alch = fastmem.fastGetFloat(0); //if( min_float != min_alch) // throw("min error"); } var time7:uint = getTimer(); startTest(); setTest( "(i<j)?j:i" , "byteArray_float+_min", time2-time1 ); setTest( "bit twiddling" , "byteArray_float+_min", time3-time2 ); setTest( "(i<j)?j:i" , "byteArray_float+-_min", time5-time4 ); setTest( "bit twiddling" , "byteArray_float+-_min", time6-time5 ); setTest( "bit twiddling asm" , "byteArray_float+-_min", time7-time6 ); endTest(); }
Leave a Reply