Swarm of the elements!!! AGH!

So over at the Actionscript.org forums another question was brought up about a specific effect in this thread. It interested me and I hopped right on making it for the blog here.

The effect is basically a cluster effect where when you click on an object any similar objects cluster around it, and other objects get out of the way. I approached this problem by utilizing a node system. All nodes have sibling nodes that are “similar” to it and should cluster around. With a little math we get this result:

To attack this problem I started with a construct of two basic classes:

The first, Node.as, is very simple. It extends the Sprite class so that it can contain DisplayObjects and Graphics, and have a position. On top of that it contains an array called “_siblings”. This is a list of all other nodes that are considered similar to it. I protected this array so that way you can only add or remove objects of type “Node” to/from it. We don’t want any other kind of object appearing as a sibling or the engine would break! There is also a method for retrieving a clone of the array of all siblings. We don’t want to hand out a direct reference to this array or it would allow other objects to put stuff in the array that are not of type Node. This is one of the downsides to working with untyped arrays in AS3. In the latest version of Actionscript Adobe has added typed arrays, which will allow much more freedom with storing objects in these kinds of lists.

Next we have the Engine, ClusterEvent.as, in which we perform all the actual work to move nodes around. The engine is far more of a complex class then Node.as. So let’s dig into this class to get a look at what it does exactly.

Properties

Note I protect ALL properties of my Classes. This is a good practice for any developer to work on doing. This gives you the coder the freedom to control any and all accessibility to your properties, and seperate the internal variable name from the external property name. Note how I use the underscore for all internal properties, but the getter/setter names for said properties are more descriptive and lack the underscore.

Static constant properties on the other hand I keep public, they can’t be changed anyways because they are constants!

In the engine I store several properties:

  • _stage – a reference to the stage that should be passed in the constructor method. Because the engine isn’t a DisplayObject but needs access to the stage we need this
  • _nodes – an array of all the nodes in the node tree for this engine
  • _activeNode – an internal property that refers to the node being dragged at the moment, if any
  • _siblingPositions – a Dictionary of points that describe where all the siblings of the activeNode should be with respect to the activeNode

The first few methods in my Class are just getter and setter methods to access any protected variables I would like to be made public. Note that some of these properties perform extra tasks in setting the related protected value. For instance if I change the reference to the stage to something new I should remove all event listeners from the old stage reference first. Or when I hand out a copy of the node tree, I don’t want to give direct access to it, but instead a clone of the node tree.

I follow this up with two methods to start and stop the engine.

The Algorithm

The algorithm is fairly basic, but is made up of several methods to facilitate all the things that occur. First their are the event Listeners. At the end of the Class I have 4 methods for adding and removing events. There is a MOUSE_DOWN event set to each node in the node tree, and two events added to the stage: ENTER_FRAME for updating and lerping positions, and MOUSE_UP to stop dragging any active nodes.

There is also a small group of repetetive functions I abstracted from the main algorithm just to clean the code up. This includes: generating the Dictionary of points around the activeNode for the sibling cluster; lerping the position of a node to a specific point; and filtering the node tree of any siblings and returning a cloned array.

Finally there is the Algorithm for the effect. This is made of three parts:

  • grabNewNode – this adds a reference to the recently clicked active node, and generates a cluster of points around itself for all sibings.
  • dropCurrentNode – this removes any reference to an active node when the mouse is released. It also clears the dictionary of points.
  • updateNodePositions – Here we update the positions of all nodes in the node tree with respect to the activeNode

When updating the positions we do 3 basic steps:

  • Update the activeNode – find the mouse position and lerp the currently active node toward it
  • Update sibling nodes – grab a copy of the array of siblings for the currently active node, compar it to the dictionary, and update their positions
  • Update non-sibling nodes – filter the node tree of all siblings and then update their positions if and only if they are to close to the active node.

And that is it! In the package I’ve included 2 more classes though. The Main.as file contains an instance of all the nodes and the engine, and the BoxBug.as Class extends the Node.as Class so as to add some visual information to the Nodes.

link to ElementCluster source code

Vector is as Vector does

Hanging around on the internet I notice a lot of beginners to programming don’t have much of a math background. I’d like to approach this as carefully as possible. You as an amateur programmer should refresh your algebra and even check out some higher level mathematics. Calculus is good to know because if you know Calculus, you’ve mastered Trig and Algebra. Discrete Mathematics is also a very effective tool to have under your belt as programming is mostly all about Discrete arithmetic.

In this post today though, I’m gonna dig into some Algebra with everyone. Functional Algebra and Vector Algebra to be exact. Downside to this highly versatile tool is that a lot of American schools don’t discuss this in their Algebra courses, yet colleges expect you to know it. I distinctly remember my Calculus 3 teacher asking if we remembered Vectors from Geometry and Algebra 2 in High School… I for one only knew of them due to teaching myself at home and never once had a High School discuss the topic. So I’m going to assume a lot of you amateur programmers who haven’t taken higher level math classes also haven’t heard of them.

Variables and Functions

X,Y,Z,t,i,etc… What the hell are with all the letters? This is fairly basic, a letter represents an assumed value. The action of adding 1 and 2 is very similar to the action of adding 4 and 12. It’s only addition. The process of which can be easily described with out the necessity for actual values. If 4 + 12 = 16, we can also rearrange this to say 16 – 4 = 12. But this same movement of values is the same no matter the values present. If a + b = c, then c – a = b.

I hope we all understand this, and variables representing values may they be constant or variable shouldn’t be a new thing for most of us. I’m not here to discuss this. What I am here to discuss though is that a variable doesn’t have to merely represent a numeric value. In algebra we’ve designated the letters as representatives of numerical values merely as a standard. But that doesn’t mean letters are the only variables out there.

f(x) = 2x

The definition of a Algebraic function is very simple. For any given input there is 1 and only 1 distinct output. In the function ‘f(x)’ above, no matter how many times you put the value 4 into the function, the value 8 will always be returned. By definition it is a function.

In Algebra class a lot of teachers make the mistake of letting the student create a distinct boundary between functions and variables. I personally believe this to be a bad choice on the educator’s side. They are the same thing in application, and you as a programmer should understand this fully. Let’s consider this on a programmatic level.

In Object Oriented Programming and several other programming flavours there is the “function” object. It is probably the most often used object in all OO languages, and they are called functions for a reason! Though some OO languages might allow for more complex outputs, the most simple programmatic function has 1 distinct output for any given input. Very often though when we right some type of equation we can state a function as a numeric value:

function manipulate( int val1, int val2 ):int
{
return (val1 + val2) % val1;
}

int value = combine( 3, 4 ) – 2;

This same practice takes place in Algebra and functions are treated just like variables are:

f(x) = 2x + 3

g(x) = x^2 + f(x)^2

The function g with respect to x can be expanded to mean:

g(x) = x^2 + (2x + 3)^2

Vectors

What is a vector? A Euclidean Vector is an object with direction and magnitude.

Whoo, ummm… what the hell does that mean? It’s simple, a vector is similar to a line, except that it doesn’t have a position. By removing the position from this, the vector becomes very useful. You know how a line segment has a position, a slope, and a length. Well a vector is the slope and length of the line, with out the position.

The vector is described piece wise. All operations are performed on it piece wise. Piece wise is the greatest thing about the vector. Instead of describing a multidimensional object as a complex multi variable function, we can now break it into multiple dimensions and alter each dimension on its own.

< x-dimenaion, y-dimension, z-dimension, ... >

each dimension has a standardized letter associated with it:

< i, j, k >

From here on out if I say i, j or k, I am referring to their respective dimension in the vector.

So let’s start making some of these former multi-dimensional equations more usable! How about the line:

y = 2x + 4

This is a line that passes through (0,4) with a slope of 2. Well we know that for every 1 unit we move up the positive x axis, we move 2 units up the positive y axis:

< 1, 2 >

and we start at the point (0,4) described as a vector pointing at that point:

< 0, 4 >

Now we can describe this line moving through 2 dimensions with 1 variable. Let’s call that variable t for time. Every 1 unit of time we move 1 unit right and 2 units up from (0,4):

f(t) = < 0, 4 > + t * < 1, 2 >

so where are we when t = 2?

< 0, 4 > + 2 * < 1, 2 >
< 0, 4 > + < 2, 4 >
< 0 + 2, 4 + 4 >
< 2, 8 >

check back with y = 2x + 4, in our result it says that i = 2 and j = 8… so when we are 2 units down the x axis we should be 8 units up the y axis:

y = 2(2) + 4
y = 8

it checks out just fine!

Now some might be thinking why use this? Slope intercept form has worked fine for me all this time. Multi-variable functions are that big a deal to me. Ok well let’s try something new:

x^2 + y^2 = 9

It’s a circle, why is it a circle? Because your teacher told you it is. But there is nothing functional about it. We can’t describe this as a function because we have a square root which causes any input to result in 2 possible out comes (the top side of the circle, and the bottom part). Let’s describe a circle in vector form now. We know the circle is centered around 0 with a radius of 3 by definition.

radius = 3
center = (0, 0)
base form of vector circle = < Cos( t ), Sin( t ) > || by definition

f(t) = center + radius * < Cos( t ), Sin( t ) >

f(t) = < 0, 0 > + 3 * < Cos( t ), Sin( t ) >

graph for t from 0 to 2PI and you get a circle

Yes it has some trig in it now, but it is functional as well. No matter the value of t that is put in, only one unique value is returned. Vectors are just starting to show some light onto what they can do! How about something like a spiral… what is the equation of a spiral in regular multi-variable form? I bet you don’t know, and it certainly isn’t functional. So lets describe one in vector form.

Let’s describe a spiral starting at (0,0) and expands by 1 unit in radius for every 2PI radians of rotation (every 360 degrees). Now spiral is basically a circle that expands its radius at some rate. So lets use the base form of a vector circle:

Radius = t / 2PI || note that when t = 2PI radius = 1… so and and so forth
center = (0,0)
base form of vector circle = < Cos( t ), Sin( t ) >

f(t) = < 0, 0 > + (t / 2PI) * < Cos( t ), Sin( t ) >

graph for t from 0 to infinity and you get a spiral expanding at the constant rate of 1 for every 2PI radians of rotation. And again the function is functional. For any value of t put in, a unique value is returned.

To be continued…

Floating under the weight of an integer – Part 2

Errors are my way

This is a follow up to Part 1 of “Floating under the weight of an integer”. This article will assume you have read that article, or know something about converting decimal values to binary values. If you need a refresher return to: Floating under the weight of an an integer – Part 1.

In the previous article I described how computer systems store decimal values in binary notation.

History of Number Systems
We humans developed the Decimal Notation systems a couple thousand years ago. The numeral system we use today was originally devised by the wonderful math freaks over in India and brought to the west through the middle east on trading routes (so history assumes). Before the current system we had several other systems of counting which included:

  • Roman Numerals – most Americans and Europeans learn these in primary school
  • Greek Numerals – very similar to Roman Numerals, the symbols differ slightly
  • Babylonian Numerals – The Babylonians utilized a base 60 system… it consisted of 59 symbols, but no zero because the mathematical concept of zero had yet to be discovered

There are others as well. Do note, the earlier numeral systems we used in the west functioned very poorly in math. Take Roman Numerals, addition was relatively simple:

V + I = VI
V – I = IV

some problems arose:

X – III = VII

It makes sense true, but visually and literally it doesn’t follow any valuable pattern. This is because how the numerals were written didn’t follow a robust mathematical system. The Babylonians were the only ones who concidered using a base system, but the system was so grotesquely massive (base 60) it is rather hard to even understand how higher level math could be written with out looking like a massive splotch of ink! To me it reminds me of someone obsessed with circles or something, but wasn’t it the greeks who invented the 360 degree circle?

It goes to show you though, a value can be written any number of ways. Our choice to use the symbols 0 -> 9 is completely arbitrary, and is debated at length as to the origin of the truth. I personally consider the hand debate makes the most sense. We have 10 fingers, so we are used to a base 10 system. We can conceptualize powers of ten easily. Possibly also the reason why our naming system is based on a base ten system. Keep in mind, the name of the value roots more in the way we perceive the value then it has to do with the value all together. twenty one is as arbitrary as “vinentu

ten = 10 (decimal) = X (roman) = 1010 (binary) = A (hexidecimal) = 12 (octal)

Well, computers don’t have 10 fingers. Computers have two. Well, they don’t have fingers, but they have registers that can be on or off… which can be related to two. Binary made the most logical sense when designing the computer. Also because adding binary values is extremely simple.

101 + 10 = 111

note how the bits just fall into place. If a 1 already exists in it’s place the one get’s pushed to the left. In subtraction to the right. For a child who never seen the values from 2 to 9 would probably find this arithmetic a million times easier then base 10 math. No need to remember that 3 and 5 is 8. You just remember 0 and 0 is 0, 1 and 0 is 1, and 1 and 1 is 10.

Significant Values

In the previous article I also described the storage technique of the Double Precission 64-bit Floating Point Value in computers. In it I showed how the 64-bit register is broken into 3 important sections: sign, exponent, mantissa/fraction.

The significant value stored in the mantissa is only 52 bits in length. Which results in approximately 16 significant decimal values. This level of significance along with translating repeating values in binary or decimal creates many arithmetic errors when performing float math. Most of these errors are so small that it is nearly insignificant, but at times the error is horribly bad. The most horrible is usually a result of the number of significant values that can be stored.

Anyone who has taken a science course that deals with any math, you have probably heard of significant values. You may have also seen it in a Statistics course, or for you great math nerds… you’ve learned in any of your courses.

Numerical values sometimes have what are called insignificant values. We see this often in the description of population. It is commonly stated that the world population is 6.6 billion. But we all know there is no way that we have exactly 6,600,000,000 people on earth. That is such a bizarrely round number of people to have. The population is more like 6,602,224,175 (July 2007 estimate). The number is so large though that we aren’t concerned about every individual, instead we describe it as an estimate in scientific notation.

Usually in science and statistics we reduce the number of significant values to 3 or 4. It makes the value much easier to read and understand. The value PI isn’t usually written as the infinite string it is, but instead as 3.1415. Or when measuring the length of something we can’t recognize the exact specific length, instead we assume the closest sig value. The table is 1.5 meters long… not 1.50000000000000000000001 meters long!

We limit the number of significant figures for several reasons.

  • The exact value is not that significant – 1.50000000000000000000001 as opposed to 1.5
  • The value is uncomfortable to read – 6,602,224,175 as opposed to 6.6 billion
  • It isn’t humanly possible to record the exact value – such as measuring the distance from the earth to sun
  • The value is so large that it is inconceivable at an exact value – such as the distance across the Milky Way galaxy
  • The container in which we store the value can not handle the size of the value – such as in the case of computers

The latter of these reasons is the area we are concerned when it comes to programming a computer. Reducing significant values reduces the accuracy of the result. Sometimes the accuracy can be spared, but in a situation where accuracy is needed we want as many significant values as possible. Infinite would be a dream, but in a computer you do not have infinite amounts of space. It is a physical machine, thusly it has physical constraints.

The average computer runs as a series of 32-bit registers in binary. The average modern processor has 2MB of cache. That is 16777216 bits to store 1s or 0s. In theory we can only store a value in binary with a sig value length of 16777216. That sounds like a lot. But again that’s just theory, and in actuality that cache needs to store a lot more then just one unsigned value.

A Double Precission 64-bit Floating Point Value can store 53 significant values (52 bits plus an assumed leading 1). In my last article I showed some basic code that described this problem. Here it is again:

In a flash actionpanel (or in any language with a double precission float) type the following code:

——-
var num:Number = 98765432109876543210;
trace(num);

the console should output:

98765432109876540000
——-

Let’s continue this on to show the amount of error it can cause in fractional values:

——-
var num:Number = 10000000000000000.999999999;
trace(num);

the console should say:

10000000000000000
——-

We lost all that decimal information! This is a problem. If we took the value 100000000000000000 and continually added 1 to it, the value would never change:

——-
var num:Number = 100000000000000000;

for (var i:int = 0; i < 10; i++) { num++; trace(num); } ------- The 1 added on to 100000000000000000 is outside of the significant value range. It gets chopped off each time the arithmetic occurs! This is one problem with how significant values can effect you. Now you might think the difference bettwen 100000000000000001 and 100000000000000000 is so significant it wouldn't matter. But if we added 1 to this for over and over 100000000000000000 times... we should get 200000000000000000. The difference of which is very significant. This is the kind of error we have to except when dealing with Double Precission Floats. Of course though, it isn't often we deal with extremely large numbers and extremely small values at the same time. Floats are great for this. As long as the values you are performing arithmetic on are in the same range as each other, floats perform fantastically well at representing extremely small and large values. You just can't mix and match 40 quintilian with the fraction 4 billionths. Repeating values make me cry
Significant values aren’t the only problems when dealing with floats. Decimal to Binary conversion can cause several errors as well.

Most of us are used to repeating values in decimal notation. 1/3 isn’t so pretty written in decimal notation, .3333333333~. Not so pretty is it? This value has an infinite number of significant values, and thusly must be reduced if we want to store it in a float. But why is it repeating? We understand why irrational values like “PI” and “e” repeat, they are irrational, there is no fractional or rational value that we can state to represent their value. “Pi” is the ratio of a circles circumfrence to its diameter, c/d. But c nore d has no whole integer values to represent them. 1/3 does, 1 and 3. What makes this repeating value so significant? It merely has to do with the decimal system we are writing it in.

There is no congruency between 3 and 10. There are no multiples of 3 that are equal to a power of 10. For instance:

1/4 == 25 / 100

the multiple of 25 and 4 is equal to 10^2.

There is no multiple of 3 that is equal to 10^n (when considering only whole integer values). This is merely as a result of the base 10 system we are in. Repeating values in Decimal sometimes don’t have repeating values in Binary, and vice versa, there are non-repeating values in decimal that are repeating in Binary.

Take for instance the value .9 or 9/10, in binary we get .111001110011100~. Same goes with .1 or 1/10, in binary .000111001110011100~. In Binary repeating values are actually far more abundant when converting from decimal. Why might someone ask? Well think about it, 1/2 is the shortest binary fraction. This 5/10 in decimal. We need some exponent of 5/10 to occur in the value somewhere to stop the repeating value (basically, it’s a bit more then just that).

Repeating values have to be truncated to fit into the 52 bits of significant values a Double can store. This turns the value into an estimate and is no longer accurate. When converting it back to decimal notation the value can sometimes change slightly due to this innaccuracy. This is whay you will see small fractions of error when performing arithemetic processes with fractional values:

—–
var num = 0;

for (var i:int = 0; i<50; i++) { num += .1; trace(num); } ----- Note what the console outputs. Several times you get values of .999999999 leading out of the value. Leaving an error of .000000000000001. This is extremely common and you should just get used to it when dealing with decimal fractions. There are ways around it, strip long decimals to a significant value of your choice. Or you can turn all fractional values into whole integer values first, but this still doesn't work in the case of values like 1/3 which repeat in decimal as well.

Floating under the weight of an integer – Part 1

Converting Decimal Values

Some people know what a float is, others don’t. Some know that floats have issues when performing arithmetic, others don’t. A lot don’t know why these issues occur, just that they do, and that’s what I want to talk about today.

Floating Point Number, otherwise known as a float is a way of storing extremely large values and decimal values on a highly integer based system like a computer. There are several different algorithms for it, the two most common are the Single Precission 32-bit Floating Point Value, and the Double Precission 64-bit Floating Point Value. Most of my tech threads happen to be about flash right now, and I hang out at a lot of Actionsciprt forums, so I am going to describe the Double in this discussion because that is the type of float Flash has used for years.

A double float is stored in a 64-bit register value. The 64 bits are broken into 3 major sections to describe a value in scientific notation:

sign –
the first leading bit denotes if the value is positive or negative, 0 is positive, 1 is negative.

exponent –
the next 11 bits denotes the length of the value for scientific notation. The 11 bits result in values from 0 to 2047, but we know we need negative values to create fractions less then 1 (or the misnomer decimal values). This value actually represents a value from -1022 – 1023 by subtracting 1023 from the value shown. the values 0 and 2047 a saved to represent +/-infanity, NaN and zero. Mainly because none of them can be reprsented as a value greater then 1 raised to any value. Why greater then 1? Well that has to do with the mantissa.

mantissa / fraction –
the last 52 bits represent the significant value.

Different floating algorithms use different ways of storing the mantissa or ‘significand’. In double floating point it is stored as a binary fractional value resulting in actually 53 implicit bits of significance. A binary value greater then or equal to 1 always has a leading 1, no matter what. So we shave this 1 off, and store the rest as the mantissa.

This results in log(2^53) = 15.955 or approximately 16 decimal significant values. Notice whenever you write a double float you can only store a significant string up to 16 digits long. Let’s test that out.

In a flash actionpanel (or in any language with a double precission float) type the following code:

var num:Number = 98765432109876543210;
trace(num);

the console should output:

98765432109876540000

note 3210 was cut off resulting in 16 significant values. Due to the conversions of decimal values to binary values, and rounding algorithms in the conversion, some numbers will get 17 significant values, and others sometimes 15 significant values:

var num:Number = 12345678901234567890;
trace(num);

will yield 17 sig values:

12345678901234567000

So let’s get into the math of this whole process for why errors come flying out every which way during float arithmetic.

Decimal to Binary conversion

Computers store values in registers. Registers have an on or off state. This is only two states, so we store values in binary notation to quickly store values. This means the register has only two symbols to its disposal, 1 and 0. No positive, no negative, no point, nothing but 1 and 0. So the simplest and easiest storage method is the unsigned integer.

Binary works just like decimal notation. Not decimal values like they taught you in school, as I said this is a misnomer. Decimal does not mean fractional, decimal literally means “proceeding by tens”. Hence the prefix ‘deci’ which means tenths. For you Europeans think decimeter or dekameter.

In decimal notation we have 10 symbols: 0123456789. These are then used to represent a value by adding up these values multiplied by it’s ‘place’ in the number. For instance:

125 == 1 * hundreds place + 2 * tens place + 5 * ones place == 1 * 10^2 + 1 * 10^1 + 1 * 10^0

Binary works the same, but the places are of base 2. So it goes that:

101 == 1 * fours place + 0 * twos place + 1 * ones place == 1 * 2^2 + 0 * 2^1 + 1 * 2^0

So to convert a decimal value to binary is very simple. In very human terms the basics are as follows.

122

convert by finding the highest possible power of two that goes into the value once. Write down a 1 and subtract the exponent from the decimal value. Now reduce the exponent to the next lowest power (2^(n-1)). If it is divisible, write down another 1 and subtract, if not, write down a zero, and then reduce the exponent again. Do this until you’ve done all exponents from n to 0.

122 -> 64 || 2^6 == 1
58 -> 32 || 2^5 == 11
26 -> 16 || 2^4 == 111
10 -> 8 || 2^3 == 1111
2 -> 4 || 2^2 == 11110
2 -> 2 || 2^1 == 111101
0 -> 1 || 2^0 == 1111010

122 decimal == 1111010 binary

How about decimal values? Let’s look at this process in more human/lamen terms for the sake of ease. First we scrap the sign and store that at the front of the double float, now we have an unsigned value. Next we split the decimal into two parts, values greater 1, and values less then 1. Basically splitting it at the decimal point.

12.5 -> 12 and 5

we convert the none fraction like before
12 -> 8 == 1
4 -> 4 == 11
0 -> 2 == 110
1 -> 1 == 1100

converting the fraction is a little different. To tell you the truth I don’t know what the heck the computer does to convert it… but for your sake as a person to understand how they are equal let’s go through a basic algorithm for finding it.

It’s pretty simple. Multiply the faction by 2, and write down any whole value on the left of the decimal point. There can only be a 1 or a 0, no other value will result. Drop off that leading whole number and continue through it.

.5 * 2 = 1.0 == .1 in binary

for a better representation how about converting .625.

.625 * 2 = 1.25 == .1
.25 * 2 = 0.5 == .10
.5 * 2 = 1 == .101

.625 decimal == .101 binary

Fractional Binary to Double Float

So now we have our value 12.5 in fractional binary 1100.1

To make it a float we normalize the value, this means we convert it to scientific notation.

1100.1 == 1.1001e+3

The exponential value has 1111111111 added to it and then is placed into the 11 bits for the exponent, and then we shave off the leading whole number 1 and store the fractional value as the mantissa.

the float now looks like this:

0 10000000010 0000000000000000000000000000000000000000000000001001

to make it easier on the eyes, in hex:

0x4020 0000 0000 0009

Wasn’t that fun?