Vectors...what's the point?

Roquen · May 31, 2011, 11:15am

No, I don’t mean the “geometric” concept of a point. I mean literally, as in “why bother?” Since vectors is such an overloaded term, let me be clear that I’m only talking about vector analysis (a.k.a. that thing with the cross and dot product).

Topics containing questions about vectors have been (naturally) popping up and I’ve started to respond, then stopped because it’s just too much of a pain to explain in an understandable way. Don’t get me wrong. Vectors aren’t really hard. They are just confusing and difficult to (properly) use for a number of reasons. They are a useful tool to have access to, but they aren’t the only game in town and with additional knowledge of your other choice(s), you just might make your life much easier (and come to grasp vectors in a deeper way).

Now, having said all of that, let me limit what I’m going to babble about. I only really want talk about 2 dimensions. Not the 3 dimensions is any harder, but because I want as many people as possible to follow what I’m saying. Because most people here will have had some exposure to an alternate way to talk about geometric operations without vectors in 2 dimension: complex numbers. Actually I’m kind of hoping that just a little programming and algebra knowledge will be enough to follow. I’ll be throwing in some math-speak so people that haven’t run across the terms before will have a first exposure. A lot of the URLs that I link (MathWorld and Wikipedia) are very formal (aka almost unreadable) but I don’t know of any really down-to-earth descriptions.

Note that I’m not attempting to suggest that things like complex numbers are replacements for vectors. They are a complimentary tool. But most of the time you can do things at least as easy in complex if not significantly easier.

To a certain extent the real purpose of the post has nothing to do with vectors or even mathematics. The attempted “point” is: think outside the box. If you’re having trouble finding a solution, maybe you need to turn it around and look at it from a different point of view. In this case, change the mathematical type that you’re thinking in. As an aside, in some instances the real problem is that you need to change the question.

Tuples

Before saying anything about complex numbers, let’s talk briefly about Tuples, which are an order list of elements (or components). Where each component is accessed by some name or ordinal.

They can be logically used to represent any “chunk” of data. The easiest direct notion of a tuple is an array. Each component is simply referred to by it’s ordinal (position) within the array. Note that there is generally no restriction on what’s stored in the array (so arrays of references are perfectly fine). Some programming languages can only represent data-structures via tuples (notable some functional languages). Objects of prototype based languages have their member fields represented by a mapping of a name to some value, which is another obvious example of something that’s logically a tuple. What perhaps isn’t so obvious that the fields of a language like java are also representable as a tuple. It just simply required choosing some logical ordering of fields. I keep using the term “logical” to point out that this has nothing to do with implementation, and is simply choosing a standardized way to think about the data.

For my purposes here is no reason to go any further with tuples in general and we’ll move directly to n-tuples of real values. (Therefore reals are the scalars) Where ‘n’ is the number of components and each is real value (modeled by some floating point type in implementation). So I’ll write some 2-tuple as <a,b>, where a & b (as stated) are reals. The common way to write this tuple is (a,b), but I’m avoiding that notation for readability reasons. Now let start build up a base type (class) of 2-tuples by defining some functions (operators, methods, whatever ya want to call them).

From now on I’ll write scalar values as small letters and when representing some tuple symbolically as capitals. (I’m using := for definitions)

Define two tuples: X := <a,b> Y := <c,d>

Let’s define addition and subtract to be component by component: X+Y := <a,b> + <c,d> = <a+c, b+d> X-Y := <a,b> - <c,d> = <a-c, b-d>

With subtraction, we can define a negation: -X := -<a,b> = <0,0>-<a,b> = <-a,-b>

And define multiplying by a scalar to distribute (scalar means the type of component…in this case a Real), so for scalar ‘s’: sX := s<a,b> = <sa, sb>

However, since s<a,b> is a little neater, I’ll leave them written that way rather than expanding.

Now let’s stop and think. We could consider these two values to represent coordinates in the Euclidean plane, where the first is some distance from the origin in ‘x’ and the second is the distance in ‘y’. Actually saying Cartesian would be better, but I’m not that anal today. Then the following is a sample of some geometric interpretations we could make with the operations that we’ve defined:

addition: translation
negation: reflection about the origin (really it’s an inversion…but it’s commonly called a reflection)
scalar multiply: uniform scaling

And since we’re considering the values to be coordinates, we could also formulate various L_n norms and distances, such as L₂ (Euclidean, commonly simply called length or distance), L₁ (Manhattan) and L_Inf (Chessboard or Chebyshev).

Choosing the tuple to explicitly represent a coordinate from some predefined origin is exactly what one does when using a vector as a “position vector”. If we subtract one tuple from another, then the result can be consider the “direction” between two (a “direction vector” in, well, vectors). And we could “normalize” this tuple if we felt like it.

Now we could go a step further and note that we could write these tuples in a trig form: m <cos(t), sin(t)>

where, ‘m’ is a positive and ‘t’ is an angle on (-pi, pi]. Although, for the moment, this trig form is of little use, since none of our operations naturally ‘fit’ in this trig form. Notice that the magnitude (L₂ norm) of the inside of the tuple is one (sqrt(cos(t)2+sin(t)2)=1). Remember that this is just how we’re thinking about the data and doesn’t imply how we’re storing it.

And we could choose to think of the tuple components as represent radically different things: a magnitude and angle for another plane representation (or a cylinder with an implied axis in 3d), a pair of angles for points on a sphere or polar representation of a unit direction, etc. etc.

Basically this is how any mathematical ‘type’ can be looked at. It ‘is’ only the set of properties that it has. Any visualizations, interpretations, etc. are just that. So this 2-tuple can be used for anything we can fit within some sub-set of its properties, but a given usage should be not confused with any intrinsic meaning of the tuple (or type) itself. Also, the functions that I defined above make sense for the kind of tuples we’ll be talking about, but that doesn’t imply that they’re reasonable for all 2-tuples of reals.

Jumping directly to tuples is usually considered to be a “bad” way to learn anything, because it hides the underlying structure of the mathematical type being discussed. I’m of the opinion that, like complex number and vectors, being able to look at something in slightly different ways is always a plus. Sometime one will be more obvious, at other time the other will be. As well as which is the easier way will vary from person to person.

Ok, enough of that…

What’s in a name? And the rules of the game.

Now let’s starting thinking about algebras (I’m being very loose with terminology here. See here if you’re hardcore). In the tuples above, we didn’t assign any names to the components. One way to do so (for an algebra) is to say something like:

“The basis set is {i,j}”.

Now bear with me, 'cause this is kind hard. This is saying the the first component of the tuple is associated with ‘i’ and the second component is associated with ‘j’. Shall I repeat it? Nah. Where the association is an implied multiplication. So for this example, the tuple <x,y> can be written in an explicit form like this:`

<x,y> = xi + yj`

and the tuple notation is now a shorthand for the above. The names of the basis “i” and “j” act like variables do in plain old algebra with couple slight differences. The first is that they never have a value. The are placeholders that separate the components when written in explicit form in the same way that the commas do in the tuple notation. The second role that they play is that they are used to define the core rules of the algebra, from which everything else springs from. The rules are called either: axioms or postulates.

We won’t bother with any rules for the moment. Going from a basis set and defining axioms and showing how they work is typically called construction. Aside: it’s not uncommon for a given type to have many possible constructions.

Vectors aren’t Complex

Vector analysis is a strange mathematical type for a number of reasons. There’s no point into going into any of my “issues” with vectors, other than one. It doesn’t have a product. As far as the properties of vectors, we’ve already developed all of them (informally…without a basis & axioms) in the section on tuples. Well almost. There are two operators (functions) defined via axioms that we need to add. The cross and dot products. The axioms are a mess, so you can look them up on the web somewhere, if you care. The basis set is {i,j}. Here are the functions (for 2D):`

<a,b> . <c,d> := ac+bd
<a,b> x <c,d> := ad-bc`

Notice that the results of both are scalars and thus cannot be shown in tuple notation. (Actually the cross product result is a bivector, which is the pseudoscalar of 2D…but enough of that, we’ll pretend it’s a scalar) If vectors included a scalar part then they would have an unnamed basis called ‘1’. I know that probably doesn’t make sense yet, but wait for it. The cross and dot product are very useful and there are plenty of resources that describe their properties.

The (Historic) Prisoner

OK. This section can easily be skipped. I want to do two things here. The first is informally show the start of a construction of a family of algebras and the second is to show that considering 'i = sqrt(-1)' in complex algebra is pretty much historical nonsense. It’s not “invalid” or anything, but it doesn’t provide any helpful insight (in fact I think that it’s harmful). Note that I’m out of the mainstream in this thinking, even considering the fact that it’s a paradox. So if why you can get meaningful results from nonsense doesn’t bother you and seeing a base construction doesn’t interest you…skip to the next bolded section.

Let’s start with a 2-tuple with the basis set {1,e}. The two components are Reals and we have a single axiom: “e² = k”, where “k” is a Real. The first basis ‘1’ is an unnamed basis and says that we won’t be adding any new rules for it (so it logically acts as multiplying by ‘1’). So for all members of this family tuples in the form <r,0> act like plain old reals.

Let’s form the products for this family of algebras:

Define ‘X’ & ‘Y’: X := <a,b> = a(1)+b[i]e[/i] = a+b[i]e[/i] Y := <c,d> = c(1)+d[i]e[/i] = c+d[i]e[/i]

Expand the product (just like “regular” algebra): XY = (a+b[i]e[/i])(c+d[i]e[/i]) = a(c+d[i]e[/i])+b[i]e[/i](c+d[i]e[/i]) = ac + ad[i]e[/i] + b[i]e[/i]c + b[i]e[/i]d[i]e[/i] = ac + ad[i]e[/i] + bc[i]e[/i] + bd[i]e[/i][i]e[/i] = ac + (ad + bc)[i]e[/i] + bd([i]e[/i]2)

From the axiom above we’ve stated that e² = k and is a real: XY = (ac + bdk) + (ad + bc)[i]e[/i] = <ac+bdk, ad+bc>

Let: k = 0 XY = <ac + bdk, ad + bc> = <ac, ad+bc>

Let: k = 1 XY = <ac + bdk, ad + bc> = <ac+bd, ad+bc>

Let: k = -1 XY = <ac+bdk, ad+bc> = <ac-bd, ad+bc>

When k=0, it form dual numbers, when k=1 forms split-complex numbers and of course when k = -1, then it’s complex numbers. In the complex case, since (what I’m calling e) squares to -1, then e=sqrt(-1) is the mainstream thinking. Consider the other two cases. For dual numbers e² = 0, then e must be 0 (if it were a number). That means the second component of the tuple can’t exist. x=a(1)+b(0)=a. In the split complex case, well it’s confusing since if ‘e’ where a number it would be either +/-1. As an aside I’ll bring up quaternions. The sqrt(-1) in quaternions has an infinite number of solutions which form a unit sphere. Quaternions can be described as a 4-tuple of reals with the basis set of {1,i,j,k}. Axiom: i²=j²=k²=ijk=-1. The mainstream thinking this means that i, j and k are all sqrt(-1), but that they’re magically independent of one another under addition. Pure silliness IHMO.

Feature creep

OK. I’m running out of steam and haven’t even started to get to the point. Complex numbers. There’s no reason to cover much about them since there are tons of resources on the web, so I’ll just focus on a couple things to make my point and touch on some basics that for some weird reason aren’t talked about.

First is that complex numbers have a product, where vectors have two partial products. Also, complex numbers have a multiplicative inverse. The advantage of having a product and an inverse is that you can manipulation equations algebraically. That way you can reason “algebraically”. Where with vectors one is almost always stuck reasoning geometrically or punting to linear algebra.

Recall the complex product is: <a,b><c,d> = <ac-bd, ad+bc>. We can now verify that the scalar multiplication that we just defined in the tuple section is valid:
s<x,y> = <s,0><x,y> = <sx-0y, sy-0x> = <sx,sy>

The interesting thing to note is that scale factors can be pulled out. So: (aA)(bB)=ab(AB). Combining this with the fact that multiplication distributes: A(B+C) = AB + AC. These two things together means the complex product is a linear transform.

If we were to think of one as being in the trig form that we had in the tuple section then:
m<cos(t), sin(t)><x,y> = m<x cos(t)-y sin(t), y cos(t) + x sin(t)>

Which is a scaled (by m) counter-clockwise rotation about the origin with an angle of ‘t’. So the product is a scaled rotation if one is considered to represent a scaled rotation and the other to represent a coordinate. If the one that represents the rotation is a unit complex number (m=1), then it’s simply a rotation about the origin. So if we call P a coordinate to rotate, R the rotation and P’ the point after rotation, then we have the super complex equation:
P' = RP

Let’s say that we have unit directions A and B and we want to know how to rotate A into B.
RA = B R = B(1/A) R = BA* (where the superscript star is conjugation…comes up later)

Rotating about the origin is OK, but we really need to be able to rotate about an aribrary point C. Well, let’s translate C to the origin, perform the rotation and then move it back:
P' = R(P-C)+C

I know that most people consider optimization to be bad, but I don’t:
P' = R(P-C)+C = RP-RC+C = RP+C-RC = RP+C(1-R) = RP+T

where T = C(1-R). Look at the last line. It is telling you that the rotation R about point C is equivalent to a rotation about the origin R followed by a translation T. And C is recoverable from T & R: C = T/(1-R) (assuming R isn’t 1).

Now the product of two in trig form:
m<cos(t), sin(t)> n<cos(s), sin(s)> = mn<cos(s)cos(t)-sin(s)sin(t), cos(t)sin(s)+cos(s)sin(t)>

which simplifies by trig identities to:
mn<cos(s+t), sin(s+t)>

Which is the composition of two scaled rotations about the origin. So rotation composition is also the product of two complex numbers.

The conjugate of complex number is defined as:
<a,b>* = <a,-b>

This definition is by axiom, which along with one for the product form all the properties of complex numbers. Now in trig form:
m<cos(t),sin(t)>* = m<cos(t),-sin(t)> = m<cos(-t), sin(-t)>

So conjugation negates the angle. Notice that this can be considered a reflection about the line of reals (which we’re thinking of as the ‘x’ axis). Now I could go on and show how to generalize reflections and how rotations, reflections, glide reflections and translation are all related, but my point isn’t a tutorial on complex numbers. So I’ll simply state that generalized reflections and glide reflections can be written as:
P' = RP*+D

which is a reflection if RD*+D=0 about the axis (line) with the direction of half the angle of R (in trig form) and through the point D/2. Otherwise it’s a glide reflection (about line with direction (RD*+D)/2 through point D/2). This equation along with the last rotation cover all the isometries of the plane when the magnitude of R in each is one.

Dude: what’s your point?

I’m getting there. After a brief detour. Complex conjugation is an involution, which is math-speak for a function that is its own inverse. (All reflections are involutions) So if some function ‘F’ is an involution, then F(F(X)) = X. Which, besides optimiz…err…formula reduction, is frequently used for extract/isolation of parts. Using this technique with addition and the conjugate yield:
(X+X*)/2 = (<a,b>+<a,-b>)/2 = <2a,0>/2 = <a,0> (X-X*)/2 = (<a,b>-<a,-b>)/2 = <0,2b>/2 = <0,b>
which isolate the real and complex parts respectively. Note that the algebraic form is only useful for equation manipulation and that (of course) you simply grab the parts for computation.

Lets now do the same thing for multiplication:
X^Y := (X*Y-XY*) = (<a,b>*<c,d>-<a,b><c,d>*) = <0,ad-bc> X.Y := (X*Y+XY*) = (<a,b>*<c,d>+<a,b><c,d>*) = <ac+bd,0>
where I’m using ‘^’ not for raising to a power (since I can do superscripts) but as a symbol for a partial product. We’ll call the first an exterior product and the second an interior product. If you compare the inner product with the vector dot product you’ll notice that they are the same (<ac+bd,0> is scalar ac+bd). The exterior product has the same form as the cross product, except that its result is multiplied by ‘i’ (<0,1>).

Notice that adding the exterior and interior products together yield:
X*Y = X.Y + X^Y XY* = X.Y - X^Y
So the product of two complex numbers when one is conjugated can be considered to be isolation of relative angle information between the two.
XY* = |X||Y|<cos(t),sin(t)>
Which is hopefully obvious if you expand in trig form.

So all the operations that you can perform in 2D vectors can likewise be performed in complex numbers. Complex numbers have the advantage it has more algebraic operators which are relatively easy to manipulate.

My tone has been anti-vectors, but that’s just posing for added impact. Vectors are important. From a practical standpoint so much existing work is in terms of vectors and chances are you’ll be required to learn them. They are also needed from a geometric standpoint as well…I’ll just leave it at that, cause it’s too hard to explain. Just remember that complex numbers and vectors don’t have the same properties, but you can often abuse both to represent things.

Let the flamewar begin.

P.S. I’m sure that somehow this means that Java is dead.
P.P.S. All mistakes are Riven’s fault (assuming that he’s mod that approved this thing)

CommanderKeith · June 1, 2011, 12:57pm

Thanks for posting this article, it’s interesting.

I’ve read it a few times, and your main point seems to be that complex numbers can be easier to work with mathematically. You say that this is because more operations are defined. Is this because there is always a solution when you deal with complex numbers? The classic example I remember from school is the problem of finding the roots of a parabola (where a parabola cuts the x-axis).

A parabola is y = ax² + bx + c
There will always be two solutions if you include complex numbers but when restricted to real numbers there could be 2, 1 or no solutions. This is because
x = (-b + (b² - 4ac)^1/2)/(2a) or x = (-b - (b² - 4ac)^1/2)/(2a)
so if (b² - 4ac) is negative then there is no solution in the real number plane, but there are still two in the complex plane.

So is that another example of what you mean, that complex numbers are a more general method and therefore are more convenient than vectors which are limited to the real number plane?

As an aside, why do you use e instead of i for complex numbers?

Roquen · June 2, 2011, 2:02pm

I used ‘e’ in that section at random…just to not use ‘i’ since I was babbling about other algebras than complex at the same time in an attempt to avoid confusion. I should have pointed out that ‘i’, ‘j’ and ‘k’ of vectors and ‘i’, ‘j’ and ‘k’ of complex (including quaternions) are not the same. They are very close so confusion is possible.

My attempted point is that complex numbers can be manipulated at the symbolic level in the exact same way that regular algebra can. So yeah, for alot of people working with equations in complex will be easier than working in vectors. Basically you have that extra binary operator (conjugation) added to the mix, with it’s set of basic identities and that’s about it. So once you have an equation or set of equations pasted together from what you already know, then your “geometric thinking” is done. You just mechanically manipulate the equation(s) in what ever direction you want to go. Also, since it’s more a verbose (if you will) type, then there’s a greater number of initial equations, that all say the same thing, that all converge to the same reduced result.

Obviously if you already know a solution in some other type there is no reason to re-examine it under complex. Except in special cases like where you have a sneaking suspicion that you might be missing something or there’s a bug/gotcha that you can’t track down.

Complex numbers when used to solve for roots of real functions is using them in a different manner. I don’t see any obvious meaning of interpreting the results as (say) some coordinates. But, that’s not to say that there isn’t. Regardless of what math ‘type’ you use to solve a geometric problem (of the type we’re talking about), the final numeric result will be the same. (Like anything, there are probably exceptions to this, but on the whole it’ll be true.) In some cases if you look at the low level operations occurring, they will be identical. In others they will not.

CommanderKeith · June 7, 2011, 2:24pm

Thanks for explaining. I’m not experienced enough with complex numbers to be able to manipulate them so easily, but it’s a very interesting way of looking at these problems. I wonder what other people think.

Have you studied maths at university or some other course?

Roquen · June 7, 2011, 3:50pm

I probably haven’t given enough examples, but I was running low on motivation.

Most of my math knowledge is self taught…needing to read up on X to be able to understand Y kind of things.

CommanderKeith · June 8, 2011, 4:57am

I wonder where complex numbers came up in your general self-taught study, it always seemed like the most obscure topic in my maths course at school. And I would never have guessed that there was an application of complex numbers in games.

It’s interesting though because lately I’ve had to look into exponential and logarithmic graphs, in relation to continuous versus discrete compounding growth rates, and there’s a few places where the numbers ‘e’ and ‘i’ crop up in the same equation like euler’s identity, which I am still trying to figure out (http://en.wikipedia.org/wiki/Euler’s_identity).

princec · June 8, 2011, 9:13am

Whilst I love complex numbers and entirely grok them when it comes to their use in fractals, I confess that this whole article sailed straight over my head with a whooshing noise Sometimes I’m glad to be a bit dim and leave the complicated thinky parts to other people!

Cas

Roquen · June 8, 2011, 2:33pm

I re-examined complex numbers back in the early 90s when learning quaternions. All quaternions that fall in a plane that contains the line of reals act identically to complex numbers. And like rotating a point with complex (translate to origin, perform operation and translate back), many quaternion problems solved in the same manner…rotate to plane where the operation is easy, perform the operation and rotate back. This is exactly how SLERP works BTW. This kind of thing is an ulterior motive for my article. Some of this stuff shows up over-and-over in mathematics. (Like the extraction bit for instance.)

Complex numbers basically parametrize an angle and magnitude. (I’m making myself cringe a-bit, but I’ll get over it). If you look at all the basic operations & functions of them, they are all very naturally described in the trig form. The only exception to this is addition. So they basically auto-magically contain lots (like my precision) of more complex (like my pun) functions.

On the exponential function (and all the other basic analytic functions) it’s more useful to think in terms of its series expansion, if one exists.
exp(x) = Sum (xn)/n!, for n=0 to Infinity

Using i²=-1, this can be broken into parts, which give a final result of:
exp(<a,b>) = exp(a)<cos(b), sin(b)> exp(<0,t>) = exp(0)<cos(t), sin(t)> = <cos(t), sin(t)> exp(<s,0>) = exp(s)<cos(0), sin(0)> = exp(s)<1,0> = exp(s)
So: exp(i Pi)+1 = exp(<0,Pi>)+1 = <cos(Pi), sin(Pi)>+1 = <-1,0>+1 = -1+1 = 0

While I’m babbling, the inverse function of exp is log:
log(m<cos(t), sin(t)>) = <log(m), t> log(<cos(t), sin(t)>) = <0, t>
So the pair cover (algebraically) converting between polar and rectangular coordinates.

Strictly speaking, raising to a real power ‘n’ is multivalued, but the principle power (one with the smallest angle) is:
(m <cos(t), sin<t>)n = mn <cos(n t), sin(n t)>
So raising to a power parametrizes the angle. For example if n is on [0,1] then it linear interpolates the angle from the line of reals to whatever the angle of the input complex number is. This is the core of how SLERP works. And if you set ‘n’ to 1/2 you get the square root:
sqrt(m <cos(t), sin<t>) = sqrt(m)<cos(t/2), sin(t/2)>
So the magnitude is square-rooted and the angle is halved. Square root of -1?
sqrt(<-1,0>) = sqrt(<cos(pi), sin(pi)> = <cos(pi/2), sin(pi/2)> = <0,1>
Big deal.

@princec - slacker! Seriously, is there anything that you’d want to understand that I could attempt to clarify?

ra4king · June 8, 2011, 7:25pm

I actually want to understand all of this since I just <3 math but I do believe I’m in a worse situation than princec where this entire article jet-ed over my head with a deafening sonic boom.

Roquen · June 9, 2011, 5:50am

What might help? More equation examples? Toy source code? Something else?

ra4king · June 9, 2011, 8:23am

Pictures! Lol, well I’m mainly a visual learner and I also learn through examples! When I was 1/4 reading into this article my brain started overheating and melting so I just skimmed through the rest.

Spasi · June 9, 2011, 10:58am

See this and this. Maybe our kids/grandchildren will use tools like that in school to learn math. I admit I really started to get a true grasp of math after getting familiar with 3D graphics, e.g. it’s very easy to understand what an equation does and how it behaves if you can see the result visualized in real-time.

Also want to say thanks to Roquen for this article, it’s very interesting.

princec · June 9, 2011, 2:28pm

I’m very task-oriented, that is, I only do something or find something out when it’s to actually solve a specific problem I’ve got. Right now my 2D games require so little in the way of maths beyond sin/cos/atan2/sqrt I don’t really know where I’d manage to squeeze an imaginary number!

Cas

Roquen · June 11, 2011, 11:58am

@princec - that’s what I figured.
@Spasi - thanks!
@ra4king - gotcha. I’ll completely ignore pictures and examples and post some example code…that what you can make your own interactive examples and pictures! It’s the computer game baby, static stuff is totally out! Seriously, if I motive myself I’ll do some applets with the code base that drawn stuff and spew numbers (unless some kind soul beats me to it.)

Roquen · June 15, 2011, 4:40pm

Here’s a first pass at some toy code.

http://pastebin.java-gaming.org/905a3437141

Roquen · March 8, 2012, 3:02pm

EXAMPLE: Rectangle intersection with Points and Circles

(Warning: banged out quickly…probably some mistakes. Code untested.)

Intersection tests don’t change under linear transforms. This is a fancy way of saying that we can translate, rotate & reflect (among other things) and the problem (the yes or no answer) remains the same.

If we draw some objects on transparency paper, we can slide (translate), twist (rotate) and flip the page over (reflect) and the relationship between them remains the same. To get a solution to a rotated rectangle vs. circle test, we’ll use all three of these transforms.

One of the simpliest ways we can look at a rectangle is when its center at the origin and the edges are aligned to the coordinate frame. So our rectangle is just defined by width & height.

Then testing if a point is inside is really easy:


  if (p.x >  width/2)  return false
  if (p.x < -width/2)  return false
  if (p.y >  height/2) return false
  if (p.y < -height/2) return false
  return true;

Two things to note. First we’re always dividing by two. Second two of these test are just reflections of the other two. Since our rectangle is at the origin, we can fold the paper at the x-axis and the bottom half matches the top, then we can fold at the y-axis. Regardless of where the test point was originally, the answer to our question doesn’t change.

Now our half-width and half-height are the postive distances to folded up edges in the first quadrant. We’ll put that into a 2d vector (or complex number) called delta.


  if (Math.abs(p.x) > delta.x)  return false
  if (Math.abs(p.y) > delta.y)  return false
  return true;

Now let’s allow the aligned rectangle to translate. We’ll call it’s translation ‘center’. All we need to do is reverse the translation back to the origin and we’re done:


  if (Math.abs(p.x-center.x) > delta.x)  return false
  if (Math.abs(p.y-center.y) > delta.y)  return false
  return true;

Now we’re almost done with the point version. Now we just need to allow the rectangle to be rotated. So let’s add a (unit) complex number called orientation to represent the rotation. As noted in the main text, to reverse a rotation you conjugatge and then multiply. So here’s the completed rotated-translated rectangle vs. point test:


public boolean intersects(Point p)
{
  // translate the system so the centroid of the rectangle
  // is at the origin.
  float tx = p.x - center.x;
  float ty = p.y - center.y;
  
  // unrotate the system so the edges of the rect are aligned
  // to the coordinate frame.
  float px = orientation.x*tx + orientation.y*ty;
  float py = orientation.x*ty - orientation.y*tx;
  
  // fold the problem into the first quadrant.
  px = Math.abs(px);
  py = Math.abs(py);
  
  if (px > delta.x) return false;
  if (py > delta.y) return false;
  
  return true;
}

Which can be reworked into this:


public boolean intersects(Point p)
{
  // undo the rectangle's translation
  float tx = p.x - center.x;
  float ty = p.y - center.y;

  // unrotate x, fold and compare
  if (Math.abs(orientation.x*tx + orientation.y*ty) > delta.x)
    return false;
  
  // unrotate y, fold and compare
  if (Math.abs(orientation.x*ty - orientation.y*tx) > delta.y)
    return false; 
  
  return true;
}

Now onto testing against a circle. First thing is we can ‘extend’ the rectangle by the radius. If the center of the circle is outside of this extended rectangle, then it cannot intersect. Likewise if the center of the circle is inside the unextended rectangle, then it must intersect. Modifing the first of the above gives us this:


public boolean intersects(Circle c)
{
  float tx = c.x - center.x;
  float ty = c.y - center.y;
  float px = orientation.x*tx + orientation.y*ty;
  float py = orientation.x*ty - orientation.y*tx;
  float r  = circle.radius;
  
  px = Math.abs(px);
  py = Math.abs(py);

  // modified: outside extended?
  if (px > delta.x + r) return false;
  if (py > delta.y + r) return false;

  // modified: inside un-extended?
  if (px <= delta.x) return true;
  if (py <= delta.y) return true;
  
  // NOT DONE YET
}

Now all that’s left to handle is the (litterally) corner case. So we’re going to do is to translate the corner of the rectangle to origin and test against the radius of the circle.

										
  // translate the corner to the origin
  px -= delta.x;
  py -= delta.y;
  px *= px;
  py *= py;
    
  // classic distance squared test
  return px+py <= r*r;

And we’re done! Well almost. Notice that we’re comparing px with delta.x and py with delta.y quite a bit. Then we end up subtracting it out. We can pull all of that up to drop a number of operation. So in effect, after our reflections (fold operations), we’re translating the remaining corner to the origin.

And like the final point test, let’s group together the tests by axis:

  
public boolean intersects(Circle c)
{
  float tx = c.x - center.x;
  float ty = c.y - center.y;
  float r  = c.radius;
  
  float px = Math.abs(orientation.x*tx + orientation.y*ty) - delta.x;
  if (px > r)  return false;
  
  float py = Math.abs(orientation.x*ty - orientation.y*tx) - delta.y;
  if (py > r)  return false;

  if (px <= 0) return true;
  if (py <= 0) return true;
  
  px *= px;
  py *= py;
  
  return px+py <= r*r;
}

(EDIT: Last version had a bug)

Roquen · May 11, 2012, 2:32pm

Relation between unit direction vectors and rotations

Let say we have two points A and B and want the direction from A to B, then it’s (B-A):
dx = b.x - a.x; dy = b.y - a.y;
and of course it’s much more useful normalized D = (B-A)/|B-A|:
s = 1/sqrt(dx*dx + dy*dy); dx *= s; dy *= s;
So in the tuple form the unit direction is: D = <dx, dy> = <cos(t), sin(t)>

We don’t know the actual value of ‘t’…we currently only have ‘dx’ and ‘dy’. We could find it by using atan2, but we don’t need to as ‘D’ is already in the form of a rotation (SEE above). If you don’t want to or cannot directly use complex multiplication to perform rotations, then simply plug the (dx, dy) values into your matrix.