Awesome SAT! 7-8 ms for 60k objects.

So I recently implemented SAT (Seperating axis theorem), which is an algorithm for finding out whether objects collide. These objects can be rotated, scaled, can have as much vertices as they want, etc.

The algorithm is mainly this: “Two convex polygons don’t intersect, if you are able to draw a straight line in between them.”

After about ~10 hours of blindly writing vector math and collision stuff code from online tutorials and own existing code, I got my version running pretty well the first time.

I corrected some lapses, for example not re-calculating the normals after updating the matrix of them or calculating the normals wrong (forgetting to create the left perpendicular of them) and things like that…

So in this screenshot I’ve got 60k polygons, each having 8 vertices, which are randomly generated. (all convex)

Usually the background is black…
Runs at a somehow steady 30 fps, due to the rendering… the collision tests only take 14-15 milliseconds.
I’ve got a GTX 550 Ti, 8GB and 3.3 GHz 8-core Xenon cpu running here.
The collision detection is single-threaded.

So here the same program is with only 60 polygons:

As you can see the background now is black ( ;D ;D ) and the vertices from the polygons are drawn white, if they’re not colliding with the “mouse-quad” and black, if they do.

The program only tests the “mouse-quad” (a 4-vertex polygon, which you can rotate with the mousewheel and move with the mouse) against all avaliable polygons, so the polygons don’t test collisions with each other, only with the mouse-quad.

I’ve got a pretty good pc, I want to see what performance you get :slight_smile:

[url=]Download HERE[/url] I want to see what performance you get :) Remember to start via command line to see additional information!

Added download link, see topic :slight_smile:

How many objects can your pc handle? :wink:

I further improved this version with using a circle vs. circle pre-collision test instead of a AABB vs AABB collision test.

This now works almost twice as fast!
It takes 7-8 milliseconds to test a quad against 60k 8-sided polygons. This means the demo is actually able to run with 60 frames per second with 60k polygons on my pc.

I’m satisfied now :slight_smile:

But I still want to know how far you can get…


I tested it, my CPU is an intel i7 2630QM.
I can have 40000 polygons taking 16ms … your PC is really fast :smiley:

Wow, at least there are some people trying it :slight_smile:

Well, I just built the new version (the link was still the old one…).
Here is ne new version based on Circles as secondary boundaries:

EDIT: Oh wait… that’s a totally other version… fixing this now…

Okey, now the right version is uploaded :slight_smile:

It should be up to 2 times faster for you know, deathpat :slight_smile:
(which means 40k would run with about 9 ms per calculation)

ok just tried it … and it runs much slower than before :slight_smile:
It takes now 32ms with 40000 polygons … but now all the polygons are rotating + the bounding circle is drawn … maybe it is just the display that is slow ? (as it is drawing much more things )

Now to find a way to implement this in a game… ???

Seriously, what do you hope to do with this.

A Space Shooter maybe?

Link is dead :frowning:

You should edit the first post’s link to fix it.

Any way we could get the source? or at least a way to implement it in our own?

I want to test it against my un-optimized version I mucked around with over a year ago :slight_smile:

I got 200K at 33ms

and 1 million at 170ms :stuck_out_tongue:

I tried 100 million and after a good 10 minutes I got a heap size error :frowning:

1 million at 97ms

Around 125k polygons cost 16ms. I’m on an i7 860 @ 3.52 GHz.

Ok yay it worked!

I think I win this thread: 175K polygons cost 16ms. Intel Core i7 2600K @ 4.2GHz

I am getting the same 175k at 16ms on a Core i7 2600k running at 3.4Ghz (Did you overclock yours?)

Which means, that its clearly not the cpu that having the limiting factor anymore?

What videocard do you have? I am running a NVidia Geforce GTX 570

matheus23, the time we are seeing in the console, is that between each frame(including rendering and collision) or is that only collision?

Yup overclocked and I’m using a beastly GTX 580 ;D

It’s a challenge, time to turn my water pump speed up and get 4.5ghz
Ram is probably a big factor here though. Guess I shouldn’t bother. :frowning:

50k 15 ms not bad :slight_smile:
(core 2 duo)
ps: use multy thread, 99% (here) have more then 1 cpu core :wink:
give one half poly check loop - to first thread and second half - to second thread =) (or more threads)

Exactly how does the SAT function work? If it doesn’t use any trigonometry, square roots or the likes, it’s most likely RAM limited. Multithreading might help up to a point, especially on hyperthreaded computers. Hyperthreading will allow it to quickly switch to another thread if a cache miss occurs (as far as I know).

(Shameless pug:;

Wooooot? This thread somehow was just born :smiley:
Okey… propably due to the time differences :wink:


Let’s define version numbers: 1 was my first version you tested before, 2 the one you just tested, and 3 the one everyone else after that tested.

Yes, 2 was drawing muuuuch more vertices. But that’s not the problem. Let’s quickly put in another quote here:

It’s only the time it takes for this code:

		long time = (Sys.getTime() * 1_000) / Sys.getTimerResolution();
		colliding = false;
		for (int i = 0; i < polys.length; i++) {
			if (Collision.polyVsPoly(quad, polys[i])) {
				collisions[i] = true;
				colliding = true;
		System.out.println("Time taken: " + (((Sys.getTime() * 1_000) / Sys.getTimerResolution())-time) + ", axis tested: " + Collision.axesTested + ", objects tested: " + Collision.objectsTested);

Which means, yeah it’s not including rendering time, which is huge here (almost the same time, but slightly less I think…)

Sooo, to come back to quote 1:
In that version I hard-tested the “engine”. I rotated each polygon (really each one). And each polygon has 2 sets of vertices: The standard model vertices, and the pre-calculated transformed vertices. Every polygon also references a 4x4 Matrix.
And I was updating each matrix each frame and transforming each vertex of each polygon by it’s matrix. And that time was included in the console output…
So this is what made it much slower…

So everyone else now tested the 3rd version:

Whaaaa! Multithreading!
Yeah. I’ve already worked with mutithreading (I had a game running simulation and rendering in two different thread one time… I even think it was my first game? Yes it was… intresting… never did it again ;D )

Yeah, even I do have more than 1 core… I’ve got 8 logical and 4 physical cores, and yeah I’ve got hyperthreading, and yeah It’d speed everything up… And finally yeah, it would be very nerdy… But intresting…

I don’t know how you’d go about doing this… I mean, what exactly would you perform twice or more at the same time?

Splitting up the polygons to check into 2 different threads:
Meeeh… this could give very strange behavior if you want to do something with those collisions, since reactions would be run on 2 different threads (or even more…)… Also, in a real game example each polygon has to calculate collisions with each other polygon, which would give much much harder thread-safety problems…

And I can’t think of another way…

How the SAT works:
first check circle vs. circle (not using sqrt().)…
and then: for each axis of the 2 to-check polygons: (axis = normal)
calculate dot product of each vertex and the normal,

These are all calculations done…
So yeah, trigonometry is calculated, but it’s only the dot product, so only + and *…
And finally:
About your threading library: I’ve already seen it once… I didn’t take a look back then, but how exactly does it work, theagentBigDecimal (kidding ;D )?

Anyways, now thank you all for your intrest, I really appreciate this :slight_smile: