Mithrandir's state sorting post

system · October 17, 2004, 7:45pm

It’s been several months since the post, and I’ve finally convinced myself to attempt to cobble together the beginings of a 3-D engine. I’ve also purchased Practical Java Game Programming, and have been looking at that. Both the book and Mithrandirs post have been helpful, but I still have plenty of questions about how to begin.

Anyway, it seems logical to have a Scene object containing Sector objects (which can be frustum culled), with the Sector objects containing lists of objects within those sectors. I can see the use of this being that frustum culling the Sectors would automatically cull all of the objects in that sector if done correctly. It also seems logical to have Appearance objects associated with the Sector objects so that those Apperances can be shared.
So, if I were to rig this up, I can imagine a set of calls like:


// Called from drawing loop
myScene.draw();

// Method of Scene
public void draw(GL gl)
{
   for(int i=0; i<sectorList.length; i++)
   {
       sectorList[i].draw(GL gl);
   }
}

// Method of Sector
public void draw(GL gl)
{
   for(int i=0; i<objectList.length; i++)
   {
       objectList[i].draw(GL gl);
   }
}

// Method of gameObject
public void draw(GL gl)
{
    // Use the vertex arrays and associated
    // Appearance object to render the objects
    // individual polygons
}

All of that sounds OK until I remember that there isn’t any state sorting at all. There clearly needs to be a list of some sort at the global level so that all of these things can BE state sorted, collision detected, etc.

My problems:

If I place references to the gameObjects in the global lists, how do the individual polygons making up those complex objects get sorted?
If there is a global list of objects, then I don’t need to
call draw() down the hierarchy, I could just call draw() on the global list. BUT, then how do I do the frustum culling of the objects in each sector? I could imagine having a flag in each object to set it’s drawable state, and then flipping them all on or off based on the results of the frustum cullilng, but that can’t be right.

If someone has the patience to answer this and the myriad of other questions I’m going to have, my hat is off to you

I have no doubt that an answer to these 2 questions will cause me to have many more…

Thanks,

Keith

p.s. Thanks Mithrandir for writing those lengthy post. I have them printed and saved

cylab · October 17, 2004, 9:29pm

One idea to that problem was to completely separate the scene from the rendering process. So you end up with bunch of ‘Entities’ like ‘Person’ or ‘Animal’ wich have several ‘Primitives’ connected to them. A Primitive can be a 3D-mesh, a sound origin or anything else, that generates “user experience” like e.g. a visual effect like heat haze.

Entities (Person,etc.) are placed inside a ‘World’ which is organized in a Quadtree or Octree of Sections (maybe partitioned and connected by Portals) and have their own coordinates and simple boundaries (spheres, boxes). A Entitiy can be modified by interaction, behaviour or physics threads (pathfinder; lifecycle: sleep, eat, work, eat, f*, pee, sleep; wind)

To get an “insight” into this world you need a ‘Camera’ with its own coordinates and view frustum. You want to get a scene from this Camera by calling getScene(). This call analyses its World and sorts out all Sections in the tree, that are not at least partial inside the Cameras frustum. After that it takes a look at the Entities inside the partial visible Sections and sorts out all Entities that are not at least partial in the frustum based on their boundaries (spheres, boxes as stated above).

The getScene() method returns all Primitives that are contained in the remaining Entities as a flat Array, since that is all what is needed for rendering (or soundmixing, if the Primitive is a Sound origin). In your rendering thread you can now sort this Primitives according to your needs and iterate them for rendering at last.

You can design this Primitives as descriptive datastructures with no code at all, so that you can design your renderer replaceable or you can just implement a “render”-method on it, that uses JOGL for drawing (and JOAL for mixing).

cya
cylab

system · October 17, 2004, 10:43pm

Thanks for the quick reply

I’m going to try to summarize your idea. Please correct the errant parts.

Use frustum culling to get the visible Sections
Extract all primitives from the entities which exist
inside of the visible Sections.
Sort the primitives and render.
Do this every frame.

Question: You stated that your primitives can be something complex like a mesh. If the mesh is made up
of geometry with different textures, materials, etc. how can the individual polygons be sorted in the ‘big picture’ that contains all of the polygons in the other primitives?
For instance, what if your world was made up of 100 cubes with a different texture and material on each side of each cube (each cube is identical to every other). In this case it seems you would extract all 100 cubes (if they passed the frustum cull) and then do what with them? As you can see, I’m still stuck on how to minimize state changes when you’ve got all those cubes in an array. It seems when you render each cube you would have 6 texture binds and 6 changes in material, which would of course give no advantages at all.

Thanks,

Keith

kevglass · October 18, 2004, 5:31am

You introduce something like Shape3D (see Java3D/Xith3D) which prevent the API user from specifing more than one texture per mesh (and hence makes them split it up into smaller meshes), this allows for efficient state sorting.

Kev

cylab · October 18, 2004, 6:31am

Yes, thats basically the idea: make your Primitives as primitive as possible, so that you have “scene-atoms” that could be sorted. In your example, the cube would be an Entity with 6 Primitives, one for each side.

You can define your Entity as a Composite- pattern, so you can aggregate your Entities out of smaller parts, e.g. ‘Person’ would consist of 2 instances of ‘Arm’, one ‘Torso’, two ‘Legs’, etc. wich all are made of Primitives for rendering.

kaffiene · October 18, 2004, 9:34am

[quote]You introduce something like Shape3D (see Java3D/Xith3D) which prevent the API user from specifing more than one texture per mesh (and hence makes them split it up into smaller meshes), this allows for efficient state sorting.

Kev
[/quote]
And what do you do if your object is rendered using texture splatting?

system · October 18, 2004, 11:28am

OK, the idea of ‘world-atoms’ and ‘Shape3D’ seems clear at this point.

Let’s say this happens:

Create model
Send model’s shapes to the collection, wherever it is
model’s shapes are scattered in collection due to
sorting
model needs to be removed from world
How do you find it’s shapes to remove them from the list?

cylab · October 18, 2004, 11:34am

Well, there are limits to everything. It depends how you implement your Primitives.

You could implement a sorting mechanism, which groups all interdependent Primitives, so that you can call a prepareState()-method on the first Primitive in that group before calling render() on every Primitive in this particular chunk. At the beginning of the next chunk of similar Primitives you again call the prepareState()-method on it’s first entry followed by iterative render() calls on the Primitives in that group and so on.

Sorting a scene graph for realtime rendering is one of the trickier parts in game development…

cylab · October 18, 2004, 11:44am

zahl2001:

In a simple implementation you just provide a method named ‘getPrimitives()’ in the Entity (Model). For each frame you recreate the Primitive list for rendering, so if the Model vanishes, it’s primitives are not collected by the getScene()-call. Thats it

You might want to create a more complex implementation, that minimizes the need to change the sorted primitive list, e.g. just delete unused and add new primitives. You could install some sort of central event bus to notify the different parts of your engine about such changes. Don’t try to use the EventListener scheme from Java Swing here, because you will get myriads of Objects and interdependencies.

mithrandir · October 18, 2004, 1:16pm

ooow… my ears are burning!

;D

I think most of the above posts have covered the basics. The biggest gain that you’ll get is by not changing state at all. All those primitives that may have multiple textures per primitive are going to have to be broken down. They need it for two reasons: Firstly, OpenGL does not handle more than one texture per primitive, and secondly because you need to minimise state changes and sending anything down the pipe that has not changed. If you have two sets of primitives both sharing the same textures, you’re going to need to pull those apart and send bits of each primitive down to OpenGL at the same time.

This is how the speedup occurs. If all you did was sort the higher-level primitives you’d still be getting larges amount of state change per primitive, and you’ll barely get any performance boost. The major boost comes when you strip everything back to absolutely bare minimum and sort based on those small chunks. Even resetting the same texture twice in a row with glBindTexture() can still dramatically kill performance on some video cards.

kevglass · October 18, 2004, 1:32pm

Depends on how you’re implementing your splatting now doesn’t it? If you’re using multitexturing (which aims you at more modern hardware) then you have a new state (the one in which you load multiple textures in) see (TextureUnitState in Java3D). If you’re using multiple coplanar alphaed polys (preferred for my stuff) then you have no problem.

Kev

system · October 18, 2004, 2:57pm

Since my knowledge of Computer Science is limited, I’d like to make a proposed case that I feel is going to be a giant waste of memory, or inefficient, and have someone offer a better proposal.

I realize this isn’t going along with some of the information posted above, I’m just curious to learn some reasons whay not to do certain types of operations.

Lanscape is composed of 100 acres (Sections).
Each Section contains 100 trees.
Each tree contains a certain number of Primitives, each of which conatains a flag called isDrawable.
On startup a referencee to every Primitive in the world is placed in a global collection.
The collection is sorted.
At runtime frustum culling checks each Section for
visibility. If invisible, each Primitive in that Section is set to not drawable. If visible, the opposite happens.
Visible items are rendered.

Thanks for all the input so far,

Keith

cylab · October 18, 2004, 6:25pm

I assume, that step 7 “Visible items are rendered” means, you iterate through your global collection and only draw those primitives that have “isDrawable” = true. One drawback of this approach is, that you can’t reuse primitives this way by referencing them, since it might be drawable in one context and not drawable in another.

Also you have to iterate over all your primitives in your world every frame, at least to call “isDrawable()”. If so, you have to set all invisible primitives to “isDrawable” = false. Only setting the primitives in the visible section does not suffice, despite you are just drawing the visible section alone, but than what is the use of the global collection at all?

Since you are analysing your world to gather visibility information anyway, you could just add the visible primitive to your rendering collection and sort this every frame. This way you are also not bound to static environments, since you can add, move or modify entities in your world as time goes by. Some of those changes (e.g. changing the texture of a primitive or adding an effect to your scene/world) affect the rendering order, since they add or alter state changes.

For reducing the cost of visibility analysis, I would recommend using quad or octrees. An introduction to octrees and their applications can be found here: http://www.gametutorials.com/Tutorials/OpenGL/Octree.htm

cya
cylab

system · October 18, 2004, 7:11pm

Ok, I think I have enough info to get started except for one thing: what type of collection to hold all of the Primitive objects. A previous suggestion was to use a wrapper class that implemented Comparable, but at school we have never done such a thing, so I am unsure on how to proceed. I looked at Comparable, and none of the implementing classes seem to support objects. I was hoping to see something like ArrayList, or LinkedList. I realize that a class (or subclass of something else) is going to have to be created, and then compareTo() defined, but that’s about it.

Thanks,

Keith

p.s. After this I’ll give you a rest for a while

Orangy_Tang · October 18, 2004, 7:53pm

Just use whatever collection is suitable (Set, ArrayList, whatever) and use Collections.sort(). Then either write a Comparator class to sort your rendering objects or build it in to the primatives themselves.

cylab · October 18, 2004, 8:13pm

Take a look at the “Comparing Objects” section (http://java.sun.com/developer/technicalArticles/Collections/Using/#co) of the “Using Collections” article at sun.com (http://java.sun.com/developer/technicalArticles/Collections/Using/)

system · October 18, 2004, 8:19pm

Ok, thanks for the suggestions. I’ll see if I can 't get this figured out soon.

Keith

system · October 20, 2004, 11:18am

I was doodling some ideas on paper yesterday, and I’m getting a scenario with very little vertex data sharing due to the fact that the primitives are small and encapsulated. Do you geniuses have some arcane method of getting around this? Or maybe I don’t want to know…

Seriously, though, if it’s normal for there to be little vertex data sharing, I suppose that’s the lesser of the evils? (The other being a ton of state changes).

I’m looking at some tutorials for building software renderers, and the basic unit of geometry for each of those I’m reading is a single polygon containing information about it’s texture, location in 3D space, etc. The state-sorting approach in JOGL seems to be heading in that direction if I understand what’s been said so far. Although a Polygon3D class would be very easy for me to understand and deal with, I suppose it would be a giant memory waste with no data sharing.

Well, if anyone has any thoughts they want to share on these topics I would be interested to hear your opinions.

Keith

mithrandir · October 20, 2004, 12:15pm

Lots of little primitives is fine. Makes it easy to visibility cull them. In addition, if you set up your scene graph correctly, you can share the individual primitive through mulitple branches of the scene graph, thus saving a lot of memory usage. This is particularly relevant if you’re doing outdoor scenes where you have lots of the same object (eg trees, benchs etc).

Software renderers do take a slightly different tack to those that are hardware based. Or at least they used to. These days, the same tactics that make hardware renderers fast will also make software renderers fast. Due to the huge speed differentials between the various memory levels from CPU to main memory, the more that you can make use of cache coherency, the better. Keeping small tight loops, reusing as much data as possible, not rendering anything that you don’t see. All of these approaches are the same both for software and hardware. Some of the costs for state changing may be be different between the two, for example texture contexts, but in the end, it’s not going to make that much difference. If you’re reading a book on software rendering, make sure it’s not much older than a year or two. Anything older than that just doesn’t have concepts that can be applied to the modern computer architecture setup.

A classic example of this that we found yesterday - a two line change in code gave us a 50% better framerate on heavily transparent worlds. We’d taken an older-style approach to dealing with the transparent geometry, but we found that by applying a different technique that was already in another piece of code, gave us a massive performance boost. Our code was correct according to anything you read from the last couple of years, but this simple change made far, far better use of the internal CPU and driver resources than the “correct” way of doing it.

princec · October 20, 2004, 12:29pm

That sounds like a neat optimisation… care to go into details in another post?

Cas