Sprites!

Well I looked into it, and apparently it does via an OpenGL ES extension. So if I had the time to drop in some optimizations the game could really benefit from a few. I’ve used FBOs in a couple places, but no VBOs.

The current state of play with the server VM on the 1.6GHz Turion64x2 and Nvidia 6150 Go:


     Compiled + native   Method                        
 19.0%  8234  +     0    com.shavenpuppy.jglib.sprites.DefaultSpriteRenderer.writeSpriteToBufferF
  9.4%  4089  +     0    worm.QuadTree.doCheckCollisions2
  4.2%  1819  +     0    worm.QuadTree.checkCollisions
  4.0%  1750  +     0    com.shavenpuppy.jglib.algorithms.RadixSort.sort
  3.4%  1477  +     0    worm.features.LayersFeature.updateColors
  3.0%  1316  +     0    worm.MapRenderer$RenderedTile.setTiles
  2.9%  1267  +     0    com.shavenpuppy.jglib.sprites.DefaultSpriteRenderer.sort
  2.8%  1216  +     0    com.shavenpuppy.jglib.sprites.DefaultSpriteRenderer.build
  2.5%  1084  +     0    net.puppygames.applet.Screen$1.tick
  2.1%   899  +     0    com.shavenpuppy.jglib.sprites.StaticSpriteEngine$1.processRendering
  1.8%   797  +     0    net.puppygames.applet.Screen.tick
  1.8%   787  +     8    worm.path.AStar.goalNotFound
  1.1%   488  +     0    worm.entities.GidrahGameMapTopology.getNeighbours
  1.0%   433  +     0    worm.MapRenderer.render
  1.0%   422  +     0    worm.entities.Building.getCurrentAppearance
  0.9%   385  +     0    worm.MapRenderer$RenderedTile.setLocation
  0.7%   287  +     0    worm.MapRenderer.postRender
  0.6%   258  +     1    worm.WormGameState.checkMouse
  0.6%   249  +     0    worm.QuadTree.doCheckCollisions
  0.6%   246  +     0    worm.path.AStar.nextStep
  0.5%   222  +     0    worm.Entity.isTouching
  0.5%   212  +     2    worm.QuadTree.checkLeafCollisions
  0.4%   195  +     0    worm.WormGameState.tickEntities
  0.3%   149  +     0    worm.Entity.update
  0.3%   143  +     0    net.puppygames.applet.effects.Particle.tick
 69.8% 30225  +    44    Total compiled (including elided)

         Stub + native   Method                        
  6.6%     0  +  2860    org.lwjgl.opengl.ARBBufferObject.nglMapBufferARB
  3.8%     0  +  1669    org.lwjgl.opengl.GL11.nglColor4ub
  3.7%     0  +  1610    org.lwjgl.opengl.WindowsContextImplementation.nSwapBuffers
  1.7%     0  +   746    org.lwjgl.opengl.ARBBufferObject.nglUnmapBufferARB
  1.1%     0  +   477    org.lwjgl.opengl.GL11.nglTexCoord2f
  1.1%     0  +   459    org.lwjgl.opengl.GL11.nglVertex2i
  0.8%     0  +   328    org.lwjgl.openal.AL10.nalGetSourcei
  0.7%     0  +   297    org.lwjgl.WindowsSysImplementation.nGetTime
  0.7%    52  +   245    net.java.games.input.IDirectInputDevice.nGetDeviceData
  0.7%     0  +   286    org.lwjgl.opengl.ARBBufferObject.nglBindBufferARB
  0.5%     0  +   221    net.java.games.input.IDirectInputDevice.nGetDeviceState
  0.5%     8  +   190    org.lwjgl.opengl.WindowsDisplay.nUpdate
  0.4%     0  +   182    org.lwjgl.opengl.GL11.nglEnable
  0.4%     0  +   155    org.lwjgl.openal.AL10.nalSourcef
  0.3%     0  +   124    java.lang.System.arraycopy
  0.3%     0  +   112    org.lwjgl.opengl.GL11.nglBindTexture
  0.2%     0  +    83    org.lwjgl.opengl.GL11.nglTexEnvi
  0.2%     0  +    82    org.lwjgl.opengl.GL11.nglDrawArrays
  0.2%     0  +    78    org.lwjgl.opengl.GL11.nglEnableClientState
  0.2%     0  +    74    java.lang.Object.hashCode
  0.1%     0  +    60    org.lwjgl.opengl.GL11.nglDisableClientState
  0.1%     0  +    58    org.lwjgl.opengl.GL11.nglDisable
  0.1%     0  +    42    java.lang.Thread.sleep
  0.1%     0  +    40    org.lwjgl.opengl.GL11.nglBlendFunc
  0.1%     0  +    40    net.java.games.input.IDirectInputDevice.nPoll
 25.4%    60  + 10950    Total stub (including elided)

That’s after 563.75 secs (49696 total ticks). Frame rate dropped to about 45fps with over 2,300 sprites and well over 120 gidrahs. Still not good enough!

There is a reasonable amount of waste going into immediate mode rendering which I will slowly remove over time and replace with triple buffered VBO code.

Collision detection seems to be taking too long - I may try switching to cell-based collisions instead of the quadtree, though this might well not really have that much of an effect. The real problem here is that so many gidrahs are packed tightly together attacking some walls in this test - perhaps I should implement a bit of that behaviour to keep them apart.

As we can see though by far the biggest single waste is writeSpriteToBuffer still, which honestly should be far faster than Java allows it to be.

Cas :slight_smile:

hi

I am using direct mode for particle sprite rendering. I got a huge speedup with avoiding many glBegin glEnd calls.
so changing this


for all particles p:
  glBegin(GL_QUADS)
  glVertex(...)
  glEnd
endfor

to


glBegin(GL_QUADS)
for all particles p:
  glVertex(...)
endfor
glEnd

made a huge difference.
Now rendering 30 000 particles is no Problem anymore.

Only a loony would do it the first way though :slight_smile:

Cas :slight_smile:

it first i wanted to do it the oop stype with particle.draw. It ended in the first version and with very poor performance. Now I am happy with performant direct mode.

For better performance, try vertex arrays and vertex buffer objects.

Ofcourse, it won’t matter if the bottleneck is with fillrate, but it’s worth a try.