The XAP development diary

Chaz has done a Bubble! It looks about, ooh, 100x better than my bubble. I have written a packing routine which takes a directory of PNGs and squeezes them all into a legal OpenGL texture - that is, one with side lengths that are powers of 2. It trims off any blank space in each PNG first and then uses a recursive subdivision algorithm to try and fit them in as best it can. It also writes out an XML file containing details of where it stuck all the sprites and their new hotspots as well. When CVS at sourceforge is working I’ll upload the new tool to SPGL.

Created a new particle feature today - the Explosion particle. Very similar to a SimpleParticle except every few ticks an Explosion particle spawns some smoke behind it - the smoke itself is just a plain old SimpleParticle like I was using before.

The explosions now occur in X, Y, and Z - as in fact do all sprites as I’ve updated the SPGL sprite engine to work in 3D now!

The 3D effect is quite noticeable.

Chaz has also done me a player ship! I’m just waiting for that to turn up in a .zip sometime this evening.

I am using Remote Administrator to control Chaz’s computer now and again (don’t worry - he knows :wink: ). It’s a godsend for getting non-developer types up and running with Eclipse and CVS. He can now get the latest code straight out of my CVS repository. Remote Administrator is by far and away the best remote admin tool out there; I even paid money for it, so it must be good!

Can’t resist a couple of screenies:

http://www.shavenpuppy.com/screenshots/bubbles.jpg

At last, lovely bubbles!

http://www.shavenpuppy.com/screenshots/jellies.jpg

A Jelly Incursion! It incurred right on top of me so I died though.

Cas :slight_smile:

Chaz did the player ship and committed it via CVS. I plugged it in and suddenly XAP no longer looks entirely shit!

The player ship consists of 64 different rotations. It constantly faces towards the mouse pointer. I have a funny feeling there’s a lot of newbies who would like to know how to do this, so here’s the code that does it:


// target is a Vector2f which has the mouse position in world coordinates in it
// (not screen coordinates). position is a Vector2f with the player's world
// coordinates in it. We subtract position from target to get the 0,0 based
// direction vector.
Vector2f.sub(target, position, scratchVector);

// Then we find out the angle in radians of this vector using a built-in Java
// function:
double angle = Math.atan2(scratchVector.y, scratchVector.x);

// The angle needs to be converted to our range of 0..63 now (assuming there's
// 64 rotations possible):
int spriteAngle = (int)(rotation.length * angle / (2.0 * Math.PI));

// Sometimes spriteAngle is negative, so to ensure it's in the range 0..63 we
// add 64 to it, plus a small offset to match the angle versus the angle of my
// sprites (they happen to be 90 degrees out as they're drawn) and then use the
// modulus % operator to get us back to 0..63:
spriteAngle = (spriteAngle + rotation.length + 16) % rotation.length;

// And that's the sprite image we want:
sprite.setImage(rotation[spriteAngle]);

I put in a little smoke trail from the ship’s thrusters which looks nice. It needs a pair of glowing jets as well which I’ll figure out tomorrow some time. I suppose it’ll need a little quiet “whooshy” noise for when the jets are firing too.

Here’s a pic of the ship:

http://www.shavenpuppy.com/screenshots/ship.jpg

Did some adjustment of some of the sound samples today; the gidrahs’ bullet is less annoying, and the gidrahs’ explosion now echoes, just like the original Defender did. Lovely.

Discovered a garbage collection glitch today. After running for a few minutes, the game starts to stutter and jerk. I was rather perplexed by this at first but then I realised that the humble fps counter was creating about half a megabyte of garbage a second. You have been warned! I have cured it by using -Xincgc and -Xconcgc but the irritating thing is these aren’t turned on by default. It’s not such a huge issue really because a) it won’t be in the released version and b) I’m deploying it as an .EXE anyway so I won’t have -Xincgc and -Xconcgc - just whatever Jet 3.0 has in it.

Tomorrow I’ll be fixing glitches and bugs before I go any further.

Cas :slight_smile:

Worked for the Man instead all day :frowning:

Chaz sent me a new player ship and new bubble. The player ship is now ever so slightly larger. The bubble now has a subtle oily film to it - looks nice. Wobbles a bit wrongly though. Can’t put my finger on it.

Tried the game in 32 bit - much subtler shading - nice. Can’t really rely on 32 bit though because the performance will start to suffer once there’s a lot going on.

Cas :slight_smile:

Discovered a bug in my sprite packer that screwed up sprite hotspots. This caused the bubble to reel around the screen a bit drunkenly.

Tweaked the smoke so it looked more even - much better. The player now has 3,000 smoke particles allocated for death! The gidrahs by comparison only get 2,048 between them.

I noticed a little bit of a slowdown when twatting the gidrahs with a smartbomb. The sheer number of particles generated seems to drop the framerate somewhat - to about 45fps or so. It’s barely noticeable because the smoke dies in 2/3rds of a second but it was still enough to make me go in search of tweaks.

First up, the SPGL sprite renderer now uses NV_vertex_array_range if it can find it. Anyone know what the ATI/Matrox/S3 equivalents are? That seems to have sped it up a teeny fraction.

Secondly I tried running it with the server VM again, and it’s quite a bit smoother when a smartbomb goes off. Later I shall compile with Jet and see how that fares, as it’s the crucial decider.

I think it’s fill-rate bound now rather than vertex bound which bodes ill for sticking a nicer background in the game. 5,000+ alpha blended particles is quite a lot for a card to handle. I wonder how I might scale it down for older cards? I’m still hoping that it’ll all run smoothly on a TNT. I might be cunning and write a frame rate analysing thingy. This thingy will watch for frame rate drops and start to lower the ceiling on particle creation accordingly. Particles are just eye candy after all, and if they get in the way the most important bit of eyecandy, which is graphics that are smooth as a baby’s bum, then they’ve got to go.

Chaz can’t do any work today on the game but apparently he’s got all Wednesday to do sprites. Hurrah! I’ll set him to work on the Tringles, Jellies, Gunner and Blob, which will be the complete complement of sprites for the alpha tests to come.

Speaking of which, I’ll be needing some alpha testers soon - preferably with diverse hardware. I’ve got the following hardware to hand and I need some volunteers with freaky hardware that’s different to what I’ve got - particularly, I’m interested in older processors and graphics cards.

Dual P3/700 with Geforce 1, Win2k Server (edit: realised it’s GF1 these days…)
P3-M 1.2 with Geforce 2 Go, WinXP Pro

Mail to cprince@shavenpuppy.com to volunteer please.

Cas :slight_smile:

Have a look at this:


     Compiled + native   Method                        
 21.7%  3233  +     0    java.util.Arrays.mergeSort
 10.1%  1507  +     0    com.shavenpuppy.jglib.sprites.SpriteRenderer.writeSpriteToBuffer
  4.6%   681  +     0    xap.particles.SimpleParticleFeature$SimpleParticleInstance.doTick
  3.7%   546  +     0    com.shavenpuppy.jglib.sprites.SpriteEngine.render
  3.2%   484  +     0    xap.ColorSequence.getColor
  3.2%   475  +     0    xap.features.ParticleFeature.tick
  2.7%   395  +     0    xap.GamePanel.doAnimationTick
  2.0%   302  +     0    java.nio.DirectByteBuffer.put
  1.3%   197  +     1    com.shavenpuppy.jglib.sprites.Sprite.tick
  1.1%   169  +     0    xap.particles.ExplosionParticleFeature$SimpleParticleInstance.doTick
  1.1%   165  +     0    com.shavenpuppy.jglib.sprites.SpriteRenderer.build
  0.6%    82  +     0    java.nio.Buffer.position
  0.4%    57  +     0    vtable chunks
  0.3%    52  +     0    com.shavenpuppy.jglib.sprites.SpriteEngine.tick
  0.3%    45  +     0    java.util.Arrays.binarySearch
  0.3%    38  +     0    com.shavenpuppy.jglib.sprites.SpriteRenderer.postRender
  0.2%    24  +     0    adapters
  0.1%    18  +     0    com.shavenpuppy.jglib.sprites.GlowingStyle.setupState
  0.1%    18  +     0    com.shavenpuppy.jglib.sprites.SpriteEngine.allocate
  0.1%    17  +     0    com.shavenpuppy.jglib.sprites.RandomGotoCommand.execute
  0.1%    17  +     0    xap.features.ParticleFeature$ParticleInstance.deallocate
  0.1%    13  +     0    xap.features.ParticleFeature$ParticleInstance.spawn
  0.1%    13  +     0    com.shavenpuppy.jglib.sprites.GlowingStyle.resetState
  0.1%    13  +     0    java.util.Arrays.swap
  0.1%    11  +     0    com.shavenpuppy.jglib.sprites.TransparentStyle.setupState
 58.1%  8655  +     3    Total compiled (including elided)

         Stub + native   Method                        
  8.7%     8  +  1283    org.lwjgl.opengl.CoreGL.drawArrays
  1.8%    35  +   240    org.lwjgl.opengl.CoreGL.texEnvi
  1.7%     0  +   260    org.lwjgl.opengl.CoreGL.callList
  1.6%    21  +   221    org.lwjgl.opengl.CoreGL.bindTexture
  1.6%    21  +   214    org.lwjgl.opengl.CoreGL.enable
  1.6%    31  +   203    org.lwjgl.opengl.CoreGL.disable
  1.6%    40  +   193    org.lwjgl.opengl.CoreGL.enableClientState
  1.4%    28  +   187    org.lwjgl.opengl.CoreGL.disableClientState
  0.9%    22  +   114    org.lwjgl.opengl.CoreGL.blendFunc
  0.3%     0  +    38    org.lwjgl.opengl.CoreGL.pushMatrix
  0.0%     3  +     4    org.lwjgl.opengl.CoreGL.popMatrix
  0.0%     0  +     7    org.lwjgl.opengl.CoreGL.translatef
  0.0%     0  +     5    org.lwjgl.opengl.CoreGL.lineWidth
  0.0%     1  +     2    org.lwjgl.opengl.CoreGL.color4ub
  0.0%     1  +     1    org.lwjgl.opengl.CoreGL.vertex2d
  0.0%     0  +     1    org.lwjgl.opengl.CoreGL.scissor
  0.0%     1  +     0    org.lwjgl.opengl.CoreGL.texCoord2f
  0.0%     0  +     1    org.lwjgl.opengl.CoreGL.matrixMode
 21.4%   212  +  2974    Total stub

Using the server VM (which approximates the performance of Jet), I’m only getting a mere 40-odd fps when I generate hundreds of particles - and this is on a 1.2GHz P3. To get 60fps, my frames have to take no longer than 16ms or so. Currently they’re taking 25-30ms when there are a lot of sprites on the screen. At first I thought it was fill-rate but a closer look at the native side of things shows me that in fact the limiting factor is my own compiled code at 58% CPU (15ms or so).

This is bad.

As usual the main culprit is in a library that someone else has written. The mergeSort is probably a perfectly fine routine, and here in fact we can deduce that the figure of 21.7% given includes the inlined comparator that I use to compare the sprites with. First off - can I replace it with a radix sort? Will that be any faster? Can I do something cunning so that I don’t have to sort my sprites in the first place?

Secondly, there’s writeSpriteToBuffer. This has actually got its work cut out for it as it has to scale and rotate the sprites itself in order to take advantage of glDrawArrays. I’m not doing any rotation but quite a few sprites are being scaled using floating point maths when really they could probably be done using integer maths somehow. Instead of specifying a scale in terms of 1.0f being 1:1 scaling I should specify an absolute width and height for the sprite to be rendered at (which would default to the actual width and height of the image for sprites which aren’t scaled).

Thirdly, the SimpleParticleInstance.doTick method looks like it could do with some optimising as well. All this has to do is add vectors together. Perhaps particles need to be moved into fixed-point integer maths because this is just far too slow. There’s only a thousand or so particles whizzing around, and that’s just not acceptable performance.

Finally, ColorSequence getColor is taking a strangely long time, which is odd because it’s quite trivial - so it’ll need investigating.

It’s worth stopping tweaking the game now and getting these bits optimised now because they’re “finished” - the code works and will be in the game in this state, so now’s the time to tweak it and make it faster whilst it’s easy to check that it’s working properly.

Surprisingly, performance optimisation is actually quite a fun aspect of games programming, and it doesn’t get much of a mention.

Cas :slight_smile:

Hey Cas, what exactly are your plans for this? Are we gonna have to buy it when it’s done to take it for a spin?
:wink:

Ooh! A visitor!
Yeah, hopefully :slight_smile: I’ll be releasing it through an indie publisher of some kind (I mentioned a couple of favourites earlier) and I expect it to cost about £15 or so (gowaaaaaan! the price of a takeaway curry! cuttin’ me own throat etc.) and if I end up making enough out of it to quit the day job - I will! And write another one. I’ve got 3 games lined up, all very thoroughly planned.

Intrepid beta testers can of course expect their own copy :slight_smile:

Oh yeah of course - there’s a 4 level demo too.

Right then, I’d better get on putting the radix sort into the sprite engine…

Cas :slight_smile:

Now take a look at this:


     Compiled + native   Method                        
  8.3%  2052  +     0    com.shavenpuppy.jglib.sprites.SpriteRenderer.writeSpriteToBuffer
  5.2%  1283  +     0    xap.BattleZone.render
  5.0%  1252  +     0    xap.GamePanel.doAnimationTick
  4.7%  1165  +     0    com.shavenpuppy.jglib.algorithms.RadixSort.sort
  4.1%  1008  +     0    xap.particles.SimpleParticleFeature$SimpleParticleInstance.doTick
  2.5%   633  +     0    xap.ColorSequence.getColor
  2.5%   624  +     0    com.shavenpuppy.jglib.sprites.SpriteRenderer.sort
  1.5%   367  +     0    java.nio.DirectByteBuffer.put
  1.1%   268  +     0    xap.particles.ExplosionParticleFeature$ExplosionParticleInstance.doTick
  0.6%   143  +     0    vtable chunks
  0.4%   107  +     0    com.shavenpuppy.jglib.sprites.Animation.animate
  0.2%    43  +     0    com.shavenpuppy.jglib.sprites.RandomGotoCommand.execute
  0.1%    29  +     0    xap.features.ParticleFeature$ParticleInstance.deallocate
  0.1%    26  +     0    com.shavenpuppy.jglib.sprites.FrameCommand.execute
  0.1%    22  +     0    adapters
  0.1%    21  +     0    xap.gui.Component.render
  0.1%    21  +     0    xap.features.EntityFeature.tickAllEntities
  0.1%    20  +     0    xap.features.GidrahFeature$GidrahInstance.doAliveTick
  0.1%    19  +     0    xap.gui.Label.renderSelf
  0.1%    19  +     0    com.shavenpuppy.jglib.sprites.TransparentStyle.resetState
  0.1%    16  +     0    xap.BattleZone.tick
  0.1%    16  +     0    com.shavenpuppy.jglib.sprites.GlowingStyle.resetState
  0.1%    15  +     0    com.shavenpuppy.jglib.sprites.GlowingStyle.setupState
  0.1%    15  +     0    com.shavenpuppy.jglib.sprites.TransparentStyle.setupState
  0.1%    13  +     0    xap.gui.Interface.mainLoop
 37.7%  9367  +    10    Total compiled (including elided)

         Stub + native   Method                        
 16.1%     0  +  4000    org.lwjgl.opengl.BaseGL.swapBuffers
  6.6%    11  +  1618    org.lwjgl.opengl.CoreGL.drawArrays
  1.6%     2  +   403    org.lwjgl.opengl.CoreGL.callList
  1.5%    52  +   322    org.lwjgl.opengl.CoreGL.disable
  1.4%    18  +   325    org.lwjgl.opengl.CoreGL.bindTexture
  1.4%    54  +   286    org.lwjgl.opengl.CoreGL.enable
  1.4%    27  +   312    org.lwjgl.opengl.CoreGL.texEnvi
  1.2%    64  +   240    org.lwjgl.opengl.CoreGL.enableClientState
  1.1%    40  +   234    org.lwjgl.opengl.CoreGL.disableClientState
  0.9%     2  +   234    org.lwjgl.opengl.GL.finishFenceNV
  0.7%    23  +   149    org.lwjgl.opengl.CoreGL.blendFunc
  0.5%     5  +   123    org.lwjgl.opengl.CoreGL.pushMatrix
  0.3%     4  +    79    org.lwjgl.opengl.CoreGL.begin
  0.3%    17  +    61    org.lwjgl.opengl.CoreGL.texCoord2f
  0.3%     8  +    56    org.lwjgl.opengl.CoreGL.vertex2i
  0.2%     0  +    41    org.lwjgl.opengl.CoreGL.clear
  0.1%     3  +    33    org.lwjgl.opengl.CoreGL.translatef
  0.1%     3  +    25    org.lwjgl.opengl.CoreGL.popMatrix
  0.1%     2  +    23    org.lwjgl.opengl.CoreGL.scissor
  0.1%     0  +    19    org.lwjgl.input.Mouse.nPoll
  0.1%     1  +    15    org.lwjgl.opengl.CoreGL.loadIdentity
  0.1%     1  +    15    org.lwjgl.opengl.CoreGL.color4ub
  0.1%     0  +    16    org.lwjgl.opengl.GLU.ortho2D
  0.1%     0  +    14    org.lwjgl.Sys.getTime
  0.0%     6  +     4    org.lwjgl.opengl.CoreGL.end
 36.6%   361  +  8733    Total stub (including elided)

Sorting now takes a mere 7% of the processing time instead of 22% - so I’ve just speeded up the whole process by a huge amount. Now the frame rate only begins to drop when there are thousands of particles on the screen. I’ve still got to figure out what’s killing my dual P3/700 GF1 Win2k system when anything explodes. I mean, just a single explosion slows it down to 15fps, but you can have tons of other stuff going on and it doesn’t bat an eyelid. I’m wondering if I haven’t picked a mode which requires a software path on the GF1 but to my knowledge it’s all pretty trivial blending going on. Odd…

writeSpriteToBuffer might still be tweaked but I don’t think it’s worth the effort - if I can handle 1,000 sprites without performance degration on the target system I’ll be happy. There are a number of issues concerning floating point performance that I’m worried about though, and I’m thinking it might be wise to try and move to ints where possible. Once upon a time floating point was a supreme luxury that scientists and other incredibly patient users used to do clever things. Now they’re just an easy peasy way of moving gidrahs. How times change :wink:

The moral of the tale: Never trust anyone else’s code! This was actually something I learned from Michael Abrash’s Graphics Programming Black Book (an excellent book - dated but entirely relevant).

Other things I have done - there’s a fps watcher in the main loop now. If the fps drops below 35 for longer than a second or so then it tells the game to cut back on the special effects. Right now this is a rather simplistic reduction in the number of particles allowed in the game. It’ll probably do the trick. The end result is that the game tunes itself slightly to your system.

There’s also a gidrah queue now. I’ve limited the maximum number of gidrahs in the game to 32 (it gets too crowded otherwise and runs the risk of slowing down heinously). When a gidrah is spawned and there’s no free slots it sits patiently in a queue waiting for one of its brethren to get the finger, at which point it appears as normal. Some gidrahs bypass the queue and will appear even if there are no free slots - these are the ones that appear dynamically in the game rather than ressing in, like Mad Jelly mutations or worms. Oops, gave one away there :-X

Chaz has done me a Mad Jelly and a Tringle. What can I say - they are absolutely fantastic. You’ve never seen lime jelly look so mean. It’s like evil flubber. Hopefully he’ll do the cute purple blobs tonight. I think he wants to make them furry with big googly eyes. It’ll be a sin to pop a cap in one that’s for sure.

I’m thinking that to reduce the size of the final executable I may have to skip every other animation frame Chaz has done as we’ve already grown a megabyte in size with just four sprites done. I doubt whether anyone will be able to tell the difference as everything’s so fast anyway.

Also put in the skeleton code for the powerups, and the other supporting code for rotating orbs and homing rockets. I don’t think I’ll include powerups in the demo as a little bit of an incentive :-p

Cas :slight_smile:

AAAGH! There, that’s better.

I have spent all day today replacing nearly all of the floating point calculation in the game with fixed point (16:16) arithmetic in an attempt to get greater performance. What an absolute nightmare! It’s still not perfect because I need a fast fixed-point multiply and fast fixed-point divide (I still cheat and convert to floats) which are accurate without overflowing. A fixed-point sqrt would be nice too; there’s one in Allegro I might try and figure out.

I’ve also abandoned WGL_EXT_swap_control - probably a good thing as it’s not cross-platform - and now resort to timing frames. I had to do this because as soon as a frame exceeded 17ms I missed the vertical blank, and had to wait for another one to come along 17ms later. The result: instant 30fps slowdown. At least with frame rate capping it degrades very gradually; I very rarely see less than 60fps now on the GF1/P3-700/Win2K box now even in full catastrophy mode. Even Brian gets 60fps on his cranky 500MHz celery/GF2MX rig. However, it tears a bit now when you zoom around. Bugger. Still, it can’t be helped.

The good news is - performance is nearly doubled thanks to this gargantuan 14 hour effort, including all sorts of other tiny tweaks. What’s the moral? Well - don’t use floating point, that’s what. Modern computers can handle floating point all right - the laptop never saw a slowdown but it’s got hardware T&L and a 1.2GHz P3-M in it - but hardly anyone actually owns fast hardware. The vast majority of people are still using wanky 500MHz beasts with TNTs. I’m excluding totally casual gamers from my target market by the way - maybe a bad business decision, but ultimately, it’s not a very casual shoot-em-up. It’s very, very, hardcore :smiley:

Of course, when I get hold of a system without T&L in hardware it’s back to doing millions of floating point operations per second again. This might not be good. I will have to include an option to disable eye candy and such. Bah.

It’s 1:20am again. Yawn. Chaz is coming over tomorrow all the way from sunny Brighton to put in cute purple blobs and mutation animation, and the Gunner graphics. And… that’s the core game! So it’ll be time to start adding stuff like title screens and score panels etc. Public alpha test in 2 weeks’ time. Still looking for volunteers to try it out…

Hey - still no coffee!

Cas :slight_smile:

Oh! pick me! pick me! [hand waving violently in the air]
I volunteer to test it. :slight_smile:

Patience, grasshopper.

Chaz arrived this evening. Instead of doing any work though we got pissed and played a few indie games to see what we were up against. Mutant Storm - pheweeee, now that is a fine game. (PomPom are insanely good). There’s Galaxy Force, Gridrunner++, Space Tripper, BrainWave (!), Strayfire, and of course, the original Defender running in MAME32.

The good news is, XAP is beginning to look like a proper pro competitor. The gameplay is very solid, and it turns out people are actually really enjoying whizzing around shooting stuff. I even catch myself at it for no good reason when I should be working for The Man.

Charlotte slagged my laser off again. Chaz doesn’t like it much either :frowning: I’m gutted, because I really like it. I might possibly replace it with a single elongated quad though because drawing 400 sprites to render a single laser is serious overkill. You can have 12 of them on the screen which can amount to an awful lot of processing for not a lot more effect. But then - what should the laser look like? Answers by email please…

Cas :slight_smile:

Particles have now been subdivided into two types - sprite based particles and GL_POINT based particles. Gidrah and player explosions together accounted for nearly 1,000 sprites in sparks alone; that’s 4,000 vertices to process. Now they’re rendered using GL_POINTs, which is only 1,000 vertices and a corresponding 4x speedup in processing. Always trying to get those low-end systems performing acceptably. They don’t look quite as nice as the sprite based sparks but after a little while I didn’t notice any more.

Talking of which - we all noticed that no-one ever looks at the player’s ship. It’s so obvious of course that no-one has ever mentioned it - but you’re always looking at the target, not the avatar. It’s been the same since the very first Space Invaders of course - right after you’ve marvelled at the graphic for the first time you stop looking at it and worry about the things that are whizzing around the screen trying to kill you, and never really look at it again! (Just like riding motorbikes in London) We saw, for example, that the player’s ship in Mutant Storm is crap! But it doesn’t matter, because you don’t look at it!

Chaz didn’t manage to finish the cute purple blobs but he did do the Gunner, which is an angry, throbbing, slightly translucent red ball with spines. I’m going to make it smoke a bit while it chases you, and try to create a sizzling sound effect like frying bacon. The gunner is now faster than the player - you can’t outrun it - but it’s slower to accelerate, so it tends to shoot past you. I might stop it from shooting as it travels faster than bullets anyway.

We’ve started work on the new background for XAP as well to replace my naff circuit board patterns. There’s a bottom, repeating texture layer, which just scrolls away and never ends, and over the top of it, there’s a transparent layer drawn as a grid (indexed GL_TRIANGLES, if you’re interested). I’m a little worried about burning fill rate on older cards as this effectively means blitting the whole display and then blending the whole display as well as drawing all the sprites and points. We’ll have to see how it goes.

The top texture sees the return of the classic water rippling effect to perturb it. When I put some lighting on it it will wobble and ripple when things get blasted on it or res-in.

Technical Fact Time

Once upon a time using an indexed GL_TRIANGLE_STRIP was all the rage, and it was important that you figured out how to draw everything by stitching it together as a triangle strip, naively believing that this was the fastest way to do stuff in OpenGL (or D3D for that matter).

Since then of course I’ve sat and had a think about how drivers and hardware work when using indexed primitives.

And driver worth its salt will cache the last couple of vertex transformations it has done. This means if you are just drawing with GL_TRIANGLEs, and draw an adjacent triangle (eg. 1, 2, 3, 2, 3, 4) then only one vertex actually needs to be transformed. Furthermore, in these days of T&L and on-card vertex caches (I believe even the lowly TNT has a 3-vertex cache) it’s very likely that the transformed vertex won’t even need to be sent down the bus again. In other words, using GL_TRIANGLEs is, in general, no slower than using GL_TRIANGLE_STRIP. What’s more, because you’ve avoided the headache of trying to turn a bunch of discrete triangles into strips, you’ve probably made your overall code even faster. There’s only one downside, and that is you need a slightly larger indexing array.

There is of course a whole other thing about optimising triangle rendering order to maximise the use of on-card vertex caches (the GeForce range has a 16 vertex cache, which is, it turns out, almost the perfect size for nearly all geometry). But I’m drawing sprites, and they’re quads, and they’re not connected to each other anyway. So I’ll maybe talk about that another time :slight_smile:

Cas :slight_smile:

All work and no play makes caspian a dull boy. All work and no play makes Caspian a dull boy. All work and no play makes Caspiana dull boy. All work and no play makes Caspian a dull boy. All work and no pla ymakes Caspian a dull boy. All work and no play makes Caspian a dull boy. All work and no play makes Caspian a dull boy. All work and no play makes caspian a dull boy. All work and no play makes Caspian a dull boy. All work and no play makes Caspian a dull boy. All work and no play makes Caspian a dull boy. All work and no play makes Caspian a dulll boy. All work and no play makes Caspian a dull boy. All work and no play makes Caspian a dullboy. All work and no play makes Caspian a dull boy. all work and no play makes Caspian a dull boy. All work and no play makes Caspian a dul lboy. all work and no play makess Caspian a dull boy. All work and no play makes Caspian a dull boy. All workand no play makes Caspian a dull boy. All work and no play makes Caspian a dull boy. All work and no play makes Caspian a dull boy. all work and no play makes Caspian a dull boy. All work and no play makes Caspian a dull boy. All work and no play makes Caspian a dull boy. All wwork and no play makes Caspian a dull boy. All work and no playmakes Caspian a dull boy. All work and no play makes Caspian a dull boy.All work and no play makes Caspian a dull boy.

Cas :slight_smile:

I’ve been feckin’ around with backgrounds again, trying to make the water effect look a) not shit and b) not slow but sadly I seem to be failing in the shitness department. Slowness is probably going to be a problem too as it relies on drawing every triangle twice - thanks to the fact it’s transparent I’ve got to do a depth buffer write pass first - and I’ve got to use lighting, which has always been a fairly slow affair on cards without T&L built-in as the computation has to be done one what is probably also a very slow CPU.

In other words, it would gobble up all that optimisation I’ve already done to make it run acceptably on the baseline system.

It looks like that the backgrounds are once again up in the air for a complete rethink.

To distract me from hair-pulling antics we’ve been trying to think of a proper name for XAP. “Alien Flux” is looking to be the favourite. I think I might start a new thread too as it’s getting to take a very long time to reply to this one.

Cas :slight_smile: