Crabs!

I’m not entirely sure how many phones have dual core processors, but as you seem to have done everything you can (according to you xd) why not just multithread it? Well, it wouldn’t improve performance on older phones (which I suppose is what you want most) but basic number crunching like this scales almost linearly with the number of cores. If the program is just memory bandwidth limited hyperthreading can provide a bigger boost than a second core! For example when I was experimenting with particle engines, I used Riven’s MappedObject library (which you really should use if you can) and multithreading. Using 2 threads on my dual core and got a mere 1.5x speedup, but using 4 threads with hyperthreading gave me a 1.95x speedup. This was without drawing anything though (scaling gets a little worse as the drawing has to be done singlethreaded) but at the same time the numbers are a little biased by Turbo Boost, so scaling should at least get to 2x using hyperthreading. Basically cores can increase computation performance while hyperthreading compensates (more than expected actually) for memory bottlenecks.
Well, I’m not entirely sure this is very attractive for phones though… xd

princec, have you tried the libgdx library for doing android stuff? it is made by badlogic and now Nate is also working on it.

if I remember correctly the libgdx library fixes some performance issues with bytebuffers and the opengl apis that were present in earlier versions of android (not 1.6 early but 2.0, 2.1 early)

Cas already applies that ‘workaround’ by pushing all data into an int[]

10-10 09:17:47.844: DEBUG/libEGL(1488): loaded /system/lib/egl/libGLES_android.so
10-10 09:17:47.854: DEBUG/libEGL(1488): loaded /system/lib/egl/libEGL_adreno200.so
10-10 09:17:47.854: DEBUG/libEGL(1488): loaded /system/lib/egl/libGLESv1_CM_adreno200.so
10-10 09:17:47.854: DEBUG/libEGL(1488): loaded /system/lib/egl/libGLESv2_adreno200.so
10-10 09:17:47.874: INFO/System.out(1488): Surface created
10-10 09:17:47.874: INFO/System.out(1488): init
10-10 09:17:47.874: INFO/global(1488): Default buffer size used in BufferedInputStream constructor. It would be better to be explicit if an 8k buffer is required.
10-10 09:17:47.874: INFO/global(1488): Default buffer size used in BufferedInputStream constructor. It would be better to be explicit if an 8k buffer is required.
10-10 09:17:47.994: INFO/System.out(1488): LOADING sprites0.mp3 from assets
10-10 09:17:47.994: INFO/global(1488): Default buffer size used in BufferedInputStream constructor. It would be better to be explicit if an 8k buffer is required.
10-10 09:17:48.054: INFO/dalvikvm-heap(1488): Grow heap (frag case) to 3.217MB for 262160-byte allocation
10-10 09:17:48.094: INFO/System.out(1488): Loaded an image 256x256
10-10 09:17:48.144: INFO/global(1488): Default buffer size used in BufferedInputStream constructor. It would be better to be explicit if an 8k buffer is required.
10-10 09:17:48.154: INFO/System.out(1488): LOADING sprites1.mp3 from assets
10-10 09:17:48.164: INFO/global(1488): Default buffer size used in BufferedInputStream constructor. It would be better to be explicit if an 8k buffer is required.
10-10 09:17:48.204: INFO/System.out(1488): Loaded an image 256x256
10-10 09:17:48.214: INFO/global(1488): Default buffer size used in BufferedInputStream constructor. It would be better to be explicit if an 8k buffer is required.
10-10 09:17:48.264: INFO/System.out(1488): LOADING range.jgimage from assets
10-10 09:17:48.264: INFO/global(1488): Default buffer size used in BufferedInputStream constructor. It would be better to be explicit if an 8k buffer is required.
10-10 09:17:48.264: INFO/System.out(1488): Loaded an image 16x1
10-10 09:17:48.264: INFO/global(1488): Default buffer size used in BufferedInputStream constructor. It would be better to be explicit if an 8k buffer is required.
10-10 09:17:48.464: INFO/dalvikvm-heap(1488): Grow heap (frag case) to 11.022MB for 4194320-byte allocation
10-10 09:17:48.684: INFO/dalvikvm-heap(1488): Grow heap (frag case) to 21.441MB for 4194320-byte allocation
10-10 09:17:48.714: ERROR/dalvikvm-heap(1488): 2516580-byte external allocation too large for this process.
10-10 09:17:48.714: WARN/OSMemory(1488): External allocation of 2516580 bytes was rejected
10-10 09:17:48.734: INFO/ActivityManager(150): Displayed activity net.puppygames.minandroid/.MinimalAndroidActivity: 1047 ms (total 1047 ms)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488): FATAL EXCEPTION: GLThread 10
10-10 09:17:48.744: ERROR/AndroidRuntime(1488): java.lang.OutOfMemoryError
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at org.apache.harmony.luni.platform.OSMemory.malloc(Native Method)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at org.apache.harmony.luni.platform.PlatformAddressFactory.alloc(PlatformAddressFactory.java:150)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:66)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at java.nio.ReadWriteDirectByteBuffer.<init>(ReadWriteDirectByteBuffer.java:51)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at java.nio.BufferFactory.newDirectByteBuffer(BufferFactory.java:93)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:68)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at com.shavenpuppy.jglib.sprites.SpriteEngine$RenderQueue.create(SpriteEngine.java:159)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at com.shavenpuppy.jglib.sprites.SpriteEngine$RingBuffer.create(SpriteEngine.java:420)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at com.shavenpuppy.jglib.sprites.SpriteEngine.doCreate(SpriteEngine.java:666)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at com.shavenpuppy.jglib.Feature.create(Feature.java:131)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at net.puppygames.minandroid.MinimalAndroidActivity.init(MinimalAndroidActivity.java:527)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at net.puppygames.minandroid.MinimalAndroidActivity$3.onSurfaceCreated(MinimalAndroidActivity.java:239)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at android.opengl.GLSurfaceView$GLThread.guardedRun(GLSurfaceView.java:1317)
10-10 09:17:48.744: ERROR/AndroidRuntime(1488):     at android.opengl.GLSurfaceView$GLThread.run(GLSurfaceView.java:1116)
10-10 09:17:48.774: WARN/ActivityManager(150):   Force finishing activity net.puppygames.minandroid/.MinimalAndroidActivity
10-10 09:17:48.794: INFO/System.out(1488): onPause
10-10 09:17:48.794: INFO/System.out(1488): We're finishing!
10-10 09:17:48.794: INFO/System.out(1488): Saving application state
10-10 09:17:48.794: INFO/System.out(1488): cleanup

Huawei ideos x5
Android 2.21
512mb memory, booted the phone just for testing so nothing was on background.

Yeah, I do need to sort out the lifecycle stuff as it’s not shutting down cleanly. It also allocates a bit more RAM than it really needs… (16mb for a single sprite engine :))

Cas :slight_smile:

“Just multithread” eh? LOL

Actually it already is, but I wouldn’t trivialise the task like that if I were you.

MappedObject is unavailable on Dalvik. Hyperthreading is not available on ARM. I don’t think we’re near memory bandwidth saturation yet. The reason it’s slow right now is because the Dalvik VM isn’t very good and does no inlining or bounds check elimination or intrinsification of Buffer writes. Also I notice that VBOs in OpenGLES1.1 don’t seem to have a flag that allows you to specify that they are write-only, so all those buffer writes are polluting the cache unnecessarily. Mind you the cache is already buggered from having to write everything to an array first anyway.

Cas :slight_smile:

Incredibly I got another 50% speedup by doing the same thing to the index buffer (not sure why I forgot to do that last iteration but there we go). Up to approx 7500 sprites @ 15fps on the Galaxy, so we’re 3x faster than where I started without resorting to native code. I think that’ll do! Realistically that’s going to be about 3,000 sprites at my target framerate of 30fps on the Galaxy, or if I widen the net to cheapy phones, 1000 sprites - enough for Titan Attacks. Job’s a good 'un.

Now to fix all the lifecycle problems.

I’ll open the source to this lot later, not that it’s probably much use to many people (everyone else: you are better off with libgdx!) Many thanks for the fast cos/sin Riven - surprising how much faster it is than the intrinsics on Dalvik.

Cas :slight_smile:

MappedObject could be reworked to dump data in an int[] (dropping support for long and double)

Hm, I’m not really sure MappedObject will help me as basically all of the data needs processing on the way out of the sprite anyway. I suppose it might help data locality if all the sprites are contiguous in memory but they probably are, by and large, anyway.

Cas :slight_smile:

You should fiddle with SIN_BITS to reduce cache trashing.

I downloaded a fresh apk this morning. I get a brief flash of the text at the top of the screen, then it all goes black. I know it’s still running as i can hear the effects when i tap the screen.

Endolf

Yeah that’s the OOME. I’ll upload a new one soon when I get lifecycle figured out. I’m sure I’m doing something fundamentally wrong.

Cas :slight_smile:

Installed and all i could do was hearing the tapping sound.
HTC Desire HD @android 2.3.3

Can you explain this a little more. I use a lot sin/cos look up table in my Boxtrix(physic tetris game, http://code.google.com/p/boxtris/ ) and every performance stuff is needed.

Princec: It would be nice to see performance comparision beetween your sprite engine and Libgdx spriteBatch. Maybe this could benefit both systems.

With 12 bits, two float[2^12]s are allocated: 32KB

By changing the SIN_BITS you trade memory-footprint for accuracy and performance.

8 bits seems to be indistinguishable for the purposes of rotating little sprites.

The latest version is here, and now should have proper application lifecycle management. And more sprites :slight_smile: As before, if you could let me know the point at which it reaches 15fps on your phone steadily that’d be grand. And any other weirdness. This looks like I’ve cracked it now, so if this test goes well, on with a game!

Cas :slight_smile:

Samsung Galaxy Gio: black screen, audio when touching and background music. Crashed a bit later, when I started touching the black surface again.

Damn :frowning: Anything interesting in logcat?

Cas :slight_smile:

Same here: sound is ok, but just a black screen. Very occasionally flashes a screen full of random colored blobs. Samsung Fascinate with CM7

Oh dear, this is beginning to look like vertex buffer objects don’t work.

Cas :slight_smile: