You guys might have run into this issue, that when writing AI, you want your units behaviour to be coded as simple as possible, preferably with code that seems to single threaded. The problem however is that each unit would need 1 thread, and that’s just bad for performance, due to both context switches and lock contention.
There are a few alternatives:
[] Splitting your code into very small chunks and schedule those tasks in a queue (poor man’s green threads)
[] Using a library that injects continuations in your code (little less poor man’s green threads)
[*] Using a scripting language that is interruptable, with a local stack and program counter. You have to write your code in another/new language, without IDE support, and performance is probably terrible.
I just had the following idea - and feel free to point out the obvious mistakes… if any.
It would be a major advantage to write your AI code as if it were single-threaded, writen in Java, in your favourite IDE, generating a standard Java ClassFile. Now the idea is not to load it in the JVM, but to interpretate it yourself. It might sound slow, and reinventing the wheel, and what-not, here’s the catch: when you stumble upon the invoke_ family of bytecodes, or getfield/putfield bytecodes, you’d simply use Java Reflection to let the JVM run it at full speed. Suddenly you have this class in which you can step through the bytecodes (with custom program counter, stack) making it possible to put the execution on hold or interleaving thousands of units ‘green threads’, while you’d still be able to call into all your other classes, at full speed.
The interpreter might be 50x (or more) slower than execution by the JVM, but in the end you’re probably doing all the hard stuff in ‘external’ code (JIT compiled classes) anyway, so you might not even notice the performance degradation. It might even be faster (both in coding time as in execution time), as you won’t have to split up your tasks into dozens of Runnables – and we all know interfaces with many implementations end up as massive switch-tables which slows down invoke_special (more or less).
Implementing a Java Bytecode Interpreter that delegates the ‘hard stuff’ to the JVM itself, shouldn’t be too hard, right? The only non-trivial thing is throwing and catching exceptions.
So… what are your thoughts? Best of boths worlds, or did I miss something obvious.
