Java Emulator Framework

As Java is a great standard, as it offers a lot of nice patterns, as it is a sooo fast runtime language, I wonder why there is no Emulator Design Framework attempt?

I know there is a lot of Emulators project in Java. But there is no project to try to federate them such as has been done with MAME. So, (just to start) I propose:

  • A standard CPU pattern: all CPU emulations attempt have some basic standard functionalities such as cycles management, IRQ management, etc…
  • A set of assembly instructions
  • some CPU input/output: CPU ports management, etc…
  • some very specific hardware management (i.e. custom manufacturer chips)

There could be too a game driver pattern such as in MAME.

hope will interest someone … :slight_smile:

Note: Sorry for my poor english. Frenchie little guy :wink:

CottAGE is what you are describing. I haven’t looked at it in a while but when I did I thought it was crunchy at best as a framework.

Fun, Cottage’s emulation layer is called ‘jef’, for ‘Java Emulation Framework’, the topic of this forum.
Might be what you look for. :slight_smile:

( en plus, ils cherchent des programmeurs pour le projet… alors… )

True, JEF needs work to be a real good framework (and a lot of cleaning) but the basic building blocks are there to build your own emulator.
Although I didn’t have much time recently to work on it, I’m still in the process of trying to refactor and redesign things.
CottAGE uses JEF to make an arcade emulator based on MAME’s driver architecture.
We are looking for contributers for the project to improve it and if you have any questions about how it works, don’t hesitate to email me.

Erik :slight_smile:

erikd:
what if you were using BCEL http://jakarta.apache.org/bcel/index.html (or other similar tool) to regenerate a replicate of the rom using only list of calls for each instructions. That way, you’ll have a bytecode equivalent of the rom, and wil not need anymore runtime parsing. Moreover, it will be the jvm’s job to inline and optimise what can be.
Would be curious to know how fast that can be after such a treatment.

[quote]what if you were using BCEL http://jakarta.apache.org/bcel/index.html (or other similar tool) to regenerate a replicate of the rom using only list of calls for each instructions.
[/quote]
Yes, someone else suggested something like that too. It’s a fascinating idea, however I think it’s close to impossible to do a static recompilation of a rom to bytecodes. How would you distinguish code and data, or handle generated or self-modifying code?

Maybe it is possible to have a dynamic approach, sort of like a “JIT-CPU-native-opcode-to-Java-byte-code-translator” 8)
I don’t have a plan on how I could implement it, though. It would require some sort of mixing between the current interpreting and translating to bytecodes so there would always be some parsing left.

Erik

Seems interesting. But I’m not sure the “fastest way” is the “better way”. I think what is important in this project is not to get the fastest emulation code, but a fast development code.

To resume, what is important is not to be the fastest, but to be fast enough… :slight_smile:

[quote]I think what is important in this project is not to get the fastest emulation code, but a fast development code.
[/quote]
You’re right, although you can happily write an emulator using JEF (well, the new upcoming version of it anyway ;)) without looking at the CPU code at all so the fastest code doesn’t have to stand in the way of fast development.

One a side note, in CottAGE’s Z80 based emulators, the main bottleneck is not the CPU emulation but the video emulation.
OTOH, someone is developing an 68000 emulator in for JEF and that will most probably be slow if it’s an interpreting core so that might benefit.

Erik

[quote]How would you distinguish code and data,
[/quote]
by analysing the instruction path. follow all the paths that the code can use, and whas is left is data. maybe simplistic, but i don’t see why it should not work. Maybe i miss something else…

[quote] or handle generated or self-modifying code?
[/quote]
Errrr. did no think that arcade games used such advanced methods. Yes, that would complicate the whole thing.

[quote]Quote:How would you distinguish code and data,

by analysing the instruction path. follow all the paths that the code can use, and whas is left is data. maybe simplistic, but i don’t see why it should not work. Maybe i miss something else…
[/quote]
You can do that to a certain degree, but still it would be very difficult.
For example jumps are not always static.
Sometimes you have an opcode like JP (HL), which jumps to the address stored in register HL. Which values can HL get? Hard to tell (and be sure too). Sure it may be possible to write some analyzer that might come up with the correct answer (and the complete instruction path), but that would be some piece of code!
Many programs use this opcode, pacman being a popular example :slight_smile:

But maybe it’s possible to analyze and translate as much as possible up front, regard the rest as ‘data’ and ‘untranslated code’ that will be interpreted or even JIT-ed.

I haven’t seen self-modifying code in arcades too, but JEF is not meant to be just for arcade emulation. What I have seen is little pieces of code inserted at an interrupt hook at run-time.

Self modifying code was a popular technique in the 80’s You could often use it to optimize loops so that you didn’t have to check conditions inside the loop. Arcade machines running code from ROMs obviously woul have a harder time… but there could still be jump tables ect that are copied to RAM. Games for PC’s like the Atari 400, 800, ST, or the Commodore 64, 128, Amiga would be more likely to use self-modifying code I think.

I think it became less popular when processors became mroe advanced and things like seperate data and instruction caches made it very expensive to use self-modifying algorithms.

If it weren’t for that, I would figure the best emulation technique would be as suggested… not be to emulate the processor core, but to treat the machine instructions as JIT compilers treat byte code… as instructions for a virtual machine that you will compile to “native” code.

You probably can detect what is code or data at runtime if you can exercise all paths of the program… if ever the path of execution leads the instruction pointer to address X then address X must have an instruction or the code would crash. It would be no easy task though to track ‘data’ that is written to address X in a way that you could then manage to compile it as well.

Maybe the thing to do is to assume that there is no self-modifying code and see how far that gets you… then work on patching the ‘special cases’ where self-modifying code is used.

I’m just curious, remember jump tables? Used to be useful when doing assembly or even C.

In the case of an emulator, if the CPU being emulated had one byte opcodes, then it’s possible to build a jump table for them, having a very simple main loop.

Maybe this is the way things usually work, say, in MAME. I’ve never looked at emulation code.

I wonder how well a jump table would work in Java? Taking into account the way the VM optimizes code.

Of course, you would have to have a class table in this case of classes implementing something like CpuOpcode


class CPU {
  CPUOpcode[] opcodes;
  byte[] memory;
  int instructionPointer;

  public void run() {
    do {
      ( opcodes[ memory[ instructionPointer ] ] ).execute( this );
    } while (true);
  }

Of course, it’s a tad more complicated than this since there usually are some two byte opcodes and stuff. But the basic idea is that.

It would probably be hard for the VM to optimize some of this I’m sure when compiling to native, perhaps it isn’t even a good idea for native emulators. Any comments?

Seb

Just to clarify what I said above a little more.

The idea is eliminate the switch statements in the main loop, which I think are much slower for interpreting, and are way too popular in main message loops for my taste.

Seb

A good compiler will use jump tables for switch statements when appropriate. A switch does not necessarily compile to the equivalent of
if…else-if…else-if…

[quote]A good compiler will use jump tables for switch statements when appropriate. A switch does not necessarily compile to the equivalent of
if…else-if…else-if…
[/quote]
True, but you have no control over it.

Seb

My point is - let the compiler do what it can, if your program runs fast enough, you are done. If it doesn’t, profile the code. If you find that something like the switch statement is in a critical spot then you can experiment with alternatives.
But it’s best to know for sure if you are smarter than your compiler for certain things. Take for example the poster on these forums that found that manually inlining methods slowed down his code, rather than sped it up. So try the jump table method and benchmark it. And keep in mind that particular characteristics of the switch() statment might lead the compiler to optimize differently… for instance if you have any cases that ‘fall though’ (no break or return at the end) or possibly if your case values are not sorted the translation to a table will not be easy for the compiler. And if your table accesses have array bounds checking that may or may not be optimized away by the JIT compiler… there are so many variables…
You never know… I’m not even sure if javac does that sort of optimization.

I agree with Seb. Your design matches what I think could be a good design (more object oriented). I would like to add that on a Hostpot VM, the result would be better because the VM would easily identify the most frequent opcodes array access. We all know that in a game 90% of the CPU is monopolized by the same 20% of the code.
We progress :slight_smile:

[quote]Quote:A good compiler will use jump tables for switch statements when appropriate. A switch does not necessarily compile to the equivalent of
if…else-if…else-if…

True, but you have no control over it.
[/quote]
You do have some control over it. If you put your cases in ascending order, it will be compiled to a jump table.

We have tried your suggestion to mimic a jump table in java. The MC6809 emulator for example at one point was built exactly how you described (except that we use int arrays for memory because they are faster). However, it was rewritten to a switch/case because that turned out to be a good deal faster.
Might be ‘less OO’ but as it turns out, emulators are not really java’s cup of tea performance wise, so to me ‘less OO’ is better in this case.

Erik

Well, that was my question, if anybody had tried the method.

I’m also not sure if javac uses internal jump tables at any point for switch statements.

As for doing it by hand. I figured that although bound’s cheking is NOT turned off here, (that usually works only for array access inside some ‘for’ & ‘while’ loops where the array index is also the loop counter) it could still be faster than even short switch statements.

Also, if javac does implement jump tables when it can, these are rare occasions, one would probably have to be careful to sort the cases, make sure none fall through, make sure there are no gaps, etc…

Seems complicated and any careless modifications could alter performance wildly.

An intermediate solution is probably to research how the original processor did the decode stage and code something analogous to that circuitry.

I’m calling this intermediate because I figure in most cases CPUs start decoding by the most significant bit or bits, perhaps ending up with something similar to a search tree, and some parts in this tree could implement jump tables and some other switch statements or directly execute some opcode.

Seb

[quote]I’m also not sure if javac uses internal jump tables at any point for switch statements.
[/quote]
Well, AFAIK there are 2 different opcodes in the java bytecodes for switch/case, one of which uses a jump table and is fastest and is used when the cases are in ascending order. I don’t think that small gaps make any difference (my z80 emulator does have gaps and I checked that it still uses the lookup switch).
You’re right it’s not pretty and can be vulnerable code, but it helps when you only put function calls in the switch.

Anyway it’s not that bad. I mean, how complicated is it to put all cases in ascending order and not forget the break; statements?
Not really.

BTW, Seb, could you explain how your intermediate and final solution would help in making things faster?
(Or for which problem was it the solution? ::))
I figure that the closer to the hardware you emulate, the slower the emulation will be because you’ll be emulating unneeded details.
(Ok, those details are not always unneeded. For example the Sega sys32 console and the SNES need accuracy to the cycle to keep everything in sync.)

What I think is that the greatest slowdown in emulation using java is that java datatypes don’t map very well to native datatypes a lot of the time.
In java, there’s a lot of casting and/or a lot of masking needed to convert its datatypes to the emulated datatypes.
In C you have more flavours and low-level control over datatypes for example. One thing that would speed up my z80 emulation a good deal for example is when java had something like UNION.
I think this impacts speed more than the opcode interpretation itself.

Greetings,
Erik