Java Emulator Framework

Well, I don’t know if it would make things faster, nor do I propose it as a final solution. I’m just discussing the different methods since I find it worth while. With the switch statement working through a lookup, I think it’s hard to find something that works faster.

I agree it’s not optimal to emulate close to the hardware in many cases. However in the particular case of part of the decode circuitry it appears to me it could be good. But I’m just imagining a particular type of decode circuitry. One that would look like an opcode search tree.

Using a search tree, one could be garanteed that it would find the code for an opcode within, say, 8 or so tree lookups max, for 8 bit opcodes. It also seems flexible enough to account for different bit sizes in opcodes naturally.

This tree wouldn’t have to exist as a complete tree either. It’s only the search that’s structured hierarchically, starting by verifying the most significant bits first, not necessarily one bit at a time.

Different points in this hierarchical decode I suggested one could implement either with switch statements or with jump tables. But, as you said, it can all be done with switch if being careful. You’re right, it’s not that hard to maintain that code, just gotta’ know how to get the compiler to compile right.

Seb

Xbrain,
Although the VM could possibly identify which opcodes are used more frequently, I’m afraid there’s not much it can do to optimize those particular code paths.

It can’t inline anything a that point based on frequency of access, it has to handle all possible situations. So there’s not much optimization possible other than a jump table a this point.

Even the method I proposed in the previous post, the hierarchical decode, is probably slower, I only like it because it seems like an elegant solution, but that would have to be seen in an actual implementation.

Seb

[quote]It can’t inline anything a that point based on frequency of access
[/quote]
I think it would probably inline the functions into the switch cases rather than make the function call. But other than that you are right… it can’t really make the lookup for a particular case go faster.

[quote]Using a search tree, one could be garanteed that it would find the code for an opcode within, say, 8 or so tree lookups max, for 8 bit opcodes. It also seems flexible enough to account for different bit sizes in opcodes naturally.
[/quote]
Search trees (TreeMap) or has maps would work… but the lookup operation would still be much more expensive than a bounds check on an array.
For 8 bit instructions it is easy enough to fill any potential gaps in the array with a call to trigger an invalid opcode exception.
In this particular case thoughas erikd has determined by experimentation, the switch case is more optimal still… ultimately compiling down to a lookup table in the bytecode… so you won’t likey get any faster than that for instruction dispatching.
Perhaps more optimizations are waiting to be found in the ALU emulation?
I just had an interesting thought…
Coud you do something really advanced like emulate a very modern CPU core, so for instance on a dual processor machine you could fetch two instructions at a time, and execute them in parallel? Maybe at that point you would have to get fancy and do instruction reordering etc… That would be something!
Entertaining perhaps… but probably too complex and not worth the effort in the end.

Well, a TreeMap wasn’t in my mind at all.

Lemme’ try to say it clearly in one posting.

CPU opcode formats usually have some sort of semantic logic to them. Take some z80 instructions for example (What I’m gonna say is oversimplified but it makes the point) :


----------- 76 543 210  76543210
LD r,r'     01 ddd sss
LD r, (HL)  01 ddd 110
LD (HL), r  01 110 sss
LD r,n           00 ddd 110  nnnnnnnn
LD (HL), n  00 110 110  nnnnnnnn

Well, having the most significant bit (bit 7) equal to 0 means we’re doing a load/store.

[]The next bit (bit 6) equal to 1 means there are no immediate operands and we proceed to figure out destination and source from the next two pairs of 3 bits. If one of these is 6 then it means indirect addressing for source or destination, using the address at HL, otherwise we’re referring to one of 7 registers.
[
]If instead, bit 6 were 0, then we have an operand on the next byte. If the source is 6, were once more using indirect addressing.

The thing is, this structure lends itself to hierarchical decode (hence my commenting this being similar to a tree search, although it wasn’t the most appropriate way to describe it I admit :stuck_out_tongue: )

This, of course, is not as efficient as the switch compiled to a table lookup, I just mentioned it because it seemed elegant and because I wasn’t sure at the time if javac did lookup-table optimization on switch/case so I figured it could provide a performance boost. But I wouldn’t implement this with a TreeMap, I would hardcode the decoding process.

Seb

Here’s another emulation framework in Java. It’s not as functional as Cottage.

http://jfrace.sourceforge.net/index.html

Seb

I’m newbie here (and in Java) and emulator’s fan…

Aren’t u think Java’s speed a problem to develop emulator as SNES or genesis emulator ??? PSX emulator ??? (I can dream, can’t I ??? :wink: )

When I see C or C++ emulators in my AMD 500Mhz + TNT2 so slow (ie less than 60 fps)…

@++

PS : scuse me for my very poor english… French are very bad for foreign langage :-[

If you think emulators written in C or C++ are slow, then yes, there’s also a speed problem with java.
But remember that emulators are extremely demanding in terms of CPU speed, no matter in what language you make them.
I believe that if the emulated CPU’s datatypes map fairly well to java’s datatypes that it doesn’t make that much difference in speed if you use java or C++.
For example, when you run CottAGE using the IBM JVM, many games run just about as fast as MAME32 in a window. (it’s a shaky comparison, but hey :))

I think a good SNES, Genesis or PSX emulator written in java is possible, if you have a beefy CPU in your machine, but probably it wont be faster than their C or C++ based brothers.

Well… if it’s true, sweet to me to imagine a JBle*m :wink:

When the beginning of this dream ?

Do u know if “advanced console or computer” emulators are written with Java ?

I know an Atari ST emulator (which hardware is comparable to a Genesis). A co-author of CottAGE (Gollum) has made a Gameboy Advance emulator in java and even a PC/MS DOS emulator in java.
There’s no more advanced emulators yet, but I know this will change soon.
JEF (Java Emulation Framework) has gone through some serious changes and rewrites and is getting some cool additions like 68000 emulation, Q-Sound, ADPCM, 2 different FM chips and more.

Greetings,
Erik

[quote]CPU opcode formats usually have some sort of semantic logic to them.
[/quote]
Yes, true. But unfortunately to decode the instruction in software based on that structure is just many more operations.

The native CPU hardware would rip the instruction apart in parallel (e.g. the register #s would be tied directly to some sort of ‘register address lines’) so the elegance of the structure shows through…
When you don’t have all those different hardware units running in parallel I’m afraid that model can’t be exploited efficiently.

[quote]Xbrain,
Although the VM could possibly identify which opcodes are used more frequently, I’m afraid there’s not much it can do to optimize those particular code paths.

It can’t inline anything a that point based on frequency of access, it has to handle all possible situations. So there’s not much optimization possible other than a jump table a this point.

Even the method I proposed in the previous post, the hierarchical decode, is probably slower, I only like it because it seems like an elegant solution, but that would have to be seen in an actual implementation.

Seb
[/quote]
Maybe using some kind of short circuits on a range check…

why dont u guys make an emulator api.

That’s basically what JEF is.

o…

are there any that have been written with that?

Yes, CottAGE:
http://cottage.emuunlim.com

and there’s a demo space invaders emu included in JEF.

C or C++ emulators tend to have highly optimized inline assembly.
A pure C/C++ emulator is not likely to be that much faster than a Java one.

[quote]A pure C/C++ emulator is not likely to be that much faster than a Java one.
[/quote]
In my experience, emulators are not really java’s cup of tea. While it’s true that most emulaters written in C/C++ have some inline assembly, it’s also true that because you have much more low level control in just C/C++, C/C++ based emulators are generally quite a lot faster than the java ones. This doesn’t necessarily make C/C++ quite a lot faster than java per definition, but a better choice for creating emulators if performance is a major concern.

Look at the emulator ‘Modeler’ for example (emulates Sega System32 arcade machines). It’s completely written in C++, no fancy dynamic recompiling CPU cores (just interpreters), and it emulates possibly the most complex 2D video hardware ever. It runs quite acceptably on a 450Mhz P2.
You probably won’t come close to that kind of performance if you did it in java.

Erik

I have waited for this emulator for a Long time, I wanted to code a emulator with java long before,but have no time ,the work is too much and busy at all,and team-force,dear all,would you start this project with me? 8)

To be honest, I’m not much into emulators anymore. I’ve written a couple of emulators around my own CPU cores and started an open source MAME-like effort, but to start a new project now… No, I don’t have the time I’m afraid.
You really need a lot of patience. Really rewarding if the emulator starts working, but a major pain to debug.