Some performance related Java language ideas

I had some Java language ideas recently, and decided to write them up and post to JavaLobby, but as of yet I’ve had no replies. I decided to post it here too:
http://www.javalobby.org/thread.jspa?forumID=152&threadID=10670

I don’t claim to be an expert in this stuff, but here’s some interesting ideas I had. Comments welcome.

By default, “float” should be optimised for performance while “double” should be optimised for accuracy. In other words, since float is typically used for speed, it should be optimised as such by default and since double is most often used when accuracy is crucial, it should be optimised for accuracy by default. Currently, both are optimised for accuracy by default. (I’m sure the HPC guys want more flexible optimisation options for doubles too, but here’s a quick and simple idea I think most people could agree on)

“primative” class modifier: an instance from a “primative” class does not use references, it uses unique pointers. With the code “int c = 1; int d = c;” the variable c is copied to d, in “Object c = new Object(); Object d = c;” variable d becomes a reference to c. With the new primative class, “PrimativeObject c = new PrimativeObject(); PrimativeObject d = c;” copies c to d - creates new object. A decent JVM optimiser would be able to “inline” most uses so that the JVM doesn’t actually need to do object creation/deletion for them. One major use of this class is to have data structures with minimal memory useage (which may require other features of normal objects to be removed from primative objects too). In other words, an array of 1000 primative classes which just had one non-static variable (an “int”) would take up almost exactly as much memory as int[1000] while being much more useful. This would be good for 3D, multi-media and some maths stuff - eg arrays of complex numbers. Could have a special sub-type of “primative class” for numbers which can use numerical operator overloading - eg could do “ComplexNumber c = new ComplexNumber(1, 1); ComplexNumber d = c * c;” (I’m NOT suggesting operator overloading should be allowed in general)

Number formats with SIMD optimisation support: use standard primative classes with operator overloading to define some new standard classes. This would be to help performance and ease of implementation for user code for multimedia apps, particularly 2D graphics processing for some of the more unusual data types. For example, say you have a 2D image (with 24-bit rgb pixels) and want to brighten the image a bit - you can just add 1 to all the rgb values, but if a value is already at 255 (max of 8 bit unsigned) then adding 1 will make it wrap around to 0, which you don’t want (what will happen in Java will depend on what you use for the calcuations - signed ints or signed bytes). Many modern CPUs have SIMD instructions where you can do the add with the checking to prevent wrap-around in just one cycle, for all 3 values (or 4 if you include alpha-channel). If you have standard data types in Java spec for the various types of multi-media number formats, then the JVM can map operations on these to SIMD instructions in CPUs. So if you have “ARGB c = new ARGB(255,255,255,255); ARGB brighten = new ARGB(0,1,1,1); c+= brighten;” then not only is that much simpler than what you’d have to do today, it could be optimised increadibly well, and in a portable way too.

I would strongly recommend you correct your spelling.

Primitive.

Changing the symantics of the language, depending on the type of class is a BIG no no.
Also, your assumption that float is normally used for speed, double for accuracy is flawed.

It is far more logical to default to accuracy, and, if the programmer finds accurate calcs. to be the bottleneck, should then have the option of requesting speed over accuracy.

Your final suggestion, adding operator overloading and new classes for explicitly dealing with SIMD operations is IMO another bad idea :-/
SIMD optimisations are platform/processor specific, the api is meant to be abstract.
SIMD optimisation should therefor be done by the VM jit/Hotspot compiler.

Personally I’m in favour of anything that advances towards reconciling primitives (nb: no “a” :)) with classes in Java - it is the source, historically, of so much pain…and in recent years of so many ugly hacks.

I hate boxing for being a crude hack; I love it for doing some of the reconciliation that is so badly needed by Java if it is to thrive for a long time in all the areas where it thrives today (instead of merely surviving in some of them and thriving in others). :wink:

Long-term, if the primitive/class multiple-personality-disorder of Java does NOT get fixed, I’ll be in that large group of programmers who moved from C/C++ to Java early on - and then did the same in the early days of a java-successor, largely because java’s fundamental problems waste too damned much of my life.

Personally, I’m still fascinated by the story of primitives in java - it seems that the initial all-consuming antagonism towards primitives in the early days of the JLS (as described in historical looks at java, and how late primitives were added, and how close they came to being kept out - perhaps these were over-dramatised? but the conclusions were generally that primitive-fanciers had to fight REALLY hard to get them in) has never died out in Sun; it seems there are still many people who really want to get rid of them, even though ?tens? / ?hundreds? of thousands of java developers are still heavily reliant on them for memory and execution performance.

It seems today to be “the only way we’ll get people using objects as much as they should” or “the only way we’ll get the focus we need on improving object-handling performance as far as it can and should go” is if “we only tolerate the extreme view that everything MUST be a full-fat object”.

I think your “primitive” class-modifier is the best suggestion I’ve yet heard (but then again, I don’t subscribe to relevant lists etc, so there’s probably lots of other ones floating around too :)) for resolving the dichotomy. As you point out, there are desirable and valid reasons for utilising the unique characteristics of primitives in classes - and the reasons for cloning functionality the other way are so compelling that Sun has been struggling to bring the unique characteristics of classes to primitives for many many years

(c.f. lang.Void.TYPE:

“The Void class is an uninstantiable placeholder class to hold a reference to the Class object representing the Java keyword void.”

yuk! )

Agreed, changes to “semantics” ;D are usually bad…but as a starting point for fixing the primitive/class dichotomy it’s good. Bear in mind that the semantics are ALREADY screwed up, by dint of the fact that the valid semantics of a random variable, assuming it’s declared in a separate line:


int myVariable;
...
myVariable = ...;

are dependent upon it’s pseudo-class (by which I mean “whether it’s a primitive or a class”). Semantically, there is IIRC no way of differentiating the two? The only possible compiler errors from forgetting whether something is primitive or not are all based on the type-checking?

The RFE to add “structs” to Java goes some way to reconciling the extreme awkwardness that adding primitive classes would entail at this stage in the language evolution. Apart from behaving the same way as any other Object in Java except for the way that they are guaranteed to be laid out in memory, they also serve a very specific problem domain which is getting structured data in and out of ByteBuffers in an object-orientated manner.

The “struct” RFE could be implemented by current VMs by synthesizing glue bytecode that mimicked the required operation until the JVM generated specific optimized code to deal with the situation. (And one of the reasons for the RFE was that otherwise we have to write our own glue code and it’s a bit pointless hoomin beans doing what machines are best at innit?)

As for embracing SIMD instructions - they’re not platform specific. They should be implemented as an API like any other and then platform implementations of the JVM should detect their explicit supported usage and use single SIMD instructions instead of multiple bytecode instructions, eg. the java.lang.SIMD class may contain a method int clampedAdd(int a, int b) which is implemented as normal bytecode on those systems which do not support a single machine code instruction which can perform the operation.

The only problem is that without pass-by-reference for primitive types most of this API will be useless as most enhanced CPU instruction sets operate directly in place on data. Unless we get used to passing everything in ByteBuffers. In which case we’ll definitely need the “struct” enhancement as it makes passing by reference a relative doddle compared to overhauling the whole VM to allow reference arguments.

Any and all enhancements along these lines should be encouraged so long as a fallback can be implemented where the underlying hardware does not support it or the JVM can simulate the operation in bytecode as normal.

(Right about now I’m pretty sick of the cutting-nose-off-to-spite face attitude with regards performance in Java. It’s time the specs reflected accurately the state of hardware today or the competitors (C# and God help us even VB.net) are going to stomp on the performance benchmarks, both micro and macro)

Cas :slight_smile:

IMHO, this is insane :slight_smile:
I have two issues.

  1. SIMD instructions are platform dependant. Actual instructions and support for different data size is highly variable across architectures.
  2. I would never want to have to use a SIMD API that is POSSIBILY going to be accelerated on a given platform. We went through this idea in the Java game experts group. If you provide an API, but can not guarantee that it is accelerated (which in this case you can not ), then it will not get used. It’s that simple. The way that the JIT is currently using SIMD is much preferred.

Think about things like JOGL and JOAL. For production games we will use JOGL, but it requires 3D acceleration also. It’s true that OGL can render in software but every game that uses OGL requires the 3D hardware as well.

I see my suggestion as an addition. I would certainly never recommend preventing older byte-codes from running or older code from compiling. For example, I wouldn’t use this to change the current primitive system.

[quote]Also, your assumption that float is normally used for speed, double for accuracy is flawed.

It is far more logical to default to accuracy, and, if the programmer finds accurate calcs. to be the bottleneck, should then have the option of requesting speed over accuracy.
[/quote]
Isn’t using double instead of float a “request” for more accuracy in the first place though…? So conversely, isn’t using a float a request for speed?

[quote]Your final suggestion, adding operator overloading and new classes for explicitly dealing with SIMD operations is IMO another bad idea :-/
SIMD optimisations are platform/processor specific, the api is meant to be abstract.
SIMD optimisation should therefor be done by the VM jit/Hotspot compiler.
[/quote]
Isn’t that what I suggested? The JVM would optimise the code if possible, otherwise it would execute the byte-codes as normal. I thought the example made it fairly clear. You have a class that is an abstract for a pixel and you have methods you can call for typical operations on pixels. Your JVM can optimise the underlying code as any other class, or with special knowledge (since they are standard classes) it can use SIMD instructions instead. The operator overloading (not something I’m a fan of) is just a syntatic sugar.

The example I gave is also very common and standard SIMD operation as far as I’m aware - every modern CPU has them I think. Only some embedded chips don’t. For some other types of operation, I agree there may not be enough support or implementation variances might be too much - in which case, either don’t support classes that depend on them or warn developers that optimisation support is lacking.

In OpenGL we query for capabilities but don’t know what’s hardware accelerated either but that doesn’t stop us from realising it’s the best API there is for graphics :slight_smile:

SIMD isn’t platform specific. In fact, lots of CPU-specific instructions can be added in here, not just SIMD ones. Just keep adding support for lots and lots of CPUs in there and let developers write code that uses particular profiles. In general there are only a few profiles that anyone need consider, and furthermore, this is very specialist stuff, for use by high performance VM. But so that it actually does run on any implementation of Java there’s no harm in a pure bytecode representation being the default - so long as it works as expected it shouldn’t be any slower than not using the API.

It’s not a fundamentally bad idea. Once again, it’s just an enabling technology, and there are no technological reasons not to implement it. No-one has to use it, but for those that do, the performance gains would be massive.

Cas :slight_smile:

floats might also be used for size. Also… just like in physics class you may choose the representation that has the accuracy required. If you don’t know what you are getting with float then how can you make that decision?

The primitive thing is useless IMO. Call clone() and let the VM optimize.

I also see nothing wrong with Void.TYPE… it’s a tag… for identifying ‘void’… no big deal.

SIMD is complex. I have not seen any way to use it effectively in a high level language without the need for totally new language concepts & keywords. what the VM can automatically do now does not include ANY of the ‘MD’ part of SIMD. It may improve… but if you want to write a video CODEC or do proper image scaling with filtering you are going to need a much lower level interface to do it right. Maybe compiler tech will catch up some day… but the expense of what is required to truly exploit SIMD from a high level description means it isn’t going to happen soon for a JVM.

SIMD is platform specific. I can show you many processors with no SIMD instructions. Unless you mean “instruction” as a Java language instruction… in which can the implementation can be done without native SIMD. Which IMHO is what is required to do SIMD right in a high level language.

OK, my point was about why this type of field (there’s one for each basic datatype) exists, why it is necessary in java: because “primitive’s as Object’s” is incompletely realised in the JLS, and where it’s mandatory to accept them as args to methods that MUST be specified in terms of a superclass of both “Object” and of “(all the primitive types)” you have a considerable problem, solved by the inclusion of these tags.

Including “tags” in the base library for the “lang” package of a language to overcome deficiencies in the language specification is - surely? - not justifiable except in terms of “urgency” and “path of least resistance”; it screams “post-release hack!” (nb: in fact, this behaviour was introduced for the 1.1.x series to allow reflection to work - it was a hack; IIRC there was no other core library that required these tags?).

Unless…I guess if your language is designed as a metadata-tag based language, or similar. That seems to me at odds with some of java’s fundamental behaviour - a very :wink: strongly typed language, not even allowing pointers :slight_smile: - whereas I’d expect it “as standard” in something typeless like a BCPL derivative, where you’ve no semantics to ease development, and arguably need all the “soft” help you can get (like tags).

[quote]floats might also be used for size. Also… just like in physics class you may choose the representation that has the accuracy required. If you don’t know what you are getting with float then how can you make that decision?
[/quote]
If you were using floats for size, wouldn’t that often be because of performance reasons…?

Anyway, on second though, changing the spec for float would be bad - too many potential problems for old code. D’oh.

I did wonder if using meta-data (or similar) might be a good way, but from what I understand of floating-point accuracy models, this would be very hard to actually implement in general and on x86 in particular. sigh

[quote]The primitive thing is useless IMO. Call clone() and let the VM optimize.
[/quote]
For many of the optimisations I think my primitve class idea would allow (eg efficient array of objects), it would be really hard for the VM to be able to optimise to a similar degree. It would also require the programmer to really really know exactly what the capabilities of the optimiser are - one false step and bang goes your performance, and trying to figure out why would be really hard.

[quote]The primitive thing is useless IMO. Call clone() and let the VM optimize.
[/quote]
Unfortunately it remains beyond the current state of the art to primitive type performance without giving hints to the optimiser (at least for some types of code). The SmallTalk answer to this is that the people who write this type of application should use Fortran.

I think there are some better proposals out there for light weight classes of verious kinds. I would like to see one of them adopted in Java. My preference is for the proposals that require these classes to be immutable (and which do not mandate any object layout).

…which sadly makes them pretty useless for fast I/O operations in Buffers.

Cas :slight_smile: