Structs

I gather that some here would to see structs added to Java to improve the efficiency of communications with other systems (notably hardware). Rather than add a new language feature I propose a class (below) which the JIT could optimise to achieve the same efficiency as a C struct.


public abstract class Struct {
   private ByteBuffer buffer;
   private int position;
   
   protected Struct(ByteBuffer b, ByteOrder order) {
      buffer = b.slice();
      buffer.order(order);
   }

   public abstract int size();
   public int alignment() {return 1;}
   public final int position() {return position;}
   public final void position(int p) {
      if (p < 0 || p+size() > buffer.limit()) throw new IndexOutOfBoundsException();
      if ((p&(alignment()-1)) != 0) throw new IllegalArgumentException("Bad alignment");
      position = p;
   }
   public final void setIndex(int i) {
      position(i*size());
   }

   protected int getInt(int offset) {return buffer.getInt(position+offset);}
   protected void setInt(int offset, int value) {buffer.putInt(position+offset, value);}
}

class MyStruct extends Struct {
   public MyStruct(ByteBuffer buf) {super(buf, ByteOrder.nativeOrder());}
   public int size() {return 8;}
   public int getX() {return getInt(0);}
   public void setX(int x) {setInt(0, x);}
   public int getY() {return getInt(4);}
   public void setY(int y) {setInt(4, y);}
}

Now the JIT should be able to optimise this provided that
[]the size() method returns a constant
[
]The supplied ByteOrder is suitable. Note that the byte order can’t be changed after construction of the Struct.
[]The calls to the get and set methods have constant values for the offset. Or where the value can be deduced to always lie in the range 0 … size()-sizeof(type).
[
]The alignment is constant and suitable for the processor

This mechanism also allows structs to contain 'union’s and you can put padding where appropriate.

Thus the performance objective can be achieved by adding just one ‘special’ class and modifying JITs to make use of the properties of that class. Even better the code will work with existing JVMs without change (just slower).

OK, so you have to do a bit more typing (or use a wizard), but it ought to be far easier to get this change through the JCP than one which requires a change to the language itself.

The mechanism I’ve proposed before doesn’t need a language change, it needs a JVM change to work efficiently. Anything descended from javax.nio.Struct, or Mapped, or whatever, has its fields laid out in memory in a well-defined manner and overlaps a ByteBuffer.

Cas :slight_smile:

[quote]fields laid out in memory in a well-defined manner and overlaps a ByteBuffer.
[/quote]
I don’t think you can do that without losing some of the characteristics of an object. specifically the hidden words that provide for synchronization and class information can’t be in the ByteBuffer. So they must either be removed in which case the Struct is no longer an Object, or the field references become indirect (effectively the same as my Struct class).

That’s because you misunderstand my implementation :slight_smile:

The base class Struct is thus:


class Struct {
private int position;
private final StructBuffer buf;
}

and it lives within the context of a StructBuffer:


class StructBuffer {
private final ByteBuffer buf;
private final int stride;
}

So you see - there are two classes; one defines the container for a structure, and the other defines merely a position within the container from where data can be read.

Instances of Struct themselves are normal Java objects but all fields in derived classes are defined to live within their parent StructBuffer’s ByteBuffer. Synchronisation and other object overhead tags are all present and correct, just as for any normal Object.

You do not keep millions of Struct instances hanging about everywhere, for that would defeat the purpose of Structs. You only need one Struct, really, which is repeatedly positioned within its parent StructBuffer according to its stride.

It’s brilliant, even if I do say so myself :slight_smile:

Cas :slight_smile:

I think your original proposal included a struct keyword which is certainly a language change. My proposal and yours have the same effect but differ in the following respects
[]You change the semantics of fields in subclasses of Struct. This also counts as a language change. I presume that attempts to include reference fields would be rejected (compile time error).
[
]I allow the struct to positioned anywhere in the buffer subject to the alignment constraint. This allows it to be used for nonhomogenous data. Access to data beyond the declared size would also be permitted but not optimised. This would be useful with variable size structures which have a fixed ‘header’ section.
[]In my proposal the code would run on both existing and new JVMs with the same results. Only the performance is affected.
All that my Struct does is provide the compiler with some guarantees:
[
]There are at least size() bytes after the position
[]The position is always a multiple of alignment(). Some machines will require alignment to achieve best optimisation.
[
]The byte order doesn’t change
You add some syntactic convenience and leave the layout of the struct to the compiler (presumably to be in the order specified without gaps). Whereas currently the order of fields makes no difference to a Java program.

In my case the derived class methods could include loops accessing the buffer and provided that the JIT could deduce (from the loop range) that the resulting accesses were always within the size() constraint then they too could be optimised (i.e. bounds checks eliminated).

Isn’t the point of structs that they are public by nature.

This would prevent other usage that would belong to a class and also emphasis the use of structures.
-unsafe
-fast
-no fancypants functions
-no overhead and easy to optimize/“inline”

No, that’s not the point of “Structs” - the point of structs is combining efficient memory access with full-blown Java OO capabilities.

I’m having a think about the idea of non-homogenous buffers. I don’t like them really - if the data stream is just that, a stream, then you want a stream based protocol to read the data out in the first place, which we’ve got in NIO already. You can synthesise non-homogenous buffers anyway by wrapping the same bytebuffer in different structbuffers with different offsets and strides but that sounds like a burden of complexity on the programmer. Hm.

Cas :slight_smile:

As the purpose of structs would be to improve efficiency of communication with hardware or other processes, we will often not get the luxury of designing the protocol. It was for this reason that I included non homgenous buffers.

You mentioned elsewhere that the facility might also be used for other purposes purely within Java such as holding computation information. While this is possible, I think there are better ways of addressing those shortcomings. My preference would be for a immutable classes with value semantics (much like that proposed by Gosling).

[quote]No, that’s not the point of “Structs” - the point of structs is combining efficient memory access with full-blown Java OO capabilities.

I’m having a think about the idea of non-homogenous buffers. I don’t like them really - if the data stream is just that, a stream, then you want a stream based protocol to read the data out in the first place, which we’ve got in NIO already. You can synthesise non-homogenous buffers anyway by wrapping the same bytebuffer in different structbuffers with different offsets and strides but that sounds like a burden of complexity on the programmer. Hm.

Cas :slight_smile:
[/quote]
Well, why access the memory through buffers then at all?

Why not propose a “struct” that tells the compiler to allocate x amount of memory for x object which’s duration is in the hands of the programmer?

It doesn’t help at all if you add another non direct route.

But I guess in its essence your idea is like that?

So long as a reference to the object exists we want the behaviour when attempting to use it to be well defined. Unlike C++.

Incidentally, this RFE appears to have stalled most spectacularly:

http://developer.java.sun.com/developer/bugParade/bugs/4820062.html

Is anyone still interested in it, or do priorities lie elsewhere these days? If still wanted, maybe it should be written up with current understanding and resubmitted?

Maybe. I heard that dougtwilleager was interested in the proposal. Or I might have imagined it.

Cas :slight_smile:

Wait til the C# advocates make enough noise and then it will rise in priority. While C# is not a cross platform language and .Net pretty much sucks rocks, the Java devs could learn a lot from the language semantics of C# (yeah it has structs too… and they get allocated on the stack instead of the heap).

I couldn’t find the original thread on this but this seems as good as any place to post…
Now that 1.5 beta is out, the RFE list is getting cleaned up a bit.

Currently, there are still several RFEs in the top 20 that are in 1.5 and will get pulled but even without that being done yet, right now the “structs” RFE is at #15.

BUT STILL NO “Evaluation”!

Anyway, here’s the link again so vote 'em if ya got 'em

Provide “struct” syntax in the Java language
http://developer.java.sun.com/developer/bugParade/bugs/4820062.html

and the top RFE link
http://developer.java.sun.com/developer/bugParade/top25rfes.html

I notice that JSR-203 is now targetted at J2SE 1.6, having previously fallen of the end of JSR-51 (part of which made 1.4). This relates to the find free disk space RFE.
There are several other RFEs above Struct which clearly aren’t going to make Tiger. In fact as the feature cut off is long gone, if it isn’t in 1.5 already it won’t make it.

I certainly don’t expect it to make it to 1.5
What I was saying is that, now 1.5. is out a bunch of the higher RFEs will be cleared, and structs wil rise into the top 10 soon, yet stil there has been no eval.

Also, I woudl like to make a call to votes again on this topic.
Whether or not you like the way this struct proposal is being made, more votes for the struct RFE will still help get the issue addressed officially by Sun.

I don’t know that it must be implemented exactly as described in the RFE. But I agree with the general idea of a mechanism to efficently map the contents of ByteBuffers to a particular arrangement of primative types. If there is some fancy way of tagging fields in a class so that they can be backed by a particular offset into a byte buffer that may do the trick… and only be slightly different from how structs is currently proposed.

Mind you I also think there are a lot of other critical enhancements, like that mentioned of better access to the filesystems - free disk space, file attributes like read,write, executable bits, owner, created time vs. modified time, etc. Those things are more obviously lacking and it is depressing to see it take so long for such basic things to get addressed. It’s all related to more efficient integration with the external systems though - something critical for Java to work well on the client.

Latest progress update -
Structs moved up 1 from 15 to 14, due to removal of a complete RFE above in the list.
4 more to go to top ten and well, then maybe there will be an official Sun response.

Keep on voting!

Latest progress update -
Structs RFE has moved up to number 12 in the RFE list.
I am on a push in my communities to get another round of votes to get it over the hill to make the top 10.
Only need about 20 more people to give all 3 of their votes.

Remember, even if you don’t like the particular proposal, at least it can get a RESPONSE from Sun if we get it up there.

Keep on voting!

Top 25 RFE’s (Request for Enhancements)
http://bugs.sun.com/bugdatabase/top25_rfes.do

Actual RFE
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4820062

I am not a C++ shark. But I seem to recall that the only difference between a struct and a class is that all members of a struct are public.

But the difference between a C++ class and a Java class, is that you can cast an arbitrary memory pointer to a pointer of a given class (or struct) and then you can scroll your ‘class view’ through memory with pointer arithmetics.

I would hate to see such a thing in Java, because pointer arithmetics is awful.

But luckily, the code samples that you have posted, look more like the good old Flyweight Pattern from GOF.

I am not sure I totally follow, how the JVM should be able to optimize this code in a particular way.

Where is the significant performance gain?