JEP for making Unsafe a public API

A quick hack: https://github.com/roquendm/JGO-Grabbag/tree/master/src/roquen/pr0n

On heap seems to work fine (today on my version of hotspot).

My other approach is… interesting too: (typing it out - probably doesn’t even compile)

public static long getAddressOfReference(Object ref) {
   Object[] holder = new Object[1];
   holder[0] = ref;

   long addr;
   // read address from holder[0]
   if(is32bitJVM)
      addr = unsafe.getInt(holder, OBJECT_ARRAY_DATA_OFFSET + 0) & 0xFFFF_FFFFL; // signed
   else
      addr = unsafe.getLong(holder, OBJECT_ARRAY_DATA_OFFSET + 0);

   return addr;
}

public static Object makeReferenceToAddress(long addr) {
   Object[] holder = new Object[1];

   // write address into holder[0]
   if(is32bitJVM)
      unsafe.putInt(holder, OBJECT_ARRAY_DATA_OFFSET + 0, (int)addr);
   else
      unsafe.putLong(holder, OBJECT_ARRAY_DATA_OFFSET + 0, addr);

   return holder[0];
}

public static void copyFloatArrayHeadersToAddress(float[] src, long dstAddr) {
   for(int i=0; i<FLOAT_ARRAY_DATA_OFFSET; i+=4) {
      unsafe.putInt(dstAddr + i, unsafe.getInt(src, i));
   }
}

Now make a float[] reference in off-heap memory, shared with a FloatBuffer:


final int sizeofFloat = 4;
int sizeofMalloc = FLOAT_ARRAY_DATA_OFFSET + elements * sizeofFloat;
ByteBuffer bb = ByteBuffer.allocateDirect(sizeofMalloc);
long bbAddr = getBufferAddress(bb);

float[] dummy = new float[0];
copyFloatArrayHeadersToAddress(dummy, bbAddr);

float[] floatArray = (float[]) makeReferenceToAddress(bbAddr);

bb.position(FLOAT_ARRAY_DATA_OFFSET);
FloatBuffer floatBuffer = bb.slice().order(ByteOrder.nativeOrder()).asFloatBuffer();


floatArray[1337] = 13.37f;
floatBuffer.get(1337) == 13.37f;

My code is without question “nuts”. I was just curious if it would work at all. Remember I was just thinking about a potential work around for android.

Why so apologetic / defensive?

I didn’t mean it to sound that way. Without any extra information it doesn’t seem useful for Java/HotSpot.

Small usability problem with getting a raw address. The GC must be exactly tracking references so if the instance is moved after getting the address the reconstructed reference be foo-bared. (I just checked) Easy enough to shore up.

(EDIT: grammar mistake)

Had some trouble with multithreaded access in this regard. In between getting the address of a heap based object and writing the data to it - literally a couple of instructions later - the objects were being moved when I added another thread in to the mix.

Cas :slight_smile:

For heap-based objects, you use [icode]Unsafe.[put|get]…(Object ref, int offset, … value)[/icode]

What did you try to accomplish? Why would you want to write into an instance with Unsafe?

Experiments…

I was trying to get some cache-friendly way of storing processed sprite vertex data actually inside the sprites themselves (as I was having to touch the sprites anyways whilst iterating through them). So I had a big chunk of vertex00, vertex01, vertex02, vertex03, etc floats in each sprite, and I’d stick a direct FloatBuffer window over them pointing at vertex00, and write the transformed data out there; next time I had to write the sprite out to a VBO I’d point the window at it and do a bulk FloatBuffer copy straight into the VBO.

Alas, when I started to add more threads, it would mysteriously glitch now and again, and after ruling out concurrency issues with the cheeky Unsafe hacks, it turned out that the underlying sprite objects were indeed moving at arbitrary moments after I’d pointed the FloatBuffer window at them. So I gave up on that idea.

Cas :slight_smile:

So has anyone looked at how hotspot expands the various get/put methods after compiling?

BTW: Since I never stated my reservation about the llegal method + (edit: the tech doesn’t matter) lie about type of the array is that the compiler will end up seeing after it inlines:

pointer integerArray = byteArray

so it’s almost insured to eliminate ‘integerArray’ since it’s simply an alias of ‘byteArray’. Although it has all the information to still do the right thing, we’re falling into an untested situation. However if in some instances accessing memory via an array is actually faster (like potentially where range-checks can be eliminated) then get/put, then there’s a fall-back position of making the conversion more involved such that the compiler can’t determine that the two pointers are aliases. Obviously it’d be nice if it does work with the alias elimination 'cause that’d be faster.

(EDIT: Oh and I added alternate lie-about-types methods into that playground code using riven’s construction and a loop to nod about the GC maybe moving it)

[quote=“Roquen,post:30,topic:48421”]
They compile down to trival memory loads/stores. At least in the x86 disassemblies I’ve inspected. However…

The real problem with Unsafe access comes from hotspot NEVER reordering accesses in the JITed code. This has a measurable effect on performance and I first encountered it when doing performance testing with Riven’s original mapped objects library. I’m not sure what the technical problem is, but it looks like all native calls (intrinsified or not) act as implicit code barriers and GC safepoints. Unsafe access disassembly looks exactly like the source Java code (same access order that is) and I don’t know if this is a bug or a correctness/security issue. Definitely though, hotspot does not apply the same optimizations to Unsafe as it does to fields (of course I’m talking about plain access, not volatile/ordered) and any computation going in or out of off-heap memory will suffer a performance penalty.

I’ve written a JMH benchmark that showcases the problem. The baseline is Matrix4fJava.mul4f, which performs standard 4x4 matrix multiplication, where the target may be one of the operands. The matrix is represented as a POJO with float fields. It is compared to doing the exact same calculation with a float array, a FloatBuffer and finally, Unsafe. There are also optimized versions that have been manually tuned to minimize stack usage, close to what hotspot does automatically for field access. Results on Java 8 GA x64 (edit: added float vs int results and made more clear what the baseline is):

// Floating point
Benchmark                       Mode   Samples         Mean   Mean error    Units
UnsafeBench.field               avgt        10       24.972        0.229    ns/op

UnsafeBench.array               avgt        10       30.871        1.331    ns/op
UnsafeBench.arrayOptimized      avgt        10       26.681        0.044    ns/op
UnsafeBench.buffer              avgt        10       31.548        0.493    ns/op
UnsafeBench.bufferOptimized     avgt        10       28.809        0.071    ns/op
UnsafeBench.unsafe              avgt        10       31.541        0.160    ns/op
UnsafeBench.unsafeOptimized     avgt        10       24.517        0.056    ns/op

// Integer
Benchmark                       Mode   Samples         Mean   Mean error    Units
UnsafeBench.field               avgt        10       29.582        0.153    ns/op

UnsafeBench.array               avgt        10       37.068        0.303    ns/op
UnsafeBench.arrayOptimized      avgt        10       31.804        0.125    ns/op
UnsafeBench.buffer              avgt        10       44.780        0.134    ns/op
UnsafeBench.bufferOptimized     avgt        10       39.070        0.175    ns/op
UnsafeBench.unsafe              avgt        10       37.281        0.343    ns/op
UnsafeBench.unsafeOptimized     avgt        10       29.032        0.194    ns/op

This is the reason I’m not particularly enthusiastic about this JEP. My only hope is seeing John Rose being involved with both value types and Project Panama. It’s the best opportunity to highlight what a pain IPC and interacting with native APIs is, and somehow making value types/arrays possible to back with off-heap memory.

Interesting. I was expecting to hear that get/set were identical to field accesses. It certainly seems like a ‘defect’ if they are acting like barriers. If you haven’t then now might be a good time to ask the question on the compiler mailing list since issues like the improved memory model, public Unsafe and structure like functionality are on people’s minds. I’ll look at the source soon.

Spasi’s benchmark indicates this is very unlike to be interesting but I tossed this together anyway if anyone is motivated to play around:

Makes on offheap object with overlayed array headers for (byte,int,float).

In a moment of work-avoidance I took a quick peek at android’s Unsafe. It looks like one could use a FatOaf backing to port LibStruct.

AFAIK Android’s Unsafe only supports bulk operations, making it useless for LibStruct :persecutioncomplex:

I was looking here:
https://android.googlesource.com/platform/libcore/+/master/libdvm/src/main/java/sun/misc/Unsafe.java

Ah right…


public native int getInt(Object obj, long offset);
public native void putInt(Object obj, long offset, int newValue);

I, however, need:


public native int getInt(long address);
public native void putInt(long address, int newValue);

Besides, if they can’t get FloatBuffer performance right after half a decade, what are the odds Unsafe calls will be in intrinsified.

On intrinsics: zero percent. That’s why I was thinking that FatOaf thing…then they are needed. Maybe I should attempt to motive myself to see if it actually works first before opening my mouth any more.

My Google Fu failed on ‘FatOaf’ :emo:

I assume he’s talking about this: https://github.com/roquendm/JGO-Grabbag/blob/master/src/roquen/vm/FatOaf.java