TUER: Truly Unusual Experience of Revolution, FPS using JOGL

the main problem with direct allocated buffers is that they can’t be allocated everywhere. In fact they are allocated somewhere in the permgen (-> not in the regular heap) which limits the max usable amount of memory.

(remember direct memory does not change the position which has advantages and disadvantages. e.g fragmentation, but on the other hand this makes them to the best vehicle to transfer mem between jvm and native libs)

@see -XX:MaxDirectMemorySize=foobar

if you are running out of direct memory (on latest JREs) you will see the info usually in the exception message.

as riven said, use big buffers and slice them and prevent allocating direct buffers again and again.

[edit] latest version of visualvm + jre6 can track direct memory usage, nice tool for debugging.

I see what you mean, I increased the perm gen size with:

[quote]-XX:PermSize=1024m
[/quote]

That is why I’m going to try to use direct NIO buffers only when they are going to be used by JOGL.

What is the difference with -XX:PermSize?

When I need to allocate some temporary NIO buffers, I will rather use indirect ones, thanks. I will use jvisualvm if I fail in tracking direct memory usage, I already used it in order to find the high memory consumption in TextRenderer :wink:

What do you mean by “overhead”? Why 4K? Can you explain it to me a bit more please? It is unclear to me.

I have never used the slice() method. As far as I know, I don’t need to use several buffers sharing the same content but maybe it will be useful to share only a part of a buffer with another one.

All because MappedBuffers must be page-aligned (pages are 4K on most systems) all direct buffers are page-aligned ::slight_smile:

allocating a direct buffer basically looks like:


int bytes = ...;
final int pageSize = 4096;
long pointer = sun.misc.Unsafe.malloc(bytes + pageSize);
long base = (pointer + (pageSize-1)) / pageSize * pageSize;
Buffer buffer = createBufferAt(base);

:o Now I see what you mean, I have to avoid using direct buffers to store tiny data, it was a huge waste of memory.

Hi!

Now JFPSM needs 8 times less memory to do the same job, I thank Bienator and Riven for their advices. In the past, I was creating about 3 millions of small direct float buffers.

Nice. I put it on my blog, with easy to use sample code:

It is an excellent and comprehensible explanation, thank you :slight_smile:

I don’t understand why direct buffers have been implemented so that they are page-aligned.

Some lazy bastard at Sun didn’t want to write two codepaths…

It’s a perfectly reasonable optimisation when you consider how bytebuffers are meant to be used. They’re specifically meant to be allocated rarely, as very big buffers, and sliced up into little bits as needed, and used solely for bulk I/O operations (be that network, audiovisual, or file). If you’re doing anything else with them - that is anything other than streaming data in or out of them - you’re not using them for what they’re intended.

Cas :slight_smile:

Ofcourse, but few people know that.

In Java you can do new float[1] with hardly any overhead (maybe 32 bytes), yet an allocated DirectFloatBuffer of 1 element (4 bytes) has an overhead of 4096 bytes, and a HeapFloatBuffer of 1 element is not having that kind of overhead too.

It simply comes as a surprise to most people, if noticed at all.

Ah, I thought everyone knew that… but that’s probably because I’ve had my head buried in this stuff since before it was even officially released in 1.4 8)

Cas :slight_smile:


currentBuffer.limit(currentBuffer.position() + size);
ByteBuffer result = currentBuffer.slice();
currentBuffer.position(currentBuffer.limit());
currentBuffer.limit(currentBuffer.capacity());

I am far to be an expert of ByteBuffer but according to the javadoc about slice “capacity and its limit will be the number of bytes remaining in this buffer”
maybe the code should rather look like the following, no, yes ?


ByteBuffer result = currentBuffer.slice();
result.limit(size);
currentBuffer.position(currentBuffer.position() + size);

No :slight_smile:

that remark is about the newly created buffer, not the original buffer.

ByteBuffer a = …;
ByteBuffer b = a.slice();

b = the range between a.position() and a.limit()

ha ok sry, I did not understand the docs…

Why does the alignment with pages improve the performances then?

For your average direct bytebuffer: a negligible reduced amount of cache misses.

Well, not just that, but it helps the OS malloc the stuff more easily without wasting in the first place. But nonetheless the intended use of DirectByteBuffer is such that you shouldn’t construct many of them anyway.

Cas :slight_smile:

As you can see, the malloc by the OS is unaligned. The Java code aligns it.

Hi!

My project stagnated for some weeks because I was busy with the interviews, I was in a reality TV show… I hope to find some time to work on it in June.

Haha woah cool, when can I see it? What show was it? Why were you selected?