TUER: Truly Unusual Experience of Revolution, FPS using JOGL

bienator · April 26, 2010, 11:25pm

the main problem with direct allocated buffers is that they can’t be allocated everywhere. In fact they are allocated somewhere in the permgen (-> not in the regular heap) which limits the max usable amount of memory.

(remember direct memory does not change the position which has advantages and disadvantages. e.g fragmentation, but on the other hand this makes them to the best vehicle to transfer mem between jvm and native libs)

@see -XX:MaxDirectMemorySize=foobar

if you are running out of direct memory (on latest JREs) you will see the info usually in the exception message.

as riven said, use big buffers and slice them and prevent allocating direct buffers again and again.

[edit] latest version of visualvm + jre6 can track direct memory usage, nice tool for debugging.

gouessej · April 27, 2010, 1:10pm

I see what you mean, I increased the perm gen size with:

[quote]-XX:PermSize=1024m
[/quote]

That is why I’m going to try to use direct NIO buffers only when they are going to be used by JOGL.

What is the difference with -XX:PermSize?

When I need to allocate some temporary NIO buffers, I will rather use indirect ones, thanks. I will use jvisualvm if I fail in tracking direct memory usage, I already used it in order to find the high memory consumption in TextRenderer

What do you mean by “overhead”? Why 4K? Can you explain it to me a bit more please? It is unclear to me.

I have never used the slice() method. As far as I know, I don’t need to use several buffers sharing the same content but maybe it will be useful to share only a part of a buffer with another one.

Riven · April 27, 2010, 1:19pm

All because MappedBuffers must be page-aligned (pages are 4K on most systems) all direct buffers are page-aligned :

allocating a direct buffer basically looks like:


int bytes = ...;
final int pageSize = 4096;
long pointer = sun.misc.Unsafe.malloc(bytes + pageSize);
long base = (pointer + (pageSize-1)) / pageSize * pageSize;
Buffer buffer = createBufferAt(base);

gouessej · April 27, 2010, 1:24pm

Riven:

All because MappedBuffers must be page-aligned (pages are 4K on most systems) all direct buffers are page-aligned :

allocating a direct buffer basically looks like:
int bytes = ...;
final int pageSize = 4096;
long pointer = sun.misc.Unsafe.malloc(bytes + pageSize);
long base = (pointer + (pageSize-1)) / pageSize * pageSize;
Buffer buffer = createBufferAt(base);

:o Now I see what you mean, I have to avoid using direct buffers to store tiny data, it was a huge waste of memory.

gouessej · April 28, 2010, 10:27pm

Hi!

Now JFPSM needs 8 times less memory to do the same job, I thank Bienator and Riven for their advices. In the past, I was creating about 3 millions of small direct float buffers.

Riven · April 29, 2010, 8:59am

Nice. I put it on my blog, with easy to use sample code:

gouessej · April 29, 2010, 10:40am

It is an excellent and comprehensible explanation, thank you

I don’t understand why direct buffers have been implemented so that they are page-aligned.

Riven · April 29, 2010, 10:56am

Some lazy bastard at Sun didn’t want to write two codepaths…

princec · April 29, 2010, 1:02pm

It’s a perfectly reasonable optimisation when you consider how bytebuffers are meant to be used. They’re specifically meant to be allocated rarely, as very big buffers, and sliced up into little bits as needed, and used solely for bulk I/O operations (be that network, audiovisual, or file). If you’re doing anything else with them - that is anything other than streaming data in or out of them - you’re not using them for what they’re intended.

Cas

Riven · April 29, 2010, 1:33pm

Ofcourse, but few people know that.

In Java you can do new float[1] with hardly any overhead (maybe 32 bytes), yet an allocated DirectFloatBuffer of 1 element (4 bytes) has an overhead of 4096 bytes, and a HeapFloatBuffer of 1 element is not having that kind of overhead too.

It simply comes as a surprise to most people, if noticed at all.

princec · April 29, 2010, 4:42pm

Ah, I thought everyone knew that… but that’s probably because I’ve had my head buried in this stuff since before it was even officially released in 1.4 8)

Cas

DzzD · April 29, 2010, 4:54pm


currentBuffer.limit(currentBuffer.position() + size);
ByteBuffer result = currentBuffer.slice();
currentBuffer.position(currentBuffer.limit());
currentBuffer.limit(currentBuffer.capacity());

I am far to be an expert of ByteBuffer but according to the javadoc about slice “capacity and its limit will be the number of bytes remaining in this buffer”
maybe the code should rather look like the following, no, yes ?


ByteBuffer result = currentBuffer.slice();
result.limit(size);
currentBuffer.position(currentBuffer.position() + size);

Riven · April 29, 2010, 5:00pm

No

that remark is about the newly created buffer, not the original buffer.

ByteBuffer a = …;
ByteBuffer b = a.slice();

b = the range between a.position() and a.limit()

DzzD · April 29, 2010, 5:04pm

ha ok sry, I did not understand the docs…

gouessej · April 30, 2010, 12:07pm

Why does the alignment with pages improve the performances then?

Riven · April 30, 2010, 1:10pm

For your average direct bytebuffer: a negligible reduced amount of cache misses.

princec · April 30, 2010, 1:45pm

Well, not just that, but it helps the OS malloc the stuff more easily without wasting in the first place. But nonetheless the intended use of DirectByteBuffer is such that you shouldn’t construct many of them anyway.

Cas

Riven · April 30, 2010, 1:49pm

As you can see, the malloc by the OS is unaligned. The Java code aligns it.

gouessej · May 19, 2010, 8:31am

Hi!

My project stagnated for some weeks because I was busy with the interviews, I was in a reality TV show… I hope to find some time to work on it in June.

Demonpants · May 19, 2010, 2:10pm

Haha woah cool, when can I see it? What show was it? Why were you selected?