I need really Help. JOGL is so slooow ! :(

Hi, I am working on a simple JOGL-speedtest for my diploma thesis. I want to compare the speed of the JOGL-Renderloop
with the same Renderloop used in C with GLUT. The Speedtest draws 1000 textured Cubes, which rotate an move arround.
It is realized with a display list.

My result: JOGL/Java gets ~ 370 FPS and C/GLUT gets 620 FPS !!

Is that really true, that JOGL is so slow ? On my university of applied sciences they thought about switching from GLUT/C to
JOGL/Java. But if it is really so slow, the will not switch to them I think :frowning:

Here is my renderloop in JOGL and in C.

JOGL


public void display(GLAutoDrawable glDrawable)
{
        now = System.nanoTime();
        fullatency+=latency;
            
        if ((now-fullatency) >= (before + 1000000000))
        {
            before = System.nanoTime();
            fullatency=latency;
            fps = frames;
            frames = 1;
        }
        else
            frames++;
        
        gl = glDrawable.getGL();
        gl.glClear(GL.GL_COLOR_BUFFER_BIT | GL.GL_DEPTH_BUFFER_BIT);
        gl.glLoadIdentity();
       
        cubeTex.enable(); 
        
        for (int i=0;i < numCubes; i++)
        {
            gl.glPushMatrix();
            gl.glTranslatef(posX[i], posY[i], posZ[i]);
            gl.glRotatef(angle2[i]+=angle2Speed[i], angle2X[i], angle2Y[i], angle2Z[i]);
            gl.glTranslatef(distX[i], distY[i], distZ[i]);
            gl.glRotatef(angle1[i]+=angle1Speed[i], angle1X[i], angle1Y[i], angle1Z[i]);
            gl.glCallList(1234); 
            gl.glPopMatrix();
        }
        
        cubeTex.disable();
}

…and here in c


void renderScene(void)
{
	QueryPerformanceCounter(&now);
	fullatency.QuadPart+=latency.QuadPart;

	if ((now.QuadPart-fullatency.QuadPart) >= (before.QuadPart + frequenzy.QuadPart))
	{
		QueryPerformanceCounter(&before);
		fullatency.QuadPart = latency.QuadPart;
		fps = frames;
        frames = 1;
	}
	else
		frames++;

	
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
    glLoadIdentity();
	
   for (int i=0;i < numCubes; i++)
    {
	glPushMatrix();
        glTranslatef(posX[i], posY[i], posZ[i]);
        glRotatef(angle2[i]+=angle2Speed[i], angle2X[i], angle2Y[i], angle2Z[i]);
        glTranslatef(distX[i], distY[i], distZ[i]);
        glRotatef(angle1[i]+=angle1Speed[i], angle1X[i], angle1Y[i], angle1Z[i]);
        glCallList(1234); 
	glPopMatrix();
   }
	glutSwapBuffers();
}

  • I have Vsync disabled and each Programm uses Fullscreen Exclusive mode with 1024x768x32
  • I uses P4 3,2 GHZ Single CPU with Nvidea Geforce 6800 on WindowsXP Porfessional.
    I have newest Detonator Drivers and newest JOGL Library!
  • Where could be my bottleneck (if there is one ?) in my speedtest ? Or is the JNI in general so slow ?

try removing the lines
cubetex.enable()
cubetex.disable()
and only call
cubetex.bing()
once in your init function.

Thank you for the answer. But it does not change anything on the speed :frowning:

Maybe I should post more -Code ?



public class SimpleRoutine extends JOGLMainRoutine implements KeyListener
{
    private GL gl                          = null;
    private final GLU glu                  = new GLU();
    
    public SimpleRoutine(JOGLMainFrameInterf mainFrame)
    {
        super(mainFrame);
    }

    Texture cubeTex             = null;
    TextureCoords coords  = null;
    final int numCubes        = 1000;
  
    float[] angle1              = new float[numCubes];
    float[] angle1Speed  = new float[numCubes];
    float[] angle1X            = new float[numCubes];
    float[] angle1Y            = new float[numCubes];
    float[] angle1Z            = new float[numCubes];
    float[] angle2              = new float[numCubes];
    float[] angle2Speed  = new float[numCubes];
    float[] angle2X            = new float[numCubes];
    float[] angle2Y            = new float[numCubes];
    float[] angle2Z            = new float[numCubes];
    
    //Platzieren an Pos
    float[] posX                = new float[numCubes];
    float[] posY                = new float[numCubes];
    float[] posZ                = new float[numCubes];
    float[] distX               = new float[numCubes];
    float[] distY               = new float[numCubes];
    float[] distZ               = new float[numCubes];
    private long now           = 0;
    private long before       = 0;
    private long latency      = 0;
    private long fullatency  = 0;
    private int frames         = 0;
    private int fps                = 0;
    
    
    public void init(GLAutoDrawable glDrawable)
    {
        glDrawable.addKeyListener(this);
        
        gl = glDrawable.getGL();
        gl.glShadeModel(GL.GL_SMOOTH);              
        gl.glClearColor(0.0f, 0.0f, 0.0f, 0.5f);    
        gl.glClearDepth(1.0f);                      
        gl.glEnable(GL.GL_DEPTH_TEST);              
        gl.glDepthFunc(GL.GL_LEQUAL);               
        gl.glHint(GL.GL_PERSPECTIVE_CORRECTION_HINT, GL.GL_NICEST);
            
        try
        {
            cubeTex = TextureIO.newTexture(new File ("AppGfx/cubeTex2.png"), true);
            cubeTex.bind();
            cubeTex.enable();
            coords = cubeTex.getImageTexCoords();
        }
        catch(Exception e) {}
                
        for (int i=0;i < numCubes; i++)
        {
            Random Zufallsgenerator = new Random();
            angle1[i]       = Zufallsgenerator.nextFloat()*360.0f;
            angle1Speed[i]  = Zufallsgenerator.nextFloat()*3.0f;
            angle1X[i]      = (float)Zufallsgenerator.nextInt(2);
            angle1Y[i]      = (float)Zufallsgenerator.nextInt(2);
            angle1Z[i]      = (float)Zufallsgenerator.nextInt(2);
            angle2[i]       = Zufallsgenerator.nextFloat()*360.0f;
            angle2Speed[i]  = Zufallsgenerator.nextFloat()*3.0f;
            angle2X[i]      = (float)Zufallsgenerator.nextInt(2);
            angle2Y[i]      = (float)Zufallsgenerator.nextInt(2);
            angle2Z[i]      = (float)Zufallsgenerator.nextInt(2);
        
            if (Zufallsgenerator.nextInt(2) == 0)
                posX[i]         = Zufallsgenerator.nextFloat()*5.0f;
            else
                posX[i]         = Zufallsgenerator.nextFloat()*(-5.0f);
                
            if (Zufallsgenerator.nextInt(2) == 0)
                posY[i]         = Zufallsgenerator.nextFloat()*5.0f;
            else
                posY[i]         = Zufallsgenerator.nextFloat()*(-5.0f);

            posZ[i]         = Zufallsgenerator.nextFloat()*(-10.0f)-3.0f;
            
            if (Zufallsgenerator.nextInt(2) == 0)
                distX[i]         = Zufallsgenerator.nextFloat()*5.0f;
            else
                distX[i]         = Zufallsgenerator.nextFloat()*(-5.0f);
            
            if (Zufallsgenerator.nextInt(2) == 0)
                distY[i]         = Zufallsgenerator.nextFloat()*5.0f;
            else
                distY[i]         = Zufallsgenerator.nextFloat()*(-5.0f);
                
            if (Zufallsgenerator.nextInt(2) == 0)
                distZ[i]         = Zufallsgenerator.nextFloat()*2.0f;
            else
                distZ[i]         = Zufallsgenerator.nextFloat()*(-2.0f);    
        }
        
        gl.glNewList (1234, GL.GL_COMPILE);
        gl.glBegin(GL.GL_QUADS);
        gl.glTexCoord2f(coords.left(), coords.top());
        gl.glVertex3f(-0.2f,  0.2f, 0.2f);
        gl.glTexCoord2f(coords.right(), coords.top());
        gl.glVertex3f( 0.2f,  0.2f, 0.2f);
        gl.glTexCoord2f(coords.right(), coords.bottom());
        gl.glVertex3f( 0.2f, -0.2f, 0.2f);
        gl.glTexCoord2f(coords.left(), coords.bottom());
        gl.glVertex3f(-0.2f,  -0.2f, 0.2f);
        gl.glTexCoord2f(coords.left(), coords.top());
        gl.glVertex3f(-0.2f,  0.2f, -0.2f);
        gl.glTexCoord2f(coords.right(), coords.top());
        gl.glVertex3f( 0.2f,  0.2f, -0.2f);
        gl.glTexCoord2f(coords.right(), coords.bottom());
        gl.glVertex3f( 0.2f, -0.2f, -0.2f);
        gl.glTexCoord2f(coords.left(), coords.bottom());
        gl.glVertex3f(-0.2f,  -0.2f, -0.2f);
        gl.glTexCoord2f(coords.left(), coords.top());
        gl.glVertex3f(-0.2f,  0.2f, -0.2f);
        gl.glTexCoord2f(coords.right(), coords.top());
        gl.glVertex3f(-0.2f,  0.2f,  0.2f);
        gl.glTexCoord2f(coords.right(), coords.bottom());
        gl.glVertex3f(-0.2f, -0.2f,  0.2f);
        gl.glTexCoord2f(coords.left(), coords.bottom());
        gl.glVertex3f(-0.2f,  -0.2f, -0.2f);
        gl.glTexCoord2f(coords.left(), coords.top());
        gl.glVertex3f(0.2f,  0.2f, -0.2f);
        gl.glTexCoord2f(coords.right(), coords.top());
        gl.glVertex3f(0.2f,  0.2f,  0.2f);
        gl.glTexCoord2f(coords.right(), coords.bottom());
        gl.glVertex3f(0.2f, -0.2f,  0.2f);
        gl.glTexCoord2f(coords.left(), coords.bottom());
        gl.glVertex3f(0.2f,  -0.2f, -0.2f);
        gl.glTexCoord2f(coords.left(), coords.top());
        gl.glVertex3f(-0.2f,  -0.2f, -0.2f);
        gl.glTexCoord2f(coords.right(), coords.top());
        gl.glVertex3f( 0.2f, -0.2f, -0.2f);
        gl.glTexCoord2f(coords.right(), coords.bottom());
        gl.glVertex3f( 0.2f, -0.2f, 0.2f);
        gl.glTexCoord2f(coords.left(), coords.bottom());
        gl.glVertex3f(-0.2f,  -0.2f, 0.2f);
        gl.glTexCoord2f(coords.left(), coords.top());
        gl.glVertex3f(-0.2f,  0.2f, -0.2f);
        gl.glTexCoord2f(coords.right(), coords.top());
        gl.glVertex3f( 0.2f,  0.2f, -0.2f);
        gl.glTexCoord2f(coords.right(), coords.bottom());
        gl.glVertex3f( 0.2f,  0.2f,  0.2f);
        gl.glTexCoord2f(coords.left(), coords.bottom());
        gl.glVertex3f(-0.2f,  0.2f,  0.2f);
        gl.glEnd();
        gl.glEndList();
        
        before  = System.nanoTime();
        now     = System.nanoTime();
        latency = now - before;
        now     = 0;
        before  = 0;
    }

    public void display(GLAutoDrawable glDrawable)
    {
        now = System.nanoTime();
        fullatency+=latency;
            
        if ((now-fullatency) >= (before + 1000000000))
        {
            before = System.nanoTime();
            fullatency=latency;
            fps = frames;
            frames = 1;
        }
        else
            frames++;
  
      gl = glDrawable.getGL();
      gl.glClear(GL.GL_COLOR_BUFFER_BIT | GL.GL_DEPTH_BUFFER_BIT);
      gl.glLoadIdentity();
        
        for (int i=0;i < numCubes; i++)
        {
            gl.glPushMatrix();

            gl.glTranslatef(posX[i], posY[i], posZ[i]);
            gl.glRotatef(angle2[i]+=angle2Speed[i], angle2X[i], angle2Y[i], angle2Z[i]);
            gl.glTranslatef(distX[i], distY[i], distZ[i]);
            gl.glRotatef(angle1[i]+=angle1Speed[i], angle1X[i], angle1Y[i], angle1Z[i]);
            gl.glCallList(1234); 
            gl.glPopMatrix();
        }
    }
    
    public void reshape(GLAutoDrawable glDrawable, int x, int y, int width, int height)
    {
        gl = glDrawable.getGL();
        if (height <= 0) height = 1;
        final float ratio = (float)width / (float)height;
        gl.glViewport(0, 0, width, height);
        gl.glMatrixMode(GL.GL_PROJECTION);
        gl.glLoadIdentity();
        glu.gluPerspective(45.0f, ratio, 0.1, 1000.0);
        gl.glMatrixMode(GL.GL_MODELVIEW);
        gl.glLoadIdentity();
    }
    
............
...........
}

Another strange thing is, that I cannot see any change of Performance if I stay in windowed-mode or fullscreen-exclusive mode.
Is that ok ?

Thats expected behavior. You are profiling the performance difference between a native method call in Java and a method call in C. The time needed to render a cube is around 0. The art of graphics programming, ESPECIALLY in Java, is to push the bottleneck to the GPU.

Make a test, increase the poly count of your cubes (or render something else). You will notice no performance difference in the Java version but perhaps in the C version.

thats a second proof for a CPU bottleneck

Comparing the performance of programming languages is a very difficult if not impossible task. You will never write a java programm in the same manner as a C program and if yes, you did something wrong.

BTW your test results show that a native Java method call runs with around 67% of the speed of a C method, thats IMO a really good result

Thank you ver much for this answer :slight_smile:

I have 2 questions to this:

1.) What do you mean with “push the bottleneck to the GPU” ? I thought I did this with my displaylist ? How would I improve my Speedtest which draws 1000 Cubes, so that it stresses more the CPU than the JNI ?

2.) What do you mean with Polycount ? Do You mean I should build a cube from many little cubes ?

Greetz

Fbr

it depends what you want to show with your test. If you want to show that you can render in JOGL with the same speed as in C + GL then increase your poly count until the fps of the java testcase decreases significantly (than you have a GPU bottleneck…). Then render the same stuff in the C version and you should get similar fps (say <5%difference). But the result should be clear because JOGL is just a java binding to gl and doesn’t change the rendering speed of gl.

here is some pseudo rendering code ;D:
Java/C is working a little bit
GPU has lot of work
=> test performance of gl

what you are doing now:
Java/C is working a little bit
GPU has nothing to do and feels boring
=> test performance between JNI and C

yes for example. Or a sphere, the duke, tux… be creative :wink:

Additionally it is important to know how a GPU works. It is a highly parallelized hardware with a lot of threads => the most time it doesn’t run in sync with CPU.
If you are doing a lot with CPU and little on GPU, the Graphics Card does nothing just waits for input. This is called a stall.

Thats why I said you will not write a graphics app in java in the same manner as in C. But if you do it you will get a CPU bottleneck in java earlier as in C.

Cool…thanx so much for your help :smiley:

Now I have understand what you mean :slight_smile:

Plz…only one question to fullscreen mode. If I understand right, than you mean, If my Speedtest will stress fully the GPU and not the CPU…only than I could see advantages of the fullscreenmode ?

A long time ago there was a difference in performance between fullscreen mode and windowed mode. That is no longer the case. You can be fill rate limited, which is to say that a given videocard can only draw so many pixels per second, but that is a function of the videocard not java or c.

hmm…but sun writes itself in this article:

[url][http://java.sun.com/docs/books/tutorial/extra/fullscreen/exclusivemode.html/url]

That article is almost five years old. I doubt that you’ll notice a difference on a modern system.