Graphics2D.drawImage() is my bottleneck

My latest game is nearing completion and I’m trying to fine tune the works of it. I had it written for phone/pda sizes in an Applet and I have another guy working on converting my code to J2ME code. So now what I’m doing is making it scale up for play at 800x600 and higher for a PC version, but now I’m hitting some performance problems.

I’m using the GAGETimer to keep my framerate at 50fps and I’ve modified the sleepUntil() to return the number of times it it yielded, which I then print on my screen each frame. When I play the game the low-res way at 300x300 I get 6000-12000 yields every frame, I’m happy with that. When I play the high-res way at 800x600 (with different art to match the resolution) the yields are ALWAYS 0. That tells me I need to do a bit of optimizing :slight_smile:

So profiling has revealed where the biggest bottlenecks are. I tried bringing up the biggest one a couple of weeks ago on the Java Gaming Forums at Sun, but I didn’t really get much of a response so hopefully you guys can help me tune this up. So here’s the biggest one, aparantly taking 20% of my processing time each loop. It’s where I draw the background tiles of the game. And that’s when there’s like 15 other ships firing their weapons at me on the screen. Without that it’s over 35%

for (int i = bgStartX-Constants.SYSTEM_TILE_WIDTH; i < Constants.GAME_WIDTH; i += Constants.SYSTEM_TILE_WIDTH)
      for (int j = bgStartY-Constants.SYSTEM_TILE_HEIGHT; j < Constants.GAME_HEIGHT; j += Constants.SYSTEM_TILE_HEIGHT)
            g2d.drawImage(back, i, j, null);

bgStartX is a number%Constants.SYSTEM_TILE_WIDTH which goes in so I can set the BufferedImage “back” just far enough behind the screen to start to satisfy the scrolling effect.

back is created with:

GraphicsEnvironment.getLocalGraphicsEnvironment().getDefaultScreenDevice().getDefaultConfiguration().createCompatibleImage(width, height, Transparency.OPAQUE)

I look forward to any advice I can get. Thanks in advance!

I’m sure you’ve probably heard this, but make sure you copy your images to “Automatic” or “Volatile” images. This speeds up blits by buffering the image inside the card. Other than that, tile drawing is just plain slow. Try to use larger tiles for one thing. (e.g. 64x64) Another optimization is to only redraw what you need to. If you use a backbuffer for the Map, you may not need to redraw it every frame. This is particularly effective for Parallax scrolling. I use this method in GAGE2D and it does help.

IIRC, I also tried the “copy the unchanged portion over a bit, then redraw the ‘dirty’ tiles” method. Unfortunately, it didn’t provide any speed boosts in my testing. Memory bandwidth was probably the limiting factor.

the dimensions of my tiles are 512x512. I’m putting them on the BufferedImage created as I described, and Abuse has instructed that that will keep the image in vram, but is that image too big? Also, since the tiles have no Alpha value, I guess I could use Volatile Images, but would the speed be much different?

I’m going to look into the parallax scrolling to see what I can learn from it. Thanks!

[quote]the dimensions of my tiles are 512x512. I’m putting them on the BufferedImage created as I described, and Abuse has instructed that that will keep the image in vram, but is that image too big?
[/quote]
OUCH. Yeah, 512x512 is probably too big. At 32 bit color, you’re using about a megabyte of video RAM for each tile. Remember, if you use powers of two for tile sizes, the memory requirements go up exponentially. Allow me to demonstrate:

8x8 = 256 bytes
16x16 = 1 KB
32x32 = 4 KB
64x64 = 16 KB
128x128 = 64 KB
256x256 = 256 KB
512x512 = 1 MB
1024x1024 = 4 MB

[quote]Also, since the tiles have no Alpha value, I guess I could use Volatile Images, but would the speed be much different?
[/quote]
As long as you’re copying your images to “Automatic” (I forget, didn’t we change the name to “Managed”?) images, there should be no difference using Volatile images. “Automatic” images use the best backing store available. With any luck, that should be a Volatile Image.

I tried this:

BufferedImage tile = Rimscape.getMediaItem(Constants.TILES, systems[currentSystem].getBackgroundTile());
back = GraphicsEnvironment.getLocalGraphicsEnvironment().getDefaultScreenDevice().getDefaultConfiguration().createCompatibleVolatileImage(tile.getWidth(), tile.getHeight());
Graphics g = back.getGraphics();
g.drawImage(tile, 0, 0, null);

then drawing “back” like I did before and the costs almost tripled… so I think I was better off with BufferedImage. Before I was just using

back = Rimscape.getMediaItem(Constants.TILES, systems[currentSystem].getBackgroundTile());

Hey I just managed to cut the load by 1/3. I figured since my tiles are friggin huge that drawing before and beyond the borders of the screen might be a problem, so after scratching my head for a few minutes I wrote some code to keep my game from drawing anything outside the bounds of the screen. Here’s what I came up with instead of my previous draw method:

for (i = bgStartX-tW; i < w; i += tW)
      for (j = bgStartY-tH; j < h; j += tH) {
            
            startX = Math.max(0, i);      startY = Math.max(0, j);
            if ((width = Math.min(tW - (startX - i), w - startX)) > 0) {
                  
                  if ((height = Math.min(tH - (startY - j), h - startY)) > 0) {
                  
                        sourceX = (startX == 0 ? 0 - i : 0);
                        sourceY = (startY == 0 ? 0 - j : 0);
                        
                        g2d.drawImage(back,
                              //dest rect
                              startX, startY, startX + width, startY + height,
                              //source rect
                              sourceX, sourceY, sourceX + width, sourceY + height,
                              null
                        );
                  }
            }
      }

trembovetski said:

[quote] This one is easy: we have this limit (which we often forget about ourselves) for bitmask images on windows, the size (w * h) must be less than 65536 in order for image to get accelerated. This is because of the way we accelerate bitmask images with DirectX pipeline.
[/quote]
See here:
http://www.java-gaming.org/cgi-bin/JGNetForums/YaBB.cgi?board=2D;action=display;num=1083335469;start=15

So 256x256 is the maximum (square) tile size you should use.

I tried splitting those into 256x256, 128x128, and 64x64 pixel tiles, but each time I went lower, performance decreased. I was creating the BufferedImages in the same way I described above, which I have come to understand puts them in video memory. Am I wrong? Or is there something else I’m missing? So far though, my best performance has been with the images being at 512x512

We found the 2D performance was a bit pants in general - especially with transparencies so we rolled our own software version.

There are some references in this forum, basically you make a new ImageProducer/ImageConsumer and use an int array for the screen. then some nice for loops for copying the sprite data to the image, and a single drawImage call at the end to render the screen. the VM seems to do a great job with optimising these loops :slight_smile:

It can be a bit complicated, but I used various references such as:

[quote]Take a glance at TinyPTC http://www.gaffer.org/tinyptc/ which has a correct ImageProducer/ImageConsumer model, although I have not used it plenty of people like it.
[/quote]
and:

[quote]JEF stands for Java Emulation Frame Work and is a Source Forge project which can be located here:
https://sourceforge.net/projects/jef/
[/quote]
These helped me on my way and now we have nice fast 2D rendering with alpha blending and other cool features (like z-buffering so sprites can move behind background features)

Be warned though - it’s not a job for the faint hearted!

  • Dom

Well I’ve spent the last two days reading through most all of the classes for those two projects and browsing most of their pages and surprisingly I found nothing to help me. I’m afraid I don’t understand what it is you’re saying I should be looking for. You say you should be “copying pixels” but when you use sprites with alpha components you have to combine pixels, not just copy them over… so I don’t understand what you mean I think. Could you elaborate more for me?

You need the ImageConsumer and ImageProducer bits from those libs to draw the screen from an array of ints.

From our derived applet class:


      public synchronized boolean render()
      {
            if ( copy11 )
            {
                  // get our graphics buffer (640x480 int array)
                  int pixels[] = myGraphics.dest;

                  // check consumer
                  if (_consumer!=null)
                  {
                        // copy integer pixel data to image consumer
                        _consumer.setPixels(0,0,_width,_height,_model,pixels,0,_width);
                        
                        // notify image consumer that the frame is done
                        _consumer.imageComplete(ImageConsumer.SINGLEFRAMEDONE);
                  }
                  // draw image to graphics context
                  _graphics.drawImage(_image, 0, 0, _width, _height, null);
            }
            else
            {
                  // draw image to graphics context
                  _graphics.drawImage(_image2, 0, 0, _width, _height, null);
            }
            return true;
      }


You should be able to find this bit of code in those libs (plus the rest of the class) as that is where we got it from :slight_smile:

We use a derived Sprite class too, which after loading the original image as normal uses a PixelGrabber to get the sprite data into an array of ints.

Non-transparent sprites are draw just by copying the ints:
From our derived ‘Graphics’ object:

            int w = s.width;
            int h = s.height;
            int start = y*width + x;
            int start_end = start + width*h;
            int sp = 0;
            int src[] = s.pixels;
            for(; start < start_end; start+=width)
            {
                  int end = start + w;
                  for( int curr=start; curr<end; curr++, sp++ )
                  {
                        dest[curr] = src[sp];
                  }
            }

To do transparency, you do a similar same thing but blend using the the algorithm here:
http://www.java-gaming.org/cgi-bin/JGNetForums/YaBB.cgi?board=cluebies;action=display;num=1086024248

We have about 16 different sprite draw functions, to do transparency, render a sub-rectangle, scaling, etc.

As I said, its quite a bit of work to get up and running, but afterwards you have your own 2D lib that is pretty fast for rendering :slight_smile:

Hope this helps,

  • Dom

Transparency is fine, its translucency that will cause Java2D to crawl (unless you are using the experimental transaccel flag)

Malohkan, a couple of questions :-

  1. How much space do the images for your background (and indeed your entire game) take up?

If you have more images than can fit in your available vram, you are going to have problems.
At the moment there is no way to control what (if anything) gets shunted out of vram. (it works on a first come first served basis)
If this is your problem, there isn’t much you can do.
Load in the images that are most expensive to render 1st.
(i.e. the images that are drawn most often)

  1. How are you creating your images?

I would recommend loading your images through ImageIO.read(), and then copy them onto a BufferedImage obtained from GraphicsConfiguration.createCompatibleImage(int,int,int)

This will definitly give you a managed image.

  1. As Onyx said, the size of the images matter.

Although i seem to remember it is large ( >256x256)
BITMASK images that will not be cached in vram.
OPAQUE images always are, regardless of size (note this maybe wrong)

  1. What platorm are you using?

Everything I have said above is applicable to Windows only.

The images take up about 4mb. The class files are about 150k. I’ve tried playing the game while printing out GraphicsEnvironment.getLocalGraphicsEnvironment().getDefaultScreenDevice().getAvailableAcceleratedMemory() and it always shows a heck of a lot.

I load in my images using Toolkit and MediaTracker because I’ve never gotten ImageIO to work even when using code you’ve shared on the forums. I then draw that image onto a BufferedImage using the method I described in my initial post up top, using GraphicsConfiguration.

I don’t load very many of my BufferedImages with Transparency.OPAQUE. Most are TRANSLUCENT and none are BITMASK.

I’m using Windows XP.

crystalsquid, I’m working on creating some of the tools you’ve described. I have the basics down, but my drawing of images onto others still isn’t working. I double checked the math and I can’t find anything wrong, but the image is being drawn broken. The funny thing is that it’s drawing in the correct placement, but it’s like it’s grabbing the wrong pixels from the source. I’m sure I just need to tinker more and figure out where I messed up.

Well so far my progress has actually degraded :wink: I made a drawImage for a custom class I made using all this, and then it seems like my coordinate system went retarded. Some things are flipped over y=x, and some things are rotated. I also haven’t gotten alpha to work, but I have managed to make the screen look like it does when your monitor is dying and all the colors go nuts. Soo… yeah this isn’t going well :wink: I think I’m going to have to scratch this project unless maybe you might know where I’ve screwed up. Thanks though!

[quote]You need the ImageConsumer and ImageProducer bits from those libs to draw the screen from an array of ints.
[/quote]
But will it be fast enough to sustain 800x600 at 50 fps? Sounds too much even if you don’t touch the int array.

With Java2D you can’t use TRANSLUCENT if you wan’t descent speed. I think the background is read from the card (very slow), blended, then written back.

You might consider using OpenGL. It is fast even with blending. A lot of work ofcource, and you’ll have to deploy some extra dlls.

yeah good question. When I made my attempt, although it wasn’t working right, it was stilly only drawing like half of the stuff (because I hadn’t made a drawImage method to take AffineTransform objects) and working with 0 alpha and I was getting fps’s of like 20. So that wasn’t very encouraging

Then thats your problem.

Unless you have the ‘transaccel’ flag set, and have a graphics card with the necessary support, translucent images will be very slow.

:edit:

just for a quick metric on that.

Balls.jar with rendering TRANSLUCENT 32x32 images onto a BufferStrategy backBuffer (or VolatileImage) on my machine gets :-

with ‘transaccel’ : ~3000 @ 30fps
without ‘transaccel’ : ~30 @ 30fps

:edit:

hmm, i’ve just discovered something very strange with regard to different resolutions, and TRANSLUCENT rendering (with and without acceleration)
I’m gonna post a new thread on it.

Yeah that’s very helpful on my standalone version, but I still want to have it run the best it can in an Applet.

[quote]Yeah that’s very helpful on my standalone version, but I still want to have it run the best it can in an Applet.
[/quote]
a 4mb Applet?! =/

I doubt many ppl will have the patience to wait that long for an applet to load.
If you are not limiting yourself to Java1.1 only capable machines, i see little reason to use Applets.

Use webstart!

I have a low-res set of art that’s only about 350k. But I’d hate to have to have two sets of code and every time I want to make a change I’d have to go through both sets and make simultaneous changes. The 350k version is plenty small for the Applet and will be condensed further.

Anyway, loading the full one as an Applet is faster than loading many of the Flash games at MiniClip.com and plenty of people play those :slight_smile:

So, more questions. Abuse, I saw you made a post today about using VolatileImages. I tried it for my tiling background, since there’s no transparency. The tiles are broken into pieces so that they’ll be sure to be cached in vram. Even without checking whether contents have been lost an all that, when I used VolatileImages I got fps’s of around 15fps. When I used BufferedImage, I got 70+ fps. Do VolatileImages choke when the drawImage(image, destRect, sourceRect, null) method is called?

I’m creating them all through the GraphicsConfiguration blah blah so that they’ll be cached in vram. I tried breaking them into 256x256, 128x128, and 64x64. Performance got worse as I went to smaller sizes. With BufferedImage though I never dropped my framerate.

Why do VolatileImages always perform so slowly for me? I know they should be faster, so what am I missing?

Oh, and I won’t be using WebStart because web start forces you to jump out of the browser. That means you can’t see the ad’s on the page, which means bye bye to my only source of revenue for my free games site. That’s not gonna happen :wink: