clip an image

Wow, I stand corrected.
Though I am still dubious as to the cause of the slow-down, and a 3x slowdown!?!?

For an operation that should be graphics card limited, I think there is something going seriously wrong in the 2D pipeline for such a dramatic difference.

I’d be interested to see what speed drawImage(Image img, int dx1, int dy1, int dx2, int dy2, int sx1, int sy1, int sx2, int sy2, ImageObserver observer) gives, as this would eliminate the overhead of the setClip invocations.

as you wish, the benchmark has been updated to include a 3rd method - “drawImage”

it turns out this method is identical to the graphics clipping - my best guess is that there is some kind of internal clipping going on in the background.

subimaging is faster simply because it has less computation needs than the other 2 methods. it’s just read-from-source -> plot-to-screen.

edit:
after further investigation, it seems hardware acceleration is making the difference!

every subimage in the subimage array is on a hardware accelerated surface. this makes it a hardware blit from source -> screen.

although the spritesheet itself is on a hardware accelerated surface, we’re making Graphics do its own calculations with pixels which makes it a software blit from source -> screen.

So the lesson is - clipping an image kills that drawImage operation’s hardware acceleration.

Does that mean that painting a big, larger-than-the-screen image will forfeit hardware acceleration due to the screen clip?

hm… technically, ANY image that’s drawn outside the boundaries of the screen is clipped, and therefore, should lose accleration. but if you notice, the sprites DO leave the screen a bit on the right/bottom sides… with no obvious effect…

shrug

Mh.

On the first run I got these numbers with 1k sprites:
SI:24
GC:22
DI:23

After a reboot I got these (again 1k sprites):
SI:17
GC:12
DI:13

Geez… why? There is even less stuff running in the background.

Well, there are some things bad about this benchmark. The spritesheet’s size exceeds 2^16 pixels and you’re only using 2 buffers (which is fine for windowed mode, but really bad for fullscreen).

[quote]The spritesheet’s size exceeds 2^16 pixels
[/quote]
ah yes, ofcourse.

However, I would have expected this to cause no hardware acceleration for all versions - as getSubImage uses the same source pixel data as its parent image.
Perhaps BufferedImage’s created from getSubImage are cached in vram independantly of their parent image.

yes. that’s how I coded it :wink:
quick code snippet…


protected void loadSprites() {
	// <snip>
	for (y = 0;y < spriteImg.length;y++) {
		for (x = 0;x < spriteImg[y].length;x++) {
			spriteImg[y][x] = createBuffer(Sprite.WIDTH,Sprite.HEIGHT); // <-- new acccelerated image from createBuffer
			spriteImg[y][x].createGraphics().drawImage(sheet.getSubimage(x*Sprite.WIDTH,y*Sprite.HEIGHT,Sprite.WIDTH,Sprite.HEIGHT),0,0,null);
		}
	}
	// <snip>
}

that createBuffer method returns a hardware accelerated image surface. I draw a sprite frame onto that and then it is never drawn on again. viola, yet another accelerated image =o

Java engineers have made it clear in the past (through these forums) that getSubImage shouldn’t be used, cause it’s implementation is not optimal.

They explicitly said to paint the image into a new image instead of using getSubImage.

I can only assume that the clipping/drawImage implementations have similar implementation issues.

doesn’t make a difference, the benchmark only calles getSubimage during intiliazation and records the results on a hardware accelerated image, the method is never called again

The reason you’re seeing bad performance with non-subimage modes is that you
use getSubImage() on the “sheet” image, which disables its acceleration.
(it’s easy to see if you run with -Dsun.java2d.trace=log to see what primitives are
being used).

I made some simple mods to your test (which is pretty nicely done, btw):


#> diff -c TestOld.java Test.java
*** TestOld.java   Sun May 21 20:24:23 2006
--- ../Test.java        Tue May 23 16:55:31 2006
***************
*** 205,218 ****
                // there 8 total sprites
                // okay, so lets do this
                sheet = loadImage("sprites.png");
                spriteImg = new BufferedImage[sheet.getHeight()/Sprite.HEIGHT][sheet.getWidth()/Sprite.WIDTH];
                int x,y,j = 0;
                for (y = 0;y < spriteImg.length;y++) {
                        for (x = 0;x < spriteImg[y].length;x++) {
                                spriteImg[y][x] = createBuffer(Sprite.WIDTH,Sprite.HEIGHT);
!                               spriteImg[y][x].createGraphics().drawImage(sheet.getSubimage(x*Sprite.WIDTH,y*Sprite.HEIGHT,Sprite.WIDTH,Sprite.HEIGHT),0,0,null);
                        }
                }
                sprites = new ArrayList();
                for (j = 0;j < 25;j++) {
                        sprites.add(new Sprite(rand(0,7),new Point(rand(0,width),rand(0,height)),new Point(rand(0,width),rand(0,height)),rand(1,3)));
--- 205,220 ----
                // there 8 total sprites
                // okay, so lets do this
                sheet = loadImage("sprites.png");
+               BufferedImage sheet1 = loadImage("sprites.png");
                spriteImg = new BufferedImage[sheet.getHeight()/Sprite.HEIGHT][sheet.getWidth()/Sprite.WIDTH];
                int x,y,j = 0;
                for (y = 0;y < spriteImg.length;y++) {
                        for (x = 0;x < spriteImg[y].length;x++) {
                                spriteImg[y][x] = createBuffer(Sprite.WIDTH,Sprite.HEIGHT);
!                               spriteImg[y][x].createGraphics().drawImage(sheet1.getSubimage(x*Sprite.WIDTH,y*Sprite.HEIGHT,Sprite.WIDTH,Sprite.HEIGHT),0,0,null);                        }
                }
+               sheet1.flush(); sheet1 = null;
                sprites = new ArrayList();
                for (j = 0;j < 25;j++) {
                        sprites.add(new Sprite(rand(0,7),new Point(rand(0,width),rand(0,height)),new Point(rand(0,width),rand(0,height)),rand(1,3)));

Basically, I just do getSubImage on a different image instead of the one used
by “mode 0”.

With this change I get much better performance with drawImage method than getSubImage.

Especially with the new Direct3D pipeline in mustang, it costs a lot to change the texture to render from.
In your case, if you have tons of images you’ll change the source texture for every sprite,
which costs a lot.

Also, with tons of smaller images you may be wasting more video memory.

Thanks,
Dmitri
Java2D Team

Basically, I just do getSubImage on a different image instead of the one used by “mode 0”.

I meant to say that I do getSubImage on a different image instead of the one used by modes other than
“mode 0”.

Also, you might want to change ‘+’ to add like 500 sprites, otherwise it takes a while to get
to the point when the fps falls to less than 60 =)

Dmitri

And regarding the clipping mode. It will be slower because we’ll need to update/calculate the clip
for each sprite.

So, in short - the drawImage(srcRect,dstRect) should be the fastest.

Dmitri

thanks dmitri! nice to see some clues to these mysteries…

for sake of argument, I took your change a step further: I separated the Subimaging and Clipping/drawImage tests into 2 applications. take a look…

[]Subimaging.jar [source]
[
]Clipping.jar [source]

however, @1000 sprites, Subimaging is at 44FPS while Clipping lags behind 19FPS… what’s going on here?

edit: adding/substracting sprites are now at 15 intervals, as you pointed out, +1/-1 was too slow :slight_smile:

I was running the test on Mustang with Direct3D pipeline enabled, so I didn’t notice the
problem with clipped images and the default pipeline…

Basically, if you’re using 5.0, or 6.0 with d3d pipeline disabled, you’re using
DirectDraw acceleration. And we have this funny (rather arbitrary) restriction
on the size of 1-bit transparent images that we can accelerate. If
a 1-bit transparent image with DirectColorModel has size larger than
65536 we don’t accelerate it. Your sprite.png is 576 * 128 = 73728 so it doesn’t
get accelerated. (for those interested, the offending file is WinCachingSurfaceManager.java,
take a look at Mustang’s code on http://mustang.dev.java.net). The restriction has to do with the algorythm
used for calculating a pixel which can be used for masking with DirectDraw’s blit operation.
If the image is too large, it takes too long to calculate.

The direct3d and opengl pipelines don’t have this restriction.

Workaround is rather simple. Do not convert your image after loading. The restriction above
is only applicable to images with DirectColorModel. Your image is 8-bit, so it will be IndexColorModel, so it will
get accelerated. Anyway, as of 5.0 all images could potentially be accelerated, no need for converting them to a “compatible image”
as in 1.4 days.

So, in your loadImage() method just return ‘b’.

Thanks,
Dmitri

Hi! I come from Italy and I’m working on a project about Java2D for university.
I’m developing a game and I want to know if my “drawing-engine” is good or not.

I do not know how you create a Sprite but in my case I create an ArrayList of BufferedImage and each element is a Frame of this Sprite. Every n milliseconds the Frame changes and is drawn with [b]drawImage/b of a Graphics with a double-buffering strategy. The game with 640 by 480 resolution goes well, but can I do something else to increase the performance? Ah I also use the [b]getSubImage/b method to draw parts of the sprites (always using [b]drawImage/b in this way g.drawImage(image.getSubImage(0, 0, 20, 20), 0, 12, null)).

Thanks in advance for all replies!

Nobody knows?

Don’t use getSubimage() for this case, it will defeat image acceleration. Instead use the variant of Graphics.drawImage() that takes both src and dst parameters:
http://download.java.net/jdk6/docs/api/java/awt/Graphics.html#drawImage(java.awt.Image,%20int,%20int,%20int,%20int,%20int,%20int,%20int,%20int,%20java.awt.image.ImageObserver)

So taking your example:
g.drawImage(image, 0, 12, 20, 12+20, 0, 0, 20, 20, null);

We really should add an item to the FAQ about this.

Chris

Or fix this bug, as well as the whole getScaledInstance mess.
Why can’t these methods be implemented just by doing what you describe - creating a new bufferedimage, copy/scale the stuff and return it?

lg Clemens

Because of backward compatibility.

Currently both parent image and subimage share single data buffer.
So any changes you make to any of them is reflected in another.
If we do what you suggest this will be broken.

Dmitri

Thats an interesting bit of info!
getScaledInstance makes no mention of it sharing the parent images data buffer!

It’d be nice for the javadoc to :-
a) document all the methods behaviour, and
b) be marked as deprecated regardless.