suppose i have a buffered image with some pixels transparent and some opaque. now, i’ve drawn this imag into a larger buffered image, that before the operation was completley transparent. so now i have two pictures with the same number of opaque pixels, but a different number of transparent ones. will this affect the speed of the rendering, and if it does, how much?
two pictures with the same number of opaque pixels, but a different number of transparent ones. ???
yes. the one with the more pixels has obviously larger dimensions…
ok, only image size will affect the speed of the rendering, so larger image will be slower.
but it seems so logical that if the image is of type bitmask, you can speed up the drawing by creating an array of coordinates, containing the coordinates of only the opaque pixels, and when you draw, you iterate through this array, thus skeeping the check of whatever the pixels are transparent or not. thus, the only thing that’ll affect the rendering speed is the number of opaque pixels…
Well, blitting is not done this way otherwise it would be terrible slow.
lg Clemens
[quote]Well, blitting is not done this way otherwise it would be terrible slow.
[/quote]
Actually, it’s very similar to the way we ‘accelerate’ 1-bit transparent images on X11.
We build a bitmask Pixmap from the image, and then use X11 to copy the original image
(also stored in Pixmap) to the screen with masked out transparent pixels using the bitmask
Pixmap. (check out j2se/src/solaris/native/sun/java2d/x11/X11PMBlitLoops.c and X11CachingSurfaceManager.c
in the same directory, j2se/src/solaris/classes/sun/java2d/x11/X11CachingSurfaceManager.java
in mustang j2se workspace)
Something similar is done on windows (for accelerating 1-bit transparent images with ddraw)
but for different reasons: there we try to find an unused color so it can be used as a color key
for the ddraw Blt call (see j2se/src/windows/classes/sun/java2d/windows/WinCachingSurfaceManager.java).
Obviously, we don’t do this on every blit call, only if the image has changed since the
last time we processed it.
But to (try to) answer the original question: whether the number of truly translucent pixels in an image
(or percentage of those) matters.
Let’s first talk consider a pure software blit case (from a BufferedImage - translucent - to another BufferedImage -
opaque).
Basically, for every translucent pixel in the source image we’ll have to read the destination pixel to
compute the resulting pixel. So the more translucent pixels there are in the image, the more we’ll have to
read from the dest. surface. The opaque pixels are just copied to the destination (I’m assuming SrcOver
compositing mode here).
Now if your destination surface is in video memory, those pixels we read will have to be
pulled from VRAM, which is slow. So the more translucent pixels there are in the source,
the slower it will be.
Of course, in general, the more pixels you have to process the more time it takes
to copy an image.
In some cases, however, due to hardware or native api oddities copying smaller images
woudn’t be much faster than copying larger images (for example, it takes a while to
setup a ddraw blit, so if your image is small, the overhead is quite large).
The story is a bit more complicated with our ddraw pipeline acceleration on windows because of
our punting mechanism (if we detect that we have to read too much from the destination surface in
vram, we punt it to system memory). But in this case we don’t count a number of pixels for which
we had to read from the destination surface, we take a rough approximation - a region that has to be
locked in order to work with the pixels. If the area of that region is larger than certain percentage of the
destination surface, the destination surface is punted to system memory.
This means that if you have two images, a small one and a large one, both containing
translucent pixels, we’re more likely to punt the destination surface to the system memory
if you copy the larger image.
Once the dest. surface (which is typically a back-buffer) is in system memory, all compositing
operations become fast, but copying from this surface to the screen becomes slower.
I hope you’re thoroughly confused by now.
Thanks,
Dmitri
Java2D Team
thanks for a great answer! i understand much more now, but i haven’t gotten my answer- you answered what happenes with a translucent pixel (has alpha that is greater than zero but smaller than one), i.e that of an image created with Transparency.TRANSLUCENT, while i asked for transparent pixels, those who have zero alpha value, and are stored in an image created with Transparency.BITMASK.
to make it more clear, here’s the original question, after a couple of changes:
if i have two pictures in different sizes, both created with Transparency.BITMASK, and have the same number of opaque pixels, will they draw at the same speed (more or less)?
thanks, noam
I am really stupid - I did not understood your post the right way when I read and answerd to it.
I interpreted the reply as “you may omit reading of pixels from the src imge if the blitted pixel is opaque” - sorry.
please forgive me
lg Clemens

if i have two pictures in different sizes, both created with Transparency.BITMASK, and have the same number of opaque pixels, will they draw at the same speed (more or less)?
thanks, noam
If we’re talking about a bufferedimage to bufferedimage copy, copying a larger image will be
slower. We still have to walk through all the source pixels and test if they’re transparent or not.
The inner loop looks something like this (as you can imagine), if you pardon my French:
for all pixels in src image do
fetch a pixel from src image
if pixel is transparent then go to next pixel
else store the pixel into the destination surface
done
So one would expect that since a larger image has more pixels,
we’ll need to do more checks (which also involves memory access
to get the source pixel).
If we’re talking about a hardware copy vram->vram (like in case of ddraw, or
opengl/d3d), then it depends. I mean, for fast hardware there shouldn’t be
too much of a difference (within reasonable limits).
It’d be interesting to know what are you trying to optimize.
Thanks,
Dmitri
Java2D Team
i still don’t understand why, when it’s a BufferedImage created with BITMASK, you can’t use the following ,method:
you create a two dimensional array of integers that stores the coordinates of only the opaque pixels, than the initialization code will look something like this:
for all pixels in src image do
if pixel is opaque then add the pixel’s coordinates to opaqueArray
else go to next pixel
done
and the draw loop will look like this:
for all coordinates in opaqueArray do
fetch a coordinate from opaqeuArray
get the pixel in that coordinate from src image
set the dst image’s pixel at that coordinate to the src pixel color
done
that you you don’t need to check if the pixels are transparent every draw loop.
to your question, the question came up when i wantd to add some drawing to an already large pic, but the drawing is very big on one dimesnsion adn very small on the other, so it adds alot of transparent pixels.
I do not want to tell you again something which is not true but as far as I can imagine for buffered-image -> buffered-image blits
First variant:
- check if pixel is opaque (at least a compare/jump operation, maybe also masking)
- put into opaque array
- go through opque array and paint
- …
For todays processors such operations are a lot slower than just doing the same (load src and dest, multily source and destover and write the reult) and over.
With SIMD-Instruction sets like SSE2 or MMX its even possible to work on 2 or 4 values at the same time whereas its quite hard to archive the same with code that has tons of conditional-branchs in it. One drawback is of course RAM bandwith consumption.
For SW-YSurface blits its again another story since hardware only works efficient of you say “put this bunch of memory onto the surface”. I guess for transparent accerlated images the Mask is cached anyway, but its an Image-Mask and not an array
lg Clemens

I do not want to tell you again something which is not true but as far as I can imagine for buffered-image -> buffered-image blits
First variant:
- check if pixel is opaque (at least a compare/jump operation, maybe also masking)
- put into opaque array
- go through opque array and paint
- …
For todays processors such opeations are a lot slower than just doing the same
that’s not the thing i had in mind.
you put the opaque pixels into the opaquearray on initialization, and then you just iterate through it every game loop. thus skipping the checking of the transparent pixels.
so, the order will be:
in initialization:
- check if pixel is opaque (at least a compare/jump operation, maybe also masking)
- put into opaque array
in the paint method: - go through opque array and paint
here’s some code to demonstarte my intentions:
class Pixel
{
public Pixel(int X,int Y,Color color{
///some code here…
}
public Color color;
public int x;
public int y;
}
//(in Ctor)
{
for(int i=0;i<width;i++)
{
for(int j=0;j<height;j++)
{
if(pixel at (i,j) is opaque)
{
opaqueArray.add(new Pixel(i,j,color));
}
}
}
}
(in draw function)
{
for(int i=0;i<opaqueArray.length;i++)
{
dst_image.setPixel(opaqueArray[i].x,opaqueArray[i].y,opaqueArray[i].color);//no checking for alpha, becuase
//opaquearray contains only opaque pixels
}
}
of course i’m sure this has no resemblence to the real code, it’s just to clarify the idea i thought of.
Well, I understood it as it was ment (at least I think so after reading your answer post).
For Software -> Software blits (buffered-img to buffered-img) only calculating alpha for non-opaque/transulescent pixel would be more expensive that iterating over all and using SIMD extensions plus maybe the 2-3 pipelines modern preocessors offer if more than x/y of the pixels are opaque. This would maybe make sence for images that have a lot (> 85%) opaque pixels, however you can only find tht out by counting them which is again
For Software -> VRam blits (that happens when the image gets accerlated) that is what happens. When the image becomes accerlated a second mask image is generated which stands for the thing you call “opaque array” with the difference that its supported by hardware.
lg Clemens
maybe i’m not understanding you, but this line:

only calculating alpha for non-opaque/transulescent pixel would be more expensive
makes me think that you don’t understand me, because you talk about calculating alpha, though my method only checks the alpha at initialization, and after it just iterates through the opaque pixels (it’s a bitmask image, so it doesn’t have anything else than opaque and copletley transparent pixels).
Hmm, maybe I expressed myself a bit complicated /not 100% right.
But this would only help in a very few cases with only a few (i guess less than 10%) opaque pixels, but its just abguess it would be really great to see some benchmarks. Another question is how far both concepts could be optimized for highest performance…
lg Clemens

But this would only help in a very few cases with only a few (i guess less than 10%) opaque pixels
sorry, but i really wnat to get to the bottom of this, why wouldn’t it help in other cases? it gets rid of some alpha checking, and doesn’t add new calculations, so i think it can only fasten the code, not slow it down. of course, it’ll only work for software rendering of bitmask images, but that’s still is alot, isn’t it?
First of all, you Pixel class would use atleast 60 bytes of memory. So I hope it was only pseudo code or maybe you were joking. What you are saying is that you want it stored in some kind of Run Length Encoded (RLE) format. That could speed up rendering if there were a lot of non opaque pixels.
But the reason why you can’t do it that way is that you need fast direct access to all the pixels in the image. So that you can scale, rotate, clip, filter the image. The RLE image can only be used to draw the image as it is.

for all pixels in src image do fetch a pixel from src image if pixel is transparent then go to next pixel else store the pixel into the destination surface done
I’m sure you mean “for all pixels in destination rectangle do”. You do not want to iterate the src image when it is scaled or clipped.
ok, that’s a good reason… thank