ive been writing a software renderer, but it seems like alpha blending isnt working so well…
algorithm for porter-duff SrcOver:
void alphaBlend(int [] pixels, int offset, int source, int alpha) {
int destRGB = pixels[offset];
int destA = destRGB >>> 24;
int destR = (destRGB >> 16) & 0xff;
int destG = (destRGB >> 8) & 0xff;
int destB = destRGB & 0xff;
int srcA = source >>> 24;
int srcR = (source >> 16) & 0xff;
int srcG = (source >> 8) & 0xff;
int srcB = source & 0xff;
srcA *= alpha;
srcR *= alpha;
srcG *= alpha;
srcB *= alpha;
int oneMinusSrcA = 0xff - (srcA >> 8);
destR = (srcR + destR * oneMinusSrcA) >> 8;
destG = (srcG + destG * oneMinusSrcA) >> 8;
destB = (srcB + destB * oneMinusSrcA) >> 8;
destA = (srcA + destA * oneMinusSrcA) >> 8;
//pixels[offset] = (destA << 24) | (destR << 16) | (destG << 8) | destB;
pixels[offset] =
((destA << 24) | (destR << 16) | (destG << 8) | destB);
}
pixel = array of ARGB pixels
offset = position of pixel in array to blend
source = color in ARGB format to blend with destination (pixel[offset])
alpha = alpha to take into account while blending (between 0-255)
the problem is that sometimes it doesn’t blend the alpha properly and instead just replaces the destination alpha with source alpha, can anyone see why this happens?
it happens when the source color has a very small alpha value eg: 1
I’ve also written my own software renderer. Just had my first morning coffee so it’s a bit early for bitshifting - let’s hope this is right … ;D
This bit looks wrong. Assuming your alpha value is in the range 0…255, then that needs to be shifted down.
eg.
srcA = (srcA * alpha) >> 8;
In my code I’m also doing the equivalent of
srcA = (srcA * (alpha + 1)) >> 8;
which seems to be more accurate.
The second issue may be that Porter-Duff algorithms expect pre-multiplied colour data. You haven’t mentioned whether the data is pre-multiplied or not (I’d recommend doing so, as it makes a lot of things easier for both you and the CPU).
If your data isn’t pre-multiplied then I’d also change the code above to be
This is probably still rounding errors caused by bitshifting. Could try -
public static int blend(int src, int dest, int alpha) {
return src + mult(dest, 0xFF - alpha);
}
public static int mult(int val, int multiplier) {
return (val * (multiplier + 1)) >> 8;
}
ie. force it to add 1 to the multiplier.
Haven’t been able to test it, but I think it might work. Should probably evaluate that fix for my code too.
Nice catch! Must have been a bit too quick on the copy, paste, replace there.
There’s a few other potential optimisations too - the immediate one that comes to mind is to replace the 3 checks for alpha==255 with a single if statement. In fact, I’d done that in the first Add composite but none of the others. I must go back to this class and do some tidy up - I haven’t looked at it for a while - was just glad it worked at all!
Incidentally, what are you writing a software renderer for? A project of its own or something bigger?
And I know I mentioned it before, but I’d really recommend moving to premultiplied data throughout - it’s easier and better performing - trust me, I found out the hard way! ;D
I wasn’t happy with the performance of Java2D, especially in regards to colorizing images on the fly; therefore I had to devise something much faster.
Now that I have switched to a software renderer, I get the following benefits:
Able to use any blendmode I want for brushes (Add, Multiply, Over, Overlay, Screen, Darken, etc)
Able to save the state of any pixels I modify without doing a scan beforehand (VERY useful for undo states)
Able to colorize brush pixels on the fly using a simple OR operation: (source << 24) | color, previously in Java2D I had to create a new image, get a graphics context, set alphacomposite, set color and call drawLine(x,y,x,y) for each pixel I wanted to recolor. INSANE speed increase in brush rendering once I made the switch.
Uses much less memory, each brush image (with colors, etc) took about 4kb memory before, and about 1000 would be created for a single stroke, equal to 4MB. With my new software renderer, it is able to use a constant 400kb per brush (for 100 sizes)
overall, the software renderer is about 15x faster than Java2D and also supports blendmodes, state saving, etc, its a win-win situation.
I might switch to TYPE_ARGB_PRE later, if required, but I’ve actually tried it before and it results in weird artifacts when I zoom in, i’ll save it for investigation on another day
BTW: Brush class is now:
package as.internal.unnatural;
import as.internal.UndoManager.UndoState;
import java.awt.AlphaComposite;
import java.awt.Color;
import java.awt.Graphics2D;
import java.awt.RenderingHints;
import java.awt.image.BufferedImage;
import java.awt.image.DataBufferInt;
import java.util.Arrays;
/**
*
* @author David
*/
public class Brush {
private int[][] brushData;
private int maxPixelValue;
public static int TypeRound = 1,
TypeSquare = 2;
public static Brush getSquareBrush(int count) {
return new Brush(count, TypeSquare, false);
}
public static Brush getRoundBrush(int count) {
return new Brush(count, TypeRound, true);
}
private Brush(int numSizes, int type, boolean antialias) {
brushData = new int[numSizes][];
for (int i = 1; i <= numSizes; i++) {
BufferedImage brush = new BufferedImage(i, i, BufferedImage.TYPE_INT_ARGB);
Graphics2D g = brush.createGraphics();
if (antialias) {
g.setRenderingHint(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_BICUBIC);
g.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
}
g.setColor(Color.BLACK);
g.setComposite(AlphaComposite.Src);
if (type == TypeRound) {
g.fillOval(0, 0, i, i);
} else if (type == TypeSquare) {
g.fillRect(0, 0, i, i);
}
g.dispose();
int array[] = ((DataBufferInt) brush.getRaster().getDataBuffer()).getData();
brushData[i - 1] = Arrays.copyOf(array, array.length);
int[] data = brushData[i - 1];
for (int j = 0; j < data.length; j++) {
data[j] >>>= 24;
if (data[j] > maxPixelValue) {
maxPixelValue = data[j];
}
}
brush.flush();
}
}
public Brush(BufferedImage image, int numSizes) {
brushData = new int[numSizes][];
for (int i = 1; i <= numSizes; i++) {
BufferedImage brush = new BufferedImage(i, i, BufferedImage.TYPE_INT_ARGB);
Graphics2D g = brush.createGraphics();
g.setRenderingHint(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_BICUBIC);
g.drawImage(image, 0, 0, i, i, null);
g.dispose();
int array[] = ((DataBufferInt) brush.getRaster().getDataBuffer()).getData();
brushData[i - 1] = Arrays.copyOf(array, array.length);
int[] data = brushData[i - 1];
for (int j = 0; j < data.length; j++) {
data[j] >>>= 24;
if (data[j] > maxPixelValue) {
maxPixelValue = data[j];
}
}
brush.flush();
}
}
public int getPixelMaxAlpha() {
return maxPixelValue;
}
public int getMaxBrushSize() {
return brushData.length + 1;
}
int clamp(int value) {
return value > 255 ? 255 : value;
}
public static int premultiply(int rgbColor, int alpha) {
if (alpha <= 0) {
return 0;
} else if (alpha >= 255) {
return 0xff000000 | rgbColor;
} else {
int r = (rgbColor >> 16) & 0xff;
int g = (rgbColor >> 8) & 0xff;
int b = rgbColor & 0xff;
r = (alpha * r + 127) / 255;
g = (alpha * g + 127) / 255;
b = (alpha * b + 127) / 255;
return (alpha << 24) | (r << 16) | (g << 8) | b;
}
}
public static int unpremultiply(int preARGBColor) {
int a = preARGBColor >>> 24;
if (a == 0) {
return 0;
} else if (a == 255) {
return preARGBColor;
} else {
int r = (preARGBColor >> 16) & 0xff;
int g = (preARGBColor >> 8) & 0xff;
int b = preARGBColor & 0xff;
r = 255 * r / a;
g = 255 * g / a;
b = 255 * b / a;
return (a << 24) | (r << 16) | (g << 8) | b;
}
}
void alphaBlend(int[] pixels, int offset, int source, int alpha) {
int destRGB = pixels[offset];
int destA = destRGB >>> 24;
//destRGB = premultiply(destRGB, destA);
int destR = (destRGB >> 16) & 0xff;
int destG = (destRGB >> 8) & 0xff;
int destB = destRGB & 0xff;
int srcA = source >>> 24;
//source = premultiply(source, srcA);
int srcR = (source >> 16) & 0xff;
int srcG = (source >> 8) & 0xff;
int srcB = source & 0xff;
srcA *= alpha;
srcR *= alpha;
srcG *= alpha;
srcB *= alpha;
int oneMinusSrcA = 0xff - (srcA >> 8);
destR = (srcR + destR * oneMinusSrcA) >> 8;
destG = (srcG + destG * oneMinusSrcA) >> 8;
destB = (srcB + destB * oneMinusSrcA) >> 8;
destA = (srcA + destA * oneMinusSrcA) >> 8;
//pixels[offset] = (destA << 24) | (destR << 16) | (destG << 8) | destB;
pixels[offset] = //unpremultiply
((destA << 24) | (destR << 16) | (destG << 8) | destB);
}
/*thanks nsigma*/
public static final int ALPHA_MASK = 0xff000000;
public static final int RED_MASK = 0x00ff0000;
public static final int GREEN_MASK = 0x0000ff00;
public static final int BLUE_MASK = 0x000000ff;
public static int premultiply(int argb) {
int a = argb >>> 24;
if (a == 0) {
return 0;
} else if (a == 255) {
return argb;
} else {
return (a << 24) | multRGB(argb, a);
}
}
public static int premultiplyEXT(int argb, int a) {
if (a == 0) {
return 0;
} else if (a == 255) {
return argb;
} else {
return (a << 24) | multRGB(argb, a);
}
}
public static int multRGB(int src, int multiplier) {
multiplier++;
return ((src & RED_MASK) * multiplier) >> 8 & RED_MASK
| ((src & GREEN_MASK) * multiplier) >> 8 & GREEN_MASK
| ((src & BLUE_MASK) * multiplier) >> 8;
}
private void blendPixelEXT(int pixels[], int offset, int srcPx, int alpha) {
boolean alpha255 = alpha == 255;
int destPx = pixels[offset];
destPx = premultiplyEXT(destPx, (destPx) >>> 24);
int srcA = srcPx >>> 24;
srcPx = premultiplyEXT(srcPx, srcA);
if (alpha255 == false) {
srcA = mult(srcA, alpha);
}
int srcR = (alpha255) ? (srcPx & RED_MASK) >>> 16
: mult((srcPx & RED_MASK) >>> 16, alpha);
int srcG = (alpha255) ? (srcPx & GREEN_MASK) >>> 8
: mult((srcPx & GREEN_MASK) >>> 8, alpha);
int srcB = (alpha255) ? (srcPx & BLUE_MASK)
: mult(srcPx & BLUE_MASK, alpha);
int destA = (destPx) >>> 24;
int destR = (destPx & RED_MASK) >>> 16;
int destG = (destPx & GREEN_MASK) >>> 8;
int destB = (destPx & BLUE_MASK);
pixels[offset] = unpremultiply(blend(srcA, destA, srcA) << 24
| blend(srcR, destR, srcA) << 16
| blend(srcG, destG, srcA) << 8
| blend(srcB, destB, srcA));
}
public static int blend(int src, int dest, int alpha) {
return src + mult(dest, 0xFF - alpha);
}
public static int mult(int val, int multiplier) {
return (val * (multiplier + 1)) >> 8;
}
public void drawBrush(int x, int y, int size, int alpha, int color, int[] pixels, int width, int height, UndoState state) {
if (alpha == 0) {
return;
}
int brushIndex = 0;
int[] brushPixels = brushData[size - 1];
for (int i = 0; i < size; i++) {
final int ycoord = y + i;
if (ycoord < 0 || ycoord >= height) {
continue;
}
for (int j = 0; j < size; j++) {
int source = (brushPixels[brushIndex++]);
if (source > 0) {
final int xcoord = x + j;
if (xcoord < 0 || xcoord >= width) {
continue;
}
final int pos = ycoord * width + xcoord;
state.putPixel(pixels[pos], xcoord, ycoord);
blendPixelEXT(pixels, pos, (source << 24) | color, alpha);
}
}
}
}
public void drawBrush(int x, int y, int size, int alpha, int color, UserLayer layer) {
if (size <= 0) {
return;
}
int brushIndex = 0;
int[] brushPixels = brushData[size - 1];
int limit = (alpha * maxPixelValue) / 255;
for (int i = 0; i < size; i++) {
for (int j = 0; j < size; j++) {
int source = (brushPixels[brushIndex++]);
if (source > 0) {
final int xcoord = x + j, ycoord = y + i;
if (xcoord < 0 || ycoord < 0 || xcoord >= layer.width || ycoord >= layer.height) {
continue;
}
final int xcoordScaled = xcoord >> 5, ycoordScaled = ycoord >> 5;
UserLayer.Tile tile = layer.getTileForNoScaling(xcoordScaled, ycoordScaled);
int offset = ((ycoord - (ycoordScaled << 5)) << 5) + xcoord - (xcoordScaled << 5);
int destA = tile.pixels[offset] >>> 24;
destA = source + ((destA * (0xff - source)) >> 8);
destA = Math.min(limit, destA);
tile.pixels[offset] = color | (destA << 24);
}
}
}
}
}