I think the reason why you’re getting wrong results is because you do the matrix multiplication the wrong way around. Remember that matAmatB != matBmatA. However, I’ve been thinking about this, and I think it’s possible to simplify this.
What we really want to do is rotated the samples around the Z-axis. If we look at the raw sample offsets, this just means rotating the XY coordinates separately, leaving the Z intact. Such a rotation matrix should be much easier to construct:
float angle = rand(texCoords) * PI2;
float s = sin(angle);
float c = cos(angle);
mat3 rotation = mat3(
c, -s, 0,
s, c, 0,
0, 0, 1
);
//We want to do kernelMatrix * (rotation * samplePosition) = (kernelMatrix * rotation) * samplePosition
mat3 finalRotation = kernelMatrix * rotation;
This should be faster and easier to get right!