Difference in sampled audio stream buffer wait for write between platforms?

I’ve been working on a sound-based game that’s using the sampled audio stream write(byte[]) call as the timer, since the JavaDoc says that the write command will wait until the buffer has sufficient capacity before writing the data and returning.

On Windows JDK 1.5 and 1.6, this method allows for smooth time increments between each write. However, on Linux and Mac OS X (more apparent on Linux), the intervals between the writes seems to fluctuate, making the video seem to stutter (audio playback doesn’t have any clear interruption, though).

An earlier post references the problem with playing both MIDI and sampled sound, but that’s not what I’m dealing with here. Is the source of the issue the JDK implementation for the platform, or is there something in the Java Sound setup that I can do to make the timing more reliable? Thanks.

I think I’ve resolved the issue with the stuttering - I was invoking the SourceDataLine open() call, instead of the open(AudioFormat, int) call.

However, now I’m working through an issue where the delay between when the audio is emptied from the sound buffer to when the audio is heard. Linux seems to have about a 100ms longer delay than Windows. Is there a technique to discover the differences via the sound API?

Doubtful. It is also probably varies on different hardware. :-\

Java Sound is really a piece of crap. Especially on OSX, where it barely works at all and often crashes for no reason.

FWIW, in SingSong I synchronize on the playback time for gameplay. I found on XP the playback time is very accurate, but on Vista, 7, and Linux the playback time does not update very often. I ended up having to interpolate between playback time changes using the system clock. It was a bit tricky, but it worked out well and I have smooth scrolling based on the music on all systems. OSX uses QuickTime.

Thanks for the response. I’ve been finding through some research that this audio latency can get pretty bad. My local Linux box has nearly a second of latency between when the audio is flushed from the Java Sound audio buffer, to when it’s actually played through the speakers. I’ve also found that Windows XP has really low latency.

I’m able to have the audio and video synch up fairly well now, but I can’t figure out a way to automatically account for the audio delay. Right now, I’m just dealing with it by having the user specify the “fudge factor” to account for the latency. I’ll see what I can dig up for a better solution, though.

YMMV but it may be the case that you get more reliable results with LWJGL’s OpenAL binding. I can’t speak for JOAL but LWJGL uses a software OpenAL implementation that’s consistent across platforms.

Cas :slight_smile:

What do you mean by the bold part? When you write audio data, you can’t know when it is going to be played. Audio is very sensitive to missed frames. Your data is going to get buffered to give you time to write more data without missing a frame. The closest you can get to knowing what audio is currently being rendered is SourceDataLine#getMicrosecondPosition(), not by how what data you’ve sent to the buffer.

You mean the Java bindings for QuickTime? You might want to check if it still works in 10.6 in that case, as I think they removed those bindings!

Edit: this is from Apple’s java-dev mailing list:

[i]"QuickTime for Java is basically a set of bindings to the legacy QuickTime API, which is also not present in in 64-bit (only QTKit is available in 64-bit). The legacy QuickTime API depends on many aspects of Carbon, and the Java bindings often passed handles as int’s, which obviously would never support 64-bit pointers.

The request number mentioned above is not-to-be-fixed, because it’s not a simple matter of “just recompiling”, because the Quicktime framework that the Java bindings are linking against simply doesn’t exist in 64-bit."[/i]

I’m able to determine approximately what’s being played, by knowing the size of the buffer, and the amount of buffer space available in the buffer; this will tell me the data that was just removed from the buffer. The buffer will (should) only be emptied at a rate related to the sampling rate of the data line, which is why line.write(byte[]) blocks.

Thanks Mr. Gol. Installing 10.6 and testing SingSong has been on my todo list. I hope there is a solution for using QuickTime from Java because Java Sound is junk!

Groboclown, I am guessing getMicrosecondPosition will be based on the frames the hardware has rendered and will be the easiest and most accurate method.

I attempted this on Windows XP, Linux, and OS X Snow Leopard. For each of these, I attempted essentially the following code, with a buffer size of 8 times the frame size:

long startTime = outputLine.getMicrosecondPosition();
for (int i = 0; i < FRAMES_PER_SECOND * 2; i++) {
outputLine.write(buffer, 0, BYTES_PER_FRAME);
}
System.out.println(outputLine.getMicrosecondPosition() - startTime);

On each of these systems, the reported microsecond delta is extremely close to 2 seconds after outputting 2 seconds worth of audio data. The getLongFramePosition() has roughly the same results. However, empirical tests show the latency more like:

Buffer Size: 0.26 seconds
Windows XP: 0.13 seconds
Linux: 0.47 seconds
OS X: 0.37 seconds

As the Windows XP results were below the audio buffer size, this makes me think that it’s directly accessing the audio driver, while the others, being larger than my allocated buffer size, use a buffer separate from the audio driver.

Perhaps I’m missing something in my analysis here. Does this look about right?

When you open a SourceDataLine, the buffer size is only a suggestion. If I understand correctly, then I think your test will give you a vague idea about the size of the buffer, whether in software or hardware. However, what can you do with this information?

Why worry about when the write method returns? That doesn’t give you accurate information about what the hardware has rendered. Just hand your data to the SourceDataLine as fast as it can take it and use getMicrosecondPosition to determine the audio’s playback position.

My main concern for knowing when the write method returns is because I’m needing to synchronize the video and audio output; if I generate the output data too fast, then actions on the video can get out of sync with the expected audio output position.

I attempted to use the getMicrosecondPosition method with mixed results. By waiting from MicrosecondPosition call to the next one, I can get the video to sync up with the audio, but, because I’m not able to construct the audio far enough ahead, the audio starts stuttering due to the buffer underruns. I’ll do some more investigations to see if I can come up with some other alternative.

Thanks for all your help!

You can’t drop audio frames without it being noticeable, so I would just pump more data to the SourceDataLine whenever it can accept it. You can drop frames in video without it being too noticeable, so maybe you can show video frames for a shorter or longer period based on the audio position. Ideally you’d do it smoothly, eg if the video is behind, instead of just dropping video frames to get to the right place for the audio, you’d just show frames for less time than usual. This way it would catch up over the course of maybe 10 frames (or whatever looks good most of the time) and the viewer would never even realize there was an issue. This also works for slowing down the video.

Sounds like you are doing something pretty cool. This is a game? Mind if I ask what you are using for video? I looked into video with Java a long time ago and found it was pretty difficult. I am hoping Java 7 will improve things.

I’m working on a Guitar Hero-style 4k game, which limits my ability to do many fancy things with video/audio synchronizing. It currently works well except for the audio latency - I’m attempting to use the write() method as my timer. There’s still a few tricks I’m planning on trying out before falling back on letting the user decide the latency amount.