8 kHz mono sound real-time implementation

Hello (my apologies if this post went to wrong sub-forum, not totally sure if my problems are networking or sound-api related),

I’m currently building a kind of real-time “echo-laboratory”. I’m having a tough time making the part of my code using JavaSound + plain old sockets meet the requirements - e.g. buffer size of approximately 80 bytes when using a 8 kHz sample-rate (mono, i.e. 16-bit PCM).

What I try to achieve is basically:
“rec mic” + “network send” + “network receive” + “playback on loudspeaker” <= 20 ms latency (<10 ms if possible would be great!)*

So my questions to you guru’s stalking :o this forum are:

  • Is the above scenario realistic to solve using java with decent hardware and networking support? My guess is yes, otherwise I wouldn’t bother you guys - but still, if not…
  • I’m currently using Source/Target-DataLine’s from the JavaSound API and have trouble with small buffer size’s (i.e. low latency requirement), is it my lack of understanding (hey, I just recently begun using the api :smiley: ) the api or should I consider alternative API’s; if so, any suggestions?
  • I’m currently using plain old sockets to do network communication (currently only “point-to-point”), and without any in-depth performance profiling that seems to work ok, i.e., mostly dependent on inherent network lags.

Thanks in advance

  • Final architecture will use an “in-lab network” - network latency should not be a (big) problem (<< 10 ms round-trip).

network-wise (internet), 10ms round-trip is a very hard requirement. I would also convert from 16bit -> 8bit. Furthermore, Java has the ZipOutputStream/InputStream, which might (?) help reduce packet size! Compress the data before sending over the network.

In a lab, 10ms, should be possible… Hard to tell. Is i a research project?

Yes, it’s a research project and the network is a closed network with probably only 2-10 connected computers/hosts - at first there will only be a point-to-point connection (i.e. 2 computers, not a “network”).

Modifying the data is unfortunately not an option, as that would kind of defeat the purpose. Perhaps ADPCM encoding, or equivalently, will be used later on - but that is only speculation from my part.

128 kb/s should not be a problem, neither ~ < 2000 packets/s (I guess)

how do you setup your lines in JavaSound?
what version of java are you using? It seems javasound in java6 is capable of much less latency. I personally never succeeded in getting low latency results on any java version below java6.

Disclaimer: I’m currently only testing on my (kind of crappy) workstation and not on the lab computer - thus the “soundcard” is of quite dubious quality.

I’m keeping it simple and doing a straightforward test currently:


class DefaultLoudspeaker implements Loudspeaker, LineListener {
	protected static final AudioFormat PCM_FORMAT = new AudioFormat(8000,   /* Sample-rate */
								16,     /* Bit-depth */
								1,      /* Channels */
								true,  /* Signed */
								false); /* Big-endian */
	private SourceDataLine speaker;
	private Logger log;
	private int bufferSize;
	
	public DefaultLoudspeaker(String name, int bufferTimeInMillis) {
		log = Logger.getLogger(name);
		int frameSize = PCM_FORMAT.getFrameSize();
		float sampleRate = PCM_FORMAT.getSampleRate();
		this.bufferSize = (int) (frameSize * sampleRate * bufferTimeInMillis / 1000);
	}

	public void init() throws LineUnavailableException {
		speaker = AudioSystem.getSourceDataLine(PCM_FORMAT);
		speaker.addLineListener(this);
		speaker.open(PCM_FORMAT, bufferSize);
		speaker.start();
		log.info(this + ": Speaker is operational");
		log.info(this + ": Speaker level: " + speaker.getLevel());
		log.info(this + ": Speaker buffer-size: " + speaker.getBufferSize());
	}
	
        /**
	 * Responsible for sending data to a loudspeaker.
	 * @param data 8kHz sampled audio with 16 bits depth.
	 */
	public void output8kHz16bit(byte[] data, int offset, int length) {
		speaker.write(data, offset, length);
	}

	public void update(LineEvent le) {
		log.info(le.toString());
	}

	public void close() {
		speaker.drain();
		speaker.close();
		speaker.removeLineListener(this);
                log.info(this + ": Speaker is closed");
	}
}

My initial conclusions is that the above code leads to an unusable speaker at buffer times <~ 500 ms. Is this result reasonable or do you think it’s the poor quality of the soundcard that is solely responsible?

Test code:


public class DefaultLoudspeakerUT {

	private AudioInputStream data;
	private File file = new File("H:/Code/Java/near_end.wav");
	private Loudspeaker speaker;
	
	@Before
	public void setUp() throws Exception {
		try {
			assertTrue(file.exists());
			data = AudioSystem.getAudioInputStream(file);
			speaker = new DefaultLoudspeaker("Magic speaker", 500);
			speaker.init();
		} catch (RuntimeException re) {
			re.printStackTrace();
			fail(re.getMessage());
		}
	}

	@After
	public void tearDown() throws Exception {
		data.close();
		speaker.close();
	}

	@Test
	public void testOutput8kHz16bit() {
		System.out.printf("Reading data formatted as: %s\n", data.getFormat().toString());
		try {
			System.out.printf("Data available: %d\n", data.available());
			byte[] buffer = new byte[160];
			int totalRead = 0;
			while (data.available() > 0) {
				int read = data.read(buffer);
				speaker.output8kHz16bit(buffer, 0, read);
				totalRead  += read;
			}
			System.out.printf("Data written: %d\n", totalRead);
		} catch (IOException e) {
			e.printStackTrace();
			fail(e.getMessage());
		}
	}

}

I’ve tried various sizes of buffer and buffer-delay

Thanks in advance
/Anders

EDIT:
java version “1.7.0-ea”
Java™ SE Runtime Environment (build 1.7.0-ea-b07)
Java HotSpot™ Client VM (build 1.7.0-ea-b07, mixed mode)

I’ve tried a few different versions, including “stable” 1.6 etc